Jump to content

First imige newral network


So i have wanted to do a project for a while now but i dont know were to start since i have never trained a newral network before and this task is quite complex and if you culd point me in to the right direction for recorcese like turials or examples etc. that is what i ask since i cant find anything about this topic on google


I have programed before but mostly in robotics not anything fancy and now i want to create a "AI" imige sort of generator that is similar to the cartoonish or animefy filters you find in apps or on the enternet were you upload a photo or video and generates a counterpart imige that is similar just goffy or cartuniefied. i am looking for ways to take motion data and imige data and use that to make a video to cartoon style ai video generator and i know some of you may think its of the limts for a novice at programing such as mysalf for such a imense task i asure you i understand the theoretical working of all subjects nedded and i am eager to learn how to do it since its a passion for me to create this.


the schematic for something like this is gona look like this

Motion kinematic data --|
                                          |                _________________________________

                                          |------->|                                                   |

                                                           |   Newral netwwork framework|  ---> newly ganerated frame that has ben cartoonyfied
                                     |---------> |________________________________|
randerd 3d figure ----|

   video or imige

I am not looking for straight code (if you have i would be more then happy for it) just for guidance on how to start coding this

Link to comment
Share on other sites

Link to post
Share on other sites

Neural networks as such aren't programmed, they are trained.


What you will need is a large amount of input and the desired output for each input. You will have to "show" both of them to the NN, so that over time it learns how to generate the desired output from the input. Once it has been trained, you can give it an image and it should generate a modified image.


The biggest issue here isn't going to be programming (the underlying NN), but generating a suitable amount of training material to train the NN.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

  • 3 weeks later...

I agree with above but I'm thinking about this from a robotics perspective.


So let's shrink down the scope of the problem: we take video of people walking around and doing stuff along with a sample image of an art style and output a video of the same objects and interactions but cartoonified/animated.


As a MVP, I would think about using existing AI that can describe what they see in an image using text, then plugging that into an AI that can translate text into video. You'd have to do a bit of scripting to process the video into image and then maintaining an open context across each image, but it'll probably be doable.


This obviously isn't going to be very good so perhaps I would use a separate AI for object detection and also add in pose estimation for detected humans. Parse that data and attach it along what we had above and the accuracy should go up.


None of this is easy and I have no idea how doable it would be, especially as a hobbyist. I'd consider this at least master's thesis level work.

This is a signature.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now