Jump to content

why cant AI get hands right?

Mark Kaine

its so weird to me, faces, body shapes... not a problem but hands always look like its an alien or a 2yo drew it...?

 

what gives? is it some kind of "copyright protection"?

 

just an observation i made, especially baffling for "art" that gets sold for 2.50 or more per jpg lol.

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

I guess you must be looking at older models or something, there are many where this is a non-issue.

 

Older models just weren't properly trained to deal with that.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, igormp said:

I guess you must be looking at older models or something, there are many where this is a non-issue.

 

Older models just weren't properly trained to deal with that.

is October 2023 old? if yes then yes, if no then no.

 

you're right ive seen some that weren't as bad, but most ai "art" the hands look horrendous. 

 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

well because hand models dont show all of their hand

wd-40 is god there is no other god than wd-40 wd-40 is the solution to all problems

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, Mark Kaine said:

is October 2023 old? if yes then yes, if no then no.

 

you're right ive seen some that weren't as bad, but most ai "art" the hands look horrendous. 

 

 

Yep, that recent batch of AI videos that game out, from the open AI Sora, also struggled with hands in at least one of the examples (grandma blowing out candles I think). I also wondered about that. Is it just as simple as it's a small detail, and they can struggle with all small details, and hands are just super noticeable to us? 

 

I'm not sure.

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Mark Kaine said:

its so weird to me, faces, body shapes... not a problem but hands always look like its an alien or a 2yo drew it...?

 

what gives? is it some kind of "copyright protection"?

 

just an observation i made, especially baffling for "art" that gets sold for 2.50 or more per jpg lol.

The machine learning algorithms generally struggle from any irregular pattern made of (almost) identical things.

Pasta is really hard:

07728f14be542f3e8bab5fb4ff7c9b69_1024_1024.thumb.webp.4ba1ac3a99efab779abdc33ebaf94924.webp

 

Screws are also quite complicated:

image.jpeg.8c7f831e4a366895f000b57ffcf2db62.jpegimage.jpeg.b961be600502ba4a43c77610270e159d.jpeg

 

And nobody has trained the poor AI generator on the difference between nails and finger nails, leading to this abomination:

image.jpeg.6ca3480000bfbb86f0186245f88be7cf.jpegimage.jpeg.488dc62f5a91409d609f4fb792e968a9.jpeg

 

or this one:

image.jpeg.87d92bc19183dd6696ecc487095193ab.jpeg

Link to comment
Share on other sites

Link to post
Share on other sites

I think it may have a lot to do with the data they're trained on and how often hands are obscured in photos. Fingers can be blocked by other body parts, clothing, objects a person is holding, or even obscured by the hand itself.

 

For example this image (top search result for "photo of model"). 

The hands are there, and we know they probably have 4 fingers and a thumb on each hand, but how many fingers are really visible? If you had no idea what hands looked like and you were shown this image you would see a hand with a thumb and finger and another hand with a thumb and two fingers. 

images (10).jpeg

 

 

I think AI will get better at this stuff when it's trained on 3D models, rather than photos. It'll give the AI a much better sense of objects and where they belong in the world and how they interact with other objects.

CPU: Intel i7 6700k  | Motherboard: Gigabyte Z170x Gaming 5 | RAM: 2x16GB 3000MHz Corsair Vengeance LPX | GPU: Gigabyte Aorus GTX 1080ti | PSU: Corsair RM750x (2018) | Case: BeQuiet SilentBase 800 | Cooler: Arctic Freezer 34 eSports | SSD: Samsung 970 Evo 500GB + Samsung 840 500GB + Crucial MX500 2TB | Monitor: Acer Predator XB271HU + Samsung BX2450

Link to comment
Share on other sites

Link to post
Share on other sites

Have you tried drawing hands? It's difficult for humans too, so no wonder AI is struggling..

Note: Users receive notifications after Mentions & Quotes. 

Feel free to ask any questions regarding my comments/build lists. I know a lot about PCs but not everything.

PC:

Ryzen 5 5600 |16GB DDR4 3200Mhz | B450 | GTX 1080 ti

PCs I used before:

Pentium G4500 | 4GB/8GB DDR4 2133Mhz | H110 | GTX 1050

Ryzen 3 1200 3,5Ghz / OC:4Ghz | 8GB DDR4 2133Mhz / 16GB 3200Mhz | B450 | GTX 1050

Ryzen 3 1200 3,5Ghz | 16GB 3200Mhz | B450 | GTX 1080 ti

Link to comment
Share on other sites

Link to post
Share on other sites

16 minutes ago, podkall said:

Have you tried drawing hands? It's difficult for humans too, so no wonder AI is struggling..

Sure, it's difficult for a human to draw them accurately, but as a human you generally know that humans aren't supposed to have more than 4 fingers and 1 thumb.

 

The AI basically just learned that there's a variable number of things there, so that's what it replicates

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

40 minutes ago, HenrySalayne said:

The machine learning algorithms generally struggle from any irregular pattern made of (almost) identical things.

Pasta is really hard:

07728f14be542f3e8bab5fb4ff7c9b69_1024_1024.thumb.webp.4ba1ac3a99efab779abdc33ebaf94924.webp

 

Screws are also quite complicated:

image.jpeg.8c7f831e4a366895f000b57ffcf2db62.jpegimage.jpeg.b961be600502ba4a43c77610270e159d.jpeg

 

And nobody has trained the poor AI generator on the difference between nails and finger nails, leading to this abomination:

image.jpeg.6ca3480000bfbb86f0186245f88be7cf.jpegimage.jpeg.488dc62f5a91409d609f4fb792e968a9.jpeg

 

or this one:

image.jpeg.87d92bc19183dd6696ecc487095193ab.jpeg

body hair when static

wd-40 is god there is no other god than wd-40 wd-40 is the solution to all problems

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Mark Kaine said:

is October 2023 old? if yes then yes, if no then no.

 

you're right ive seen some that weren't as bad, but most ai "art" the hands look horrendous. 

 

Depending on which model you used, it may as well be.

 

Most ai "art" usually uses cheaper/lightweight models, I'm not sure if dall-e or similar services still have this issue.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

51 minutes ago, Spotty said:

I think it may have a lot to do with the data they're trained on and how often hands are obscured in photos. Fingers can be blocked by other body parts, clothing, objects a person is holding, or even obscured by the hand itself.

 

For example this image (top search result for "photo of model"). 

The hands are there, and we know they probably have 4 fingers and a thumb on each hand, but how many fingers are really visible? If you had no idea what hands looked like and you were shown this image you would see a hand with a thumb and finger and another hand with a thumb and two fingers. 

images (10).jpeg

 

 

I think AI will get better at this stuff when it's trained on 3D models, rather than photos. It'll give the AI a much better sense of objects and where they belong in the world and how they interact with other objects.

yeah one thing i thought of sometimes in photos or drawings hands can look "weird" due the angle or something,  but they still usually look anatomically correct or natural, but yes ai seems to get this just wrong more often than not... probably "training" sure but as shown it generally seems to have issues with certain things.

 

still i think details like a hand it probably should get right before seeing commercial usage (of course that's wishful thinking i guess, my opinion nonetheless) 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, igormp said:

Depending on which model you used, it may as well be.

 

Most ai "art" usually uses cheaper/lightweight models, I'm not sure if dall-e or similar services still have this issue.

well i didn't use anything,  i just said its something I've repeatedly noticed and also have seen others pointing it out. and its a pretty interesting subject imo why these ai's can get certain things astonishingly right and others astonishingly wrong. 

 

like there doesn't seem to be any kind of "fact checking" mechanics (there probably are but that just doesn't seem to work in these cases then) 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

22 minutes ago, Eigenvektor said:

Sure, it's difficult for a human to draw them accurately, but as a human you generally know that humans aren't supposed to have more than 4 fingers and 1 thumb.

 

The AI basically just learned that there's a variable number of things there, so that's what it replicates

I mean yah, it's AI algorythm, it's doing it asap and it doesn't have built in reasoning for things like "did I put 2 thumbs there", it just sees 5 fingers which approximately scale in size and shape based on their placement

Note: Users receive notifications after Mentions & Quotes. 

Feel free to ask any questions regarding my comments/build lists. I know a lot about PCs but not everything.

PC:

Ryzen 5 5600 |16GB DDR4 3200Mhz | B450 | GTX 1080 ti

PCs I used before:

Pentium G4500 | 4GB/8GB DDR4 2133Mhz | H110 | GTX 1050

Ryzen 3 1200 3,5Ghz / OC:4Ghz | 8GB DDR4 2133Mhz / 16GB 3200Mhz | B450 | GTX 1050

Ryzen 3 1200 3,5Ghz | 16GB 3200Mhz | B450 | GTX 1080 ti

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, podkall said:

I mean yah, it's AI algorythm, it's doing it asap and it doesn't have built in reasoning for things like "did I put 2 thumbs there", it just sees 5 fingers which approximately scale in size and shape based on their placement

Yeah, as others have said the number of fingers you see in an image can vary a lot depending on the angle of the hand, whether the person is holding an object and so on. Without some form of understanding what a hand actually is and how it works, it's a difficult thing to replicate. As you said, it's difficult for a human to do, and we generally know how hands work from personal experience.

 

Current AI doesn't really have any form of understanding, it just replicates patterns.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Eigenvektor said:

Yeah, as others have said the number of fingers you see in an image can vary a lot depending on the angle of the hand, whether the person is holding an object and so on. Without some form of understanding what a hand actually is and how it works, it's a difficult thing to replicate. As you said, it's difficult for a human to do, and we generally know how hands work from personal experience.

we also analyze 3D space with the things called eyes, and brain right?

 

so we understand what's shadow and what's gap, because AI could mistake shadow for gap and assume there's more than 5 fingers, or see more fingers at angle assuming when the hand isn't angled there's even more fingers?

Note: Users receive notifications after Mentions & Quotes. 

Feel free to ask any questions regarding my comments/build lists. I know a lot about PCs but not everything.

PC:

Ryzen 5 5600 |16GB DDR4 3200Mhz | B450 | GTX 1080 ti

PCs I used before:

Pentium G4500 | 4GB/8GB DDR4 2133Mhz | H110 | GTX 1050

Ryzen 3 1200 3,5Ghz / OC:4Ghz | 8GB DDR4 2133Mhz / 16GB 3200Mhz | B450 | GTX 1050

Ryzen 3 1200 3,5Ghz | 16GB 3200Mhz | B450 | GTX 1080 ti

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, podkall said:

we also analyze 3D space with the things called eyes, and brain right?

 

so we understand what's shadow and what's gap, because AI could mistake shadow for gap and assume there's more than 5 fingers, or see more fingers at angle assuming when the hand isn't angled there's even more fingers?

Exactly. You know what a 3 dimensional object is, that your hand is such an object and you can move and observe your own hand at various angles to figure things out in general.

 

An AI has no such understanding. It's just a bunch of data that has a specific pattern. It replicates that pattern. There's a bunch of small appendages and there's apparently a variable number of them. So it replicates a variable number. Without some form of constraint it simply doesn't know there can't be more than five.

 

I mean, if you had to draw a cat's face, you would probably draw a bunch of whiskers on it. But without looking into it, you probably wouldn't know how many is too many, or how few is too few. So you'd look at a bunch of images and draw an approximate match. But if you wanted to draw a photo realistic image, you'd likely need to look up some basic information about cats to learn how many there actually should be.

 

AI simply doesn't have that level of understanding. It can't go "Hm, maybe I should learn the basics about human anatomy first".

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

The tl;dr 

 

AI's are just giant probability engines at the moment.  They don't have any concept of "fingers", they just know when some asks for a finger that a certain type of result should fall out.  They don't know that a hand should have 5 fingers, they just know that if it see's a finger to one side, it should be drawing another until it "looks" right to it.

3735928559 - Beware of the dead beef

Link to comment
Share on other sites

Link to post
Share on other sites

16 hours ago, Eigenvektor said:

An AI has no such understanding

i like this whole post, but this in particular just means that "ai" is even worse than i thought,  i mean that's just a basic thing that *should* be programmed in, it's what i meant with fact checking...

 

and yes that mostly explains it, a face is a lot more static than a hand obviously. still with some basic algo it should be possible  - i mean games get it right , mostly, too, and they're not handdrawn for every single frame either. 

 

also i guess its possible anyway just that most publicly available engines aren't sophisticated, aka "intelligent" enough. 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

24 minutes ago, Mark Kaine said:

i mean games get it right , mostly, too, and they're not handdrawn for every single frame either. 

Hands in games are three-dimensional models made and animated by humans, who intuitively know what hands are and presumably have at least one to reference at all times, in any possible pose, from any angle.

 

The near-infinite variability in angles and poses hands can make, projected into two-dimensional images, make for a huge variation in what hands are if you're a pattern-recognition algorithm processing a pile of JPEGs.

 

28 minutes ago, Mark Kaine said:

also i guess its possible anyway just that most publicly available engines aren't sophisticated, aka "intelligent" enough. 

I think it's mostly because image generation algorithms don't know what a hand is, they "know" the word "hand" represents a pattern of pixels. Maybe they can get better if they're fed more hand gestures from multiple angles, or are trained to follow objects in three-dimensional space.

 

Maybe all the artists flipping off image generation algorithms are onto something...

I sold my soul for ProSupport.

Link to comment
Share on other sites

Link to post
Share on other sites

Fingers are not consistently shown in photos whereas faces tend to be fully shown. Finger count is also something we can immediately spot as "wrong", whereas stuff like face wrinkles, for example, can vary a lot from person to person and are not necessarily "right" or "wrong".

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

48 minutes ago, Needfuldoer said:

Hands in games are three-dimensional models made and animated by humans, who intuitively know what hands are and presumably have at least one to reference at all times, in any possible pose, from any angle.

 

The near-infinite variability in angles and poses hands can make, projected into two-dimensional images, make for a huge variation in what hands are if you're a pattern-recognition algorithm processing a pile of JPEGs.

 

I think it's mostly because image generation algorithms don't know what a hand is, they "know" the word "hand" represents a pattern of pixels. Maybe they can get better if they're fed more hand gestures from multiple angles, or are trained to follow objects in three-dimensional space.

 

Maybe all the artists flipping off image generation algorithms are onto something...

yes, but that's the thinking behind my post,  if there's an "intelligent" algorithm that's supposed to make realistic pictures of humans it needs some kind of realistic 3D model included, just to check if what it does is at least somewhat correct... but it really seems mostly just to be collecting data from existing 2D pictures,  so it isn't that and it cannot work...

 

i just think it makes all the hype about the proposed power (and  the companies behind it) even more , well, overblown... 

 

i mean, some of the stuff can't even be explained with, "uh it doesn't understand what a hand is", because it gets it so wrong i don't even know where it got the blueprints for it from lol.

 

so basically,  what i learned there's no fact checking,  the promotianal stuff we see is probably all tinkered with... i mean most of these errors can be easily rectified with some human interaction/ photo shop, i would think... 

 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Mark Kaine said:

i mean that's just a basic thing that *should* be programmed in, it's what i meant with fact checking...

The thing is, AIs aren't programmed in the classical sense, they are trained. You give them some input and a desired output. Then you repeat that step millions or even billions of times. That process adjusts some internal weights in a neural network. Once trained, that somehow produces more of less useful output given some input.

 

For example you would give it the input "Dog", then provide multiple images of dogs as possible outputs for that prompt. Then you repeat that step with another prompt like "Chair", "Dog in front of chair", "Dog on top of chair", etc etc… and provide appropriate images each time. Repeat that with millions of different inputs and billions of possible outputs.

 

Once training is compete, you should only need to provide an input to get an output that matches what you describe (to some degree of accuracy).

 

We know that this approach works, but we don't really understand why, in the same sense we don't really know how the human brain works. Because of that you can't effectively debug an AI. You can train it some more, add some constraints to it, but you can't fundamentally understand and fix any issues it has.

 

There is no real intelligence behind that algorithm, it doesn't understand what it's doing, it can't second guess itself. You provide an input, that input runs through the neural network and that produces some output.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Eigenvektor said:

We know that this approach works

well yes we do, but only to a certain degree and that's the problem, its not reliable... 

 

7 minutes ago, Eigenvektor said:

There is no real intelligence behind that algorithm,

which is why there are laws against false advertising. 

 

"experimental intelligence" would be way more fitting and not as misleading. 

 

basically at this point you'd literally need huge warning signs on everything that "AI" produces, just like you do with cigarettes and drugs etc,  or alternatively stop calling it 'AI' when there's nothing really intelligent about it,  that would solve so many issues. 

 

like im not saying it can't be useful, but the current usage is overall probably more harmful than good. 

 

 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Mark Kaine said:

like im not saying it can't be useful, but the current usage is overall probably more harmful than good.

Absolutely. There's a big issue with how AI is marketed. That often leads less tech-savvy people to grossly overestimate its capabilities, or unawareness of its limitations.

 

Things like DLSS are certainly on the useful end of the spectrum, and the marketing around it isn't misleading in the same sense the marketing around other AI products is. At least I'm not aware of Nvidia selling it as something that has actual intelligence behind it.

 

I'm much more torn on something like ChatGPT. It has its uses, but I think a lot of people aren't exactly aware of its limitations. And its creators aren't necessarily interested in pointing that out, since selling a novelty toy to people makes them money. Which allows for further research into a field that likely has a much more lucrative use elsewhere.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×