Previous posts: https://programming.dev/post/3974121 and https://programming.dev/post/3974080
Original survey link: https://forms.gle/7Bu3Tyi5fufmY8Vc8
Thanks for all the answers, here are the results for the survey in case you were wondering how you did!
Edit: People working in CS or a related field have a 9.59 avg score while the people that aren’t have a 9.61 avg.
People that have used AI image generators before got a 9.70 avg, while people that haven’t have a 9.39 avg score.
Edit 2: The data has slightly changed! Over 1,000 people have submitted results since posting this image, check the dataset to see live results. Be aware that many people saw the image and comments before submitting, so they’ve gotten spoiled on some results, which may be leading to a higher average recently: https://docs.google.com/spreadsheets/d/1MkuZG2MiGj-77PGkuCAM3Btb1_Lb4TFEx8tTZKiOoYI
So if the average is roughly 10/20, that’s about the same as responding randomly each time, does that mean humans are completely unable to distinguish AI images?
In theory, yes. In practice, not necessarily.
I found that the images were not very representative of typical AI art styles I’ve seen in the wild. So not only would that render preexisting learned queues incorrect, it could actually turn them into obstacles to guessing correctly pushing the score down lower than random guessing (especially if the images in this test are not randomly chosen, but are instead actively chosen to dissimulate typical AI images).
Maybe you didn’t recognize the AI images in the wild and assumed they were human made. It’s a survival bias; the bad AI pictures are easy to figure out, but we might be surrounded by them and would not even know.
Same as green screens in movies. It’s so prevalent we don’t see them, but we like to complain a lot about bad green screens. Every time you see a busy street there’s a 90+ % chance it’s a green screen. People just don’t recognize those.
One thing I’m not sure if it skews anything, but technically ai images are curated more than anything, you take a few prompts, throw it into a black box and spit out a couple, refine, throw it back in, and repeat. So I don’t know if its fair to say people are getting fooled by ai generated images rather than ai curated, which I feel like is an important distinction, these images were chosen because they look realistic
Well, it does say “AI Generated”, which is what they are.
All of the images in the survey were either generated by AI and then curated by humans, or they were generated by humans and then curated by humans.
I imagine that you could also train an AI to select which images to present to a group of test subjects. Then, you could do a survey that has AI generated images that were curated by an AI, and compare them to human generated images that were curated by an AI.
All of the images in the survey were either generated by AI and then curated by humans, or they were generated by humans and then curated by humans.
Unless they explained that to the participants, it defeats the point of the question.
When you ask if it’s “artist or AI”, you’re implying there was no artist input in the latter.
The question should have been “Did the artist use generative AI tools in this work or did they not”?
I think if you consider how people will use it in real life, where they would generate a bunch of images and then choose the one that looks best, this is a fair comparison. That being said, one advantage of this kind of survey is that it involves a lot of random, one-off images. Trying to create an entire gallery of images with a consistent style and no mistakes, or trying to generate something that follows a design spec is going to be much harder than generating a bunch of random images and asking whether or not they’re AI.
Technically you’re right but the thing about AI image generators is that they make it really easy to mass-produce results. Each one I used in the survey took me only a few minutes, if that. Some images like the cat ones came out great in the first try. If someone wants to curate AI images, it takes little effort.
But they were generated by AI. It’s a fair definition
I mean fair, I just think that kind of thing stretches the definition of “fooling people”
LLMs are never divorced from human interaction or curation. They are trained by people from the start, so personal curation seems like a weird caveat to get hung up on with this study. The AI is simply a tool that is being used by people to fool people.
To take it to another level on the artistic spectrum, you could get a talented artist to make pencil drawings to mimic oil paintings, then mix them in with actual oil paintings. Now ask a bunch of people which ones are the real oil paintings and record the results. The human interaction is what made the pencil look like an oil painting, but that doesn’t change the fact that the pencil generated drawings could fool people into thinking they were an oil painting.
AIs like the ones used in this study are artistic tools that require very little actual artistic talent to utilize, but just like any other artistic tool, they fundamentally need human interaction to operate.
But not all AI generated images can fool people the way this post suggests. In essence this study then has a huge selection bias, which just makes it unfit for drawing any kind of conclusion.
This is true. This is not a study, as I see it, it is just for fun.
I think getting a good image from the AI generators is akin to people putting in effort and refining their art rather than putting a bunch of shapes on the page and calling it done
I still don’t believe the avocado comic is one-shot AI-generated. Composited from multiple outputs, sure. But I have not once seen generative AI produce an image that includes properly rendered text like this.
Yeah I’m sceptical too, what tool and prompt was used to produce this?
Did you not check for a correlation between profession and accuracy of guesses?
I have. Disappointingly there isn’t much difference, the people working in CS have a 9.59 avg while the people that aren’t have a 9.61 avg.
There is a difference in people that have used AI gen before. People that have got a 9.70 avg, while people that haven’t have a 9.39 avg score. I’ll update the post to add this.
Sampling from Lemmy is going to severely skew the respondent population towards more technical people, even if their official profession is not technical.
If you do another one of these, I would like to see artist vs non-artist. If anything I feel like they would have the most experience with regular art, and thus most able to spot incongruency in AI art.
I got a 17/20, which is awesome!
I’m angry because I could’ve gotten an 18/20 if I’d paid attention to the thispersondoesnotexists’ glasses, which in hindsight, are clearly all messed up.
I did guess that one human-created image was made by AI, “The End of the Journey”. I guessed that way because the horses had unspecific legs and no tails. And also, the back door of the cart they were pulling also looked funky. The sky looked weirdly detailed near the top of the image, and suddenly less detailed near the middle. And it had birds at the very corner of the image, which was weird. I did notice the cart has a step-up stool thing attached to the door, which is something an AI likely wouldn’t include. But I was unsure of that. In the end, I chose wrong.
It seems the best strategy really is to look at the image and ask two questions:
- what intricate details of this image are weird or strange?
- does this image have ideas indicate thought was put into them?
About the second bullet point, it was immediately clear to me the strawberry cat thing was human-made, because the waffle cone it was sitting in was shaped like a fish. That’s not really something an AI would understand is clever.
One the tomato and avocado one, the avocado was missing an eyebrow. And one of the leaves of the stem of the tomato didn’t connect correctly to the rest. Plus their shadows were identical and did not match the shadows they would’ve made had a human drawn them. If a human did the shadows, it would either be 2 perfect simplified circles, or include the avocado’s arm. The AI included the feet but not the arm. It was odd.
The anime sword guy’s armor suddenly diverged in style when compared to the left and right of the sword. It’s especially apparent in his skirt and the shoulder pads.
The sketch of the girl sitting on the bench also had a mistake: one of the back legs of the bench didn’t make sense. Her shoes were also very indistinct.
I’ve not had a lot of practice staring at AI images, so this result is cool!
About the second bullet point, it was immediately clear to me the strawberry cat thing was human-made, because the waffle cone it was sitting in was shaped like a fish. That’s not really something an AI would understand is clever.
It’s a Taiyaki cone, something that already exists. Wouldn’t be too hard to get AI to replicate it, probably.
I personally thought the stuff hanging on the side was oddly placed and got fooled by it.
Yes, "the end of the journey’ got me too. So AI-looking. It’s this one:
https://www.deviantart.com/davidevart/art/The-End-Of-The-Journey-952592148
And this is why AI detector software is probably impossible.
Just about everything we make computers do is something we’re also capable of; slower, yes, and probably less accurately or with some other downside, but we can do it. We at least know how. We can’t program software or train neutral networks to do something that we have no idea how to do.
If this problem is ever solved, it’s probably going to require a whole new form of software engineering.
I don’t know… My computer can do crazy math like 13+64 and other impossible calculations like that.
Wow, I got a 12/20. I thought I would get less. I’m scared for the future of artists
One thing I’d be interested in is getting a self assessment from each person regarding how good they believe themselves to have been at picking out the fakes.
I already see online comments constantly claiming that they can “totally tell” when an image is AI or a comment was chatGPT, but I suspect that confirmation bias plays a big part than most people suspect in how much they trust a source (the classic “if I agree with it, it’s true, if I don’t, then it’s a bot/shill/idiot”)
I have it on very good authority from some very confident people that all ai art is garbage and easy to identify. So this is an excellent dataset to validate my priors.
Sketches are especially hard to tell apart because even humans put in extra lines and add embellishments here and there. I’m not surprised more than 70% of participants weren’t able to tell that one was generated.
What I learnt is that I’m bad at that task.
Is there any chance that the human artists experimented with AI and posted the result along with their usual work? :)
Oof. I got about 65% on the images I hadn’t seen in the post. I must be pretty close to being replaceable by an adversarial network.
I did 12 and I thought it was a pretty bad result. Apparently I was wrong!
I tried to answer by instinct, without careful examination, to make it more realistic
15 here and when I reviewed the answers I realized most were sheer luck.
The fact that it’s a clean gaussian shows that it’s mostly luck anyway.
Can people tell tell