AI generated image, prompt: Weather forecast — ar 16:9 — v 3

The case for AI hallucination

Louis Charron
UX Collective
Published in
5 min readSep 15, 2023

--

AI-generated images are getting more realistic. From the first version of DALL-E in 2021 to Midjourney’s latest model, the trajectory is clear. Realistic images might be more pleasing for the viewers and more impressive for the users generating them, but from a creative point of view, AI images are becoming more similar and less surprising. When did they become dull? How did the mystery and the poetry of the early AI images disappear?

Over the past months, AI tool makers have been working hard to limit AI hallucinations. For large language models (such as ChatGPT), fighting hallucinations means making sure the chatbot won’t cite fake science publications or judicial decisions. For image generation, it means avoiding dreamlike and uncanny images. As we are putting more trust in these tools, this makes a lot of sense. Especially in a factual use: when writing an article or generating stock photography you would not want to let the AI wonder too much. But in a creative use, when the images and the words are meant to open up possibilities, when you are looking to produce something new, these tools feel less useful and harder to use. Did the poetry disappear with the hallucinations?

Psychadelic image generated with Google DeepDream
Google DeepDream, What was once the view towards Chase, British Colombia, James Temperton, Wired

New pixels

The term AI hallucination is used for the first time in 2000 in the field of computer vision: it meant creating new pixels on surveillance cameras’ images to increase their resolution. The algorithm learned from the original pixels to generate new ones that seemed believable.

15 years later, Google launched DeepDream. Based also on a facial recognition algorithm, it reverses its function: instead of looking for faces, DeepDream adds them to any image to create uncanny and psychedelic images. With DeepDream the word hallucination takes on its full meaning: both as the process of creation of new pixels by the machine and as the experience in which we see something that does not exist.

Initially, the term hallucination carried a positive meaning, seen as something to take advantage of. But recently the meaning of the word changed. Today, Wikipedia defines it as “a confident response by an AI that does not seem to be justified by its training data”. Are hallucinations just errors?

Why Midjourney’s images look like Midjourney’s images

To get a better sense of AI images’ evolution, I ran a small series of empirical tests with Midjourney. I started with prompts I used for a project last year. These prompts are deliberately vague and abstract, the goal was to produce inspiring images, in the early stage of a creative process. Through Midjourney version’s tool I was able to test the prompts in different versions of the software, from v1 to the most recent v5.2. Here is what it looks like for each prompt.

AI generated image with the prompt: Weather forecast
Prompt: Weather Forecast
AI generated images with the prompt: Renovating old house with biomaterials
Prompt: Renovating old house with biomaterials
AI generated images with the prompt: Bioregions and micro climate
Prompt: Bioregions and micro climate

There are 2 shifts noticeable over the 3 series. The first is between Version 3 and 4, where most of the poetry of the images seems to disappear. The weather is now shown through a giant app screen, and the landscape adopts a cute 3d illustration style. Version 4 images are technically good, but the abstraction and the weirdness are gone, leaving no space for interpretation. The second shift is between Version 4 and 5.2, where the weirdness of the past versions is brought back but through photorealism. The house is once again made of weird organic materials, but now it is photorealistic. Looking at them I’m left with a feeling: haven’t I seen these images before?

Version 5.2 images feel familiar — especially the one with the woman in red and the one with the dramatic landscape — because Midjourney’s images are realistic on the visual and the conceptual level. To avoid generating weird images, the AI tries to be consistent at all cost: everything has to make sense in the image. Avoiding uncanny images became then generating photorealistic images. I see photorealism as a symptom of the developers’ control over the AI.

Midjourney’s images are becoming AI generated soulless stock photography, familiar images leaving no room for interpretation, no blanks to fill with our imagination and very little space for poetry.

Let the AI dream

So, here’s the question that got this note started: if AI image generators can produce anything with their giant dataset and their powerful algorithms, why should they produce photorealistic images by default? Midjourney and DALL-E are more than just cameras that can take any photographs, they are machines producing a new visual material. A material that could be enriched by hallucinations.

For the tools’ developers, treating AI hallucination as a parameter might be felt as a loss of control. And that’s also why AI hallucinations are fascinating to me: they show us images that are beyond the control of their makers. Like glitches, hallucinations show us the machine from the inside.

To regain a bit of strangeness and poetry I see different approaches. First we could go beyond the default images by using complex prompts. Midjourney’s new “weirdness” parameter could be a step in that direction: the images generated usually look more unexpected, but still look photorealistic. Here are images from the same prompts with the weirdness parameter at different levels.

Weird = 100, Weird = 500, Weird = 1000

We could also imagine new AI tools that would go beyond the default photorealistic aesthetic. Tools that would let us play with hallucinations through a better user interface than the limited prompts. Tools that would not be afraid of producing images that question their generation process and that question us. Tools that could help us generate new pixels and develop new visual languages.

More on the subject

About me

I’m a designer specialized in science communication. I help scientists, research labs and companies to communicate complex ideas to broad audiences. You can see more here: https://www.louischarron.io

Thanks for reading! These notes are a way to share things that matters to me. Feel free to comment, I’ll be happy to read your thoughts!

--

--