| Oct 29, 2023 | Biraj sarmah |

How Dall-E 2 and Other AI Generators Works?

Once upon a time, a text to image AI art generator was tasked with creating a humanoid robot reading the Newspaper, sitting on a yellow bench. Actually, there were lots of different kinds of robots who liked to sit and read the paper. But only one of these robots is actually real. Well, a real human, under a real robot suit.

How Dall-E 2 and Other AI Art Generators Create Images From Text

The rest of these robots were created by artificial intelligence using text to image generators. But how do they work? With new tools like OpenAI’s DALI2 and StabilityAI’s DreamStudio, you can type in a phrase and out comes an image. Nothing in these generated images you’re looking at right now.

Is copied and pasted from Google images or some photo agency. But how does the quality compare to real photos? We came up with a challenge to test the limits of AI photo generation. And to see if it can compete with humans. Or, in our case, a real photographer. And a real man named Peter Kokos. Oh, am I lined up?

Everything, uh… With his real, elaborate, and handmade robot suit. Which takes an hour to put on.

What does this mean about telling real from fake on the internet? I’m gonna explain it all. With the help of… This guy. I, I don’t think he’s actually gonna be that much help. Let’s start with how this works. Both Dolly 2 and Dream Studio provide a simple text box. You say whatever comes to mind. Say, a photo of a robot in the forest with a balloon.

Then both systems take a few seconds to generate some pretty accurate and funny images. You might think, oh, the system is just looking at images of robots and balloons and putting them together. But, no. In the background, your text is sent to OpenAI or StabilityAI, where their powerful artificial intelligence has learned to make sense of that text and translate it into completely original images.

How did it learn what is what? By looking at billions of labeled images. Think of it like flashcards. The AI sees those all and starts to learn that many sci fi robots have eyes, and boxy heads, and a square stomach. Same with balloons. It learned that most balloons are this shape, and have strings on them, and it even has this little tie thing.

Through deep learning, it not only knows that that’s a robot, and that’s a balloon, but it knows the relationship between two distinct objects. So it automatically puts the balloon in the robot’s hand. Just like I knew to put the balloon in his hand on this shoot. Or at least Dolly knew to put the balloon in a hand.

Dream Studio was pretty confused by the whole robot balloon thing. Creating basic images is just the start of what you can do with these tools. If I wanted to take a Polaroid of this real robot sitting by the pool, drinking a beer. Smile! I need to get an instant camera. Perfect. And a beer. Wall Street Journal and a beer.

How Dall-E 2 and Other AI Art Generators Create Images From Text

Yeah, this is going to be the keeper. Or I could just type in a humanoid robot sitting by the pool, drinking beer, taken with a Polaroid camera. The systems understand different art and photography styles. So you can do things like fixing the office printer in the style of a medieval painting. Or an Andy Warhol style painting of a bunny rabbit wearing sunglasses.

Realistic photographs are definitely one of the hardest challenges for these systems. Take a high quality photograph of a robot walking a cavapoo. Go, Browser! Come on! Our photos turned out great. Dolly, on the other hand, had an idea of what a Cavapoo looked like, but the rest was, uh, yeah. And I have no idea what’s going on here with Dream Studio’s creation.

Now, here is one of the coolest tricks. Peter, can you put your hand in like this for me? Okay, then I can take this shot, upload it to Dolly, and add in whatever I’d like in this area, with a feature called in painting. And then you can export all those stills and make something like this. So, in real life, or even in an AI generated image, These scenes are pretty fun.

A robot doing some everyday things. But what if you wanted to make something not as fun? Maybe something gruesome, or based on a real life event, or a real life public figure? Say Joe Biden eating a cupcake. OpenAI’s Dolly will immediately restrict searches like that, as the company prevents searches with names of public figures and various hateful or harmful words.

Searches for other names do work. Like, this is what it thinks of Joanna Stern in space. When I put that Joe Biden prompt into Dream Studio, it generated this image. Stability AI’s founder, Imad Mustaq, said he saw no reason to restrict the ability to generate images of public figures. Now say you try a photograph of a terrorist attack.

DALI didn’t generate any really believable images, but it did conduct the search. Dream Studio generated clearer images with guns and fire. If you look closely, you can see issues with the quality. But let’s face it, people don’t always look so closely on social media. This all leads to the bigger tech question.

How are we going to be able to tell the difference between the real or the fake or the AI generated images on the internet? OpenAI’s policies encourage users to indicate that content is AI generated. It also places this watermark in the bottom right of every image, but it can easily be cropped out.

Stability AI doesn’t require any crediting. Right now, another way to tell the real from the fake is quality. It depends on what you need. A realistic photo? Well, you’re still probably going to need a photographer and props like we had today. I don’t need any help. Don’t worry. But quick, compelling graphics and illustrations for a presentation, website, or more?

This would do the trick. And Microsoft is even incorporating Dolly into its new designer app and Bing image creator website. Bottom line, AI art tools are getting real good, real fast. But your AI generated robots… won’t be able to explain the art behind their creation. 


Readers Also Read This

Understanding machine Learning

Understanding Machine Learning

working of neural networks 3

The Inner Working of Neural Networks |3|

Text To Speech

Artificial Intelligence Has No Reason to Harm Us: Deep Dive Analysis

how to improve accuracy in machine learning

How To Improve Accuracy In ML


Join Discord

Unleash AI Text-to-Speech Excellence – Elevate Your Voice, Join Discord and Speak Your Mind with Cutting-Edge Technology!

Join a community of over 200k

Hear From Your Favorite


Or Subscribe for All Alerts