DALL.E 2 - Making Anything From Text With AI

By Owain Brennan

DALL·E 2 is a new AI system that can generate realistic images and art from text-based descriptions in natural language, developed by OpenAI. At SeerBI, we were delighted to receive early research access to the AI system by OpenAI, which we have been exploring to its full extent over the past few months. From research and machine learning to a bit of fun, DALL·E 2 has proven to be an exciting and versatile tool.

By taking in a simple text-based input, the AI uses natural language processing alongside a series of neural network layers applied via a diffusion model approach to generate images. The system can currently generate four images at a time, providing variations based on the input.

The system works best when the text input is specific, such as specifying an art style like “oil painting,” “digital art,” or “3D render.” Even the featured image above was generated using the AI tool with the phrase “Teesside in the year 3000 oil painting”:

This model works by using OpenAI’s previous neural network natural language processing-based model, CLIP, which can identify images using natural language. A decoder layer is added to generate an image conditioned on the image embedding from CLIP using a diffusion model method. The images are enhanced to high quality through upsampler methods, ensuring the resolution is always 1024 x 1024 px. Additionally, a Gaussian blur is applied to sharpen or blur the image as determined by the neural networks.

The use cases for the DALL·E 2 model are expansive, ranging from the immediate generation of social media images for marketing to eliminating the need for graphic design work. The model can also create photo-realistic images for use on social media posts, such as this generation of “an Italian vista with a galaxy clear in the sky.”

Another use case is prototyping designs or scenes for movies, video games, or other art projects. This eliminates the need for lengthy design tasks, allowing prototyping to happen much faster.

Here is an example of an idea for a video game: “pixel art of a raccoon in a street dressed as a superhero fighting a squirrel,” generated by the AI:

Illustrations for books or the visualisation of concepts are another impressive use case of the AI. For example, if a writer were creating a children’s book, they could input their writing into the model along with a selected art style to generate illustrations. This process saves costs and time or can serve as a prototype to take to a designer for full development.

Here is a line taken from the book Where The Wild Things Are: “The wild things roared their terrible roars and gnashed their terrible teeth and rolled their terrible eyes and showed their terrible claws,” and here is the line visualised as an image generated by the AI:

Join our Mailing List!

Join the mailing list to hear updates about the world or data science and exciting projects we are working on in machine learning, net zero and beyond.