Imagine describing a scene to an AI piece of software, and in a few seconds it renders exactly what you were thinking about. Well, that dream is now a reality, thanks to the amazing programmers behind DALL-E 2.
It uses a 12-billion parameter version of the GPT-3 Transformer model to interpret natural language inputs (such as “four oranges shaped like bananas on top of a pyramid made out of candy floss” or “an isometric view of a zebra wearing a three-piece suit”) and generate corresponding images. It can create images of realistic objects (“a stained-glass window with an image of a purple orangutang eating an ice cream”) as well as objects that do not exist in reality (“a cube with the texture of a hedgehog”). Its name is a portmanteau of WALL-E and Salvador Dalí.
You too can try out this amazing piece of AI engineering from OpenAI, but beware, the waiting list is absolutely huge. The possibilities for DALL-E are infinite, and one can imagine a future where many artists, art directors, visualisers and creative directors will be out of jobs. Executives will simply describe what they want…et voilà.
The quality of the images as well as the speed of rendering, including shadows/reflections, is outstanding. The DALL-E AI also interprets the words of the user, however obscure they may be from its almost infinite neural network.