Visual Pie In The AI Sky
We are living in one of the most exciting times ever for an area of technology that is evolving faster than the Tuatara 🦎 *. Those with an eye on — or even a foot — in this area of computational research will be absorbed, astounded and occasionally bemused themselves by the pace of change and the wider ramifications for creativity. It’s not gaming, nor, web3, or the Metaverse. It’s AI visual generation that’s at the firm frontier of technology.
Artificial Intelligence is redefining and advancing visual creativity at an astonishing pace, and it’s such an exiting time to be alive!
Of course, Artificial Intelligence is a very broad church that numbers many in its congregation, with generative art being one of them. There are many different ways that AI can be used to create and adapt artworks, from using text to generate images, images to generate videos, to creating 3D models from imagery and bringing them to life.
Let’s dive into some examples of tools that are operating at the cutting edge of the very sharp blade of visual Machine Learning based mania!
Using natural language to generate imagery based on state of the art AI implementations is becoming more familiar in the visual arts — whilst nascent the field is progressing at considerable pace.
Midjourney is a text-to-image generating tool capable of producing wild output, with one of it’s strengths being its ability to generate fantastical sci-fi scenes.
Stable Diffusion 2.0
Stable Diffusion 2.0 has just been released, building on the original open source version, and its results are increasingly incredible. It’s open source too, so the advances can drive everyone forward!
DALL-E2 is perhaps the most versatile and adaptable of current text to image research, as the underwater bears below aptly illustrate!
Imagen is also wildly impressive, step along to their site to see the kinds of imagery you can create with just some natural language input.
It’s a natural step from using text to generate static images, to using text to generate video content.
Meta AI’s recently launched Make-A-Video generates short videos from text and image input.
Googles Imagen Video makes videos <5 seconds long and generates animated text.
Google has Phenaki under development, able to generate videos from more fully formed descriptions.
Generative 3D Objects
Let’s up the ante. It’s now possible to generate full 3D models from 2D images as well as textual descriptions.
NVIDIA has GET3D systems on its books, an AI system which generates 3D models from static images.
NVIDIA has more 3D magic in the works with the aptly named Magic3D which creates higher quality 3D meshes from text descriptions than other competitors.
Establishing its presence in this space, Google has Dreamfusion, an AI system which also generates 3D models from text descriptions.
Generative 3D Animations & Behaviours
The final piece of the puzzle is the ability for AI to bring static 3D models to life.
The ability to capture human movement using AI and mapping it to 3D models is a movement gathering pace, led by startups such as Move.ai.
Mixamo brings static biped models like humans in the T position to life using Machine Learning, it was bought by Adobe in 2015 and remains the go to solution for quick and easy animation of human avatars today.
Anything World has seven proprietary Machine Learning steps ready to bring any static 3D model to life. Using its next generation animation system, it can generate rigs and create animations for any 3D object at any time (even at run time) for games and other 3D experiences. It’s a bit like Mixamo on Steroids! 💪🦊
So Who Will Win?
Given the pace of developments spanning organisations as diverse as startups, university research departments and multinational corporations who really stands to win from this boom in AI powered visual creativity? I would argue it’s the organisation that can significantly lower the barrier to creative output in one or more of the sectors above for a mainstream audience.
The Utopian Nirvana In Our Sights
Artificial Intelligence allows anyone to be an artist and significantly lowers the barrier to visual creativity.
Let’s imagine for a zeptosecond that you could do all of the above AI based creativity but with just one easy service? That’s a goal that’s firmly in our sights.