Join our Discord Community
Join our Discord Community
Join our Discord Community
Join our Discord Community
Join our Discord Community
Join our Discord Community
Join our Discord Community
Join our Discord Community
Join our Discord Community
Join our Discord Community
Join our Discord Community
Join our Discord Community
December 6, 2022

Visual Pie in the AI sky!

Visual Pie In The AI Sky


We are living in one of the most exciting times ever for an area of technology that is evolving faster than the Tuatara 🦎 *. Those with an eye on — or even a foot — in this area of computational research will be absorbed, astounded and occasionally bemused themselves by the pace of change and the wider ramifications for creativity. It’s not gaming, nor, web3, or the Metaverse. It’s AI visual generation that’s at the firm frontier of technology.

Artificial Intelligence is redefining and advancing visual creativity at an astonishing pace, and it’s such an exiting time to be alive!
“The Future” as generated by Midjourney

Of course, Artificial Intelligence is a very broad church that numbers many in its congregation, with generative art being one of them. There are many different ways that AI can be used to create and adapt artworks, from using text to generate images, images to generate videos, to creating 3D models from imagery and bringing them to life.

Let’s dive into some examples of tools that are operating at the cutting edge of the very sharp blade of visual Machine Learning based mania!

Generative Imagery

Using natural language to generate imagery based on state of the art AI implementations is becoming more familiar in the visual arts — whilst nascent the field is progressing at considerable pace.

Midjourney

Midjourney is a text-to-image generating tool capable of producing wild output, with one of it’s strengths being its ability to generate fantastical sci-fi scenes.

''Time Travel'' as generated by MidJourney

Stable Diffusion 2.0

Stable Diffusion 2.0 has just been released, building on the original open source version, and its results are increasingly incredible. It’s open source too, so the advances can drive everyone forward!

An astronaut mowing the lawn by Stable Diffusion 2.0

DALL-E2

DALL-E2 is perhaps the most versatile and adaptable of current text to image research, as the underwater bears below aptly illustrate!

''Teddy bears working on new AI research underwater with 1990s technology'' as generated DALL-E 2

Imagen

Imagen is also wildly impressive, step along to their site to see the kinds of imagery you can create with just some natural language input.

“A transparent sculpture of a duck made out of glass. The sculpture is in front of a painting of a landscape” as generated by Imagen

Generative Video

It’s a natural step from using text to generate static images, to using text to generate video content.

Make-A-Video

Meta AI’s recently launched Make-A-Video generates short videos from text and image input.

“A fluffy baby sloth with an orange knitted hat trying to figure out a laptop close up highly detailed studio lighting screen reflecting in its eye” as generated by Make-A-Video

Imagen

Googles Imagen Video makes videos <5 seconds long and generates animated text.

“A panda taking a selfie” as generated by Imagen

Phenaki

Google has Phenaki under development, able to generate videos from more fully formed descriptions.

“Side view of an astronaut is walking through a puddle on mars The astronaut is dancing on mars The astronaut walks his dog on mars The astronaut and his dog watch fireworks” as generated by Phenaki

Generative 3D Objects

Let’s up the ante. It’s now possible to generate full 3D models from 2D images as well as textual descriptions.

GET3D

NVIDIA has GET3D systems on its books, an AI system which generates 3D models from static images.

3D models from imagery as generated by GET3D

Magic3D

NVIDIA has more 3D magic in the works with the aptly named Magic3D which creates higher quality 3D meshes from text descriptions than other competitors.

“A metal bunny sitting on top of a stack of broccoli” by Magic3D

https://deepimagination.cc/Magic3D/

Dreamfusion

Establishing its presence in this space, Google has Dreamfusion, an AI system which also generates 3D models from text descriptions.

“a DSLR photo of a squirrel wearing a kimono” by Dreamfusion

Generative 3D Animations & Behaviours

The final piece of the puzzle is the ability for AI to bring static 3D models to life.

Move AI

The ability to capture human movement using AI and mapping it to 3D models is a movement gathering pace, led by startups such as Move.ai.

Mixamo

Mixamo brings static biped models like humans in the T position to life using Machine Learning, it was bought by Adobe in 2015 and remains the go to solution for quick and easy animation of human avatars today.

Anything World

Anything World has seven proprietary Machine Learning steps ready to bring any static 3D model to life. Using its next generation animation system, it can generate rigs and create animations for any 3D object at any time (even at run time) for games and other 3D experiences. It’s a bit like Mixamo on Steroids! 💪🦊

So Who Will Win?

Given the pace of developments spanning organisations as diverse as startups, university research departments and multinational corporations who really stands to win from this boom in AI powered visual creativity? I would argue it’s the organisation that can significantly lower the barrier to creative output in one or more of the sectors above for a mainstream audience.

The Utopian Nirvana In Our Sights

Artificial Intelligence allows anyone to be an artist and significantly lowers the barrier to visual creativity.

Let’s imagine for a zeptosecond that you could do all of the above AI based creativity but with just one easy service? That’s a goal that’s firmly in our sights.

* https://www.livescience.com/2396-fastest-evolving-creature-living-dinosaur.html

Published by

Related articles