top of page
  • Writer's pictureGreg Robison

Impossible Hybrid Fruits: Exploring Latent Space as Creativity

There is no doubt that creativity is the most important human resource of all. Without creativity, there would be no progress, and we would be forever repeating the same patterns.” -- Edward de Bono
6 March 2024

The last couple of years in AI have been dominated by Generative AI’s ability to create images (by diffusion models like Dall-e3) or human-like text (by large language models (LLMs) like ChatGPT). Diffusion models’ ability to create almost anything from a simple text prompt has opened new creative avenues and brought “artistic” abilities to the masses. Similarly, LLMs like GPT-4 have taken the ability to comprehend and create human-like text to another level, creating nuanced chat partners. These two developments have us questioning AI’s ability to mimic human intelligence, creativity and innovation.


The Magic of Diffusion Models

Recently, diffusion models have emerged as an effective and efficient way to create synthetic images that can be both visually coherent and “creative”. These models are trained to reverse a process that adds random noise to an image – guided by a text prompt like “a photo of two Coton-de-Tulear puppies surfing Mavericks”, these models remove noise step by step to guide the image towards your instructions. The generated images can be both highly detailed, visually interesting, and feel creative. They have crept their way into photography, art, design, science (even with some interesting consequences).


two Coton-de-Tulear puppies surfing Mavericks

Why do I put “creative” in quotes? We’re not entirely aligned on how it works in us, let alone in machines. Creativity usually means creating something new. But these models, like an artist with a blank canvas, can create images of things that don’t or can’t exist in real life. We can create fruit combinations like apple/kiwi or pineapple/watermelon hybrids that might only exist as a chewing gum flavor, but never a handheld fruit. But we now have the power to visualize them. Is this creative? Is the human the creative one and the AI is merely optimizing a function based on user-submitted tokens? Either way, we can bring our imaginations to life no matter how mundane, creative, or delicious.


Pick the impossible fruit you want to taste

  • Apple-Kiwi

  • Banana-Corn

  • Pineapple-Watermelon

  • Eggplant-Avocado

Exploring Creativity with Large Language Models

LLMs like OpenAI’s ChatGPT have the ability to be a chat partner, writer, programmer, meal planner, joke-maker, even chess partner. These models are trained on huge datasets from books to large chunks of the internet (and now pictures and video), enabling them to grasp the complexities and subtleties of human languages. Their deep-learning architectures allow them to extract syntactic and semantic rules of our language to sound just like us. Like their diffusion-model cousins, LLMs too can “create” unique text that might be considered “creative”. No human has ever tasted a pineapple/watermelon hybrid, yet an LLM can still provide a lot of detail on what it might taste like:


“Imagine the initial taste to be a burst of sweet and tangy flavors, reminiscent of pineapple's tropical acidity, softened by the lush, juicy sweetness characteristic of watermelon. The texture might be an intriguing blend as well; the crispness of pineapple could give way to the succulent and watery crunch of watermelon, making each bite both satisfying and thirst-quenching.

The aroma of this hybrid fruit would likely be intoxicating, with the heady, exotic scent of pineapple mingling with the subtle, fresh fragrance of watermelon. As you bite into it, the complexity of flavors could unfold in layers—first the sharp, vibrant pineapple notes, followed by the cooling, mellow sweetness of watermelon, leaving a lingering, pleasantly sweet aftertaste with a hint of tropical tartness.

This hybrid could also carry a slight creaminess to its texture, blending the fibrous, meaty texture of pineapple with the smooth, almost buttery feel of watermelon's flesh. The balance between the acidity of the pineapple and the watermelon's watery sweetness would make it an ideal fruit for summer days, offering both refreshment and a rich tapestry of flavors that dance on the palate.”  -- GPT 

This type of creative potential can help inspire writers and storytellers to develop their imagination and stories even further.


The Latent Space of Hybrid Fruits

So how can diffusion models visualize and LLMs describe these impossible fruits? The apple/kiwi is certainly not part of its training dataset, but both apples and kiwis are present separately. And after training, these concepts exist in what is called “latent space” where they can be accessed to create potentially new images and text. Latent space is highly-dimensional where each dimension is a learned feature from the data – it’s like a compressed representation of all the data it was trained on (and the more dimensions, the bigger the model, the better it can represent the training data, all else being equal). For an LLM, the latent space represents the ways words, phrases, and sentences can be understood and related to each other, enabling the model to perform tasks like text generation, translation, programming, or answering questions with a nuanced understanding of language. Apples and kiwis exist separately, but close to each other in latent space, grouped with other fruits. And in-between the apple and kiwi concepts exist all manner of combination of apple and kiwi in various strengths and potential meanings. The same is true for diffusion models:


AI generated image of apples and kiwis combining

The latent space between and connecting different concepts represents potential novelty and thus what might be considered sources of creativity. We can blend disparate concepts in novel ways with infinite potential. Just as hybrid fruits that represent unique combinations exist in latent space, there are unique areas to create innovative concepts and designs. By exploring latent spaces, we can create unique images, stories, sounds and videos.

“The concept of ‘latent space’ is important because its utility is at the core of ‘deep learning’ — learning the features of data and simplifying data representations for the purpose of finding patterns.” -- Ekin Tiu

Embracing latent space can also impact our use of language, where we can describe new concepts to each other with positions in latent space. Suppose we met someone new at a party and we wanted to describe them to someone else – ordinarily we would describe their distinctive features to try to conjure that image in someone else’s head. But what if we could describe them using the latent space of people we know? We could describe this new person as 30% Brad Pitt and 70% George Clooney, a face I’ve never seen before, but can now imagine vividly. Our language can move from discrete nouns and adjectives to continuous combinations of concepts which can better reflect the nature of our reality.


AI merging of the faces of Brad Pitt and George Clooney

(BP = the share of Bradpittness and GC = the share of Georgeclooneyness representing dimensional latent space of Cloon-ittism. Imagine if we used even more dimensions, we could describe someone more exactly, like throw in 25% of Jonahhillism to the mix. Imagined by stable diffusion.)


Implications for Creativity and Innovation

Generative AI has enabled the creation of content that can match the complexity and subtlety of human art and even surprise us with flashes of novelty that feel genuinely new and creative. Diffusion models allow us to blend textures, colors, and forms in ways we may not have imagined and LLMs can craft pieces of writing that elicit genuine emotion from the reader. These benefits can extend to areas like product design by combining features in novel and unprecedented ways, much like creating bespoke fruit. We can push boundaries from reality to the novel via their connecting latent space.


But we as humans need to partner with Generative AI to open doors to these new forms of creativity and expression and craft a new future of creative work. The conceptual space between apples and kiwis, pineapples and watermelons, is like the future of human and artificial creativity through collaboration, a space where we can explore and discover new forms of beauty and expression. We can embrace the limitless potential of latent space for creativity.


It takes humans and AI models working together, usually iteratively, in the creative workflow. This combination makes us reevaluate ideas such as innovation, authorship, and inspiration in new  ways. Artists (and non-artists) have tools that can generate ideas at a scale and speed like we’ve never seen before. Do AI-generated outputs dilute the value of human creativity, or do they enrich it by offering new perspectives and possibilities? And as these technologies continue to evolve, our dialog about artistic expression must also evolve. We may see an increase in innovation and diversity of art or an averaging of artistic expression. The democratization of creative expression across industries is beginning, lowering the barrier to entry to realize new ideas, but also at the expense of misinformation.


the latent space between concepts in a large language model

Conclusion

Impossible hybrid fruits landed us in the concept of latent space and questioning the creative potential of today’s AI models. This conceptual space represents potential new forms of expression and innovation – potential that is already rippling throughout many industries. More than ever, we question creativity and our unique place in it as humans. Just as we have throughout history, we are merely using new tools to bring our human visions to life. But we can also continue to work with and learn from them in a way that augments human creativity rather than stifles it. AI will continue to evolve and our complicated relationship with it will continue to evolve too. While AI-driven creativity will open new doors and provide tangible visions, it also potentially transforms our language. It shouldn’t diminish our creativity; it should augment by allowing us to explore the infinite possibilities of latent space.


Let's stay connected button

bottom of page