The concept of ‘stability’ is a cornerstone in our understanding of the world. Whether it’s a building, a relationship, a business model – they all need a certain degree of stability to function effectively or they may collapse completely. As I’ve been playing around with AI and using it to generate more and more content I’m continually discovering it’s finding stability that is a key part of effectively getting the results you’re looking for.
That particular revelation dawned on me while exploring a suite of AI tools developed by a company called StabilityAI. Initially, I thought the name was cool, but as I delved deeper into trying to get the results I wanted, the profound significance of that word in this context began to reveal itself to me.
A month ago, the videos I watched of AI animation were like a baby taking its first uncertain steps across the uncanny valley. The technology could generate intriguing characters and intricate worlds, and it could set these creations into motion. But the animations were a psychedelic, chaotic whirlwind of forms that morphed and changed from one frame to the next. The characters and environments were caught in a perpetual state of transformation, unable to maintain their shape over a sequence of frames.
This lack of consistency was problematic, especially in a medium where continuity and coherence are key. What AI animation needed was what has been referred to as “Temporal Stability” – the ability for characters and environments to remain consistent over time, while still allowing for movement and changes in location and perspective.
Within a short span of a month, the strides AI animation has made are nothing short of remarkable. The second generation of these tools already displays a far greater sense of place and time, presenting animations that, while still slightly uncanny, prove beyond doubt that AI is already making great progress towards being a valuable tool in the world of animation.
It was by observing these developments that I realized that the notion of stability extends beyond just AI animation. Whether I’m working on creating an AI-driven art piece with Midjourney, or using AI to help craft these articles, maintaining a consistent tone, voice, and context is paramount. In essence, I am applying a form of temporal stability to these contexts, as well.
Let’s consider my writing process, for example. I don’t let an AI chatbot directly write my articles. Instead, I collaborate with it, using it as a sounding board to expand and refine my ideas. This process often involves utilizing a common structure I’ve termed “Role Playing Prompts”. These statements outline a specific role for the AI, such as “you are an editor” or “act as an AI expert”. They provide a stable context for the AI’s responses, helping it produce output that aligns with the desired tone and scope of the article.
However useful these prompts are, achieving fine control over the AI’s responses still requires careful iteration. I often find myself using OpenAI’s ability to rewind back to a particular point in the conversation, tweaking the inputs at that point to shape the AI’s output before any weirdness has crept in. This iterative process allows me to control the ‘stability’ of the conversation, refining it until the AI’s contributions align with my original vision.
Similarly, in the realm of AI art, tools like Midjourney allow me to maintain consistency in the narrative and stylistic elements of the image while modifying other aspects to meet my artistic objectives. However, as in the writing process, achieving this balance is often a battle to understand and control the model’s underlying assumptions.
Midjourney lets you do that by iterating from your existing image. You can also add specific commands to the prompt that let you weigh specific aspects of the prompt. You can pretty much influence every aspect of the generation process, save for the final result.
For Stable Diffusion there is a control tool called “Awesome1111” that functions as a prompt control dashboard. You can still give it a description but there are numerous dials and controls that let you more accurately take control of what elements can be modified from generation.
Moreover, the integration of “LoRA” models allows you to directly influence the actual model being used to generate the image.
While critics may argue that AI-generated art lacks the unique touch of a human artist, tools like Awesome1111 and LoRA are helping to bridge this gap. They enable artists to infuse AI-generated pieces with their distinct style and flair, restoring the element of individuality that many people are concerned has been obliterated by LLMs.
Reflecting on my experiences with these tools, I realize that the quest for stability is one key aspect of mastering them. The skills lie in analyzing and iterating on the output, adjusting the input, and maintaining context to get the desired results. Stability transcends the boundaries of AI’s continuous problem-solving attempts, creating a useful anchor in the dynamic world of AI.
It’s ironic that a technology as volatile and rapidly evolving as AI requires stability at its core, but it’s these kind of discoveries that make it endlessly intriguing to use.
Hopefully you’ll try applying this concept to your own work and let me know if it works for you.
0 Comments