On what I’ve been up to (Halluzinator)
I know, I know, it has been far too long since I last wrote something. I can only defend myself by telling that I at least have been very busy all this time. And what I’ve been doing is the subject of this post, as well as what will ensue from that.
When I encountered Ryan Murdocks aka Advadnoun’s Aleph notebooks I began right away to make animations of how the images form as in developing photographs from inchoate brown to mysteriously glowing spectacles of weirdness. A little later, other members of the Patreon group started to come up with all kinds of amazing improvements. One of them was using an initial starting image instead of the brown square and then applying all kinds of graphical augmentations.
I realized that by using the starting image technique, one could have a new kind of “text to hallucination” notebook, which I called Halluzinator. After trying it out, and witnessing the beautiful subtle morphs, I began to have even more ambitious plans. I imagined a kind of movie factory in a box where you feed in a series of text inputs and out comes a short film, complete with soundtrack and all.
That is what I’ve been working with nearly all my free (and sometimes non-free) time. During this period I also finally got access (after 4-5 months of waiting) to OpenAI’s API which allows me to interact with the famous GPT-3, so it was natural to use that for producing the text inputs. Up to now I have generated three short films that you can see on my Youtube playlist:
These are almost purely AI-generated movies. My own part has been merely to provide the initial idea, like “Poem about diamonds”, and then let GPT-3 run with it. Then Halluzinator hallucinates imagery based on the lines that I get from GPT-3, and afterwards the same lines are turned into music using Magenta library with the marvelous method and code developed by Robert A. Gonsalves, which he explains here.
Lastly the GPT-3-generated lines are synthetized into speech using Google TTS, and it only remains for me to put it all together in a video editor. I hardly exaggerate if I say that I have my own “film studio” where all the actors, artists and employees are robots!
My dream was to automate even the production so that the movie would wholly be edited programmatically inside the Halluzinator, but although the idea is kind of cool, I dropped it, and instead focused on implementing some graphical enhancements, such as simple camera movements (rotation and zoom) and when it was pointed out to me that with CLIP you can add and multiply the encoded text inputs, I made it so that at all times two consecutive prompts are blending into each other producing buttery smooth visual morphing!
I am very happy and proud to see that several other people in Advadnoun’s Patreon group have now started to experiment with Halluzinator, combining it with other tools and production pipelines, and have come up with some remarkable art that exceeds my own imagination. Here are some examples:
Glenn Marshall, who has a real knack of conjuring up truly breathtaking, sharp and exquisite imagery.
Bokar N’Diaye, who, I feel, belongs to the select few who are truly beginning to grasp the poetical (and dare I say, spiritual) implications of this emerging art form.
and lastly but absolutely not leastly, Joshua Holmes, whose creativity and indefatigable inventiveness in coding has influenced Halluzinator’s graphics to a very large degree. In other words I pretty much stole everything he came up with 🙂 and in return, he has graciously accepted some of my meager modifications.
Now, if you have read this far, you will probably ask, where is the link? Where can I see this Halluzinator that you’re hyping and huffing about here? And I have to answer with a semi-heavy heart, that I am not at liberty to make it publicly accessible. Halluzinator is based on Advadnoun’s notebooks that are restricted to his Patreon group, so the only way that one can get it is to sign up. Which I wholeheartedly endorse if you have any interest at all in cutting edge AI art. My Halluzinator is merely one of an evergrowing collection of wonderful CLIP-based notebooks available there, and every week delivers some new ones, all filled with new innovations and tricks.
By immersing myself into this addictive new way of fusing art and science, I have learnt something. From the stage of a eager beginner, who thinks hey, this is easy I’ve now progressed on to the next level of enlightenment where I realize that I don’t know jack shit. Only a short time ago I thought I’d simply transfer the mechanics of Halluzinator to some of the earlier releases of Aleph that are publicly available – but the situation is both more complex and in another sense, more simpler than I thought.
It is more complex in the sense that despite the name, the earlier editions of Aleph are quite different from the later ones. The ones that are available on the web are basically CLIP using DALL-E’s autoencoder as the image generator, whereas the version of Aleph, that Halluzinator is based on, is CLIP using the so-called Taming Transformers’ VQGAN, and I haven’t still quite figured out how to tweak the earlier Aleph to produce the continuous morphing that is at heart of Halluzinator.
On the other hand the situation is also simpler insofar that almost anything can be used to provide images to be scrutinized by CLIP, and indeed, one of the more recent of Ryan’s notebooks, called “Wake up!” has a relatively small convolutional neural network making the pictures and it is faster than the Alephs, while outputting quality graphics.
So eventually I will put out a notebook very similar to Halluzinator that will be freely available, but I need to learn a little more about the technical side of things. How long that will take, I honestly can’t tell, since more than anything I love to make music and videos with AI, and studying the fundamentals tend to take second seat. But the Halluzinator having now been completed, I have little bit more time in my disposal, although I also do have some new ideas… more of that later…
In conclusion, I wish to stress again that in making Halluzinator I am only a lilliput standing on the shoulder of giants, as I’ve made clear, the system is entirely built on Advadnoun’s Aleph, and besides the persons already mentioned, I’m indebted to CobaltOwl, who first came up with the idea of using custom starting image in Aleph and also to Hotgrits whose immensely helpful suggestions were responsible for inspiring the image blending code. Also kudos to numerous other members of the Patreon group and people in the wider AI arts scene, of whom Hannu Töyrylä deserves a special mention, as his generous sharing of knowledge has given me a solid foundation of understanding machine learning as applied to graphical arts.
PS. As one last example of Halluzinator in action, check out my latest music video! Although as always, it employs a few other animators as well. Excelsis!