Halluzinator 0.5, user guide
I finally released my old workhorse, the Halluzinator notebook for making animations.
This is the notebook that I used with doing all my music animations. It might perhaps be no match for some newer and fancier notebooks (like Pytti or Disco Diffusion), but it does still have certain charm. It was done in view of changing parameters easily during breaks in execution, and it is quite fast too, so it is well suited for quick stream-of-consciousness, make-up-stuff-while-you-go-type of dreamy surrealist/impressionist animations.
For more info about the “philosophy” behind the design and how I use it, I wrote about it in my essay “The Faking of ‘The Woman in the White Dress'”
The notebook in question is also not very robust and is subject to bugs and breakdowns, so keep that in mind too.
To get started, get the notebook from here.
I assume that you, the user, have some previous practical knowledge about using AI art Colab notebooks, so I’m not going to cover any basics here.
Ok, so what do we have here?
We have a settings cell, where we apply the large-scale settings. Then we run an init cell, which installs libraries and modules and initializes a lot of necessary stuff. Then we have a “Run” cell which allows us to animate interactively in sprints.
Also we have extra cells for: resetting the project and starting from blank state, loading settings from a file and for making video. Now let’s go thru them one by one.
After the optional “show gpu” cell, we enter in our settings. These are:
rendered frames will be saved to halluzinator/frames directory on your google drive. I usually leave this unchecked to save space on my gdrive, in this case the frames are in the session content directory /content/frames
downloaded models will be saved on your google drive, and so they will load faster next time. I recommend having this checked. The model save path is halluzinator/models directory on your google drive.
frames per second in the rendered video
Can give similar results, if seed stays constant across sessions.
Output width of the frames.
Output height of the frames. Large amounts are likely to cause out-of-memory (OOM) error.
path to an initial image, either in /content or on gdrive. So you need to upload it manually.
If no path is given, a nice perlin noise init image will be generated.
Short for “multimodal”. The notebook uses the open source implementation of OpenAI’s CLIP by MLFoundations.
These models come in pairs, each consisting of vision transformer + name of dataset.
generative models providing imagery. imagenet_16384 is the best.
Use dark theme for the interactive widgets.
Not much to say, just run and wait it to finish.
This is where we animate. When we run the cell for the first time, it sets up the UI. This state is indicated by the “Ready” text at the top.
At minimum one needs to enter a prompt and then re-run the cell.
The UI widgets are grouped around the preview area. They are grouped into three: on the upper right, there are camera movements, below that, image controls, and below the preview, prompt controls.
This shows a video made from up to 50 last frames that have been already done, and the frames display their numbers.
Interval is how many frames to do in this run.
There are two prompt inputs and their weights. The weight can also be negative.
If we want to use image prompt, we need to upload image first either to session storage or gdrive, copy the path and paste it here.
Any prompt with forward slash (“/”) in it is interpreted as a path to an image.
Burn means how many iterations per frame. The idea is that more the iterations, more deeply the prompt is “burned” into the frame.
looseness determines how much the frame is allowed to differ from the previous frame. Large values produce hectic and twitchy “stop-motion” movement, whereas low values result in more coherent but much slower movements.
If you are not happy with the result of the run you can check the Rewind box and then type into Frame start the number of the frame from which you wish to start animating anew. The old frames will be overwritten as they come out.
We can choose 0-2 camera movements from the dropdowns. They are all self-explanatory except for warp, which is a zooming movement using perspective transform. We can also specify speeds of the movements. Note that the lower speed slider can do negative numbers also. This is an experimental feature – some movements wont work with negative numbers.
These affect the way the frame is rendered.
noise injects noise into the (latent) image, often giving more detail to the image during generation
lr controls the learning rate, this is somewhat similar to the looseness control, as large values can make it jumpy and low values keeps changes more conservative.
smoothness determines how smooth the overall image is – large values try to eliminate jaggedness and pixel noise (uses total variance loss).
cuts: number of cutouts during the generation. More the cutouts, more details. But large number of cutouts consume more memory.
Clean up and start new generation
this wipes out all the frames and running parameters and after this, when we run the “run” cell, it will be again in ready state.
Restore from a file
This cell can be useful for restoring the whole notebook state if something breaks.
Our settings are being written to /contents/options.json and the last frame is always copied to /contents/view.jpg.
Using these two, we ought to be able to get back on track if needed.
We can also save the view.jpg and the options.json and start from them in another session.
Makes video from all the frames generated and stores it to /content/video.mp4
That’s it, hope you make wonderful animations with it!