Final Video

Stream Diffusion
image to image
draw sketch / model to generate video
short story script:

prompt:
A minimalist geometric figure of a small person made from simple cylinders and spheres, walking through a desert landscape at sunset. The scene has a lonely, desolate atmosphere with a cartoonish, hand-drawn, graffiti comic style. The camera is positioned behind the small figure, who takes up about one-third of the image. In the distance, there are 3-5 more figures of the same simple geometric style, creating a sense of depth in the scene

At first I tried to generate images from text, but none of the outputs accurately conveyed the precise shape I wanted