Bringing 'The Birth of Venus' to Life with Deep Learning

September 06, 2021

Even after analyzing thousands of artworks throughout my Art History degree, Sandro Botticelli’s work remains one of my favourites. I wanted to create something that intersected my experience in Art History and Computer Science, and “The Birth of Venus” just felt right. For fun, I decided to bring Venus to life within Botticelli’s work.

The final product leverages Deep Learning to transform “The Birth of Venus” into an interactive interpretation where Venus’ gaze follows your mouse/touch. You can view the effect on my site, erinhavens.com.

In this post, I’ll walk through technical implementation and inspiration for my work.

 

Background on the Project

Why ‘The Birth of Venus’?

Figure 1: Sandro Botticelli. ‘The Birth of Venus’. Tempera on canvas. 67.9 inches x 109.6 inches. Wikimedia Commons.

“The Birth of Venus” — in Botticelli’s native language, “Nascita do Venere” — is an expression of symbolism, virtue, and ideals. While Botticelli’s motivations for creating the work are widely disputed, most agree the work to be a depiction of a traditional scene from Greek Mythology — a floating Venus, being driven to the shore by the breeze of Zephyrus, the wind God, shortly after her birth. The work traverses many themes, including purity and other classical values like the appreciation of beauty and the concept of perfection.

The work, as one of the most influential and cult favourite artworks of all time, has been a pillar of my degree and exploration of the art world, so it was an easy choice. (See footnotes for more information on Botticelli’s work).

Inspiration for the Effect

During the month of August, I was lucky enough to take a 4-week sabbatical before starting my new role at GitHub. I’ve always been wildly excited about the realm of art tech and wanted to do a light hackathon project in the space. I also wanted to redo my 3 year old (read: ancient!) personal website during this time.

I wanted to do something: 1) interactive and magnetic, 2) which leveraged some of my favourite artworks, and 3) which could be incorporated into my new site design. I spent a while prototyping different designs, but nothing quite felt like me.

Responsive Eyes

Recently, I came across a trend on CodePen where developers are drawing cartoon eyes with CSS, which follow the user’s mouse:

See the Pen Eyes follow mouse by Jeremy (@J-Roel) on CodePen.

Figure 2: See the effect live at @J-Roel’s CodePen here.

It was from this moment of serendipity where I wondered, what if we could create a similar interactive effect, using a static, stylized image?

  

Animating ‘Venus’ with Deep Learning

I chose to use the First Order Motion Model (FOMM), a popular approach for similar projects by Aliaksandr Siarohin et al., for its open-source and straightforward implementation.

Conveniently, the project includes a pre-trained model for animating input images, which saved time and resources and which allowed for rapid prototyping of my idea. The model simply needed a cropped input video of a source user’s face, which will be used to generate an output video from a static source image.

To animate Venus, my plan was:

  1. Create an input video of myself using my webcam, and crop the video appropriately.
  2. Use FOMM to generate an output video using a cropped image of Venus.
  3. Divide our Venus output into discrete frames.

The next phase of the project is bringing Venus to life through post-processing, CSS, and some simple JS:

  1. Map our numbered frames to specific positions on the screen, for our interactive mouse-following effect.
  2. Blend the frames in order to ensure a seamless transition for our images, undetectable to the naked eye.
  3. Overlay the output frames onto the original painting.
  4. Transform the frames to the correct rotation.

Creating the Frames

I created a crop of Venus to experiment with a cropped webcam video of myself looking from left to right. Unfortunately, the output was wonky at best, and terrifying at worst, with the pre-trained model only able to detect half of Venus’ face.

Due to its training data, the model responded better to well lit, unrotated, straight-on view of the subject. As such, it was necessary to rotate Venus’ face:

Figure 3: Rotational transformation of Venus.

Mapping the Frames

Next, we divide the desired output into about 30 discrete frames looking from left to right:

Figure 4: Sample output frames.

The left-most frame, frame 0, will correspond to the left-most mouse position. The right-most frame will correspond to the right-most mouse position.

However, I would need some additional image processing in order to seamlessly integrate the frame back into the original image – I would have to determine the transformation angle, figure out how the size of the frame corresponds to the background image, and blur or blend out the edges for a seamless effect.

Transforming and Blending the Frames

To blend the frames, the first step was to determine the transformation angle by superimposing a slightly-transparent output frame on top of the original image.

Next, it became a matter of blending the edges of each frame with the original source image in order to ensure a seamless, undetectable transition, which I did in post-processing using this solution by Emilie Xie.

Finally, I simply had to perfectly align the output frames and ensure this alignment remains responsive:

Figure 5: Overlaid output on the original.

Output

Success! We now have a Venus which followed the mouse/touch screen’s input with her gaze.

For fast movements, the output was choppy, so I adjusted the animation speed and “forced” the animation to show frames in-between the original state and the new frame, dictated by the mouse’s position.

Finally, at rest, I implemented a look from left to right, and right to left. The final product:

Figure 1: Final animation bringing Venus to life!

It was an extremely fun and thrilling project — I feel like I have the coolest personal site on the internet now. 😉 And most importantly, it feels like me.

Thanks for reading!

Notes

[1] For an interpretation on “The Birth of Venus”: HeadStuff.

[2] For technical inspiration: “Bringing the Mona Lisa to Life” by Emilie Xie, from TensorFlow.js

[3] First Order Motion Model (FOMM) by Aliaksandr Siarohin et al. 2019.

[4] FOMM Demo by Aliaksandr Siarohin and GUI by @graphemecluster.

[5] Mentor for this project: @anassinator