SketchRNN: When Technology meets Art

Introduction

Hello again folks, In this blog, we are going to summon our creative selves and dive into this cool pre-trained machine-learning model called SketchRNN. After a briefing about SketchRNN, we will take a look at a few sketches generated by this model.

Imagine a neural network capable of creating free-hand sketches, an artist behind your screen. SketchRNN is exactly that- It is a generative model that helps your imagination come to life with the stroke of a virtual pen. We all are aware of the power Art holds and with the advancement of Artificial Intelligence and Machine Learning, we have pushed the boundaries as to what is feasible in the world of Art.

SketchRNN is one such intersection where art and technology meet and with its development in the future, we may witness machines assisting us, humans, and help amplify our innate creativity.

Recurrent Neural Networks

Before we get into SketchRNN itself, if you have no idea about what RNNs are, they are basically a type of neural network that is designed to handle sequential data. RNNs have memory which allows them to generate output by considering the previous inputs. They can be used to generate sequences which makes them well-suited for generating sketches as each stroke in a sketch is a sequential element.

Sequential data

Sequential data, as the name implies, is a sequence of data. That could be text, a sequence of characters; text like a sequence of words Or even music, which is a sequence of notes and rhythms. Each one of the units in the sequence can be called the "state". For music, the state could be the duration of a note or which note it is.

Like the above, drawings can also be thought of as a sequence. They are a sequence of Vector paths or Stroke paths. Each element of the sequence involves a Vector path - Change in x, change in y and pen status.

Insights of the Model

Coming back to SketchRNN, It is not one model but a collection of models. The data that was used to train the machine learning model is from the project "Quick, Draw!" which is a game from Google where you are prompted to draw something and the website guesses if you draw the correct thing. Basically, an open-source dataset.

SketchRNN is designed to produce diverse outputs for a given input. This diversity is achieved through a sampling process, providing a range of possibilities for each generated sketch. Users can begin a sketch, and the model will finish it in real-time based on their input, in an interactive context. This makes interactive and group drawing experiences possible.

Variational AutoEncoder (VAE)

The sequence-to-sequence model in SketchRNN is combined with a variational autoencoder. The Variational Autoencoder (VAE) is a crucial component of the model in SketchRNN. In a lower-dimensional latent space, it learns to represent sketches concisely and meaningfully. Sketches are compressed by the encoder into this space, and sketches are reconstructed by the decoder using points in this space. The VAE permits sampling from this space during sketch generation to create a variety of organized sketches. It enhances the creative capabilities of the model by allowing meaningful interpolation and exploration within the learned latent space.

Creating cute little Octopuses

Now for the fun part, putting the boring theory to use, we shall create a consortium (a group of octopuses for those who don't know). But how do we do it? This is where we need to trigger our creativity. Let's go through the entire step-by-step process.

Creating a canvas in p5.js
Setting up the draw() function and drawing the setup() function.

For those who don't understand, read it again :)

In the setup() function, we import the SketchRNN model from ml5.js library. The code looks something like this

  function setup() {
    createCanvas(windowWidth, windowHeight);
    x = random(-width / 2, width / 2);
    y = random(-height / 2, height / 2);
    model = ml5.sketchRNN("octopus", modelReady);
    background(0);
  }

Once the model is imported, we load the data

  function modelReady() {
    console.log("model ready");
    model.reset();
    model.generate(gotSketch);
  }

Now we setup the draw() function. As mentioned earlier, drawings are a sequence of Vector paths, Vector Path being Change in x (dx), Change in y (dy) and pen status.

Pen status being whether the pen is up or down. The pen state actually describes what you should be doing for the next stroke. This is so because drawings are done with the pen down, obviously, hence, the initial status of the pen is "down"

  if (strokePath != null) {
      let newX = x + strokePath.dx * 0.2;
      let newY = y + strokePath.dy * 0.2;
      if (pen == "down") {
        stroke(0,200, 200);
        strokeWeight(2);
        line(x, y, newX, newY);
      }

After some debugging and final coding, we arrive at our beautiful consortium

https://youtu.be/0KpyinwtStQ

Conclusion

From this pretty long blog, we can see how machines can learn to draw from humans and create beautiful octopuses and other creations. We can also see how it behaves like a human and be inaccurate at times. To conclude, SketchRNN is fun. Hope you enjoyed reading, byee!