Adding AR Filters in a 100ms Video Call - Part 1

Tushar Tripathi — Tue, 22 Feb 2022 16:14:42 GMT

How cool would it be if you could build your own Video Call app with Snapchat-like filters in it! Ikr! That's what I was thinking when I came across Jeeliz. Now I have worked with tensorflow.js based libraries in the past but they're usually quite CPU intensive for a live video use case. Jeeliz looked promising as it's designed for this use case. So I thought why not try it out by adding some 3d AR filters to our video calls. Well! that is what we're going to do.

We'll use React and 100ms' React SDK for the video call part of our application. 100ms, in short, is building developer-focused live SDKs which abstracts away the low-level complexities. Support for video plugins was released recently which makes it easier to experiment with AR filters after setting up a basic app. And so I set forth on the journey. I'll mostly be talking about the implementation details related to the filters themselves in this blog than setting up the video call app from scratch. You can checkout the quickstart guide though for a quick overview of the SDK and how it works, or you can just fork it(it's also the first step 😀) and follow along with my exploration.

I have split the blog into parts so it's not overwhelming. In this part, we'll try to understand the plugin interface exposed by the SDK, learn a bit about HTML Canvas elements and implement a basic filter. We'll go into more details about AR, WebGL, and implementing the AR filter plugin in further parts.

Everything we'll do is available in this Github repo, and I have linked to the relevant commit for each step. By the end of this blog, we'll be able to build a simple grayscale filter -

Looks cool? You can check the demo of the final thing here. Let's get started with the code part.

Fork the quickstart

This step can be skipped if you're integrating filters in an existing web app already using the 100ms SDKs. If that is not the case let's start with forking the codesandbox linked in the doc to a GitHub repo. Now I have already done it so forking my GitHub repo will be much faster. The initial code lies in the branch named original.

You can also checkout the branch to follow locally -

git clone -b original https://github.com/triptu/100ms-face-filters.git

Run the app locally

We can clone the repo now and run it locally. Feel free to update to the latest versions here of the SDKs and then run using yarn install followed by yarn start. We'll see a screen like this if everything worked fine -

In case you're wondering what that auth token is, we can imagine them to be the meeting id that tells 100ms which room to put you in. Getting such a token is fairly straightforward(doesn't require anything technical or code) and is given in more detail here. Once you get the token, verify that everything is working fine. You can try joining from multiple tabs or sharing the link with your friends(after exposing with ngrok ofc). You can also join the same room from the link available on the dashboard(where the token was copied from).

Grayscale Filter

Let's say that we have to convert a colorful image to Grayscale and we're wondering what would it take. To answer this let's try to break down the image into further parts. An image is a matrix of many pixels where a single pixel can be described using three numbers from 0-255, the intensity values of red, green and blue. For a grayscale image, each pixel can be described as only one number ranging from 0-255 with 0 being black(lowest intensity) and 255 being white(highest intensity). Now if we were to convert a colored pixel with RGB values into grayscale, we will need some sort of mapping between both. A fairly straightforward way to map these is to average out the three intensities -

intensity = (red + blue + green)/3

But this won't result in a balanced grayscale image. The reason for it is that our eyes react differently to each color being most sensitive to green and least to blue. For our filter, we'll go with Luma which is a weighted sum of the RGB values and maps to the luminance much more accurately.

// Luma
intensity = red * 0.299 + green * 0.587 + blue * 0.114

Going through the Plugin Docs

Now that we're all set with the algorithm to convert an RGB image to grayscale, let's move ahead with checking out how we can write a plugin to implement this. The documentation is here, and fortunately, I've read it so you don't have to.

The gist of it is that we have to write a class that implements a method processVideoFrame(inputCanvas, outputCanvas), where we're passed in an image on the input canvas and have to put a result image on the output canvas. This makes the job fairly easy for us as we don't have to worry about video but just one image at a time. So as long as we can find a way to get RGB values from the input canvas and put the grayscale values on the output canvas, we should be able to implement the algorithm discussed and we'll be good.

Implementing the Grayscale Plugin

Checkout the full commit here.

So as we figured out from the docs, it's HTML Canvas we're going to deal with. Now canvas has something called a context which exposes direct methods both for getting the RGB values from a canvas(getImageData) and applying them(putImageData). With this information, we can begin writing our GrayScale Plugin. I have added further comments in the code below. Note that some other methods are present too as they're required by the SDK.

class GrayscalePlugin {
   /**
   * @param input {HTMLCanvasElement}
   * @param output {HTMLCanvasElement}
   */
  processVideoFrame(input, output) {
    // we don't want to change the dimensions so set the same width, height
    const width = input.width;
    const height = input.height;
    output.width = width;
    output.height = height;
    const inputCtx = input.getContext("2d");
    const outputCtx = output.getContext("2d");
    const imgData = inputCtx.getImageData(0, 0, width, height);
    const pixels = imgData.data; 
    // pixels is an array of all the pixels with their RGBA values, the A stands for alpha
    // we will not actually be using alpha for this plugin, but we still need to skip it(hence the i+= 4)
    for (let i = 0; i < pixels.length; i += 4) {
      const red = pixels[i];
      const green = pixels[i + 1];
      const blue = pixels[i + 2];
      // the luma algorithm as we discussed above, floor because intensity is a number
      const lightness = Math.floor(red * 0.299 + green * 0.587 + blue * 0.114);
      // all of RGB is set to the calculated intensity value for grayscale
      pixels[i] = pixels[i + 1] = pixels[i + 2] = lightness;
    }
    // and finally now that we have the updated values for grayscale we put it on output
    outputCtx.putImageData(imgData, 0, 0);
  }

  getName() {
    return "grayscale-plugin";
  }

  isSupported() {
    // we're not doing anything complicated, it's supported on all browsers
    return true;
  }

  async init() {} // placeholder, nothing to init

  getPluginType() {
    return HMSVideoPluginType.TRANSFORM; // because we transform the image
  }

  stop() {} // placeholder, nothing to stop
}

Adding a button component to add the plugin

Checkout the full commit here.

Let's write a toggle button component now which will turn on/off the filter. The component will take in a plugin and button name to display.

// also intialise the grayscale plugin for use by the Button's caller
export const grayScalePlugin = new GrayscalePlugin();

export function PluginButton({ plugin, name }) {
  const isPluginAdded = false;
  const togglePluginState = async () => {};

  return (
    <button className="btn" onClick={togglePluginState}>
      {`${isPluginAdded ? "Remove" : "Add"} ${name}`}
    button>
  );
}

We'll use it as below, this is added in the header component in the above commit.

"Grayscale"} />

Clicking on the button won't work yet though, because we're not adding the plugin to the video track. Let's see how to do that in the next section.

Making the button functional

Checkout the full commit here.

With some help from the documentation, we can make our button component functional using the hooks exposed by the SDK. There are two hooks from the SDK we need to use to implement our toggle function -

useHMSStore for knowing the current state i.e. whether the plugin is currently part of the video track.
useHMSActions to get access to the methods for adding and removing the plugin.

import {
  selectIsLocalVideoPluginPresent,
  useHMSActions,
  useHMSStore,
} from "@100mslive/react-sdk";

export function PluginButton({ plugin, name }) {
  const isPluginAdded = useHMSStore(
    selectIsLocalVideoPluginPresent(plugin.getName())
  );
  const hmsActions = useHMSActions();

  const togglePluginState = async () => {
    if (!isPluginAdded) {
      await hmsActions.addPluginToVideoTrack(plugin);
    } else {
      await hmsActions.removePluginFromVideoTrack(plugin);
    }
  };

  return (
    <button className="btn" onClick={togglePluginState}>
      {`${isPluginAdded ? "Remove" : "Add"} ${name}`}
    button>
  );
}

Voilà!

And that's it, our button is functional now. Everything works and looks amazing. To recap, we were able to write a grayscale filter from scratch which transforms our video for everyone in the room.

You can go on from here to have more filters(for e.g. sepia, saturation, contrast), or experiment with other image processing algorithms to explore the possibilities. Check out this and this for some starting points.We'll talk about creating an AR filter in upcoming parts which will build upon the fundamentals learned in this part.

First Principles

Adding AR Filters in a 100ms Video Call - Part 1

Fork the quickstart

Run the app locally

Grayscale Filter

Going through the Plugin Docs

Implementing the Grayscale Plugin

Adding a button component to add the plugin

Making the button functional

Voilà!