GSoC 2017 Second Phase Summary


Summary of progress of implementing face tracking mechanism in Jitsi-Meet, in GSoC 2017, second phase.


In the second phase of GSoC (Google Summer of Code), I improved the face tracking mechanism in Jitsi Meet a lot. To be specific, for the tracking.js library used for face recognition, I added essential new features to make it npm compatible, and to improve the performance significantly for tracking large videos. Based on face tracking mechanism implemented in the first phase, I ameliorated the structure of code by moving FaceTracker to middleware, fixed some bugs such as (bug link), as well as adding new features such as full-screen support.

Here, I list related results generated from my recent work in the second phase:

Modifications of tracking.js

tracking.js is a very popular library in computer vision on web. While, after I used it in my project for a while, I found some issues:

  1. It lacks support for npm. Even though, currently tracking.js has been published on npm.js, it doesn't support CommonJS or ES6 modules. So actually, the library cannot be directly used in Jitsi Meet.

  2. It has a performance issue for tracking in large videos. I found that when the video is in full-screen mode on my 13'' MacBook Air, it cost about 2s for each face recognition.

To solve the problems, my original idea is to totally rewrite the library in ES6, which is implemented in a new library, named jstracking. While, one of my mentors, Saúl, recommended me not to do in this way, because it loses the commits history of original library, and there are too many changes which cannot be merged with tracking.js main stream. (Thanks a lot for his advice!) So finally, I made as minimal changes as possible to solve the issues, and sent several pull requests to official tracking.js library. Before the requests are merged, I'll use jitsi branch of my forked repository in my work. This branch contains all the features I need.

Structure of Face Tracking in Jitsi Meet

There are also lots of changes in the structure of my face tracking code. In my original design, which is shown below, both of FaceTracker and FacePrompt are React components.

Structure of original design

Structure of original design

VideoInputPreview component renders FaceTracker, and FaceTracker renders FacePrompt in sequence. The problem for this structure, pointed by Yana, is that FaceTracker does not have a meaning for UI rendering, but it's just like a functional layer above FacePrompt component. So, it's better to move FaceTracker to a middleware module, which as a result, leads to my current design.

Structure of current design

Structure of current design

In my current design, FaceTracker is just a module used in middleware. VideoInputPreview takes charge of adding a new FaceTracker instance, and enabling it when appropriate. These actions are intercepted by the middleware, and FaceTracker instances will be created and called here. Actually, to be more maintainable, I created a FaceTrackerFactory class to manage a collection of FaceTracker instances.

After a FaceTracker instance is enabled, and a bad position is detected for a while, showPrompt action will be called, and corresponding FacePrompt instance will update its UI.

Future Work

The todos below are listed in descending priority order.

  1. Fix bugs and improve current implementation. There is still much space to improve in my current design, such as improving code structure, polishing comments, removing unnecessary debugging code, etc.

  2. Create a pull request of auto-scaling to tracking.js. Even though I have implemented all the necessary features in jitsi branch, I haven't send PR for auto-scaling to tracking.js library.

  3. Maybe work on another project. It time is enough, I plan to work on another project, following Yana's instructions.^_^

Post Directory