Summary of progress of implementing face tracking mechanism in Jitsi-Meet, in GSoC 2017, second phase.
In the second phase of GSoC (Google Summer of Code), I improved the face tracking mechanism in Jitsi Meet a lot. To be specific, for the tracking.js library used for face recognition, I added essential new features to make it npm compatible, and to improve the performance significantly for tracking large videos. Based on face tracking mechanism implemented in the first phase, I ameliorated the structure of code by moving FaceTracker to middleware, fixed some bugs such as (bug link), as well as adding new features such as full-screen support.
Here, I list related results generated from my recent work in the second phase:
report of my progress in the second phase - this blog.
Modifications of tracking.js
tracking.js is a very popular library in computer vision on web. While, after I used it in my project for a while, I found some issues:
It lacks support for npm. Even though, currently tracking.js has been published on npm.js, it doesn't support CommonJS or ES6 modules. So actually, the library cannot be directly used in Jitsi Meet.
It has a performance issue for tracking in large videos. I found that when the video is in full-screen mode on my 13'' MacBook Air, it cost about 2s for each face recognition.
To solve the problems, my original idea is to totally rewrite the library in ES6, which is implemented in a new library, named jstracking. While, one of my mentors, Saúl, recommended me not to do in this way, because it loses the commits history of original library, and there are too many changes which cannot be merged with tracking.js main stream. (Thanks a lot for his advice!) So finally, I made as minimal changes as possible to solve the issues, and sent several pull requests to official tracking.js library. Before the requests are merged, I'll use jitsi branch of my forked repository in my work. This branch contains all the features I need.
Structure of Face Tracking in Jitsi Meet
There are also lots of changes in the structure of my face tracking code. In my original design, which is shown below, both of
FacePrompt are React components.
Structure of original design
VideoInputPreview component renders
FacePrompt in sequence. The problem for this structure, pointed by Yana, is that
FaceTracker does not have a meaning for UI rendering, but it's just like a functional layer above
FacePrompt component. So, it's better to move
FaceTracker to a middleware module, which as a result, leads to my current design.
Structure of current design
In my current design,
FaceTracker is just a module used in middleware.
VideoInputPreview takes charge of adding a new
FaceTracker instance, and enabling it when appropriate. These actions are intercepted by the middleware, and
FaceTracker instances will be created and called here. Actually, to be more maintainable, I created a
FaceTrackerFactory class to manage a collection of
FaceTracker instance is enabled, and a bad position is detected for a while,
showPrompt action will be called, and corresponding
FacePrompt instance will update its UI.
The todos below are listed in descending priority order.
Fix bugs and improve current implementation. There is still much space to improve in my current design, such as improving code structure, polishing comments, removing unnecessary debugging code, etc.
Create a pull request of auto-scaling to tracking.js. Even though I have implemented all the necessary features in jitsi branch, I haven't send PR for auto-scaling to tracking.js library.
Maybe work on another project. It time is enough, I plan to work on another project, following Yana's instructions.^_^