A significant challenge in teaching machines to automatically analyze, understand and glean object related insights from video is how to efficiently and accurately prepare large amounts of examples used to train and evaluate models. With frame rates around 30 to 60 fps, accurately labelling objects in even small time spans of video can be extremely time consuming and expensive.
Today, we have the pleasure of introducing you to Vannot—an open source, web based, and easy to integrate video annotation tool we created to help efficiently annotate objects for use in machine learning tasks like video segmentation and imagery quantification. Vannot takes advantage of the relative similarity of nearby frames to enable efficient object annotation in a web context with geographically distributed labelers.
We took inspiration from some of the industry’s most venerable drawing and illustration applications, and reframed them in close consideration of the workflow processes involved with annotating a large amount of video data. It is easy, for example, to advance a few frames or seconds and carry over the most recent shapes and annotations, so that all you have to do each time is make a few small adjustments. More advanced features are available, as well: it's possible to group adjacent or disjoint shapes into the same instance if, for example, an object is composed of many parts or is obscured behind some interloper.
We're interested and excited for you to use Vannot in your own efforts, and hopefully to contribute back to this free and open source project. We've strived to make it very easy to integrate — Vannot is just a webpage: HTML, CSS, and Javascript. You configure it with the URL you use to load the page. More information on using, integrating and developing Vannot can be found on GitHub at github.com/xyonix/vannot.
Have a look at the video below to see Vannot in action. In this video, Vannot designer and sailor Issa Tseng, walks through the preparation of sailing related training data like hull, jib and main sail object segmentation.