Video Class Agnostic Segmentation

Video Class Agnostic Segmentation Benchmark


Video class agnostic segmentation is the task of segmenting objects without regards to its semantics combining appearance, motion and geometry from monocular video sequences. The main motivation behind this is to account for unknown objects in the scene and to act as a redundant signal along with the segmentation of known classes for better safety as shown in the following Figure.

There are two main formulation for this problem, we provide a benchmark with these two main tracks

Motion Segmentation Track


It poses the problem as segmenting moving objects like animals crossing the street, or unusual construction vehicles.

We provide an improved dataset for motion instance segmentation towards that end where we mainly focus on increasing the sequences and categories to avoid overfitting to a certain semantic class for moving objects. We build upon the publicly available Cityscapes-VPS[6] and KITTI-MOTS[7] datasets. We further provide baselines for real-time joint panoptic and motion instance segmentation that are publicly released under Models.

Dataset #Frames #Seqs #Cats Instances Panoptic Tracking Annotation Type
DAVIS[1] 6208 90 78 Yes No Yes Manually Labelled
Kitti-Motion[5] 455 - 1 No
No No Manually Labelled
Kitti-MoSeg[2][3] 12919 38 1 No No No Weak Annotations
Cityscapes-Motion[5] 3475 - 1 Yes
No No Manually Labelled
Kitti-MoSeg Extended[4]
12919 38 5 Yes No No Weak Annotations
Ours 11008 520 8 Yes
Yes Yes Manually Labelled

          Cityscapes-VPS Motion                                  Kitti-MOTS Motion

[1] Sergi Caelles, Jordi Pont-Tuset, Fed-erico Perazzi, Alberto Montes, Kevis-Kokitsi Maninis,and Luc Van Gool. The 2019 davis challenge on vos:Unsupervised multi-object segmentation.arXiv preprintarXiv:1905.00737, 2019.
[2] Mennatullah Siam, Heba Mahgoub, Mo-hamed Zahran, Senthil Yogamani, Martin Jagersand, andAhmad El-Sallab. Modnet: Moving object detection net-work with motion and appearance for autonomous driving.arXiv preprint arXiv:1709.04821, 2017.
[3] Hazem Rashed, Mohamed Ramzy,Victor Vaquero, Ahmad El Sallab, Ganesh Sistu, andSenthil Yogamani. Fusemodnet: Real-time camera and li-dar based moving object detection for robust low-light au-tonomous driving. InThe IEEE International Conferenceon Computer Vision (ICCV) Workshops, Oct 2019.
[4] EslamMohamed,MahmoudEwaisha, Mennatullah Siam, Hazem Rashed, Senthil Yo-gamani, and Ahmad El-Sallab. Instancemotseg: Real-timeinstance motion segmentation for autonomous driving.arXiv preprint arXiv:2008.07008, 2020.
[5] Johan Vertens, Abhinav Valada, andWolfram Burgard. Smsnet: Semantic motion segmentationusing deep convolutional neural networks. InProceed-ings of the IEEE International Conference on IntelligentRobots and Systems (IROS), Vancouver, Canada, 2017.
[6] Dahun Kim, Sanghyun Woo, Joon-YoungLee, and In So Kweon. Video panoptic segmentation. InProceedings of the IEEE/CVF Conference on ComputerVision and Pattern Recognition, pages 9859–9868, 2020.
[7] PaulVoigtlaender,MichaelKrause, Aljosa Osep, Jonathon Luiten, Berin Balachan-dar Gnana Sekar, Andreas Geiger, and Bastian Leibe.Mots: Multi-object tracking and segmentation. InCon-ference on Computer Vision and Pattern Recognition(CVPR), 2019.

Open-set Segmentation Track


It poses the problem as segmenting classes outside the closed set of known classes. Thus, it can segment unknown static objects like traffic warnings or other rare objects in parking lots near markets.

We build Custom Carla scenarios and provide a synthetic data for the open-set segmentation with fine-grained class annotations for the unknown objects. The fine-grained annotations help to perform a controlled set of experiments about the objects labelled as unknown during training versus testing to understand the generalization ability of the model. In order to collect large-scale data, the Carla basic agent is modified to cope with unknown objects in the road and avoid them through lane changing. A large-scale dataset of approximately 70,000 frames is collected with random traffic, weather conditions and different towns and scenarios.

                 Construction                            Parking                                     Barrier

Dataset Coming Soon ...