Download

We collected IVOS for the purpose of bench-marking (I)nteractive (V)ideo (O)bject (S)egmentation in the HRI setting. We collected the dataset in two different settings:

Human teaching objects

Translation	Scale	Rotation

The final teaching videos contains ∼ 50,000 frames for 12 object categories, with a total of 36 instances under these categories. The detection crops are provided for all the frames.
The segmentation annotation is currently provided for 20 instances with ~ 18,000 segmentation masks.

Translation

Scale

Rotation

Manipulation tasks setting

robot teaching image

4 main manipulation tasks: cutting, pouring, stirring, and drinking for both robot and human manipulation.
The dataset contains ∼ 8984 frames from robotic manipulation sessions, covering a total of 56 tasks with the different objects and configurations.
The segmentation annotation is provided for the main objects of interest. Along with recording the robot trajectories to enable further research on how to learn these trajectories from visual cues.
As for the human manipulation tasks, it covers 11 tasks similar to the robot tasks.

Classes Color Map Classes.txt
Human Manipulation Tasks
Coming Soon …
- cutting Download
- pouring Download
- stirring Download
- drinking Download
Robot Manipulation Tasks
- cutting Download
- pouring Download
- stirring Download
- drinking Download
Trajectories ROS Bag files
- cutting Download
- pouring Download
- stirring Download
- drinking Download