Interactive Video Object Segmentation using HRI Dataset.
Description of IVOS Dataset
We collected IVOS for the purpose of benchmarking (I)nteractive (V)ideo (O)bject (S)egmentation in the HRI setting. We collect the dataset in two different settings:
Human teaching objects.
Manipulation tasks setting.
Human teaching objects
The human demonstrates the object with different poses to the robot while verbally teaching the robot its corresponding label. The robot is required to learn the segmentation mask without an explicit manual segmentation mask, and segment that object in other scenes for manipulation.
Manipulation tasks setting
Different tasks perform by the robot or a human is recorded and annotated with segmentation masks. The robot tasks is recorded with the corresponding robot trajectories to aid further research in robot manipulation. Food preparation tasks for assistive robotics that can aid the elderly in unstructured environments is the focus of the manipulation tasks.
Why IVOS Dataset?
In comparison to other datasets collected in HRI setting whats unique in IVOS:
Segmentation annotations are provided for the teaching and manipulation sequences.
Manipulation tasks for food preparation such as stirring, cutting, pouring and drinking are recorded with the corresponding robot trajectories, and segmentation annotation for the objects being manipulated is provided.
Provides the possibility to do an ablation study on the effect of different transformations such as translation, scale, planar rotation, and out-of-plane rotation on incrementally learning object segmentation.
Whats the HRI setup?
Team Alberta was one of the five finalists in the KUKA Innovation Challenge with the target of teaching the robot incrementally about the unstructured environment it is operating in. We performed online learning of novel tools and its corresponding motion tasks, where the human teaches the robot about the novel items to manipulate[1]. Watch the demo to demonstrate the human robot interaction setting :