Mennatullah Siam, PhD, PEng

Currently, in transition between positions

Calendar, Scholar, Linkedin, Github, Email: menna.seyam@gmail.com, Academic CV

Research Highlights, Teaching, My PhD Thesis

I started as an assistant professor in Ontario Tech University since 2023 leading the Image and Video Understanding (IVU) lab and an affiliate professor in UBC, funded by NSERC among other Canadian agencies. My research interests include pixel-level scene and video understanding, data efficient learning, interpretability and responsible AI. Previously I was a Postdoctoral researcher working with Professor Richard Wildes in York University, 2021-2023. I was also a vector affiliate in 2022. I obtained my PhD in 2021 under Professor Martin Jagersand supervision working in vision for robotics. My thesis was focused on learning video object segmentation from limited labelled data, where I was working on the intersection between video object segmentation and fewshot object segmentation with application to both autonomous driving and robot manipulation. I was a member in a team of 4 in the KUKA Innovation Challenge 2018, where our team received a finalist award. Previously I finished my MSc in NU and BSc in Ainshams University, Egypt.
Research Interests: Computer Vision, Deep Learning, Robotics, Fewshot Learning, Foundation Models, Video Understanding, Interpretability, Responsible AI.
Research Statement: This statement only provides a rough description of my research program without delving into details.
I am heavily promoting "No to Killing children and civilians", "No to Genocide!", "No to undermining Human Rights for some money that will be gone anyway within a limited lifetime". This is an integral component of my teaching and research just to regain humanity back again in a fast paced AI research.
I am also supportive for African researchers passionate to learn about the Computer Vision field, through founding Ro'ya community, Twitter.
Note I dont sleep with anyone! So work offers or affiliations that are related to that in the future don't give it to me to begin with cause it won't happen! My apologies for being explicit but we entered quite the dark days that abnormal things are being normalized.

Talks and Recognitions

"Learning Image and Video Understanding with Limited Labelled Data", Black in AI Keynote, NeurIPS 2022.
"From Image to Video Understanding, what to Consider?", University of British Columbia, 2022.
"On the Intersection of Few-shot and Video Object Segmentation.", Doctoral Consortium, CVPR 2021.
"Few-shot Learning Tutorial", Samsung AI, 2022.
"Segmentation with Transformers Tutorial", Ro'ya Workshop, Deep Learning Indaba, Accra, Ghana, 2024.
KUKA Innovation Finalist Award, 2018.
Outstanding Reviewing in ICCV 2023.
PhD/Postdoc scholarships: VISTA Fellowship, AITF, Verna Tate Graduate Scholarship, Alberta Graduate Excellence Scholarship.

News

Jan 2026: I am an IJCAI 2026 Senior PC.
Jan 2026: Our work with UBC on "Segmentation From Attention: Training-Free Layer Selection and One-Shot Tuning for Segmentation in VLMs" has been accepted to TMLR, congratulations to Rayat and also for finishing his PhD recently.
Dec 2025: Our work on "Pixel-level Understanding of a World in Motion within a Neural Encoding Framework" has been accepted to Nature Scientific Reports, Motion Perception special issue, congratulations to Mai and the team and for her PhD graduation.
June 2025: I am WACV 2026 Area Chair.
April 2025: Our work on building a vision centric remote sensing benchmark that was led by AMMI/AIMS MSc student jointly supervised by me and Prof. Naoto has been accepted in Eval-FoMo CVPR 25 workshop. Congrats to the team and to him for finishing his MSc.
February 2025: My work got accepted in IJCV 2025 that is focused on few-shot video object segmentation, I acknowledge the guidance of my postdoc supervisors although they were not able to continue the work till its final acceptance form.
January 2025: I am an organizer in PixFoundation: 1st Workshop on Pixel-level Vision Foundation Models in CVPR 2025, You can Follow us on Twitter for updates.
November 2024: Our work on TAM-VT video segmentation and tracking is accepted in WACV 2025, our work with RIKEN institute was also accepted in IEEE Geoscience and Remote Sensing Letters.
October 2024: Our work is accepted in Neuro AI workshop and WiML part of NeurIPS 2024.
September 2024: Our work is accepted in TPAMI, which was an extension of our CVPR 2022 paper.
August 2024: Our work on the current state of Computer Vision research in Africa is accepted in JAIR special issue on Fairness and Bias in AI.
June 2024: I am WACV 2025 Area Chair.
May 2024: I acquired the NSERC Alliance International grant, thanks to NSERC.
April 2024: Happy to announce that I acquired the Discovery grant and launch supplements for my IVU Lab on "Learning pixel-level video understanding", postdoc and PhD students interested to apply reach out on my email.
March 2024: I am glad to announce that I am an affiliate assistant professor with University of British Columbia, Canada.
March 2024: I am a supporting organizer in the first African Computer Vision Summer School, ACVSS, Nairobi, Kenya co-located in Microsoft Research (MARI).
February 2024: 1 Paper got accepted in CVPR 2024 on prompting pixel-level image understanding models, and our work on studying video understanding models from a neuroscience perspective is released on arxiv.

...

December 2023: I am co-organizing 3rd workshop on L3D-IVU in CVPR 2024.
I am an outstanding reviewer in ICCV 2023.
July 2023: I started as an assistant professor in Ontario Tech University
June 2023: I am a WACV 2024 Area Chair.
February 2023: Our paper on Multiscale Video Transformers for Video Object Segmentation is accepted in CVPR 2023.
December 2022: Co-organizing 2nd Workshop on L3D-IVU: Learning with Limited Labelled Data for Image and Video Understanding in CVPR 2023.
November 2022: I was a Keynote speaker in Black in AI workshop co-located with Neurips 2022 on Learning Scene and Video Understanding with Limited Labelled Data.
September 2022: I am guest editor in the special issue on "Signal Processing and Machine Learning for Autonomous Driving" in Remote Sensing Journal.
April 2022: Gave a talk on few-shot learning and its extension beyond single images to videos in Samsung AI.
March 2022: Our paper on the interpretability of Spatiotemporal models has been accepted in CVPR2022.
December 2021: Co-organizing Workshop on L3D-IVU: Learning with Limited Labelled Data for Image and Video Understanding in CVPR 2022.
December 2021: Our short paper in Machine Learning for Autonomous Driving Workshop in Neurips 2021 was accepted.
July 2021: Officially Started my Postdoc in York University under supervision from Prof. Richard Wildes and Kostas Derpanis,
May 2021: I officially finished my PhD and graduated from University of Alberta convocation in Fall 2021, Thesis.

IVU Lab - Snapshot May 2025

Open Positions

Thanks for your interest to join my lab. I am currently not accepting students as I am relocating will post more details soon.

Graduate Students and interns

Yousef Hesham (MSc student, Nile University)
Abduljaleel Adejumo (MSc student, AMMI/AIMS)
Mohamed Rashad (MSc intern)
Omid Reza Heidari (Research Engineer)

Postdocs and Researchers

Faegheh Yeganli (Research Scientist)
Maria Siddiqua (Postdoctoral Fellow, awaiting work permit)

Alumni

Leila Cheshmi (MEng student, Ontario Tech University - 2024)
Mai Gamal (Visiting PhD Student, GUC - Summer'2023, 2024)

Publications

2026

The power of one: A single example is all it takes for segmentation in vlms.

Mir Rayat Hossain, Mennatullah Siam, Leonid Sigal, Jim Little

TMLR 2026 (Journal2Conference Track).

TMLR Paper Arxiv Paper

Pixel-level Understanding of a World in Motion within a Neural Encoding Framework.

Mai Gamal, Mohamed Rashad, Eman Ehab, Saif ElDawlatly, Mennatullah Siam

Nature Scientific Reports
(Motion Perception Collection).

Paper (in-press)

2025

PixFoundation 2.0: Do Video Multi-Modal LLMs Use Motion in Visual Grounding?

Mennatullah Siam

Arxiv.

Project Webpage Paper Code

PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models?

Mennatullah Siam

Arxiv.

Project Webpage Paper Code Datasets

Temporal Transductive Inference for Fewshot Video Object Segmentation

Mennatullah Siam

IJCV 2025.

Paper Code Demo

TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking

Raghav Goyal, Wan-Cyuan Fan, Mennatullah Siam, Leonid Sigal

WACV 2025.

Paper Project Webpage Code

A Vision Centric Remote Sensing Benchmark.

Abduljaleel Adejumo*, Faegheh Yeganli*, Clifford Broni-bediako, Aoran Xiao, Naoto Yokoya+, Mennatullah Siam+

(* equally contributing, + equally advising)

Eval-FoMo Workshop, CVPR 2025, Non Archival.

Paper Dataset

Multiscale Video Transformers for Class Agnostic Segmentation in Autonomous Driving.

Leila Cheshmi, Mennatullah Siam

Arxiv.

Paper

2024

MEDVT++: A Unified Multiscale Encoder-Decoder Transformer for Video Segmentation

Rezaul Karim, He Zhao, Richard P. Wildes, Mennatullah Siam

Journal Extension Under Review.

Paper Project Webpage

Quantifying and Learning Static vs. Dynamic Information in Deep Spatiotemporal Networks

Matthew Kowal, Mennatullah Siam, Md Amirul Islam, Neil D. B. Bruce, Richard P. Wildes, Konstantinos G. Derpanis

TPAMI.

Paper

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach

Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal, James J. Little

CVPR 2024.

Paper Code

Generalized Few-Shot Semantic Segmentation in Remote Sensing: Challenge and Benchmark

Clifford Broni-Bediako, Junshi Xia, Jian Song, Hongruixuan Chen, Mennatullah Siam, Naoto Yokoya

IEEE Geoscience and Remote Sensing Letters (accepted).

Paper

A Survey on African Computer Vision Datasets, Topics and Researchers

Abdul-Hakeem Omotayo*, Ashery Mbilinyi*, Lukman Ismaila*, Houcemeddine Turki, Mahmoud Abdien, Karim Gamal, Idriss Tondji, Yvan Pimi, Naome A. Etori, Marwa M. Matar, Clifford Broni-Bediako, Abigail Oppong, Mai Gamal, Eman Ehab, Gbetondji Dovonon, Zainab Akinjobi, Daniel Ajisafe, Oluwabukola G. Adegboro, Mennatullah Siam

JAIR - Fariness and Bias in AI Special Issue.

Paper Datasets List Code

System Identification of Neural Systems: Going Beyond Images to Modelling Dynamics

Mai Gamal, Mohamed Rashad, Eman Ehab, Saif ElDawlatly, Mennatullah Siam

Short Paper in NeuroAI Workshop Neurips 2024 & WiML workshop.

Paper

2023

MED-VT: Multiscale Encoder-Decoder Video Transformer with Application to Object Segmentation

Rezaul Karim, He Zhao, Richard P. Wildes, Mennatullah Siam

CVPR 2023.

Paper Project Webpage Code

Multiscale Memory Comparator Transformer for Few-Shot Video Segmentation

Mennatullah Siam, Rezaul Karim, He Zhao, Richard P. Wildes

Arxiv.

Paper Code

Towards a Better Understanding of the Computer Vision Research Community in Africa

Abdul-Hakeem Omotayo, Mai Gamal, Eman Ehab, Gbetondji Dovonon, Zainab Akinjobi, Ismaila Lukman, Houcemeddine Turki, Mahmod Abdien, Idriss Tondji, Abigail Oppong, Yvan Pimi, Karim Gamal, and Mennatullah Siam

EAAMO 2023.

Paper

Two-Stage Joint Transductive and Inductive Learning for Nuclei Segmentation

Hesham Ali, Idriss Tondji, Mennatullah Siam

ML4H Symposium 2023, Findings Track.

Paper

2022

A Deeper Dive into What Deep Spatiotemporal Networks Encode: Quantifying Static vs. Dynamic Information

Matthew Kowal, Mennatullah Siam, Md Amirul Islam, Neil D. B. Bruce, Richard P. Wildes, Konstantinos G. Derpanis

CVPR 2022.

Paper Video Demo Project Webpage Code Interpretability Code AVOS

2021

Temporal Transductive Inference for Fewshot Video Object Segmentation

Mennatullah Siam, Konstantinos G. Derpanis, Richard P. Wildes

ML4AD Workshop, Neurips 2021.

Full Paper Paper Video Demo

Video Class Agnostic Segmentation Benchmark for Autonomous Driving

Mennatullah Siam, Alex Kendal, Martin Jagersand

CVPR 2021 Workshops.

Paper Project Webpage

2020

Weakly Supervised Few-shot Object Segmentation using Co-attention with Visual and Semantic Embeddings

Mennatullah Siam*, Naren Doraiswamy*, Boris N. Oreshkin*, Hengshuai Yao, Martin Jagersand (equally contributing)

IJCAI 2020.

Paper

2019

AMP: Adaptive Masked Proxies for Few-Shot Segmentation

Mennatullah Siam, Boris N. Oreshkin, Martin Jagersand

ICCV 2019.

Paper Code

Video Segmentation using Teacher-Student Adaptation in a Human Robot Interaction (HRI) Setting

Mennatullah Siam, Chen Jiang, Steve Lu, Laura Petrich, Mosta Gamal, Mohamed Elhoseiny, Martin Jagersand

ICRA 2019.

Paper Dataset

Online Object and Task Learning via Human Roboti Interaction

Masood Dehghan*, Zichen Zhang*, Mennatullah Siam*, Jun Jin, Laura Petrich, Martin Jagersand (equally contributing)

ICRA 2019.

Paper Video Demo

2018

Real-time Segmentation with Appearance, Motion and Geometry

Mennatullah Siam, Sara Eikerdawy, Mostafa Gamal, Moemen Abdel-Razek, Martin Jagersand, Hong Zhang

IROS 2018.

Paper

Moving Object Detection Network for Autonomous Driving

Mennatullah Siam, Heba Mahgoub, Mohamed Zahran, Senthil Yogamani, Martin Jagersand, Ahmed El-Sallab

ITSC 2018.

Paper Dataset Video Demo Patent

Teaching

Ontario Tech University

Fall 2023, Fall 2024 ELEE2110 Discrete Mathematics, Undergraduate Course. Course Outline, Feedback
Winter 2024, SOFE4620 Machine Learning and Data Mining, Undergraduate Course. Course Outline, Feedback
Winter 2024, SOFE2715 Data Structures, Undergraduate Course.

Nile University

Spring 2023, CIT-670 Computer Vision, Graduate Course.
Spring 2022, CIT-670 Computer Vision, Graduate Course. Course Outline, Feedback

University of Alberta

Winter 2021, MM-805 Computer Vision and 3DTV, Graduate Course. Feedback

Teaching Samples

Lecture Sample I used in Ontario Tech University Interview on Optimization. Lecture
Assignment Sample I used in MM805 University of Alberta course. Assignment

Volunteer Teaching

ACVSS 2024 Tutorials

PhD Thesis

Learning Video Object Segmentation from Limited Labelled Data

PDF Degree Transcripts w/ Awards