Ph.D thesis Susana Ruano

 

Research  

 

GTI Data   

 

Open databases created and software developed by the GTI and supplemental material to papers.  

 

Databases  


SportCLIP (2025): Multi-sport dataset for text-guided video summarization.
Ficosa (2024):
The FNTVD dataset has been generated using the Ficosa's recording car.
MATDAT (2023):  More than 90K labeled images of martial arts tricking.
SEAW – DATASET (2022): 3 stereoscopic contents in 4K resolution at 30 fps.
UPM-GTI-Face dataset (2022): 11 different subjects captured in 4K, under 2 scenarios, and 2 face mask conditions.
LaSoDa (2022): 60 annotated images from soccer matches in five stadiums with different characteristics and light conditions.
PIROPO Database (2021):People in Indoor ROoms with Perspective and Omnidirectional cameras.
EVENT-CLASS (2021): High-quality 360-degree videos in the context of tele-education.
Parking Lot Occupancy Database (2020)
Nighttime Vehicle Detection database (NVD) (2019)
Hand gesture dataset (2019): Multi-modal Leap Motion dataset for Hand Gesture Recognition.
ViCoCoS-3D (2016): VideoConference Common Scenes in 3D.
LASIESTA database (2016): More than 20 sequences to test moving object detection and tracking algorithms.
Hand gesture database (2015): Hand-gesture database composed by high-resolution color images acquired with the Senz3D sensor.
HRRFaceD database (2014):Face database composed by high resolution images acquired with Microsoft Kinect 2 (second generation).
Lab database (2012): Set of 6 sequences to test moving object detection strategies.
Vehicle image database (2012)More than 7000 images of vehicles and roads.           

 

Software  


NaviFormer (2025): A Deep Reinforcement Learning Transformer-like Model to Holistically Solve the Navigation Problem.
Empowering Computer Vision in Higher Education(2024)
A Novel Tool for Enhancing Video Coding Comprehension.
Engaging students in audiovisual coding through interactive MATLAB GUIs (2024)

TOP-Former: A Multi-Agent Transformer Approach for the Team Orienteering Problem (2023)

Solving Routing Problems for Multiple Cooperative Unmanned Aerial Vehicles using Transformer Networks (2023)
Vision Transformers and Traditional Convolutional Neural Networks for Face Recognition Tasks (2023)
Faster GSAC-DNN (2023): A Deep Learning Approach to Nighttime Vehicle Detection Using a Fast Grid of Spatial Aware Classifiers.
SETForSeQ (2020): Subjective Evaluation Tool for Foreground Segmentation Quality. 
SMV Player for Oculus Rift (2016)

Bag-D3P (2016): 
Face recognition using depth information. 
TSLAB (2015): 
Tool for Semiautomatic LABeling.   
 

   

Supplementary material  


Viewpoint-Invariant Soccer Pitch Registration Using Geometric and Learned Features (2025)
Soccer line mark segmentation and classification with stochastic watershed transform (2022)
A fully automatic method for segmentation of soccer playing fields (2022)
Grass band detection in soccer images for improved image registration (2022)
Evaluating the Influence of the HMD, Usability, and Fatigue in 360VR Video Quality Assessments (2020)
Automatic soccer field of play registration (2020)   
Augmented reality tool for the situational awareness improvement of UAV operators (2017)
Detection of static moving objects using multiple nonparametric background-foreground models on a Finite State Machine (2015)
Real-time nonparametric background subtraction with tracking-based foreground update (2015)  
Camera localization using trajectories and maps (2014)

 

                                                                                                                                                                                                                             
 
                                                                   
 
                                                                                                                                                             
 
      

 

 

"Augmented reality over video stream acquired from UAVs for operations support" 

Susana Ruano

E.T.S. Ing. Telecomunicación, Universidad Politécnica de Madrid, November 2018, "Cum Laude".

Ph.D. thesis Directors: Carlos Cuevas Rodríguez y Guillermo Gallego Bonet .

Augmented reality (AR) has become, due to recent technology developments, a fast-growing discipline. The potential of AR supports its study not only for specific devices such as glasses or helmets, but for anything equipped with a camera. Following this idea, Airbus promoted an innovation project, Situational Awareness Virtual EnviRonment (SAVIER), to incorporate AR in their ground control stations, thus allowing the enhancement of the video stream captured from Unmanned Aerial Vehicles (UAVs). This thesis is framed in that project and explores different approaches to improve the situational awareness of the UAV operators during a mission. Initially, the thesis is focused on geo-registration, a strategy used for the localization of the UAV in GPS-denied environments. This is of interest because knowing the position of the UAV is essential to provide information about the surroundings. For this reason, we proposed two key systems for geo-registration with different reference data.

First, a multi-view stereo processing pipeline for building a dense terrain model from images of the UAV video feed. This is helpful when a reference terrain model is needed for geo-registration but it is unavailable, outdated, or it has low resolution. The proposed variational method enforces continuity not only along epipolar lines but also across them, in the full image domain. Second, the thesis proposed a joint geometric and photometric image registration method that can deal with generic types of distortion: parametric warpings (such as homographies) and non-linear photometric transformations. It is built on top of area-based registration methods to be able to operate in scenarios where feature-based geo-registration methods are not reliable. Finally, the general case was considered, in which every sensor measurement is known with enough accuracy and the thesis focused on displaying virtual elements over the video stream acquired by the UAV.

An AR tool to improve the situational awareness of UAV operators during intelligence and surveillance missions was developed. The AR system provides information about the flying path and the targets, so that the operator can reduce the time to find them even in the presence of occlusions. The usability of the proposed AR tool was proved by the adoption of NATO standards and it was fully integrated with the Airbus SAVIER demonstrator, in Getafe, Madrid.