Ph.D thesis Filippo Casu

 

Research  

 

GTI Data   

 

Open databases created and software developed by the GTI and supplemental material to papers.  

 

Databases  


SportCLIP (2025): Multi-sport dataset for text-guided video summarization.
Ficosa (2024):
The FNTVD dataset has been generated using the Ficosa's recording car.
MATDAT (2023):  More than 90K labeled images of martial arts tricking.
SEAW – DATASET (2022): 3 stereoscopic contents in 4K resolution at 30 fps.
UPM-GTI-Face dataset (2022): 11 different subjects captured in 4K, under 2 scenarios, and 2 face mask conditions.
LaSoDa (2022): 60 annotated images from soccer matches in five stadiums with different characteristics and light conditions.
PIROPO Database (2021):People in Indoor ROoms with Perspective and Omnidirectional cameras.
EVENT-CLASS (2021): High-quality 360-degree videos in the context of tele-education.
Parking Lot Occupancy Database (2020)
Nighttime Vehicle Detection database (NVD) (2019)
Hand gesture dataset (2019): Multi-modal Leap Motion dataset for Hand Gesture Recognition.
ViCoCoS-3D (2016): VideoConference Common Scenes in 3D.
LASIESTA database (2016): More than 20 sequences to test moving object detection and tracking algorithms.
Hand gesture database (2015): Hand-gesture database composed by high-resolution color images acquired with the Senz3D sensor.
HRRFaceD database (2014):Face database composed by high resolution images acquired with Microsoft Kinect 2 (second generation).
Lab database (2012): Set of 6 sequences to test moving object detection strategies.
Vehicle image database (2012)More than 7000 images of vehicles and roads.           

 

Software  


Empowering Computer Vision in Higher Education(2024)A Novel Tool for Enhancing Video Coding Comprehension.
Engaging students in audiovisual coding through interactive MATLAB GUIs (2024)

TOP-Former: A Multi-Agent Transformer Approach for the Team Orienteering Problem (2023)

Solving Routing Problems for Multiple Cooperative Unmanned Aerial Vehicles using Transformer Networks (2023)
Vision Transformers and Traditional Convolutional Neural Networks for Face Recognition Tasks (2023)
Faster GSAC-DNN (2023): A Deep Learning Approach to Nighttime Vehicle Detection Using a Fast Grid of Spatial Aware Classifiers.
SETForSeQ (2020): Subjective Evaluation Tool for Foreground Segmentation Quality. 
SMV Player for Oculus Rift (2016)

Bag-D3P (2016): 
Face recognition using depth information. 
TSLAB (2015): 
Tool for Semiautomatic LABeling.   
 

   

Supplementary material  


Soccer line mark segmentation and classification with stochastic watershed transform (2022)
A fully automatic method for segmentation of soccer playing fields (2022)
Grass band detection in soccer images for improved image registration (2022)
Evaluating the Influence of the HMD, Usability, and Fatigue in 360VR Video Quality Assessments (2020)
Automatic soccer field of play registration (2020)   
Augmented reality tool for the situational awareness improvement of UAV operators (2017)
Detection of static moving objects using multiple nonparametric background-foreground models on a Finite State Machine (2015)
Real-time nonparametric background subtraction with tracking-based foreground update (2015)  
Camera localization using trajectories and maps (2014)

 

                                                                                                                                                                                                                             
 
                                                                   
 
                                                                                                                                                             
 
      

 

 

"Optimization of Protection Techniques Based on FEC Codes for the Transmission of Multimedia Packetized Streams" 

Filippo Casu

E.T.S. Ing. Telecomunicación, Universidad Politécnica de Madrid, May 2017, "Cum Laude".

Ph.D. thesis Director: Julián Cabrera Quesada.

This thesis presents two enhanced FEC-based schemes to protect real-time packetized multimedia streams in bursty channels. The objective of these novel architectures has been the optimization of existing FEC codes, that is, Reed-Solomon codes and LDPC codes. On the one hand, the optimization is focused on the achievement of a lower computational cost for Reed-Solomon codes, since their well known robust recovery capability against any type of losses needs a high complexity. On the other hand, in the case of LDPC codes, the optimization is addressed to increase the recovery capabilities for a bursty channel, since they are not specifically designed for the scenario considered in this thesis.

The scheme based on Reed-Solomon codes is called inter-packet symbol approach, and it consists in an alternative bit structure that allocates each symbol of a Reed-Solomon code in several media packets. This characteristic permits to exploit better the recovery capability of Reed-Solomon codes against bursty packet losses. The performance of this scheme has been studied in terms of encoding/decoding time versus recovery capability, and compared with other proposed schemes in the literature. The theoretical analysis has shown that the proposed approach allows the use of a lower size of the Galois Fields compared to other solutions. This lower size results in a decrease of the required encoding/decoding time while keeping a comparable recovery capability.

Although the use of LDPC codes is typically addressed for channels where losses are uniformly distributed (memoryless channels) and for large information blocks, this thesis suggests the use of this type of FEC codes at the application layer, in bursty channels and for real-time scenario, where low transmission latency is requested. To fulfill these constraints, the appropriate configuration parameters of an LDPC scheme have been determined using small blocks of information and adapting the FEC code to be capable of recovering packet losses in bursty environments. This purpose is achieved in two steps. The first step is performed by an algorithm that estimates the recovery capability of a given LDPC code in a burst packet loss network. The second step is the optimization of the code: an algorithm optimizes the code structure in terms of recovery capability against the specific behavior of the channel with memory, generating a burst oriented version of the considered LDPC code.

Finally, for both proposed FEC schemes, experimental results have been carried out in a simulated transmission channel to assess the performances of the schemes and compared to several other schemes. Download here