IEEE ICIP 2022

GTI Data

Open databases created and software developed by the GTI and supplemental material to papers.

Databases

SportCLIP (2025): Multi-sport dataset for text-guided video summarization.
Ficosa (2024): The FNTVD dataset has been generated using the Ficosa's recording car.
MATDAT (2023): More than 90K labeled images of martial arts tricking.
SEAW – DATASET (2022): 3 stereoscopic contents in 4K resolution at 30 fps.
UPM-GTI-Face dataset (2022): 11 different subjects captured in 4K, under 2 scenarios, and 2 face mask conditions.
LaSoDa (2022): 60 annotated images from soccer matches in five stadiums with different characteristics and light conditions.
PIROPO Database (2021):People in Indoor ROoms with Perspective and Omnidirectional cameras.
EVENT-CLASS (2021): High-quality 360-degree videos in the context of tele-education.
Parking Lot Occupancy Database (2020)
Nighttime Vehicle Detection database (NVD) (2019)
Hand gesture dataset (2019): Multi-modal Leap Motion dataset for Hand Gesture Recognition.
ViCoCoS-3D (2016): VideoConference Common Scenes in 3D.
LASIESTA database (2016): More than 20 sequences to test moving object detection and tracking algorithms.
Hand gesture database (2015): Hand-gesture database composed by high-resolution color images acquired with the Senz3D sensor.
HRRFaceD database (2014):Face database composed by high resolution images acquired with Microsoft Kinect 2 (second generation).
Lab database (2012): Set of 6 sequences to test moving object detection strategies.
Vehicle image database (2012): More than 7000 images of vehicles and roads.

Software

Empowering Computer Vision in Higher Education(2024): A Novel Tool for Enhancing Video Coding Comprehension.
Engaging students in audiovisual coding through interactive MATLAB GUIs (2024)
TOP-Former: A Multi-Agent Transformer Approach for the Team Orienteering Problem (2023)
Solving Routing Problems for Multiple Cooperative Unmanned Aerial Vehicles using Transformer Networks (2023)
Vision Transformers and Traditional Convolutional Neural Networks for Face Recognition Tasks (2023)
Faster GSAC-DNN (2023): A Deep Learning Approach to Nighttime Vehicle Detection Using a Fast Grid of Spatial Aware Classifiers.
SETForSeQ (2020): Subjective Evaluation Tool for Foreground Segmentation Quality.
SMV Player for Oculus Rift (2016)
Bag-D3P (2016): Face recognition using depth information.
TSLAB (2015): Tool for Semiautomatic LABeling.

Supplementary material

Soccer line mark segmentation and classification with stochastic watershed transform (2022)
A fully automatic method for segmentation of soccer playing fields (2022)
Grass band detection in soccer images for improved image registration (2022)
Evaluating the Influence of the HMD, Usability, and Fatigue in 360VR Video Quality Assessments (2020)
Automatic soccer field of play registration (2020)
Augmented reality tool for the situational awareness improvement of UAV operators (2017)
Detection of static moving objects using multiple nonparametric background-foreground models on a Finite State Machine (2015)
Real-time nonparametric background subtraction with tracking-based foreground update (2015)
Camera localization using trajectories and maps (2014)

IEEE ICIP 2022

Carlos Cortés and Narciso García, members of the Grupo de Tratamiento de Imágenes, attended the IEEE ICIP 2022 Conference held in Bordeaux from 16 to 20 October.

" I had a pretty big audience, although it was one of the few papers that did not deal with neural networks." - Carlos said.

The goal of the presented work was to find the limits of self-view delay in immersive XR environments in terms of quality of experience. For this purpose, Carlos adapted for virtual reality a task from the ITU-T P.920 recommendation that consists in building figures using lego-like blocks. During the experiment subjects had to build specific figures while the delay varied for each of them. After each construction, the users had to complete a questionnaire previously validated for immersive interactive environments. The results of the experiment showed that the limit of acceptance of the self-view delay for presence and global perception of quality was above 450 ms (including the camera delay). On the other hand, users needed more severe delays to notice their effect on adaptive ability.

You can find the link to the conference paper here: https://ieeexplore.ieee.org/document/9897983

Carlos also recounted his experience at the conference: "During the conference, I was able to attend numerous presentations on the current state of the art in image processing methods. Obviously, most of them were based on deep learning techniques. However, I observed that many methods adapted their problems domain to the image one. The purpose of this is to use the existing neural networks for image processing because they are very robust and developed. In addition, other contributions focused on cleaning or selecting only a part of the input information to increase the performance of the networks. In addition, I attended a tutorial on point cloud compression where I learned a lot about point cloud compression using different strategies."

Research

Projects

Publications

GTI Blog

GTI Data

Quality of Experience tests