FVV LIVE

GTI Data

Open databases created and software developed by the GTI and supplemental material to papers.

Databases

SportCLIP (2025): Multi-sport dataset for text-guided video summarization.
Ficosa (2024): The FNTVD dataset has been generated using the Ficosa's recording car.
MATDAT (2023): More than 90K labeled images of martial arts tricking.
SEAW – DATASET (2022): 3 stereoscopic contents in 4K resolution at 30 fps.
UPM-GTI-Face dataset (2022): 11 different subjects captured in 4K, under 2 scenarios, and 2 face mask conditions.
LaSoDa (2022): 60 annotated images from soccer matches in five stadiums with different characteristics and light conditions.
PIROPO Database (2021):People in Indoor ROoms with Perspective and Omnidirectional cameras.
EVENT-CLASS (2021): High-quality 360-degree videos in the context of tele-education.
Parking Lot Occupancy Database (2020)
Nighttime Vehicle Detection database (NVD) (2019)
Hand gesture dataset (2019): Multi-modal Leap Motion dataset for Hand Gesture Recognition.
ViCoCoS-3D (2016): VideoConference Common Scenes in 3D.
LASIESTA database (2016): More than 20 sequences to test moving object detection and tracking algorithms.
Hand gesture database (2015): Hand-gesture database composed by high-resolution color images acquired with the Senz3D sensor.
HRRFaceD database (2014):Face database composed by high resolution images acquired with Microsoft Kinect 2 (second generation).
Lab database (2012): Set of 6 sequences to test moving object detection strategies.
Vehicle image database (2012): More than 7000 images of vehicles and roads.

Software

Empowering Computer Vision in Higher Education(2024): A Novel Tool for Enhancing Video Coding Comprehension.
Engaging students in audiovisual coding through interactive MATLAB GUIs (2024)
TOP-Former: A Multi-Agent Transformer Approach for the Team Orienteering Problem (2023)
Solving Routing Problems for Multiple Cooperative Unmanned Aerial Vehicles using Transformer Networks (2023)
Vision Transformers and Traditional Convolutional Neural Networks for Face Recognition Tasks (2023)
Faster GSAC-DNN (2023): A Deep Learning Approach to Nighttime Vehicle Detection Using a Fast Grid of Spatial Aware Classifiers.
SETForSeQ (2020): Subjective Evaluation Tool for Foreground Segmentation Quality.
SMV Player for Oculus Rift (2016)
Bag-D3P (2016): Face recognition using depth information.
TSLAB (2015): Tool for Semiautomatic LABeling.

Supplementary material

Soccer line mark segmentation and classification with stochastic watershed transform (2022)
A fully automatic method for segmentation of soccer playing fields (2022)
Grass band detection in soccer images for improved image registration (2022)
Evaluating the Influence of the HMD, Usability, and Fatigue in 360VR Video Quality Assessments (2020)
Automatic soccer field of play registration (2020)
Augmented reality tool for the situational awareness improvement of UAV operators (2017)
Detection of static moving objects using multiple nonparametric background-foreground models on a Finite State Machine (2015)
Real-time nonparametric background subtraction with tracking-based foreground update (2015)
Camera localization using trajectories and maps (2014)

FVVLive System

The third edition of The Observatory of New Technologies took place on 17 and 18 July 2019, at the Digital Content Hub in Málaga, Spain. Labelled as The New Tech Observatory, it was again the Spanish key event for meeting experts, producers, professionals, students, trainers, and entrepreneurs involved in sectors such as Immersive Experiences, Artificial Intelligence, IoT, eSports, Video Games, Serious Games, eHealth, and eTourism, among others.

The Grupo de Tratamiento de Imágenes (GTI) of the Universidad Politécnica de Madrid (UPM) presented a complete FVV system (free view-point video) in the Demo Area of the Observatory. This system, called FVVLIVE, allows the viewer to choose dynamically, through a simple interface, the point from which he/she wishes to observe the scene of a FVV live transmission. Thus, it offers an enriched audiovisual experience which allows the feeling of immersion within the scene. FVV-Live has been developed by the members of GTI.

FVV-Live has been the first free-viewpoint video system able to operate in real time. In addition to this achievement, the system has the ability to work with sparse camera configurations. Its performance improves the results obtained by state-of-the-art research teams.

The future is here!

FVV-Live operates in real time with minimum end-to-end latency, based on a configuration of sparse cameras and using consumer electronics equipment. To achieve this, the researchers of the GTI have designed, developed, and verified lightweight schemes for the acquisition, transmission, and rendering of video. Linear and planar camera configurations have been used and depth information has been obtained combining depth stereo vision and multiview stereo. New schemes for lossless and lossy transmission allow the delivery of the multiview plus depth information to a remote receiver where free-viewpoint video is generated. The motion-to-photon latency (latency of interaction) associated to the rendering of the video corresponding to the virtual viewpoint chosen by the user has been minimized. New algorithms have been developed for FVV-Live that combine texture and depth to achieve a better subjective quality.

Research

Projects

Publications

GTI Blog

GTI Data

Quality of Experience tests