Search | VHL Regional Portal

uB-VisioGeoloc: An image sequences dataset of pedestrian navigation including geolocalised-inertial information and spatial sound rendering of the urban environment's obstacles.

Scalvini, Florian; Bordeau, Camille; Ambard, Maxime; Migniot, Cyrille; Vergnaud, Mathilde; Dubois, Julien.

Data Brief ; 53: 110088, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38357450

ABSTRACT

The dataset proposed is a collection of pedestrian navigation data sequences combining visual and spatial information. The pedestrian navigation sequences are situations encountered by a pedestrian walking in an urban outdoor environment, such as moving on the sidewalk, navigating through a crowd, or crossing a street when the pedestrian light traffic is green. The acquired data are timestamped provided RGB-D images and are associated with GPS, and inertial data (acceleration, rotation). These recordings were acquired by separate processes, avoiding delays during their capture to guarantee a synchronisation between the moment of acquisition by the sensor and the moment of recording on the system. The acquisition was made in the city of Dijon, France, including narrow streets, wide avenues, and parks. Annotations of the RGB-D are also provided by bounding boxes indicating the position of relevant static or dynamic objects present in a pedestrian area, such as a tree, bench, or person. This pedestrian navigation dataset aims to support the development of smart mobile systems to assist visually impaired people in their daily movements in an outdoor environment. In this context, the visual data and localisation sequences we provide can be used to elaborate the appropriate visual processing methods to extract relevant information about the obstacles and their current positions on the path. Alongside the dataset, a visual-to-auditory substitution method has been employed to convert each image sequence into corresponding stereophonic sound files, allowing for comparison and evaluation. Synthetic sequences associated with the same information set are also provided based on the recordings of a displacement within the 3D model of a real place in Dijon.

Cross-modal correspondence enhances elevation localization in visual-to-auditory sensory substitution.

Bordeau, Camille; Scalvini, Florian; Migniot, Cyrille; Dubois, Julien; Ambard, Maxime.

Front Psychol ; 14: 1079998, 2023.

Article in English | MEDLINE | ID: mdl-36777233

ABSTRACT

Introduction: Visual-to-auditory sensory substitution devices are assistive devices for the blind that convert visual images into auditory images (or soundscapes) by mapping visual features with acoustic cues. To convey spatial information with sounds, several sensory substitution devices use a Virtual Acoustic Space (VAS) using Head Related Transfer Functions (HRTFs) to synthesize natural acoustic cues used for sound localization. However, the perception of the elevation is known to be inaccurate with generic spatialization since it is based on notches in the audio spectrum that are specific to each individual. Another method used to convey elevation information is based on the audiovisual cross-modal correspondence between pitch and visual elevation. The main drawback of this second method is caused by the limitation of the ability to perceive elevation through HRTFs due to the spectral narrowband of the sounds. Method: In this study we compared the early ability to localize objects with a visual-to-auditory sensory substitution device where elevation is either conveyed using a spatialization-based only method (Noise encoding) or using pitch-based methods with different spectral complexities (Monotonic and Harmonic encodings). Thirty eight blindfolded participants had to localize a virtual target using soundscapes before and after having been familiarized with the visual-to-auditory encodings. Results: Participants were more accurate to localize elevation with pitch-based encodings than with the spatialization-based only method. Only slight differences in azimuth localization performance were found between the encodings. Discussion: This study suggests the intuitiveness of a pitch-based encoding with a facilitation effect of the cross-modal correspondence when a non-individualized sound spatialization is used.

Outdoor Navigation Assistive System Based on Robust and Real-Time Visual-Auditory Substitution Approach.

Scalvini, Florian; Bordeau, Camille; Ambard, Maxime; Migniot, Cyrille; Dubois, Julien.

Sensors (Basel) ; 24(1)2023 Dec 27.

Article in English | MEDLINE | ID: mdl-38203027

ABSTRACT

Blindness affects millions of people worldwide, leading to difficulties in daily travel and a loss of independence due to a lack of spatial information. This article proposes a new navigation aid to help people with severe blindness reach their destination. Blind people are guided by a short 3D spatialised sound that indicates the target point to follow. This sound is combined with other sonified information on potential obstacles in the vicinity. The proposed system is based on inertial sensors, GPS data, and the cartographic knowledge of pedestrian paths to define the trajectory. In addition, visual clues are used to refine the trajectory with ground floor information and obstacle information using a camera to provide 3D spatial information. The proposed method is based on a deep learning approach. The different neural networks used in this approach are evaluated on datasets that regroup navigations from pedestrians' point-of-view. This method achieves low latency and real-time processing without relying on remote connections, instead using a low-power embedded GPU target and a multithreaded approach for video processing, sound generation, and acquisition. This system could significantly improve the quality of life and autonomy of blind people, allowing them to reliably and efficiently navigate in their environment.

Subject(s)

Pedestrians , Quality of Life , Humans , Blindness , Knowledge , Neural Networks, Computer

Normal and pathological gait classification LSTM model.

Khokhlova, Margarita; Migniot, Cyrille; Morozov, Alexey; Sushkova, Olga; Dipanda, Albert.

Artif Intell Med ; 94: 54-66, 2019 03.

Article in English | MEDLINE | ID: mdl-30871683

ABSTRACT

Computer vision-based clinical gait analysis is the subject of permanent research. However, there are very few datasets publicly available; hence the comparison of existing methods between each other is not straightforward. Even if the test data are in an open access, existing databases contain very few test subjects and single modality measurements, which limit their usage. The contributions of this paper are three-fold. First, we propose a new open-access multi-modal database acquired with the Kinect v.2 camera for the task of gait analysis. Second, we adapt to use the skeleton joint orientation data to calculate kinematic gait parameters to match golden-standard MOCAP systems. We propose a new set of features based on 3D low-limbs flexion dynamics to analyze the symmetry of a gait. Third, we design a Long-Short Term Memory (LSTM) ensemble model to create an unsupervised gait classification tool. The results show that joint orientation data provided by Kinect can be successfully used in an inexpensive clinical gait monitoring system, with the results moderately better than reported state-of-the-art for three normal/pathological gait classes.

Subject(s)

Computer Simulation , Gait , Models, Biological , Adult , Algorithms , Female , Humans , Male , Monitoring, Physiologic , Young Adult

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL