Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Heliyon ; 10(6): e27596, 2024 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-38510055

RESUMO

Sports physiotherapists and coaches are tasked with evaluating the movement quality of athletes across the spectrum of ability and experience. However, the accuracy of visual observation is low and existing technology outside of expensive lab-based solutions has limited adoption, leading to an unmet need for an efficient and accurate means to measure static and dynamic joint angles during movement, converted to movement metrics useable by practitioners. This paper proposes a set of pose landmarks for computing frequently used joint angles as metrics of interest to sports physiotherapists and coaches in assessing common strength-building human exercise movements. It then proposes a set of rules for computing these metrics for a range of common exercises (single and double drop jumps and counter-movement jumps, deadlifts and various squats) from anatomical key-points detected using video, and evaluates the accuracy of these using a published 3D human pose model trained with ground truth data derived from VICON motion capture of common rehabilitation exercises. Results show a set of mathematically defined metrics which are derived from the chosen pose landmarks, and which are sufficient to compute the metrics for each of the exercises under consideration. Comparison to ground truth data showed that root mean square angle errors were within 10° for all exercises for the following metrics: shin angle, knee varus/valgus and left/right flexion, hip flexion and pelvic tilt, trunk angle, spinal flexion lower/upper/mid and rib flare. Larger errors (though still all within 15°) were observed for shoulder flexion and ASIS asymmetry in some exercises, notably front squats and drop-jumps. In conclusion, the contribution of this paper is that a set of sufficient key-points and associated metrics for exercise assessment from 3D human pose have been uniquely defined. Further, we found generally very good accuracy of the Strided Transformer 3D pose model in predicting these metrics for the chosen set of exercises from a single mobile device camera, when trained on a suitable set of functional exercises recorded using a VICON motion capture system. Future assessment of generalization is needed.

2.
J Sports Sci ; 40(17): 1885-1900, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-36093680

RESUMO

Injury assessment during sporting collisions requires estimation of the associated kinematics. While marker-based solutions are widely accepted as providing accurate and reliable measurements, setup times are lengthy and it is not always possible to outfit athletes with restrictive equipment in sporting situations. A new generation of markerless motion capture based on deep learning techniques holds promise for enabling measurement of movement in the wild. The aim of this work is to evaluate the performance of a popular deep learning model "out of the box" for human pose estimation, on a dataset of ten staged rugby tackle movements performed in a marker-based motion capture laboratory with a system of three high-speed video cameras. An analysis of the discrepancy between joint positions estimated by the marker-based and markerless systems shows that the deep learning approach performs acceptably well in most instances, although high errors exist during challenging intervals of heavy occlusion and self-occlusion. In total, 75.6% of joint position estimates are found to have a mean absolute error (MAE) of less than or equal to 25 mm, 17.8% with MAE between 25 and 50 mm and 6.7% with MAE greater than 50 mm. The mean per joint position error is 47 mm.


Assuntos
Aprendizado Profundo , Esportes , Humanos , Movimento (Física) , Fenômenos Biomecânicos , Movimento
3.
Signal Image Video Process ; 15(8): 1829-1836, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34721702

RESUMO

We address the problem of exposure correction of dark, blurry and noisy images captured in low-light conditions in the wild. Classical image-denoising filters work well in the frequency space but are constrained by several factors such as the correct choice of thresholds and frequency estimates. On the other hand, traditional deep networks are trained end to end in the RGB space by formulating this task as an image translation problem. However, that is done without any explicit constraints on the inherent noise of the dark images and thus produces noisy and blurry outputs. To this end, we propose a DCT/FFT-based multi-scale loss function, which when combined with traditional losses, trains a network to translate the important features for visually pleasing output. Our loss function is end to end differentiable, scale-agnostic and generic; i.e., it can be applied to both RAW and JPEG images in most existing frameworks without additional overhead. Using this loss function, we report significant improvements over the state of the art using quantitative metrics and subjective tests.

4.
Artigo em Inglês | MEDLINE | ID: mdl-31995492

RESUMO

Light field technology has reached a certain level of maturity in recent years, and its applications in both computer vision research and industry are offering new perspectives for cinematography and virtual reality. Several methods of capture exist, each with its own advantages and drawbacks. One of these methods involves the use of handheld plenoptic cameras. While these cameras offer freedom and ease of use, they also suffer from various visual artefacts and inconsistencies. We propose in this paper an advanced pipeline that enhances their output. After extracting sub-aperture images from the RAW images with our demultiplexing method, we perform three correction steps. We first remove hot pixel artefacts, then correct colour inconsistencies between views using a colour transfer method, and finally we apply a state of the art light field denoising technique to ensure a high image quality. An in-depth analysis is provided for every step of the pipeline, as well as their interaction within the system. We compare our approach to existing state of the art sub-aperture image extracting algorithms, using a number of metrics as well as a subjective experiment. Finally, we showcase the positive impact of our system on a number of relevant light field applications.

5.
Artigo em Inglês | MEDLINE | ID: mdl-31484117

RESUMO

A computationally fast tone mapping operator (TMO) that can quickly adapt to a wide spectrum of high dynamic range (HDR) content is quintessential for visualization on varied low dynamic range (LDR) output devices such as movie screens or standard displays. Existing TMOs can successfully tone-map only a limited number of HDR content and require an extensive parameter tuning to yield the best subjective-quality tone-mapped output. In this paper, we address this problem by proposing a fast, parameter-free and scene-adaptable deep tone mapping operator (DeepTMO) that yields a high-resolution and high-subjective quality tone mapped output. Based on conditional generative adversarial network (cGAN), DeepTMO not only learns to adapt to vast scenic-content (e.g., outdoor, indoor, human, structures, etc.) but also tackles the HDR related scene-specific challenges such as contrast and brightness, while preserving the fine-grained details. We explore 4 possible combinations of Generator-Discriminator architectural designs to specifically address some prominent issues in HDR related deep-learning frameworks like blurring, tiling patterns and saturation artifacts. By exploring different influences of scales, loss-functions and normalization layers under a cGAN setting, we conclude with adopting a multi-scale model for our task. To further leverage on the large-scale availability of unlabeled HDR data, we train our network by generating targets using an objective HDR quality metric, namely Tone Mapping Image Quality Index (TMQI). We demonstrate results both quantitatively and qualitatively, and showcase that our DeepTMO generates high-resolution, high-quality output images over a large spectrum of real-world scenes. Finally, we evaluate the perceived quality of our results by conducting a pair-wise subjective study which confirms the versatility of our method.

6.
IEEE Trans Image Process ; 28(11): 5266-5280, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31170073

RESUMO

In depth map coding, rate-distortion optimization for those pixels that will cause occlusion in view synthesis is a rather challenging task, since the synthesis distortion estimation is complicated by the warping competition and the occlusion order can be easily changed by the adopted optimization strategy. In this paper, an efficient depth map coding approach using allowable depth map distortions is proposed for occlusion-inducing pixels. First, we derive the range of allowable depth level change for both the zero disparity error case and non-zero disparity error case with theoretic and geometrical proofs. Then, we formulate the problem of optimally selecting the depth distortion within allowable depth distortion range with the objective to minimize the overall synthesis distortion involved in the occlusion. The unicity and occlusion order invariance properties of allowable depth distortion range is demonstrated. Finally, we propose a dynamic programming based algorithm to locate the optimal depth distortion for each pixel. Simulation results illustrate the performance improvement of the proposed algorithm over the other state-of-the-art depth map coding optimization schemes.

7.
IEEE Trans Image Process ; 28(11): 5740-5753, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31217117

RESUMO

In this paper, we present a new Light Field representation for efficient Light Field processing and rendering called Fourier Disparity Layers (FDL). The proposed FDL representation samples the Light Field in the depth (or equivalently the disparity) dimension by decomposing the scene as a discrete sum of layers. The layers can be constructed from various types of Light Field inputs, including a set of sub-aperture images, a focal stack, or even a combination of both. From our derivations in the Fourier domain, the layers are simply obtained by a regularized least square regression performed independently at each spatial frequency, which is efficiently parallelized in a GPU implementation. Our model is also used to derive a gradient descent-based calibration step that estimates the input view positions and an optimal set of disparity values required for the layer construction. Once the layers are known, they can be simply shifted and filtered to produce different viewpoints of the scene while controlling the focus and simulating a camera aperture of arbitrary shape and size. Our implementation in the Fourier domain allows real-time Light Field rendering. Finally, direct applications such as view interpolation or extrapolation and denoising are presented and evaluated.

8.
IEEE Trans Image Process ; 27(1): 265-280, 2018 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-28976315

RESUMO

Spatio-temporal edge-aware (STEA) filtering methods have recently received increased attention due to their ability to efficiently solve or approximate important image-domain problems in a temporally consistent manner - which is a crucial property for video-processing applications. However, existing STEA methods are currently unsuited for real-time, embedded stream-processing settings due to their high processing latency, large memory, and bandwidth requirements, and the need for accurate optical flow to enable filtering along motion paths. To this end, we propose an efficient STEA filtering pipeline based on the recently proposed permeability filter (PF), which offers high quality and halo reduction capabilities. Using mathematical properties of the PF, we reformulate its temporal extension as a causal, non-linear infinite impulse response filter, which can be efficiently evaluated due to its incremental nature. We bootstrap our own accurate flow using the PF and its temporal extension by interpolating a quasi-dense nearest neighbour field obtained with an improved PatchMatch algorithm, which employs binarized octal orientation maps (BOOM) descriptors to find correspondences among subsequent frames. Our method is able to create temporally consistent results for a variety of applications such as optical flow estimation, sparse data upsampling, visual saliency computation and disparity estimation. We benchmark our optical flow estimation on the MPI Sintel dataset, where we currently achieve a Pareto optimal quality-efficiency tradeoff with an average endpoint error of 7.68 at 0.59 s single-core execution time on a recent desktop machine.

9.
IEEE Trans Image Process ; 22(9): 3329-41, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23715602

RESUMO

Today, stereoscopic 3D (S3D) cinema is already mainstream, and almost all new display devices for the home support S3D content. S3D distribution infrastructure to the home is already established partly in the form of 3D Blu-ray discs, video on demand services, or television channels. The necessity to wear glasses is, however, often considered as an obstacle, which hinders broader acceptance of this technology in the home. Multiviewautostereoscopic displays enable a glasses free perception of S3D content for several observers simultaneously, and support head motion parallax in a limited range. To support multiviewautostereoscopic displays in an already established S3D distribution infrastructure, a synthesis of new views from S3D video is needed. In this paper, a view synthesis method based on image-domain-warping (IDW) is presented that automatically synthesizes new views directly from S3D video and functions completely. IDW relies on an automatic and robust estimation of sparse disparities and image saliency information, and enforces target disparities in synthesized images using an image warping framework. Two configurations of the view synthesizer in the scope of a transmission and view synthesis framework are analyzed and evaluated. A transmission and view synthesis system that uses IDW is recently submitted to MPEG's call for proposals on 3D video technology, where it is ranked among the four best performing proposals.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...