Pesquisa | Portal Regional da BVS

Authenticity and presence: defining perceived quality in VR experiences.

Hameed, Asim; Perkis, Andrew.

Front Psychol ; 15: 1291650, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38708019

RESUMO

This work expands the existing understanding of quality assessments of VR experiences. Historically, VR quality has focused on presence and immersion, but current discourse emphasizes plausibility and believability as critical for lifelike, credible VR. However, the two concepts are often conflated, leading to confusion. This paper proposes viewing them as subsets of authenticity and presents a structured hierarchy delineating their differences and connections. Additionally, coherence and congruence are presented as complementary quality functions that integrate internal and external logic. The paper considers quality formation in the experience of authenticity inside VR emphasizing that distinguishing authenticity in terms of precise quality features are essential for accurate assessments. Evaluating quality requires a holistic approach across perceptual, cognitive, and emotional factors. This model provides theoretical grounding for assessing the quality of VR experiences.

Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification.

Uhrig, Stefan; Perkis, Andrew; Möller, Sebastian; Svensson, U Peter; Behne, Dawn M.

Front Neurosci ; 15: 730744, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-35153653

RESUMO

This study investigates effects of spatial auditory cues on human listeners' response strategy for identifying two alternately active talkers ("turn-taking" listening scenario). Previous research has demonstrated subjective benefits of audio spatialization with regard to speech intelligibility and talker-identification effort. So far, the deliberate activation of specific perceptual and cognitive processes by listeners to optimize their task performance remained largely unexamined. Spoken sentences selected as stimuli were either clean or degraded due to background noise or bandpass filtering. Stimuli were presented via three horizontally positioned loudspeakers: In a non-spatial mode, both talkers were presented through a central loudspeaker; in a spatial mode, each talker was presented through the central or a talker-specific lateral loudspeaker. Participants identified talkers via speeded keypresses and afterwards provided subjective ratings (speech quality, speech intelligibility, voice similarity, talker-identification effort). In the spatial mode, presentations at lateral loudspeaker locations entailed quicker behavioral responses, which were significantly slower in comparison to a talker-localization task. Under clean speech, response times globally increased in the spatial vs. non-spatial mode (across all locations); these "response time switch costs," presumably being caused by repeated switching of spatial auditory attention between different locations, diminished under degraded speech. No significant effects of spatialization on subjective ratings were found. The results suggested that when listeners could utilize task-relevant auditory cues about talker location, they continued to rely on voice recognition instead of localization of talker sound sources as primary response strategy. Besides, the presence of speech degradations may have led to increased cognitive control, which in turn compensated for incurring response time switch costs.

Effects of speech transmission quality on sensory processing indicated by the cortical auditory evoked potential.

Uhrig, Stefan; Perkis, Andrew; Behne, Dawn M.

J Neural Eng ; 17(4): 046021, 2020 08 20.

Artigo em Inglês | MEDLINE | ID: mdl-32422617

RESUMO

OBJECTIVE: Degradations of transmitted speech have been shown to affect perceptual and cognitive processing in human listeners, as indicated by the P3 component of the event-related brain potential (ERP). However, research suggests that previously observed P3 modulations might actually be traced back to earlier neural modulations in the time range of the P1-N1-P2 complex of the cortical auditory evoked potential (CAEP). This study investigates whether auditory sensory processing, as reflected by the P1-N1-P2 complex, is already systematically altered by speech quality degradations. APPROACH: Electrophysiological data from two studies were analyzed to examine effects of speech transmission quality (high-quality, noisy, bandpass-filtered) for spoken words on amplitude and latency parameters of individual P1, N1 and P2 components. MAIN RESULTS: In the resultant ERP waveforms, an initial P1-N1-P2 manifested at stimulus onset, while a second N1-P2 occurred within the ongoing stimulus. Bandpass-filtered versus high-quality word stimuli evoked a faster and larger initial N1 as well as a reduced initial P2, hence exhibiting effects as early as the sensory stage of auditory information processing. SIGNIFICANCE: The results corroborate the existence of systematic quality-related modulations in the initial N1-P2, which may potentially have carried over into P3 modulations demonstrated by previous studies. In future psychophysiological speech quality assessments, rigorous control procedures are needed to ensure the validity of P3-based indication of speech transmission quality. An alternative CAEP-based assessment approach is discussed, which promises to be more efficient and less constrained than the established approach based on P3.

Assuntos

Percepção da Fala , Fala , Estimulação Acústica , Cognição , Eletroencefalografia , Potenciais Evocados Auditivos , Humanos

Attention driven foveated video quality assessment.

You, Junyong; Ebrahimi, Touradj; Perkis, Andrew.

IEEE Trans Image Process ; 23(1): 200-13, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24184726

RESUMO

Contrast sensitivity of the human visual system to visual stimuli can be significantly affected by several mechanisms, e.g., vision foveation and attention. Existing studies on foveation based video quality assessment only take into account static foveation mechanism. This paper first proposes an advanced foveal imaging model to generate the perceived representation of video by integrating visual attention into the foveation mechanism. For accurately simulating the dynamic foveation mechanism, a novel approach to predict video fixations is proposed by mimicking the essential functionality of eye movement. Consequently, an advanced contrast sensitivity function, derived from the attention driven foveation mechanism, is modeled and then integrated into a wavelet-based distortion visibility measure to build a full reference attention driven foveated video quality (AFViQ) metric. AFViQ exploits adequately perceptual visual mechanisms in video quality assessment. Extensive evaluation results with respect to several publicly available eye-tracking and video quality databases demonstrate promising performance of the proposed video attention model, fixation prediction approach, and quality metric.

Assuntos

Algoritmos , Atenção/fisiologia , Biomimética/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Reconhecimento Visual de Modelos/fisiologia , Gravação em Vídeo/métodos , Humanos , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA