ABSTRACT
La necesidad de una evaluación instrumental y objetiva de la calidad de voz se refleja en el creciente número de métodos de análisis acústicos desarrollados para el diagnóstico clínico y la investigación. En el trabajo que se informa se realizaron análisis acústicos utilizando dos programas diferentes: PRAAT y ANAGRAF. Ambos sistemas son programas informáticos de uso común en Latinoamérica, en contextos clínicos y de investigación, para detectar y caracterizar el habla, la voz y los desórdenes vocales. El propósito fue comparar los resultados obtenidos con un conjunto de mediciones acústicas, muchas de las cuales se definen de manera similar en ambos programas y analizar si se puede distinguir clínicamente entre la normalidad y la patología en sus diferentes niveles de severidad. Un total de 776 muestras de voz correspondientes a 4 repeticiones de la vocal /a/ de 194 hablantes de español en Buenos Aires se midieron utilizando los parámetros disponibles como lo son: la frecuencia fundamental, jitter, shimmery harmonic-to-noise ratio. Los resultados muestran valores similares de frecuencia fundamental (F0) para ambos programas. Sin embargo, los valores de jitter, shimmery harmonic-to-noise ratio (HNR) fueron significativamente menores medidos con PRAAT y resultaron superiores utilizando ANAGRAF. La confiabilidad de los valores obtenidos con ambos programas se redujo significativamente con el aumento de las irregularidades en la señal. Por lo tanto, parece importante establecer normas para las voces normales y patológicas con el fin de guiar o dar un paso más en la validez y confiabilidad de las prácticas profesionales.
The need for instrumental objective assessment of voice quality is reflected in the increasing number of acoustic analysis methods developed for clinical diagnosis and as research outcome in the area. Acoustics measures of vocal productions received much attention in the literature and a variety of commercial packages are available. Those systems packages are presented as objective tools with apparently standardized, well-designed measurement protocols and acceptably low incidence of technical problems. The fact of using the same labels for similar measurement output like mean jitter or mean shimmer induce to think that results from different programs are comparable. However, there is no standardization of technique methodology and considerable variability is observed about which acoustic parameters must be measured. Furthermore, product documentation often makes it difficult to know how a particular system actually produces its measurements. Little formal information is available about the actual comparability of measures from different analysis packages. In this study, acoustic analysis was performed using two different programs: PRAAT and ANAGRAF. Both systems are computer programs commonly used in Latin America, in clinical and research to detect and characterize speech and voice disorders. PRAAT, was designed by Boersma and Weenink (2009) and ANAGRAF is a national software designed by Gurlekian (1997). The purpose of this work was to compare the results obtained by a set of acoustic parameters, many of which are defined similarly in both programs, and analyze whether it can distinguish clinically between normal and pathological voices within different severity levels. A total of 776 voice samples corresponding to 4 repetitions of the vowel /a/ of 194 speakers of Spanish in Buenos Aires were measured using the available parameters such as: the fundamental frequency, jitter, shimmer, and noise-to harmonic ratio. The Lilliefords Test, with a significance level of 5%, was used to verify the normal distribution of the results of each measurement. The parameters with normal distribution had their means compared to the standard measurements proposed by the program using the t test (significance level of 5%). General results separated by sex are reported. The findings of analyzed voice samples are showed by definitions for mean, standard deviation, and thresholds of normal for each parameter, which helps the clinician to immediately assess the findings for a particular patient. The test-retest reliability in each pair of measures was calculated. For both programs the results show similar values of fundamental frequency (F0). However, the values of jitter, shimmer and harmonic-to-noise ratio (HNR) were significantly lower measured by PRAAT, and higher using ANAGRAF in relation which the default results proposed by each system. The empirical evidence shows that if followed the default values and thresholds of each system, the diagnostic accuracy might be questioned by considering both cases as false positives or false negatives. Results demonstrate that the reliability of the values obtained by both programs was significantly reduced with the increase of irregularities in the signal. Parameters related with shimmer were more reliable than parameters related with jitter. For the normal data, r Pearson correlations ranged from .72 (ANAGRAF) to .87 (PRAAT) for measures of jitter, with lower correlations among measures of shimmer .27 (ANAGRAF) to .80 (PRAAT) and noise measures .55 (ANAGRAF) to .87 (PRAAT). The large differences found between the measurements from the systems imply that the accuracy of the measurements are questionable, especially for severely pathological samples. Therefore, it seems important to establish normal and pathological voice standards norms for Spanish in Buenos Aires to take a step in the validity and reliability of the professional practices. Future research be aimed at establishing differences between vowels in addition to sex and system used.