Coluna/Columna ; 19(1): 22-25, Jan.-Mar. 2020. tab, graf
ABSTRACT Objective The objective of this study was to analyze the intraobserver and interobserver reliability of the Lenke classification among spine surgeons from the city of Salvador, Bahia. Methods Preoperative imaging (front, profile and lateral inclinations) examinations of 20 patients at the Outpatient Clinic of the of Santa Izabel Hospital Orthopedic Department, Salvador, Bahia, who had been diagnosed with adolescent idiopathic scoliosis, were selected to be evaluated by 15 spine surgeons two times at an interval of 30 days, for analysis of the intraobserver and interobserver reliability of the Lenke classification. The project was first submitted for ethical analysis to the Institutional Review Board of the Santa Izabel Hospital - Santa Casa de Misericórdia da Bahia / Prof. Dr. Celso Figueirôa and approved with voucher number 002650/2019. All the participants signed the Informed Consent Form (ICF). Results Analyzing the concordance using the Kappa index, interobserver reproducibilities of 0.755, 0.525 and 0.840 were obtained for the type of curve and the lumbar and sagittal modifiers, respectively, while the intraobserver reliabilities for the same parameters were 0.921, 0.370 and 0.929. Conclusion For the study population, the reliability of Lenke's classification was moderate to almost perfect. Level of evidence III; Interobserver and intraobserver reliability.

RESUMO Objetivo O objetivo do presente trabalho consiste em analisar a confiabilidade intraobservador e interobservador da classificação de Lenke entre cirurgiões de coluna da cidade de Salvador, Bahia. Métodos Foram selecionados exames de imagem pré-operatórios (frente, perfil e inclinações laterais) de 20 pacientes acompanhados no Ambulatório de Coluna do Departamento de Ortopedia do Hospital Santa Izabel, Salvador, Bahia, com diagnóstico de escoliose idiopática do adolescente, para serem avaliados por 15 cirurgiões de coluna, em dois momentos, com intervalo de 30 dias, para análise da confiabilidade intraobservador e interobservador da Classificação de Lenke. O projeto foi, antes de tudo, submetido a análise de ética no CEP Hospital Santa Izabel - Santa Casa de Misericórdia da Bahia/Prof. Dr. Celso Figueirôa e aprovado com número de comprovante 002650/2019. Todos os participantes assinaram o Termo de Livre Consentimento Esclarecido (TCLE). Resultados Analisando-se a concordância por meio do índice Kappa, obteve-se uma reprodutibilidade interobservador de 0,755, 0,525 e 0,840, respectivamente, para o tipo de curva, modificador lombar e sagital, já a confiabilidade intraobservador é de 0,921, 0,370 e 0,929, respectivamente para o tipo de curva, modificador lombar e modificador sagital. Conclusão Para a população em estudo, a confiabilidade da classificação de Lenke é de moderada a quase perfeita. Nível de evidência III; Reprodutibilidade interobservador e intraobservador.

RESUMEN Objetivo El objetivo del presente trabajo consiste en analizar la confiabilidad intraobservador e interobservador de la clasificación de Lenke entre cirujanos de columna de la ciudad de Salvador, Bahia. Métodos Fueron seleccionados exámenes de imagen preoperatorios (frente, perfil e inclinaciones laterales) de 20 pacientes acompañados en el Ambulatorio de Columna del Departamento de Ortopedia del Hospital Santa Izabel, Salvador, Bahia, con diagnóstico de escoliosis idiopática del adolescente, para ser evaluados por 15 cirujanos de columna, en dos momentos, con intervalo de 30 días, para análisis de la confiabilidad intraobservador e interobservador de la Clasificación de Lenke. El proyecto fue, antes que nada, sometido a análisis de ética en el CEP Hospital Santa Izabel - Santa Casa de Misericordia de Bahia/Prof. Dr. Celso Figueirôa y aprobado con número de comprobante 002650/2019. Todos los participantes firmaron el Término de Libre Consentimiento Esclarecido (TCLE). Resultados Analizándose la concordancia por medio del índice Kappa, se obtuvo una reproductibilidad interobservador de 0,755, 0,525 e 0,840, respectivamente, para el tipo de curva, modificador lumbar y sagital, ya la confiabilidad intraobservador es de 0,921, 0,370 e 0,929, respectivamente para el tipo de curva, modificador lumbar y modificador sagital. Conclusión Para la población en estudio, la confiabilidad de la clasificación de Lenke es de moderada a casi perfecta. Nivel de evidencia III; Reproductibilidad interobservador e intraobservador.

Although microscopic analysis of tissue slides has been the basis for disease diagnosis for decades, intra- and inter-observer variabilities remain issues to be resolved. The recent introduction of digital scanners has allowed for using deep learning in the analysis of tissue images because many whole slide images (WSIs) are accessible to researchers. In the present study, we investigated the possibility of a deep learning-based, fully automated, computer-aided diagnosis system with WSIs from a stomach adenocarcinoma dataset. Three different convolutional neural network architectures were tested to determine the better architecture for tissue classifier. Each network was trained to classify small tissue patches into normal or tumor. Based on the patch-level classification, tumor probability heatmaps can be overlaid on tissue images. We observed three different tissue patterns, including clear normal, clear tumor and ambiguous cases. We suggest that longer inspection time can be assigned to ambiguous cases compared to clear normal cases, increasing the accuracy and efficiency of histopathologic diagnosis by pre-evaluating the status of the WSIs. When the classifier was tested with completely different WSI dataset, the performance was not optimal because of the different tissue preparation quality. By including a small amount of data from the new dataset for training, the performance for the new dataset was much enhanced. These results indicated that WSI dataset should include tissues prepared from many different preparation conditions to construct a generalized tissue classifier. Thus, multi-national/multi-center dataset should be built for the application of deep learning in the real world medical practice.

Trends psychiatry psychother. (Impr.) ; 41(3): 218-226, July-Sept. 2019. tab, graf
Abstract Objectives: To translate and back-translate the Autism Diagnostic Observation Schedule (ADOS) into Brazilian Portuguese, to assess its cross-cultural semantic equivalence, and to verify indicators of quality of the final version by analyzing the inter-rater reliability of the ADOS scores. Methods: This study had three stages: 1) translation and back-translation; 2) semantic equivalence analysis; and 3) pre-test to verify the agreement between mental health specialists and an ADOS senior examiner regarding the scoring procedure. Authorization to translate and carry out the cultural adaptation of the instrument was first obtained from the Western Psychological Services, publishers of the instrument. Results: The main preliminary results pointed to good equivalence between the original English version and the final version and the Brazilian version following the cultural adaptation process. Some semantic differences were found between the original version and the back-translation into English, but they did not interfere with the first translation into Portuguese or into the final version. One of the limitations of the study was the small sample size; for that reason, the inter-rater reliability of the ADOS scores between the specialists and the senior examiner using the kappa coefficient was adequate for 7 out of 10 areas. Conclusions: We conclude that the creation of an official Brazilian version of ADOS will help to strengthen clinical and scientific research into ASD, and deter the use of other unauthorized versions of ADOS in the country.

Resumo Objetivos: Traduzir e retrotraduzir a Autism Diagnostic Observation Schedule (ADOS) para a língua portuguesa do Brasil, verificar sua equivalência semântica transcultural e verificar indicadores de qualidade da versão final analisando a confiabilidade interavaliadores na pontuação da ADOS. Métodos: O estudo teve três etapas: 1) tradução e retrotradução; 2) análise de equivalência semântica; e 3) pré-teste para verificar a concordância entre especialistas em saúde mental e um examinador sênior em relação ao procedimento de pontuação. A realização do estudo foi feita com a autorização da Western Psychological Services, distribuidor oficial do instrumento. Resultados: Os principais resultados preliminares indicaram uma boa equivalência entre a versão original em inglês e a versão brasileira após o processo de adaptação cultural. Algumas diferenças semânticas foram encontradas entre a versão original e a retrotradução, mas que não interferiram na primeira tradução para o português nem na versão final. Uma das limitações do estudo foi o tamanho amostral pequeno; em razão disso, a confiabilidade interavaliadores entre as pontuações da ADOS dadas pelos especialistas e pelo examinador sênior utilizando o coeficiente kappa foi adequada para 7 das 10 áreas. Conclusão: Conclui-se que, com a versão brasileira da ADOS, oficializa-se uma versão única da escala em português, fortalecendo os campos clínicos e científicos de pesquisa em TEA e impedindo que no país sejam utilizadas outras versões não autorizadas da ADOS.

Rev. bras. ter. intensiva ; 31(3): 354-360, jul.-set. 2019. tab, graf
RESUMO Objetivo: Avaliar a concordância entre médicos intensivistas que receberam treinamento semelhante para utilização do ultrassom pulmonar à beira do leito, na identificação das linhas B pulmonares visualizadas em tempo real, a fim de verificar a reprodutibilidade do método. Métodos: Foram analisados 67 pacientes que apresentaram alguma piora ventilatória identificada nas últimas 12 horas da realização do ultrassom pulmonar, no período de novembro de 2016 a março de 2017, estando todos internados em um centro de terapia intensiva de um hospital privado de Belo Horizonte (MG). Os ultrassons pulmonares foram realizados por três profissionais diferentes, denominados A, B e C, sendo o intervalo de tempo entre cada ultrassom pulmonar menor que 3 horas. As zonas torácicas visualizadas foram apenas as anteriores e laterais, sendo definidas como zonas anteriores (1) direita e esquerda (Z1D e Z1E, respectivamente), delimitadas pela clavícula, esterno, linha horizontal perpendicular ao processo xifoide e linha axilar anterior; e zonas laterais (2) direita e esquerda (Z2D e Z2E, respectivamente), abrangendo a área entre linha axilar anterior e posterior lateralmente, tendo como limite inferior a mesma linha horizontal correspondente à altura do processo xifoide. Uma zona pulmonar era considerada positiva para linhas B, quando houvesse visualização de três ou mais dessas linhas, caracterizando possível síndrome interstício-alveolar. Por meio do valor Kappa, avaliamos a concordância dentre as quatro zonas, conforme execução de cada dupla de profissional (AB, AC e BC). Resultados: Cerca de 80% das áreas visualizadas tiveram concordância classificada como moderada a substancial, com Kappa variando de 0,41 - 079 (p < 0,05; IC95%). Os maiores graus de concordância ocorreram nas zonas superiores Z1D e Z1E entre os subgrupos AC e BC, com Kappa em torno de 0,65 (p < 0,001). Já a Z2E apresentou uma das menores concordâncias, com Kappa de 0,36. Conclusão: A possível limitação do ultrassom pulmonar quanto ao efeito examinador-dependente não se mostrou presente neste trabalho, sugerindo boa reprodutibilidade dessa modalidade diagnóstica à beira do leito.

ABSTRACT Objective: To evaluate the agreement between intensive care physicians with similar training in the use of bedside lung ultrasonography in identifying pulmonary B lines, visualized in real time, to verify the reproducibility of the method. Methods: A total of 67 patients with some ventilatory deterioration identified within 12 hours after a pulmonary ultrasonography in the period from November 2016 to March 2017 were analyzed, and all were admitted to an intensive care unit of a private hospital in Belo Horizonte, Minas Gerais. The lung ultrasonographies were performed by three different professionals, termed A, B and C, and the time interval between each lung ultrasonography was less than 3 hours. The only visualized chest zones were the anterior and lateral, defined as right and left anterior (1) zones (Z1R and Z1L, respectively), which were delimited by the clavicle, the sternum and the horizontal line perpendicular to the xiphoid process and anterior axillary line. The right and left lateral (2) zones (Z2R and Z2L, respectively) covered the lateral area between the anterior and posterior axillary lines, with the lower limit being the same horizontal line corresponding to the height of the xiphoid process. A lung zone was considered positive for B lines upon visualization of three or more of these lines, suggesting the presence of alveolar-interstitial syndrome. Using the Kappa value, we evaluated the agreement among the four zones according to the execution of each pair of professionals (AB, AC and BC). Results: Approximately 80% of the areas that were visualized showed a moderate to substantial agreement, with the Kappa values ranging from 0.41 - 079 (p < 0.05; 95% CI). The highest levels of agreement occurred in the upper zones Z1R and Z1L between subgroups AC and BC, with a Kappa of approximately 0.65 (p < 0.001). In turn, Z2L showed one of the lowest agreements, with a Kappa of 0.36. Conclusion: The possible limitation of an examiner-dependent effect on lung ultrasounds was not found in this study, suggesting the good reproducibility of this diagnostic modality at the bedside.

Rev. Paul. Pediatr. (Ed. Port., Online) ; 37(3): 325-331, July-Sept. 2019. tab
ABSTRACT Objective: To translate the Early Clinical Assessment of Balance (ECAB), an assessment scale developed specifically for children and adolescents with cerebral palsy into Brazilian Portuguese, evaluate semantic, idiomatic, experiential and conceptual equivalences, and to examine the face validity and the reliability within and between examiners of the Brazilian version. Methods: The following steps were done: translation by two independent translators; synthesis of translations; back translation into English; analysis of back-translations by a multidisciplinary committee and the author of the test to develop the final version of the test; test application training; administration of the translated version of ECAB (videotaped) in 60 children and adolescents with cerebral palsy; intra and inter-examiner reliability assessment. Reability was assessed by intraclass correlation coefficient (CCI). Results: The discrepancies found were related mainly to semantic equivalence and, therefore, there was no need to make cultural adaptations in any of the 13 items on the scale. The rate of agreement was greater than 90% and the reliability of the ECAB-Portuguese total score was excellent both for the intra-rater test (CCI=1.00) and for the inter-rater test (CCI=0.998). Likewise, the reliability evaluation of each of the scale items was also excellent. Conclusions: The translated version of the ECAB into Portuguese provides a tool for the evaluation of the specific balance for children and adolescents with cerebral palsy with different levels of functioning.

RESUMO Objetivo: Traduzir a Early Clinical Assessment of Balance (ECAB), escala de avaliação do equilíbrio desenvolvida especificamente para crianças e adolescentes com paralisia cerebral (PC) para a língua portuguesa do Brasil; avaliar as equivalências semânticas, idiomáticas, experiencial e conceituais; e examinar a validade de face e a confiabilidade intra e interavaliadores da versão brasileira. Métodos: Estudo envolveu tradução do instrumento por dois tradutores independentes; síntese das traduções; retrotradução para o inglês; análise das retrotraduções por um comitê multidisciplinar; treinamento; administração da versão traduzida da ECAB (gravadas em vídeo) em 60 crianças e adolescentes com PC; e avaliação da confiabilidade intra e interexaminadores. A confiabilidade foi avaliada por meio do coeficiente de correlação intraclasse (CCI). Resultados: As discrepâncias encontradas foram referentes principalmente à equivalência semântica e, portanto, não houve necessidade da realização de adaptações culturais em nenhum dos 13 itens da escala. A taxa de concordância foi maior que 90%, e a confiabilidade do escore total da ECAB-português foi excelente tanto para o teste intra-avaliador (CCI=1,00) quanto para interavaliadores (CCI=0,998). Da mesma forma, a avaliação da confiabilidade de cada um dos itens da escala também foi excelente. Conclusões: A versão traduzida da ECAB para o português disponibiliza, para os profissionais da reabilitação infantil, um instrumento confiável de avaliação do equilíbrio específico para crianças e adolescentes com PC com diferentes níveis de funcionalidade.

J. pediatr. (Rio J.) ; 95(3): 321-327, May-June 2019. tab, graf
Abstract Objective: To translate and culturally adapt the modified Bristol Stool Form Scale for children into Brazilian Portuguese, and to evaluate the reproducibility of the translated version. Methods: The stage of translation and cross-cultural adaptation was performed according to an internationally accepted methodology, including the translation, back-translation, and pretest application of the translated version to a sample of 74 children to evaluate the degree of understanding. The reproducibility of the translated scale was assessed by applying the final version of Brazilian Portuguese modified Bristol Stool Form Scale for children to a sample of 64 children and 25 healthcare professionals, who were asked to correlate a randomly selected description from the translated scale with the corresponding representative illustration of the stool type. Results: The final version of Brazilian Portuguese modified Bristol Stool Form Scale for children were evidently reproducible, since almost complete agreement (k > 0,8) was obtained among the translated descriptions and illustrations of the stool types, both among the children and the group of specialists. The Brazilian Portuguese modified Bristol Stool Form Scale for children was shown to be reliable in providing very similar results for the same respondents at different times and for different examiners. Conclusion: The Brazilian Portuguese modified Bristol Stool Form Scale for children is reproducible; it can be applied in clinical practice and in scientific research in Brazil.

Resumo Objetivo: Traduzir e adaptar culturalmente a Escala de Bristol para Consistência de Fezes modificada para crianças para o português (Brasil) e avaliar a reprodutibilidade da versão traduzida. Métodos: O estágio de tradução e adaptação intercultural foi feito de acordo com uma metodologia internacionalmente aceita, incluiu a tradução, retrotradução e aplicação de pré-teste da versão traduzida a uma amostra de 74 crianças para avaliar o nível de entendimento. A avaliação da reprodutibilidade da escala traduzida foi feita com a aplicação da versão final da Escala de Bristol para Consistência de Fezes modificada em português (Brasil) para crianças a uma amostra de 64 crianças e 25 profissionais de saúde, que tiveram de correlacionar uma descrição aleatoriamente selecionada da escala traduzida com a ilustração representativa correspondente do tipo de fezes. Resultados: A versão final da Escala de Bristol para Consistência de Fezes modificada para crianças em português (Brasil) foi comprovadamente reproduzível, pois foi obtida quase uma concordância total (k > 0,8) entre as descrições e ilustrações traduzidas dos tipos de fezes, entre as crianças e o grupo de especialistas. A Escala de Bristol para Consistência de Fezes modificada para crianças em português (Brasil) mostrou-se confiável em proporcionar resultados muito semelhantes para os mesmos entrevistados em diferentes momentos e para diferentes examinadores. Conclusão: A Escala de Bristol para Consistência de Fezes modificada para crianças em português (Brasil) é reproduzível e pode ser aplicada na prática clínica e em pesquisa científica no Brasil.

J. bras. pneumol ; 45(5): e20180032, 2019. tab, graf
ABSTRACT Objective: To investigate the accuracy of chest auscultation in detecting abnormal respiratory mechanics. Methods: We evaluated 200 mechanically ventilated patients in the immediate postoperative period after cardiac surgery. We assessed respiratory system mechanics - static compliance of the respiratory system (Cst,rs) and respiratory system resistance (R,rs) - after which two independent examiners, blinded to the respiratory system mechanics data, performed chest auscultation. Results: Neither decreased/abolished breath sounds nor crackles were associated with decreased Cst,rs (≤ 60 mL/cmH2O), regardless of the examiner. The overall accuracy of chest auscultation was 34.0% and 42.0% for examiners A and B, respectively. The sensitivity and specificity of chest auscultation for detecting decreased/abolished breath sounds or crackles were 25.1% and 68.3%, respectively, for examiner A, versus 36.4% and 63.4%, respectively, for examiner B. Based on the judgments made by examiner A, there was a weak association between increased R,rs (≥ 15 cmH2O/L/s) and rhonchi or wheezing (ϕ = 0.31, p < 0.01). The overall accuracy for detecting rhonchi or wheezing was 89.5% and 85.0% for examiners A and B, respectively. The sensitivity and specificity for detecting rhonchi or wheezing were 30.0% and 96.1%, respectively, for examiner A, versus 10.0% and 93.3%, respectively, for examiner B. Conclusions: Chest auscultation does not appear to be an accurate diagnostic method for detecting abnormal respiratory mechanics in mechanically ventilated patients in the immediate postoperative period after cardiac surgery.

RESUMO Objetivo: Investigar a acurácia da ausculta torácica na detecção de mecânica respiratória anormal. Métodos: Foram avaliados 200 pacientes sob ventilação mecânica no pós-operatório imediato de cirurgia cardíaca. Foi avaliada a mecânica do sistema respiratório - complacência estática do sistema respiratório (Cest,sr) e resistência do sistema respiratório (R,sr) - e, em seguida, dois examinadores independentes, que desconheciam os dados referentes à mecânica do sistema respiratório, realizaram a ausculta torácica. Resultados: Nem murmúrio vesicular diminuído/abolido nem crepitações foram associados à Cest,sr reduzida (≤ 60 ml/cmH2O), independentemente do examinador. A acurácia global da ausculta torácica foi de 34,0% e 42,0% para os examinadores A e B, respectivamente. A sensibilidade e a especificidade da ausculta torácica para a detecção de murmúrio vesicular diminuído/abolido e/ou crepitações foi de 25,1% e 68,3%, respectivamente, para o examinador A, versus 36,4% e 63,4%, respectivamente, para o examinador B. Com base nos julgamentos feitos pelo examinador A, houve uma fraca associação entre R,sr aumentada (≥ 15 cmH2O/l/s) e roncos e/ou sibilos (ϕ = 0,31, p < 0,01). A acurácia global para a detecção de roncos e/ou sibilos foi de 89,5% e 85,0% para os examinadores A e B, respectivamente. A sensibilidade e a especificidade para a detecção de roncos e/ou sibilos foi de 30,0% e 96,1%, respectivamente, para o examinador A, versus 10,0% e 93,3%, respectivamente, para o examinador B. Conclusões: A ausculta torácica não parece ser um método diagnóstico acurado para a detecção de mecânica respiratória anormal em pacientes sob ventilação mecânica no pós-operatório imediato de cirurgia cardíaca.

Although papillary thyroid carcinoma (PTC)–type nuclear changes are the most reliable morphological feature in the diagnosis of PTC, the nuclear assessment used to identify these changes is highly subjective. Here, we report a noninvasive encapsulated thyroid tumor with a papillary growth pattern measuring 23 mm at its largest diameter with a nuclear score of 2 in a 26-year-old man. After undergoing left lobectomy, the patient was diagnosed with an encapsulated PTC. However, a second opinion consultation suggested an alternative diagnosis of follicular adenoma with papillary hyperplasia. When providing a third opinion, we identified a low MIB-1 labeling index and a heterozygous point mutation in the KRAS gene but not the BRAF gene. We speculated that this case is an example of a novel borderline tumor with a papillary structure. Introduction of the new terminology “noninvasive encapsulated papillary RAS-like thyroid tumor (NEPRAS)” without the word “cancer” might relieve the psychological burden of patients in a way similar to the phrase “noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP).”

Article in Korean | WPRIM (Western Pacific) | ID: wprim-758477


OBJECTIVE: The Korean Triage and Acuity Scale (KTAS) has been used in all emergency departments (EDs) since 2016. Medical personnel can provide the treatment priority based on the KTAS levels. The inter-rater agreement with KTAS has not been reported, even though most triage assignments are performed by nurses in Korea. This study was aimed to verify the agreement of triage levels between emergency physicians (EPs) and nurses with KTAS. METHODS: This was a prospective, single-center study of an academic tertiary medical center. If the patient visits the ED, the triage nurse and EP meet the patients together. The nurse performed the history taking and physical examinations including vital signs measurements then recorded the KTAS levels. The EP did not interfere with the nurse's decision. The EP also decided the KTAS levels. The designated codes and levels were compared. The EP recorded the detailed reasons for the disagreement if there was discrepancy. RESULTS: Comparisons were performed with 928 patients. The number of patients in each KTAS level was 95 (10.2%) in level I, 263 (28.3%) in level II, 348 (37.5%) in level III, 144 (15.5%) in level IV, and 78 (8.4%) in level V. The overall agreement was 761 (82%), and the Kappa coefficient was 0.691. The errors of history taking were most frequent (131, 78.4%). Insufficient understanding of the disease pathophysiology, inaccurate neurological examinations, and errors that did not consider the vital signs except for the blood pressure were encountered in 12 (7.2%). CONCLUSION: The agreement rate was high between EPs and nurses using KTAS (K=0.691, substantial agreement).

Intestinal Research ; : 387-397, 2019.
Article in English | WPRIM (Western Pacific) | ID: wprim-764152


BACKGROUND/AIMS: The existing histological classifications for the interpretation of small intestinal biopsies are based on qualitative parameters with high intraobserver and interobserver variations. We have developed and propose a quantitative histological classification system for the assessment of intestinal mucosal biopsies. METHODS: We performed a computer-assisted quantitative histological assessment of digital images of duodenal biopsies from 137 controls and 124 patients with celiac disease (CeD) (derivation cohort). From the receiver-operating curve analysis, followed by multivariate and logistic regression analyses, we identified parameters for differentiating control biopsies from those of the patients with CeD. We repeated the quantitative histological analysis in a validation cohort (105 controls and 120 patients with CeD). On the basis of the results, we propose a quantitative histological classification system. The new classification was compared with the existing histological classifications for interobserver and intraobserver agreements by a group of qualified pathologists. RESULTS: Among the histological parameters, intraepithelial lymphocyte count of ≥25/100 epithelial cells, adjusted villous height fold change of ≤0.7, and crypt depth-to-villous height ratio of ≥0.5 showed good discriminative power between the mucosal biopsies from the patients with CeD and those from the controls, with 90.3% sensitivity, 93.5% specificity, and 96.2% area under the curve. Among the existing histological classifications, our quantitative histological classification showed the highest intraobserver (69.7%–85.03%) and interobserver (24.6%–71.5%) agreements. CONCLUSIONS: Quantitative assessment increases the reliability of the histological assessment of mucosal biopsies in patients with CeD. Such a classification system may be used for clinical trials in patients with CeD.

Clinical Endoscopy ; : 129-136, 2019.
Article in English | WPRIM (Western Pacific) | ID: wprim-763417


Inflammatory bowel disease (IBD) is considered a chronic condition characterized by mucosal or transmural inflammation in the gastrointestinal tract. Endoscopic diagnosis and surveillance in patients with IBD have become crucial. In addition, endoscopy is a useful modality in estimation and evaluation of the disease, treatment results, and efficacy of treatment delivery and surveillance. In relation to these aspects, endoscopic disease activity has been commonly estimated in clinical practices and trials. At present, many endoscopic indices of ulcerative colitis have been introduced, including the Truelove and Witts Endoscopy Index, Baron Index, Powell-Tuck Index, Sutherland Index, Mayo Clinic Endoscopic Sub-Score, Rachmilewitz Index, Modified Baron Index, Endoscopic Activity Index, Ulcerative Colitis Endoscopic Index of Severity, Ulcerative Colitis Colonoscopic Index of Severity, and Modified Mayo Endoscopic Score. Endoscopic indices have been also suggested for Crohn's disease, such as the Crohn's Disease Endoscopic Index of Severity, Simple Endoscopic Score for Crohn's Disease, and Rutgeerts Postoperative Endoscopic Index. However, most endoscopic indices have not been validated owing to the complexity of their parameters and inter-observer variations. Therefore, a chronological approach for understanding the various endoscopic indices relating to IBD is needed to improve the management.

Article in English | WPRIM (Western Pacific) | ID: wprim-786131


BACKGROUND: Assessment of programmed cell death-ligand 1 (PD-L1) immunohistochemical staining is used for treatment decisions in non-small cell lung cancer (NSCLC) regarding use of PD-L1/programmed cell death protein 1 (PD-1) immunotherapy. The reliability of the PD-L1 22C3 pharmDx assay is critical in guiding clinical practice. The Cardiopulmonary Pathology Study Group of the Korean Society of Pathologists investigated the interobserver reproducibility of PD-L1 staining with 22C3 pharmDx in NSCLC samples.METHODS: Twenty-seven pathologists individually assessed the tumor proportion score (TPS) for 107 NSCLC samples. Each case was divided into three levels based on TPS: <1%, 1%–49%, and ≥50%.RESULTS: The intraclass correlation coefficient for TPS was 0.902±0.058. Weighted κ coefficient for 3-step assessment was 0.748±0.093. The κ coefficients for 1% and 50% cut-offs were 0.633 and 0.834, respectively. There was a significant association between interobserver reproducibility and experience (formal PD-L1 training, more experience for PD-L1 assessment, and longer practice duration on surgical pathology), histologic subtype, and specimen type.CONCLUSIONS: Our results indicate that PD-L1 immunohistochemical staining provides a reproducible basis for decisions on anti–PD-1 therapy in NSCLC.

Ultrasonography ; : 374-376, 2019.
Article in English | WPRIM (Western Pacific) | ID: wprim-761988


Ultrasonography ; : 172-180, 2019.
Article in English | WPRIM (Western Pacific) | ID: wprim-761969


PURPOSE: The purpose of this study was to record and evaluate interobserver agreement as quality control for the modified categorization of screening breast ultrasound developed by the Alliance for Breast Cancer Screening in Korea (ABCS-K) for the Mammography and Ultrasonography Study for Breast Cancer Screening Effectiveness (MUST-BE) trial. METHODS: Eight breast radiologists with 4-16 years of experience participated in 2 rounds of quality control testing for the MUST-BE trial. Two investigators randomly selected 125 and 100 cases of breast lesions with different ratios of malignant and benign lesions. Two versions of the modified categorization were tested. The initially modified classification was developed after the first quality control workshop, and the re-modified classification was developed after the second workshop. The re-modified categorization established by ABCS-K added size criteria and the anterior-posterior ratio compared with the initially modified classification. After a brief lecture on the modified categorization system prior to each quality control test, the eight radiologists independently categorized the lesions using the modified categorization. Interobserver agreement was measured using kappa statistics. RESULTS: The overall kappa values for the modified categorizations indicated moderate to substantial degrees of agreement (initially modified categorization and re-modified categorization: κ=0.52 and κ=0.63, respectively). The kappa values for the subcategories of category 4 were 0.37 (95% confidence interval [CI], 0.24 to 0.52) and 0.39 (95% CI, 0.31 to 0.49), respectively. The overall kappa values for both the initially modified categorization and the re-modified categorization indicated a substantial degree of agreement when dichotomizing the interpretation as benign or suspicious. CONCLUSION: The preliminary results demonstrated acceptable interobserver agreement for the modified categorization.

Article in English | WPRIM (Western Pacific) | ID: wprim-741842


PURPOSE: To evaluate intra- and inter-observer variability and guideline adherence amongst pediatricians in treating children aged between 4 and 18 years referred with recurrent abdominal pain (RAP) without red flags. METHODS: The first part of the study is a retrospective single-center cohort study. The diagnostic work-ups of eight pediatricians were compared to the national guidelines. Intra- and inter-observer variability were examined by Cramer's V test. Intra-observer variability was defined as the amount of variation within a pediatrician and inter-observer variability as the amount of variation between pediatricians in the application of diagnostic work-up in children with RAP. Prospectively, the same pediatricians were requested to provide a report on their management strategy with a fictitious case to prove similarities in retrospective diagnostic work-up. RESULTS: A total of 10 patients per pediatrician were analyzed. Retrospectively, a (very) weak association between pediatricians' diagnostic work-ups was found (0.22), which implies high inter-observer variability. The association between intra-observer diagnostic was moderate (range, 0.35–0.46). The Cramer's V of 0.60 in diagnostic work-up between pediatricians in the fictitious case implied the presence of a moderately strong association and lower inter-observer variability than in the retrospective study. Adherence to the guideline was 66.8%. CONCLUSION: We found a high intra- and inter-observer variability and moderate guideline adherence in daily clinical practice amongst pediatricians in treating children with RAP in a teaching hospital.

Article in English | WPRIM (Western Pacific) | ID: wprim-741405


OBJECTIVE: To evaluate the interpretive performance and inter-observer agreement on digital mammographs among radiologists and to investigate whether radiologist characteristics affect performance and agreement. MATERIALS AND METHODS: The test sets consisted of full-field digital mammograms and contained 12 cancer cases among 1000 total cases. Twelve radiologists independently interpreted all mammograms. Performance indicators included the recall rate, cancer detection rate (CDR), positive predictive value (PPV), sensitivity, specificity, false positive rate (FPR), and area under the receiver operating characteristic curve (AUC). Inter-radiologist agreement was measured. The reporting radiologist characteristics included number of years of experience interpreting mammography, fellowship training in breast imaging, and annual volume of mammography interpretation. RESULTS: The mean and range of interpretive performance were as follows: recall rate, 7.5% (3.3–10.2%); CDR, 10.6 (8.0–12.0 per 1000 examinations); PPV, 15.9% (8.8–33.3%); sensitivity, 88.2% (66.7–100%); specificity, 93.5% (90.6–97.8%); FPR, 6.5% (2.2–9.4%); and AUC, 0.93 (0.82–0.99). Radiologists who annually interpreted more than 3000 screening mammograms tended to exhibit higher CDRs and sensitivities than those who interpreted fewer than 3000 mammograms (p = 0.064). The inter-radiologist agreement showed a percent agreement of 77.2–88.8% and a kappa value of 0.27–0.34. Radiologist characteristics did not affect agreement. CONCLUSION: The interpretative performance of the radiologists fulfilled the mammography screening goal of the American College of Radiology, although there was inter-observer variability. Radiologists who interpreted more than 3000 screening mammograms annually tended to perform better than radiologists who did not.

An. bras. dermatol ; 93(6): 852-858, Nov.-Dec. 2018. tab
Article in English | LILACS (Americas) | ID: biblio-973629


Abstract: Background: Dermoscopy is a noninvasive complementary diagnostic method largely used in dermatology. Feasibility, accuracy, and reproducibility are key elements for a diagnostic method to be useful, hence the importance of the terminology used to describe dermoscopic criteria. Objective: To evaluate the reproducibility of the English descriptive terminology proposed for dermoscopic criteria at the 3rd Consensus Meeting of the International Dermoscopy Society in Brazilian Portuguese. Methods: Nine Brazilian dermatologists independently analyzed the translation of sixty dermoscopic descriptive terms proposed at the 3rd Consensus Conference of the International Society of Dermoscopy. Interobserver agreement index was analyzed using the Fleiss' kappa test. Results: The interobserver agreement of the descriptive terminology in Brazilian Portuguese was considered weak (κ = 0.373; p < 0.05). The interobserver agreement of the descriptive terminology used to describe morphology and arrangement of vascular structures was considered moderate (κ = 0.43; p < 0.05). Study limitations: Our study limitations include the small number of participants and limited regional representation (only 2 out of 5 Brazilian regions were represented). Conclusions: The descriptive English terminology proposed at the 3rd Consensus Conference of the International Dermoscopy Society revealed weak reproducibility and the morphology and arrangement of vascular structures presented moderate reproducibility in Brazilian Portuguese. Despite small regional differences, metaphoric terminology in dermoscopy seems to be the most useful and reproducible system to be adopted in Brazilian Portuguese.

Int. j. morphol ; 36(4): 1298-1304, Dec. 2018. tab, graf
Article in Spanish | LILACS (Americas) | ID: biblio-975699


El objetivo de la presente investigación fue determinar el nivel de confiabilidad de las mediciones antropométricas relacionadas al tórax, realizadas por un estudiante de licenciatura en kinesiología (ELK) comparandolas con las de un antropometrista experto (AE). El proceso consistió en el escalonamiento de competencias dividido en tres etapas: i) desarrollo de competencias teóricas y prácticas, ii) Determinar idea de investigación y construcción de su marco teórico y iii) adquisición de confiabilidad, el ELK realizó este ejercicio en relación al AE International Society for the Advancement of Kinanthropometry (ISAK) II. Para este punto, se reclutaron seis participantes, los cuales fueron aleatorizados y medidos por ambos evaluadores. Se registraron los diámetros, perímetros y pliegues relacionados al tórax utilizando los protocolos de la ISAK, para posteriormente calcular el error técnico de la medición (ETM) y el coeficiente de correlación de concordancia de Lin (CCC). El ETM fue el aceptado para diámetros, perímetros y pliegues, salvo en el valor del pliegue bicipital, el que mostró un 5,60 %. La fuerza de la relación fue moderada a sustancial en perímetros y pliegues y pobre en los diámetros entre el ELK y el AE. En conclusión, el ETM se encontró dentro de los rangos permitidos salvo el pliegue bicipital. Sin embargo, ELK mostró un nivel de concordancia moderado a sustancial en perímetros y pliegues y pobre en diámetros de tórax en relación al AE.

The aim of the present investigation was to determine the reliability level of the anthropometric measurements related to the thorax performed by an undergraduate student in kinesiology (ELK) comparing them with those of an expert anthropometrist (AE). The process consisted of the staging of competencies divided into three stages: i) development of theoretical and practical skills, ii) Determine the idea of research and construction of its theoretical framework and iii) acquisition of reliability, the ELK carried out this exercise in relation to the AE International Society for the Advancement of Kinanthropometry (ISAK) II. For this point, five participants were recruited, which were randomized and measured by both evaluators. The diameters, perimeters and folds related to the thorax were recorded using the ISAK protocols, to calculate the technical error of the measurement (ETM) and the correlation coefficient of Lin's concordance (CCC). The ETM was accepted for diameters, perimeters and folds, except for the bicipital fold value, which showed a 5.60 %. The relationship strength was moderate to substantial in perimeters and folds and poor in the diameters between the ELK and the AE. In conclusion, the ETM was found within the permitted ranges except for the bicipital fold. However, ELK showed a moderate to substantial level of agreement in perimeters and folds and poor in chest diameters in relation to AE.

An. bras. dermatol ; 93(5): 752-754, Sept.-Oct. 2018. tab, graf
Article in English | LILACS (Americas) | ID: biblio-1038278


Abstract: Melanoma Guidelines of the Brazilian Dermatology Society recommend histologic review by pathologists trained in melanocytic lesions whenever possible. Out of 145 melanoma cases identified at a private clinic in São Paulo/Brazil, 31 that had been submited to histologic review were studied to evaluate whether revision had led to change in therapeutic approach.. Differences in original/reviewed reports were found in 58.1% (n=18) of the reports, leading to changes in therapeutic approach in 41.9% (n=13). Change in diagnosis was observed in 6 out of 31 (19,3%) cases. These findings suggest that second opinion by pathologists trained in melanocytic lesions is likely to show significant differences from the original report.

Rev. bras. ortop ; 53(5): 521-526, Sept.-Oct. 2018. tab
Article in English | LILACS (Americas) | ID: biblio-977895


ABSTRACT Objective: To evaluate the inter and intraobserver agreement of the Magerl AO and AOSpine thoracolumbar fracture classification systems. Methods: The participants were divided into two groups, the first composed of six spinal surgeons and the other composed of 18 medical orthopedic residents. On two different occasions, separated by an interval of one month, the participants analyzed and classified 25 radiographs with thoracolumbar fractures using both thoracolumbar fracture classification systems, Magerl AO and AOSpine. The results were analyzed for classification reliability using the Kappa coefficient (k). Results: The Magerl AO classification system showed a fair interobserver agreement (k = 0.32), considering the fractures type and subtype, whereas the AOSpine classification system showed a moderate interobserver agreement (k = 0.59). The Magerl AO classification showed a fair intraobserver agreement for both residents and specialists (k = 0.21 and 0.38, respectively), while the AOSpine showed a substantial agreement between residents (k = 0.62) and moderate between specialists (k = 0.53). Conclusions: When evaluating fracture morphology, the AOSpine thoracolumbar fracture classification system presented a better reliability and reproducibility compared to the Magerl AO classification system.

RESUMO Objetivo: Avaliar a concordância inter e intraobservadores dos sistemas de classificação Magerl AO e AOSpine para fraturas toracolombares. Métodos: Os participantes foram divididos em dois grupos, um com seis médicos ortopedistas especialistas em coluna e o outro com 18 médicos residentes em ortopedia. Os participantes analisaram 25 radiografias com fraturas toracolombares em duas oportunidades, com um mês de intervalo entre elas, e classificaram com o uso dos dois sistemas de classificação de fratura toracolombar, Magerl AO e AOSpine. Os dados de concordância foram analisados pelo método do coeficiente kappa. Resultados: A classificação de Magerl AO apresentou uma concordância interobservadores leve (k = 0,32), considerando o tipo e o subtipo das fraturas, enquanto a classificação AOSpine obteve uma concordância interobservadores moderada (k = 0,59). A classificação de Magerl AO apresentou uma concordância intraobservadores leve entre médicos residentes e médicos especialistas (k = 0,21 e 0,38, respectivamente), enquanto a classificação AOSpine apresentou uma boa concordância intraobservadores entre médicos residentes (k = 0,62) e moderada entre médicos especialistas (k = 0,53). Conclusão: O sistema de classificação da AOSpine para fraturas toracolombares apresentou uma melhor confiabilidade e reprodutibilidade comparado com o sistema de classificação Magerl AO, em relação à morfologia da fratura.

Humans , Male , Female , Spinal Injuries , Observer Variation