Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Stud Health Technol Inform ; 205: 1070-4, 2014.
Article in English | MEDLINE | ID: mdl-25160353

ABSTRACT

Internet health forums are a rich textual resource with content generated through free exchanges among patients and, in certain cases, health professionals. We tackle the problem of retrieving clinically relevant information from such forums, with relevant topics being defined from clinical auto-questionnaires. Texts in forums are largely unstructured and noisy, calling for adapted preprocessing and query methods. We minimize the number of false negatives in queries by using a synonym tool to achieve query expansion of initial topic keywords. To avoid false positives, we propose a new measure based on a statistical comparison of frequent co-occurrences in a large reference corpus (Web) to keep only relevant expansions. Our work is motivated by a study of breast cancer patients' health-related quality of life (QoL). We consider topics defined from a breast-cancer specific QoL-questionnaire. We quantify and structure occurrences in posts of a specialized French forum and outline important future developments.


Subject(s)
Breast Neoplasms/epidemiology , Breast Neoplasms/psychology , Data Mining/methods , Health Information Exchange/statistics & numerical data , Quality of Life/psychology , Social Media/statistics & numerical data , Vocabulary, Controlled , Artificial Intelligence , Female , Humans , Natural Language Processing , Surveys and Questionnaires
2.
Proc Natl Acad Sci U S A ; 107(49): 20899-904, 2010 Dec 07.
Article in English | MEDLINE | ID: mdl-21078953

ABSTRACT

PNAS article classification is rooted in long-standing disciplinary divisions that do not necessarily reflect the structure of modern scientific research. We reevaluate that structure using latent pattern models from statistical machine learning, also known as mixed-membership models, that identify semantic structure in co-occurrence of words in the abstracts and references. Our findings suggest that the latent dimensionality of patterns underlying PNAS research articles in the Biological Sciences is only slightly larger than the number of categories currently in use, but it differs substantially in the content of the categories. Further, the number of articles that are listed under multiple categories is only a small fraction of what it should be. These findings together with the sensitivity analyses suggest ways to reconceptualize the organization of papers published in PNAS.


Subject(s)
Periodicals as Topic/classification , Publications/classification , Classification , Methods , National Academy of Sciences, U.S. , Statistics as Topic , United States
3.
Ann Appl Stat ; 1(2): 346-384, 2007.
Article in English | MEDLINE | ID: mdl-21687832

ABSTRACT

Data on functional disability are of widespread policy interest in the United States, especially with respect to planning for Medicare and Social Security for a growing population of elderly adults. We consider an extract of functional disability data from the National Long Term Care Survey (NLTCS) and attempt to develop disability profiles using variations of the Grade of Membership (GoM) model. We first describe GoM as an individual-level mixture model that allows individuals to have partial membership in several mixture components simultaneously. We then prove the equivalence between individual-level and population-level mixture models, and use this property to develop a Markov Chain Monte Carlo algorithm for Bayesian estimation of the model. We use our approach to analyze functional disability data from the NLTCS.

SELECTION OF CITATIONS
SEARCH DETAIL
...