Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Npj Ment Health Res ; 1(1): 3, 2022 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38609492

RESUMO

Suicide is a growing public health concern in the United States. A detailed understanding and prediction of suicide patterns can significantly boost targeted suicide control and prevention efforts. In this article we look at the suicide trends and geographical distribution of suicides and then develop a machine learning based US county-level suicide prediction model, using publicly available data for the 10-year period from 2010-2019. Analysis of the trends and geographical distribution of suicides revealed that nearly 25% of the total counties experienced at least a 10% increase in suicides from 2010 to 2019, with about 12% of total counties exhibiting an increase of at least 50%. An eXtreme Gradient Boosting (XGBoost) based machine learning model was used with 17 unique features for each of the 3140 counties in the US to predict suicides with an R2 value of 0.98. Using the SHapley Additive exPlanations (SHAP) values, the importance of all the 17 features used in the prediction model training set were identified. County level features, namely Total Population, % African American Population, % White Population, Median Age and % Female Population were found to be the top 5 important features that significantly affected prediction results. The top five important features based on SHAP values were then used to create a Suicide Vulnerability Index (SVI) for US Counties. This newly developed SVI has the potential to detect US counties vulnerable to high suicide rates and can aid targeted suicide control and prevention efforts, thereby making it a valuable tool in an informed decision-making process.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38881882

RESUMO

Often in manufacturing systems, scenarios arise where the demand for maintenance exceeds the capacity of maintenance resources. This results in the problem of allocating the limited resources among machines competing for them. This maintenance scheduling problem can be formulated as a Markov decision process (MDP) with the goal of finding the optimal dynamic maintenance action given the current system state. However, as the system becomes more complex, solving an MDP suffers from the curse of dimensionality. To overcome this issue, we propose a two-stage approach that first optimizes a static condition-based maintenance (CBM) policy using a genetic algorithm (GA) and then improves the policy online via Monte Carlo tree search (MCTS). The static policy significantly reduces the state space of the online problem by allowing us to ignore machines that are not sufficiently degraded. Furthermore, we formulate MCTS to seek a maintenance schedule that maximizes the long-term production volume of the system to reconcile the conflict between maintenance and production objectives. We demonstrate that the resulting online policy is an improvement over the static CBM policy found by GA. Note to Practitioners­: This article proposes a method of scheduling maintenance in complex manufacturing systems in scenarios where there is frequent competition for maintenance resources. We use a condition-based maintenance policy that prescribes maintenance actions based on a machine's current health. However, when several machines are due for maintenance, a maintenance technician must choose between multiple competing jobs. While a common approach is to establish rules that dictate how maintenance jobs should be prioritized, such as the first-in, first-out rule, the goal of this work is to improve upon static policies in real time. We do this by strategically evaluating sequences of maintenance actions and playing out many "what-if" scenarios to see how the system will behave in the future. Implementation of the proposed method relies on the construction of a simulation model of the target system. This model is capable of retrieving the current state of the physical system, including the degradation state of machines, the availability of maintenance resources, and the distribution of parts throughout buffers in the system. We present several simulation experiments that demonstrate the improvement in system performance that our approach provides. Future work will aim to improve the efficiency of maintenance prioritization through online learning as well as more accurately identify manufacturing system configurations that will yield the greatest benefit of these methods.

3.
Artigo em Inglês | MEDLINE | ID: mdl-34248180

RESUMO

Quality is a key determinant in deploying new processes, products, or services and influences the adoption of emerging manufacturing technologies. The advent of additive manufacturing (AM) as a manufacturing process has the potential to revolutionize a host of enterprise-related functions from production to the supply chain. The unprecedented level of design flexibility and expanded functionality offered by AM, coupled with greatly reduced lead times, can potentially pave the way for mass customization. However, widespread application of AM is currently hampered by technical challenges in process repeatability and quality management. The breakthrough effect of six sigma (6S) has been demonstrated in traditional manufacturing industries (e.g., semiconductor and automotive industries) in the context of quality planning, control, and improvement through the intensive use of data, statistics, and optimization. 6S entails a data-driven DMAIC methodology of five steps-define, measure, analyze, improve, and control. Notwithstanding the sustained successes of the 6S knowledge body in a variety of established industries ranging from manufacturing, healthcare, logistics, and beyond, there is a dearth of concentrated application of 6S quality management approaches in the context of AM. In this article, we propose to design, develop, and implement the new DMAIC methodology for the 6S quality management of AM. First, we define the specific quality challenges arising from AM layerwise fabrication and mass customization (even one-of-a-kind production). Second, we present a review of AM metrology and sensing techniques, from materials through design, process, and environment, to postbuild inspection. Third, we contextualize a framework for realizing the full potential of data from AM systems and emphasize the need for analytical methods and tools. We propose and delineate the utility of new data-driven analytical methods, including deep learning, machine learning, and network science, to characterize and model the interrelationships between engineering design, machine setting, process variability, and final build quality. Fourth, we present the methodologies of ontology analytics, design of experiments (DOE), and simulation analysis for AM system improvements. In closing, new process control approaches are discussed to optimize the action plans, once an anomaly is detected, with specific consideration of lead time and energy consumption. We posit that this work will catalyze more in-depth investigations and multidisciplinary research efforts to accelerate the application of 6S quality management in AM.

4.
IEEE J Biomed Health Inform ; 25(6): 2215-2226, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33196445

RESUMO

Patient satisfaction is a key performance indicator of patient-centered care and hospital reimbursement. To discover the major factors that affect patient experiences is considered as an effective way to formulate corrective actions. A patient during his/her healthcare journey interacts with multiple health professionals across different service units. The health-related data generated at each step of the journey is a valuable resource for extracting actionable insights. In particular, self-reported satisfaction survey and the associated patient electronic health records play an important role in the hospital-patient interaction analysis. In this paper, we propose an interpretable machine learning framework to formulate the patient satisfaction problem as a supervised learning task and utilize a mixed-integer programming model to identify the most influential factors. The proposed framework transforms heterogeneous data into human-understandable features and integrates feature transformation, variable selection, and coefficient learning into the optimization process. Therefore, it can achieve desirable model performance while maintaining excellent model interpretability, which paves the way for successful real-world applications.


Assuntos
Aprendizado de Máquina , Satisfação do Paciente , Registros Eletrônicos de Saúde , Feminino , Pessoal de Saúde , Humanos , Masculino , Inquéritos e Questionários
5.
J Med Internet Res ; 22(8): e17239, 2020 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-32840485

RESUMO

BACKGROUND: Online pharmacies have grown significantly in recent years, from US $29.35 billion in 2014 to an expected US $128 billion in 2023 worldwide. Although legitimate online pharmacies (LOPs) provide a channel of convenience and potentially lower costs for patients, illicit online pharmacies (IOPs) open the doors to unfettered access to prescription drugs, controlled substances (eg, opioids), and potentially counterfeits, posing a dramatic risk to the drug supply chain and the health of the patient. Unfortunately, we know little about IOPs, and even identifying and monitoring IOPs is challenging because of the large number of online pharmacies (at least 30,000-35,000) and the dynamic nature of the online channel (online pharmacies open and shut down easily). OBJECTIVE: This study aims to increase our understanding of IOPs through web data traffic analysis and propose a novel framework using referral links to predict and identify IOPs, the first step in fighting IOPs. METHODS: We first collected web traffic and engagement data to study and compare how consumers access and engage with LOPs and IOPs. We then proposed a simple but novel framework for predicting the status of online pharmacies (legitimate or illicit) through the referral links between websites. Under this framework, we developed 2 prediction models, the reference rating prediction method (RRPM) and the reference-based K-nearest neighbor. RESULTS: We found that direct (typing URL), search, and referral are the 3 major traffic sources, representing more than 95% traffic to both LOPs and IOPs. It is alarming to see that direct represents the second-highest traffic source (34.32%) to IOPs. When tested on a data set with 763 online pharmacies, both RRPM and R2NN performed well, achieving an accuracy above 95% in their predictions of the status for the online pharmacies. R2NN outperformed RRPM in full performance metrics (accuracy, kappa, specificity, and sensitivity). On implementing the 2 models on Google search results for popular drugs (Xanax [alprazolam], OxyContin, and opioids), they produced an error rate of only 7.96% (R2NN) and 6.20% (RRPM). CONCLUSIONS: Our prediction models use what we know (referral links) to tackle the many unknown aspects of IOPs. They have many potential applications for patients, search engines, social media, payment companies, policy makers or government agencies, and drug manufacturers to help fight IOPs. With scarce work in this area, we hope to help address the current opioid crisis from this perspective and inspire future research in the critical area of drug safety.


Assuntos
Internet/legislação & jurisprudência , Disponibilidade de Medicamentos Via Internet/legislação & jurisprudência , Humanos
6.
JMIR Public Health Surveill ; 6(3): e19446, 2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32784193

RESUMO

BACKGROUND: The rapid spread of COVID-19 means that government and health services providers have little time to plan and design effective response policies. It is therefore important to quickly provide accurate predictions of how vulnerable geographic regions such as counties are to the spread of this virus. OBJECTIVE: The aim of this study is to develop county-level prediction around near future disease movement for COVID-19 occurrences using publicly available data. METHODS: We estimated county-level COVID-19 occurrences for the period March 14 to 31, 2020, based on data fused from multiple publicly available sources inclusive of health statistics, demographics, and geographical features. We developed a three-stage model using XGBoost, a machine learning algorithm, to quantify the probability of COVID-19 occurrence and estimate the number of potential occurrences for unaffected counties. Finally, these results were combined to predict the county-level risk. This risk was then used as an estimated after-five-day-vulnerability of the county. RESULTS: The model predictions showed a sensitivity over 71% and specificity over 94% for models built using data from March 14 to 31, 2020. We found that population, population density, percentage of people aged >70 years, and prevalence of comorbidities play an important role in predicting COVID-19 occurrences. We observed a positive association at the county level between urbanicity and vulnerability to COVID-19. CONCLUSIONS: The developed model can be used for identification of vulnerable counties and potential data discrepancies. Limited testing facilities and delayed results introduce significant variation in reported cases, which produces a bias in the model.


Assuntos
Infecções por Coronavirus/epidemiologia , Modelos Estatísticos , Pandemias , Pneumonia Viral/epidemiologia , Vigilância da População/métodos , Idoso , Algoritmos , Betacoronavirus , COVID-19 , Comorbidade , Infecções por Coronavirus/prevenção & controle , Infecções por Coronavirus/virologia , Humanos , Aprendizado de Máquina , Pandemias/prevenção & controle , Pneumonia Viral/prevenção & controle , Pneumonia Viral/virologia , Densidade Demográfica , Medição de Risco , SARS-CoV-2 , Estados Unidos , População Urbana
7.
Chaos ; 30(1): 013119, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-32013465

RESUMO

Nonlinear dynamical systems often generate significant amounts of observational data such as time series, as well as high-dimensional spatial data. To delineate recurrence dynamics in the spatial data, prior efforts either extended the recurrence plot, which is a widely used tool for time series, to a four-dimensional hyperspace or utilized the network approach for recurrence analysis. However, very little has been done to differentiate heterogeneous types of recurrences in the spatial data (e.g., recurrence variations of state transitions in the spatial domain). Therefore, we propose a novel heterogeneous recurrence approach for spatial data analysis. First, spatial data are traversed with the Hilbert Space-Filling Curve to transform the variations of recurrence patterns from the spatial domain to the state-space domain. Second, we design an Iterated Function System to derive the fractal representation for the state-space trajectory of spatial data. Such a fractal representation effectively captures self-similar behaviors of recurrence variations and multi-state transitions in the spatial data. Third, we develop the Heterogeneous Recurrence Quantification Analysis of spatial data. Experimental results in both simulation and real-world case studies show that the proposed approach yields superior performance in the extraction of salient features to characterize and quantify heterogeneous recurrence dynamics in spatial data.

8.
Health Informatics J ; 26(2): 999-1016, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-31266390

RESUMO

This study aims at developing SuperOrder, an order recommendation system for outpatient clinics. Using the electronic health record data available at midnight, SuperOrder predicts the order contents for each upcoming appointment on a daily basis. A two-level prediction framework is proposed. At the base-level, the predictions are produced by aggregating three machine learning methods. The meta-level predictions are generated by integrating the base-level predictions with the order co-occurrence network. We used the retrospective data between 1 April 2014 and 31 March 2015 in pulmonary clinics from five hospital sites within a large rural health care facility in Pennsylvania to test the feasibility. With a decrease of 6 per cent in the precision, the improvement of the recall at the meta-level is approximately 20 per cent from the base-level. This demonstrates that the proposed order co-occurrence network helps in increasing the performance of order predictions. The implementation will bring a more effective and efficient way to place outpatient orders.


Assuntos
Instituições de Assistência Ambulatorial , Aprendizado de Máquina , Sistemas de Registro de Ordens Médicas , Instituições de Assistência Ambulatorial/estatística & dados numéricos , Registros Eletrônicos de Saúde , Previsões , Humanos , Pennsylvania , Estudos Retrospectivos
9.
IEEE J Biomed Health Inform ; 24(1): 57-68, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31395567

RESUMO

Identifying drug-drug interactions (DDIs) is a critical enabler for reducing adverse drug events and improving patient safety. Generating proper DDI alerts during prescribing workflow has the potential to prevent DDI-related adverse events. However, the implementation of DDI alerting system remains a challenge as users are experiencing alert overload which causes alert fatigue. One strategy to optimize the current system is to establish a list of high-priority DDIs for alerting purposes, though it is a resource-intensive task. In this study, we propose a machine learning framework to extract useful features from the FDA adverse event reports and then identify potential high-priority DDIs using an autoencoder-based semi-supervised learning algorithm. The experimental results demonstrate the effectiveness of using adverse event feature representations in differentiating high- and low-priority DDIs. Additionally, the proposed algorithm utilizes stacked autoencoders and weighted support vector machine for boosting classification performance, which outperforms other competing methods in terms of F-measure and AUC score. This framework integrates multiple information sources, leverages domain knowledge and clinical evidence, and provides a practical approach for pre-screening high-priority DDI candidates for medication alerts.


Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos , Algoritmos , Sistemas de Apoio a Decisões Clínicas , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Aprendizado de Máquina Supervisionado , Bases de Dados Factuais , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/diagnóstico , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/prevenção & controle , Humanos
10.
Chaos ; 28(8): 085714, 2018 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-30180605

RESUMO

Nonlinear dynamical systems exhibit complex recurrence behaviors. Recurrence plot is widely used to graphically represent the patterns of recurrence dynamics and further facilitates the quantification of recurrence patterns, namely, recurrence quantification analysis. However, traditional recurrence methods tend to be limited in their ability to handle spatial data due to high dimensionality and geometric characteristics. Prior efforts have been made to generalize the recurrence plot to a four-dimensional space for spatial data analysis, but this framework can only provide graphical visualization of recurrence patterns in the projected reduced-dimension space (i.e., two- or three- dimensions). In this paper, we propose a new weighted recurrence network approach for spatial data analysis. A weighted network model is introduced to represent the recurrence patterns in spatial data, which account for both pixel intensities and spatial distance simultaneously. Note that each network node represents a location in the high-dimensional spatial data. Network edges and weights preserve complex spatial structures and recurrence patterns. Network representation is shown to be an effective means to provide a complete picture of recurrence patterns in the spatial data. Furthermore, we leverage network statistics to characterize and quantify recurrence properties and features in the spatial data. Experimental results in both simulation and real-world case studies show that the generalized recurrence network approach yields superior performance in the visualization of recurrence patterns in spatial data and in the extraction of salient features to characterize recurrence dynamics in spatial systems.

11.
Smart Sustain Manuf Syst ; 1(1): 52-74, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-28785744

RESUMO

This paper proposes a classification scheme for performance metrics for smart manufacturing systems. The discussion focuses on three such metrics: agility, asset utilization, and sustainability. For each of these metrics, we discuss classification themes, which we then use to develop a generalized classification scheme. In addition to the themes, we discuss a conceptual model that may form the basis for the information necessary for performance evaluations. Finally, we present future challenges in developing robust, performance-measurement systems for real-time, data-intensive enterprises.

12.
J Biomed Inform ; 68: 1-19, 2017 04.
Artigo em Inglês | MEDLINE | ID: mdl-28213145

RESUMO

It is believed that anomalous mental states such as stress and anxiety not only cause suffering for the individuals, but also lead to tragedies in some extreme cases. The ability to predict the mental state of an individual at both current and future time periods could prove critical to healthcare practitioners. Currently, the practical way to predict an individual's mental state is through mental examinations that involve psychological experts performing the evaluations. However, such methods can be time and resource consuming, mitigating their broad applicability to a wide population. Furthermore, some individuals may also be unaware of their mental states or may feel uncomfortable to express themselves during the evaluations. Hence, their anomalous mental states could remain undetected for a prolonged period of time. The objective of this work is to demonstrate the ability of using advanced machine learning based approaches to generate mathematical models that predict current and future mental states of an individual. The problem of mental state prediction is transformed into the time series forecasting problem, where an individual is represented as a multivariate time series stream of monitored physical and behavioral attributes. A personalized mathematical model is then automatically generated to capture the dependencies among these attributes, which is used for prediction of mental states for each individual. In particular, we first illustrate the drawbacks of traditional multivariate time series forecasting methodologies such as vector autoregression. Then, we show that such issues could be mitigated by using machine learning regression techniques which are modified for capturing temporal dependencies in time series data. A case study using the data from 150 human participants illustrates that the proposed machine learning based forecasting methods are more suitable for high-dimensional psychological data than the traditional vector autoregressive model in terms of both magnitude of error and directional accuracy. These results not only present a successful usage of machine learning techniques in psychological studies, but also serve as a building block for multiple medical applications that could rely on an automated system to gauge individuals' mental states.


Assuntos
Emoções , Aprendizado de Máquina , Saúde Mental , Previsões , Humanos , Modelos Teóricos
13.
J Biomed Inform ; 66: 82-94, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-28034788

RESUMO

INTRODUCTION: The authors of this work propose an unsupervised machine learning model that has the ability to identify real-world latent infectious diseases by mining social media data. In this study, a latent infectious disease is defined as a communicable disease that has not yet been formalized by national public health institutes and explicitly communicated to the general public. Most existing approaches to modeling infectious-disease-related knowledge discovery through social media networks are top-down approaches that are based on already known information, such as the names of diseases and their symptoms. In existing top-down approaches, necessary but unknown information, such as disease names and symptoms, is mostly unidentified in social media data until national public health institutes have formalized that disease. Most of the formalizing processes for latent infectious diseases are time consuming. Therefore, this study presents a bottom-up approach for latent infectious disease discovery in a given location without prior information, such as disease names and related symptoms. METHODS: Social media messages with user and temporal information are extracted during the data preprocessing stage. An unsupervised sentiment analysis model is then presented. Users' expressions about symptoms, body parts, and pain locations are also identified from social media data. Then, symptom weighting vectors for each individual and time period are created, based on their sentiment and social media expressions. Finally, latent-infectious-disease-related information is retrieved from individuals' symptom weighting vectors. DATASETS AND RESULTS: Twitter data from August 2012 to May 2013 are used to validate this study. Real electronic medical records for 104 individuals, who were diagnosed with influenza in the same period, are used to serve as ground truth validation. The results are promising, with the highest precision, recall, and F1 score values of 0.773, 0.680, and 0.724, respectively. CONCLUSION: This work uses individuals' social media messages to identify latent infectious diseases, without prior information, quicker than when the disease(s) is formalized by national public health institutes. In particular, the unsupervised machine learning model using user, textual, and temporal information in social media data, along with sentiment analysis, identifies latent infectious diseases in a given location.


Assuntos
Doenças Transmissíveis , Mineração de Dados , Mídias Sociais/estatística & dados numéricos , Aprendizado de Máquina não Supervisionado , Humanos , Saúde Pública , Rede Social
14.
Med Care ; 54(11): 1017-1023, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27213544

RESUMO

BACKGROUND: Transitional care interventions can be utilized to reduce post-hospital discharge adverse events (AEs). However, no methodology exists to effectively identify high-risk patients of any disease across multiple hospital sites and patient populations for short-term postdischarge AEs. OBJECTIVES: To develop and validate a 3-day (72 h) AEs prediction model using electronic health records data available at the time of an indexed discharge. RESEARCH DESIGN: Retrospective cohort study of admissions between June 2012 and June 2014. SUBJECTS: All adult inpatient admissions (excluding in-hospital deaths) from a large multicenter hospital system. MEASURES: All-cause 3-day unplanned readmissions, emergency department (ED) visits, and deaths (REDD). The REDD model was developed using clinical, administrative, and socioeconomic data, with data preprocessing steps and stacked classification. Patients were divided randomly into training (66.7%), and testing (33.3%) cohorts to avoid overfitting. RESULTS: The derivation cohort comprised of 64,252 admissions, of which 2782 (4.3%) admissions resulted in 3-day AEs and 13,372 (20.8%) in 30-day AEs. The c-statistic (also known as area under the receiver operating characteristic curve) of 3-day REDD model was 0.671 and 0.664 for the derivation and validation cohort, respectively. The c-statistic of 30-day REDD model was 0.713 and 0.711 for the derivation and validation cohort, respectively. CONCLUSIONS: The 3-day REDD model predicts high-risk patients with fair discriminative power. The discriminative power of the 30-day REDD model is also better than the previously reported models under similar settings. The 3-day REDD model has been implemented and is being used to identify patients at risk for AEs.


Assuntos
Serviço Hospitalar de Emergência/estatística & dados numéricos , Mortalidade , Readmissão do Paciente/estatística & dados numéricos , Feminino , Humanos , Tempo de Internação/estatística & dados numéricos , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Alta do Paciente/estatística & dados numéricos , Pennsylvania/epidemiologia , Estudos Retrospectivos , Fatores de Risco , Fatores Socioeconômicos
15.
Proc Winter Simul Conf ; 2015: 2100-2111, 2015 12.
Artigo em Inglês | MEDLINE | ID: mdl-28690363

RESUMO

Modern manufacturing systems are installed with smart devices such as sensors that monitor system performance and collect data to manage uncertainties in their operations. However, multiple parameters and variables affect system performance, making it impossible for a human to make informed decisions without systematic methodologies and tools. Further, the large volume and variety of streaming data collected is beyond simulation analysis alone. Simulation models are run with well-prepared data. Novel approaches, combining different methods, are needed to use this data for making guided decisions. This paper proposes a methodology whereby parameters that most affect system performance are extracted from the data using data analytics methods. These parameters are used to develop scenarios for simulation inputs; system optimizations are performed on simulation data outputs. A case study of a machine shop demonstrates the proposed methodology. This paper also reviews candidate standards for data collection, simulation, and systems interfaces.

16.
Phys Rev E Stat Nonlin Soft Matter Phys ; 86(1 Pt 2): 016111, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23005495

RESUMO

The problem of graph clustering or community detection has enjoyed a lot of attention in complex networks literature. A quality function, modularity, quantifies the strength of clustering and on maximization yields sensible partitions. However, in most real world networks, there are an exponentially large number of near-optimal partitions with some being very different from each other. Therefore, picking an optimal clustering among the alternatives does not provide complete information about network topology. To tackle this problem, we propose a graph perturbation scheme which can be used to identify an ensemble of near-optimal and diverse clusterings. We establish analytical properties of modularity function under the perturbation which ensures diversity. Our approach is algorithm independent and therefore can leverage any of the existing modularity maximizing algorithms. We numerically show that our methodology can systematically identify very different partitions on several existing data sets. The knowledge of diverse partitions sheds more light into the topological organization and helps gain a more complete understanding of the underlying complex network.


Assuntos
Algoritmos , Modelos Teóricos , Simulação por Computador
17.
Phys Rev E Stat Nonlin Soft Matter Phys ; 76(3 Pt 2): 036106, 2007 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-17930305

RESUMO

Community detection and analysis is an important methodology for understanding the organization of various real-world networks and has applications in problems as diverse as consensus formation in social communities or the identification of functional modules in biochemical networks. Currently used algorithms that identify the community structures in large-scale real-world networks require a priori information such as the number and sizes of communities or are computationally expensive. In this paper we investigate a simple label propagation algorithm that uses the network structure alone as its guide and requires neither optimization of a predefined objective function nor prior information about the communities. In our algorithm every node is initialized with a unique label and at every step each node adopts the label that most of its neighbors currently have. In this iterative process densely connected groups of nodes form a consensus on a unique label to form communities. We validate the algorithm by applying it to networks whose community structures are known. We also demonstrate that the algorithm takes an almost linear time and hence it is computationally less expensive than what was possible so far.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...