Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
1.
Biomedicines ; 11(10)2023 Sep 22.
Article in English | MEDLINE | ID: mdl-37892978

ABSTRACT

This research aims to enhance the classification and prediction of ischemic heart diseases using machine learning techniques, with a focus on resource efficiency and clinical applicability. Specifically, we introduce novel non-invasive indicators known as Campello de Souza features, which require only a tensiometer and a clock for data collection. These features were evaluated using a comprehensive dataset of heart disease cases from a machine learning data repository. Our findings highlight the ability of machine learning algorithms to not only streamline diagnostic procedures but also reduce diagnostic errors and the dependency on extensive clinical testing. Three key features-mean arterial pressure, pulsatile blood pressure index, and resistance-compliance indicator-were found to significantly improve the accuracy of machine learning algorithms in binary heart disease classification. Logistic regression achieved the highest average accuracy among the examined classifiers when utilizing these features. While such novel indicators contribute substantially to the classification process, they should be integrated into a broader diagnostic framework that includes comprehensive patient evaluations and medical expertise. Therefore, the present study offers valuable insights for leveraging data science techniques in the diagnosis and management of cardiovascular diseases.

2.
Biology (Basel) ; 12(7)2023 Jul 04.
Article in English | MEDLINE | ID: mdl-37508389

ABSTRACT

Predictive models based on empirical similarity are instrumental in biology and data science, where the premise is to measure the likeness of one observation with others in the same dataset. Biological datasets often encompass data that can be categorized. When using empirical similarity-based predictive models, two strategies for handling categorical covariates exist. The first strategy retains categorical covariates in their original form, applying distance measures and allocating weights to each covariate. In contrast, the second strategy creates binary variables, representing each variable level independently, and computes similarity measures solely through the Euclidean distance. This study performs a sensitivity analysis of these two strategies using computational simulations, and applies the results to a biological context. We use a linear regression model as a reference point, and consider two methods for estimating the model parameters, alongside exponential and fractional inverse similarity functions. The sensitivity is evaluated by determining the coefficient of variation of the parameter estimators across the three models as a measure of relative variability. Our results suggest that the first strategy excels over the second one in effectively dealing with categorical variables, and offers greater parsimony due to the use of fewer parameters.

3.
Biology (Basel) ; 12(6)2023 Jun 20.
Article in English | MEDLINE | ID: mdl-37372171

ABSTRACT

This research provides a detailed analysis of the COVID-19 spread across 14 Latin American countries. Using time-series analysis and epidemic models, we identify diverse outbreak patterns, which seem not to be influenced by geographical location or country size, suggesting the influence of other determining factors. Our study uncovers significant discrepancies between the number recorded COVID-19 cases and the real epidemiological situation, emphasizing the crucial need for accurate data handling and continuous surveillance in managing epidemics. The absence of a clear correlation between the country size and the confirmed cases, as well as with the fatalities, further underscores the multifaceted influences on COVID-19 impact beyond population size. Despite the decreased real-time reproduction number indicating quarantine effectiveness in most countries, we note a resurgence in infection rates upon resumption of daily activities. These insights spotlight the challenge of balancing public health measures with economic and social activities. Our core findings provide novel insights, applicable to guiding epidemic control strategies and informing decision-making processes in combatting the pandemic.

4.
Sustain Cities Soc ; 96: 104712, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37313370

ABSTRACT

Most crowding measures in public transportation are usually aggregated at a service level. This type of aggregation does not help to analyze microscopic behavior such as exposure risk to viruses. To bridge such a gap, our paper proposes four novel crowding measures that might be well suited to proxy virus exposure risk at public transport. In addition, we conduct a case study in Santiago, Chile, using smart card data of the buses system to compute the proposed measures for three different and relevant periods of the COVID-19 pandemic: before, during, and after Santiago's lockdown. We find that the governmental policies diminished public transport crowding considerably for the lockdown phase. The average exposure time when social distancing is not possible passes from 6.39 min before lockdown to 0.03 min during the lockdown, while the average number of encountered persons passes from 43.33 to 5.89. We shed light on how the pandemic impacts differ across various population groups in society. Our findings suggest that poorer municipalities returned faster to crowding levels similar to those before the pandemic.

5.
Math Biosci Eng ; 20(4): 6110-6133, 2023 01 31.
Article in English | MEDLINE | ID: mdl-37161100

ABSTRACT

Vision-related quality of life (QoL) analyzes the visual function concerning individual well-being based on activity and social participation. Because QoL is a multivariate construct, a multivariate statistical method must be used to analyze this construct. In this paper, we present a methodology based on STATIS multivariate three-way methods to assess the real change in vision-related QoL for myopic patients by comparing their conditions before and after corneal surgery. We conduct a case study in Costa Rica to detect the outcomes of patients referred for myopia that underwent refractive surgery. We consider a descriptive, observational and prospective study. We utilize the NEI VFQ-25 instrument to measure the vision-related QoL in five different stages over three months. After applying this instrument/questionnaire, a statistically significant difference was detected between the perceived QoL levels. In addition, strong correlations were identified with highly similar structures ranging from 0.857 to 0.940. The application of the dual STATIS method found the non-existence of reconceptualization in myopic patients, but a statistically significant recalibration was identified. Furthermore, a real change was observed in all patients after surgery. This finding has not been stated previously due to the limitations of the existing statistical tools. We demonstrated that dual STATIS is a multivariate method capable of evaluating vision-related QoL data and detecting changes in recalibration and reconceptualization.


Subject(s)
Quality of Life , Humans , Costa Rica , Prospective Studies
6.
Biology (Basel) ; 12(3)2023 Mar 13.
Article in English | MEDLINE | ID: mdl-36979135

ABSTRACT

In this article, we propose a comparative study between two models that can be used by researchers for the analysis of survival data: (i) the Weibull regression model and (ii) the random survival forest (RSF) model. The models are compared considering the error rate, the performance of the model through the Harrell C-index, and the identification of the relevant variables for survival prediction. A statistical analysis of a data set from the Heart Institute of the University of São Paulo, Brazil, has been carried out. In the study, the length of stay of patients undergoing cardiac surgery, within the operating room, was used as the response variable. The obtained results show that the RSF model has less error rate for the training and testing data sets, at 23.55% and 20.31%, respectively, than the Weibull model, which has an error rate of 23.82%. Regarding the Harrell C-index, we obtain the values 0.76, 0.79, and 0.76, for the RSF and Weibull models, respectively. After the selection procedure, the Weibull model contains variables associated with the type of protocol and type of patient being statistically significant at 5%. The RSF model chooses age, type of patient, and type of protocol as relevant variables for prediction. We employ the randomForestSRC package of the R software to perform our data analysis and computational experiments. The proposal that we present has many applications in biology and medicine, which are discussed in the conclusions of this work.

7.
Soft comput ; 27(1): 279-295, 2023.
Article in English | MEDLINE | ID: mdl-35915830

ABSTRACT

In this paper, we propose and derive a new regression model for response variables defined on the open unit interval. By reparameterizing the unit generalized half-normal distribution, we get the interpretation of its location parameter as being a quantile of the distribution. In addition, we can evaluate effects of the explanatory variables in the conditional quantiles of the response variable as an alternative to the Kumaraswamy quantile regression model. The suitability of our proposal is demonstrated with two simulated examples and two real applications. For such data sets, the obtained fits of the proposed regression model are compared with that provided by a Kumaraswamy regression model.

8.
Comput Methods Programs Biomed ; 221: 106816, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35580528

ABSTRACT

Quantile regression allows us to estimate the relationship between covariates and any quantile of the response variable rather than the mean. Recently, several statistical distributions have been considered for quantile modeling. The objective of this study is to provide a new computational package, two biomedical applications, one of them with COVID-19 data, and an up-to-date overview of parametric quantile regression. A fully parametric quantile regression is formulated by first parameterizing the baseline distribution in terms of a quantile. Then, we introduce a regression-based functional form through a link function. The density, distribution, and quantile functions, as well as the main properties of each distribution, are presented. We consider 18 distributions related to normal and non-normal settings for quantile modeling of continuous responses on the unit interval, four distributions for continuous response, and one distribution for discrete response. We implement an R package that includes estimation and model checking, density, distribution, and quantile functions, as well as random number generators, for distributions using quantile regression in both location and shape parameters. In summary, a number of studies have recently appeared applying parametric quantile regression as an alternative to the distribution-free quantile regression proposed in the literature. We have reviewed a wide body of parametric quantile regression models, developed an R package which allows us, in a simple way, to fit a variety of distributions, and applied these models to two examples with biomedical real-world data from Brazil and COVID-19 data from US for illustrative purposes. Parametric and non-parametric quantile regressions are compared with these two data sets.


Subject(s)
COVID-19 , Models, Statistical , Brazil , COVID-19/epidemiology , Humans
9.
Sensors (Basel) ; 22(10)2022 May 14.
Article in English | MEDLINE | ID: mdl-35632152

ABSTRACT

In this paper, we propose a new privatization mechanism based on a naive theory of a perturbation on a probability using wavelets, such as a noise perturbs the signal of a digital image sensor. Wavelets are employed to extract information from a wide range of types of data, including audio signals and images often related to sensors, as unstructured data. Specifically, the cumulative wavelet integral function is defined to build the perturbation on a probability with the help of this function. We show that an arbitrary distribution function additively perturbed is still a distribution function, which can be seen as a privatized distribution, with the privatization mechanism being a wavelet function. Thus, we offer a mathematical method for choosing a suitable probability distribution for data by starting from some guessed initial distribution. Examples of the proposed method are discussed. Computational experiments were carried out using a database-sensor and two related algorithms. Several knowledge areas can benefit from the new approach proposed in this investigation. The areas of artificial intelligence, machine learning, and deep learning constantly need techniques for data fitting, whose areas are closely related to sensors. Therefore, we believe that the proposed privatization mechanism is an important contribution to increasing the spectrum of existing techniques.


Subject(s)
Artificial Intelligence , Privatization , Algorithms , Machine Learning , Probability
10.
Trop Med Infect Dis ; 8(1)2022 Dec 30.
Article in English | MEDLINE | ID: mdl-36668937

ABSTRACT

Dengue is a disease of high interest for public health in the affected localities. Dengue virus is transmitted by Aedes species and presents hyperendemic behaviors in tropical and subtropical regions. Colombia is one of the countries most affected by the dengue virus in the Americas. Its central-west region is a hot spot in dengue transmission, especially the Department of Antioquia, which has suffered from multiple dengue outbreaks in recent years (2015-2016 and 2019-2020). In this article, we perform a retrospective analysis of the confirmed dengue cases in Antioquia, discriminating by both subregions and dengue severity from 2015 to 2020. First, we conduct an exploratory analysis of the epidemic data, and then a statistical survival analysis is carried out using a Cox regression model. Our findings allow the identification of the hazard and socio-demographic patterns of dengue infections in the Colombian subtropical region of Antioquia from 2015 to 2020.

11.
Sensors (Basel) ; 21(19)2021 Sep 29.
Article in English | MEDLINE | ID: mdl-34640834

ABSTRACT

Environmental agencies are interested in relating mortality to pollutants and possible environmental contributors such as temperature. The Gaussianity assumption is often violated when modeling this relationship due to asymmetry and then other regression models should be considered. The class of Birnbaum-Saunders models, especially their regression formulations, has received considerable attention in the statistical literature. These models have been applied successfully in different areas with an emphasis on engineering, environment, and medicine. A common simplification of these models is that statistical dependence is often not considered. In this paper, we propose and derive a time-dependent model based on a reparameterized Birnbaum-Saunders (RBS) asymmetric distribution that allows us to analyze data in terms of a time-varying conditional mean. In particular, it is a dynamic class of autoregressive moving average (ARMA) models with regressors and a conditional RBS distribution (RBSARMAX). By means of a Monte Carlo simulation study, the statistical performance of the new methodology is assessed, showing good results. The asymmetric RBSARMAX structure is applied to the modeling of mortality as a function of pollution and temperature over time with sensor-related data. This modeling provides strong evidence that the new ARMA formulation is a good alternative for dealing with temporal data, particularly related to mortality with regressors of environmental temperature and pollution.


Subject(s)
Environmental Pollution , Computer Simulation , Monte Carlo Method , Temperature
12.
Sensors (Basel) ; 21(16)2021 Aug 09.
Article in English | MEDLINE | ID: mdl-34450794

ABSTRACT

Healthcare service centers must be sited in strategic locations that meet the immediate needs of patients. The current situation due to the COVID-19 pandemic makes this problem particularly relevant. Assume that each center corresponds to an assigned place for vaccination and that each center uses one or more vaccine brands/laboratories. Then, each patient could choose a center instead of another, because she/he may prefer the vaccine from a more reliable laboratory. This defines an order of preference that might depend on each patient who may not want to be vaccinated in a center where there are only her/his non-preferred vaccine brands. In countries where the vaccination process is considered successful, the order assigned by each patient to the vaccination centers is defined by incentives that local governments give to their population. These same incentives for foreign citizens are seen as a strategic decision to generate income from tourism. The simple plant/center location problem (SPLP) is a combinatorial approach that has been extensively studied. However, a less-known natural extension of it with order (SPLPO) has not been explored in the same depth. In this case, the size of the instances that can be solved is limited. The SPLPO considers an order of preference that patients have over a set of facilities to meet their demands. This order adds a new set of constraints in its formulation that increases the complexity of the problem to obtain an optimal solution. In this paper, we propose a new two-stage stochastic formulation for the SPLPO (2S-SPLPO) that mimics the mentioned pandemic situation, where the order of preference is treated as a random vector. We carry out computational experiments on simulated 2S-SPLPO instances to evaluate the performance of the new proposal. We apply an algorithm based on Lagrangian relaxation that has been shown to be efficient for large instances of the SPLPO. A potential application of this new algorithm to COVID-19 vaccination is discussed and explored based on sensor-related data. Two further algorithms are proposed to store the patient's records in a data warehouse and generate 2S-SPLPO instances using sensors.


Subject(s)
COVID-19 Vaccines , COVID-19 , Algorithms , Female , Humans , Male , Pandemics , SARS-CoV-2 , Vaccination
13.
Sensors (Basel) ; 21(15)2021 Jul 31.
Article in English | MEDLINE | ID: mdl-34372434

ABSTRACT

Governments have been challenged to provide timely medical care to face the COVID-19 pandemic. Under this pandemic, the demand for pharmaceutical products has changed significantly. Some of these products are in high demand, while, for others, their demand falls sharply. These changes in the random demand patterns are connected with changes in the skewness (asymmetry) and kurtosis of their data distribution. Such changes are critical to determining optimal lots and inventory costs. The lot-size model helps to make decisions based on probabilistic demand when calculating the optimal costs of supply using two-stage stochastic programming. The objective of this study is to evaluate how the skewness and kurtosis of the distribution of demand data, collected through sensors, affect the modeling of inventories of hospital pharmacy products helpful to treat COVID-19. The use of stochastic programming allows us to obtain results under demand uncertainty that are closer to reality. We carry out a simulation study to evaluate the performance of our methodology under different demand scenarios with diverse degrees of skewness and kurtosis. A case study in the field of hospital pharmacy with sensor-related COVID-19 data is also provided. An algorithm that permits us to use sensors when submitting requests for supplying pharmaceutical products in the hospital treatment of COVID-19 is designed. We show that the coefficients of skewness and kurtosis impact the total costs of inventory that involve order, purchase, holding, and shortage. We conclude that the asymmetry and kurtosis of the demand statistical distribution do not seem to affect the first-stage lot-size decisions. However, demand patterns with high positive skewness are related to significant increases in expected inventories on hand and shortage, increasing the costs of second-stage decisions. Thus, demand distributions that are highly asymmetrical to the right and leptokurtic favor high total costs in probabilistic lot-size systems.


Subject(s)
COVID-19 , Pharmacy Service, Hospital , Humans , Pandemics , SARS-CoV-2 , Uncertainty
14.
Sensors (Basel) ; 21(12)2021 Jun 14.
Article in English | MEDLINE | ID: mdl-34198627

ABSTRACT

In this paper, we group South American countries based on the number of infected cases and deaths due to COVID-19. The countries considered are: Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Peru, Paraguay, Uruguay, and Venezuela. The data used are collected from a database of Johns Hopkins University, an institution that is dedicated to sensing and monitoring the evolution of the COVID-19 pandemic. A statistical analysis, based on principal components with modern and recent techniques, is conducted. Initially, utilizing the correlation matrix, standard components and varimax rotations are calculated. Then, by using disjoint components and functional components, the countries are grouped. An algorithm that allows us to keep the principal component analysis updated with a sensor in the data warehouse is designed. As reported in the conclusions, this grouping changes depending on the number of components considered, the type of principal component (standard, disjoint or functional) and the variable to be considered (infected cases or deaths). The results obtained are compared to the k-means technique. The COVID-19 cases and their deaths vary in the different countries due to diverse reasons, as reported in the conclusions.


Subject(s)
COVID-19 , Pandemics , Argentina , Brazil , Chile , Colombia , Ecuador , Humans , Peru , Principal Component Analysis , SARS-CoV-2 , Uruguay , Venezuela
15.
Entropy (Basel) ; 23(4)2021 Apr 20.
Article in English | MEDLINE | ID: mdl-33923879

ABSTRACT

Data mining is employed to extract useful information and to detect patterns from often large data sets, closely related to knowledge discovery in databases and data science. In this investigation, we formulate models based on machine learning algorithms to extract relevant information predicting student retention at various levels, using higher education data and specifying the relevant variables involved in the modeling. Then, we utilize this information to help the process of knowledge discovery. We predict student retention at each of three levels during their first, second, and third years of study, obtaining models with an accuracy that exceeds 80% in all scenarios. These models allow us to adequately predict the level when dropout occurs. Among the machine learning algorithms used in this work are: decision trees, k-nearest neighbors, logistic regression, naive Bayes, random forest, and support vector machines, of which the random forest technique performs the best. We detect that secondary educational score and the community poverty index are important predictive variables, which have not been previously reported in educational studies of this type. The dropout assessment at various levels reported here is valid for higher education institutions around the world with similar conditions to the Chilean case, where dropout rates affect the efficiency of such institutions. Having the ability to predict dropout based on student's data enables these institutions to take preventative measures, avoiding the dropouts. In the case study, balancing the majority and minority classes improves the performance of the algorithms.

16.
Entropy (Basel) ; 23(1)2021 Jan 12.
Article in English | MEDLINE | ID: mdl-33445659

ABSTRACT

In this research, statistical models are formulated to study the effect of the health crisis arising from COVID-19 in global markets. Breakpoints in the price series of stock indexes are considered. Such indexes are used as an approximation of the stock markets in different countries, taking into account that they are indicative of these markets because of their composition. The main results obtained in this investigation highlight that countries with better institutional and economic conditions are less affected by the pandemic. In addition, the effect of the health index in the models is associated with their non-significant parameters. This is due to that the health index used in the modeling would not determine the different capacities of the countries analyzed to respond efficiently to the pandemic effect. Therefore, the contagion is the preponderant factor when analyzing the structural breakdown that occurred in the world economy.

17.
Rev Environ Contam Toxicol ; 250: 45-67, 2020.
Article in English | MEDLINE | ID: mdl-32318823

ABSTRACT

Atmospheric pollution derives mainly from anthropogenic activities that use combustion and may lead to adverse effects in exposed populations. It is generally accepted that air contamination causes cardiovascular and pulmonary morbidity in addition to increased mortality after exposure, but other epidemiological associations have also been described, including cancer as well as reproductive and immunological toxicity. Thus the concentration of chemicals in the air must be controlled. We propose that monitoring of air quality may be achieved by employing data analytics to generate information within the context of data-driven decision making to prevent and/or adequately alert the population about possible critical episodes of air contamination. In this paper, we propose a methodology for monitoring particulate matter pollution in Santiago of Chile which is based on bivariate control charts with heavy-tailed asymmetric distributions. This methodology is useful for monitoring environmental risk when the particulate matter concentrations follow bivariate Birnbaum-Saunders or Birnbaum-Saunders-t-Student distributions. A case study with real particulate matter pollution from Santiago is provided, which shows that the methodology is suitable to alert early episodes of extreme air pollution. The results are in agreement with the critical episodes reported with the current model used by the Chilean health authority.


Subject(s)
Air Pollutants , Air Pollution/adverse effects , Environmental Monitoring/methods , Particulate Matter , Air Pollutants/analysis , Air Pollutants/toxicity , Chile , Decision Making , Humans , Particulate Matter/toxicity
18.
J Appl Stat ; 47(13-15): 2690-2710, 2020.
Article in English | MEDLINE | ID: mdl-35707422

ABSTRACT

The Birnbaum-Saunders distribution is a widely studied model with diverse applications. Its origins are in the modeling of lifetimes associated with material fatigue. By using a motivating example, we show that, even when lifetime data related to fatigue are modeled, the Birnbaum-Saunders distribution can be unsuitable to fit these data in the distribution tails. Based on the nice properties of the Birnbaum-Saunders model, in this work, we use a modified skew-normal distribution to construct such a model. This allows us to obtain flexibility in skewness and kurtosis, which is controlled by a shape parameter. We provide a mathematical characterization of this new type of Birnbaum-Saunders distribution and then its statistical characterization is derived by using the maximum-likelihood method, including the associated information matrices. In order to improve the inferential performance, we correct the bias of the corresponding estimators, which is supported by a simulation study. To conclude our investigation, we retake the motivating example based on fatigue life data to show the good agreement between the new type of Birnbaum-Saunders distribution proposed in this work and the data, reporting its potential applications.

19.
Rev Assoc Med Bras (1992) ; 65(3): 394-403, 2019 Mar.
Article in English | MEDLINE | ID: mdl-30994839

ABSTRACT

OBJECTIVE: To propose a program of physical-cognitive dual task and to measure its impact in Chilean institutionalized elderly adults. METHOD: Experimental design study with pre and post-intervention evaluations, measuring the cognitive and depressive levels by means of the Pfeiffer test and the Yesavage scale, respectively. The program was applied for 12 weeks to adults between 68 and 90 years old. The statistical analysis was based on the nonparametric Wilcoxon test for paired samples and was contrasted with its parametric version. The statistical software R was used. RESULTS: Statistically significant differences were obtained in the cognitive level (p-value < 0.05) and highly significant (p-value < 0.001) in the level of depression with both tests (parametric and nonparametric). CONCLUSION: Due to the almost null evidence of scientific interventions of programs that integrate physical activity and cognitive tasks together in Chilean elderly adults, a program of physical-cognitive dual task was proposed as a non-pharmacological treatment, easy to apply and of low cost to benefit their integral health, which improves significantly the cognitive and depressive levels of institutionalized elderly adults.


Subject(s)
Cognitive Behavioral Therapy/methods , Cognitive Dysfunction/therapy , Depressive Disorder/therapy , Exercise Therapy/methods , Mental Health , Program Evaluation , Aged , Aged, 80 and over , Aging/psychology , Chile , Cognitive Dysfunction/physiopathology , Depressive Disorder/physiopathology , Female , Housing for the Elderly , Humans , Institutionalization , Male , Psychiatric Status Rating Scales , Severity of Illness Index , Statistics, Nonparametric , Surveys and Questionnaires , Time Factors , Treatment Outcome
20.
PLoS One ; 14(3): e0212768, 2019.
Article in English | MEDLINE | ID: mdl-30822320

ABSTRACT

The objective of this paper is to propose a lot-sizing methodology for an inventory system that faces time-dependent random demands and that seeks to minimize total cost as a function of order, purchase, holding and shortage costs. A two-stage stochastic programming framework is derived to optimize lot-sizing decisions over a time horizon. To this end, we simulate a demand time-series by using a generalized autoregressive moving average structure. The modeling includes covariates of the demand, which are used as predictors of this. We describe an algorithm that summarizes the methodology and we discuss its computational framework. A case study with unpublished real-world data is presented to illustrate the potential of this methodology. We report that the accuracy of the demand variance estimator improves when a temporal structure is considered, instead of assuming time-independent demand. The methodology is useful in decisions related to inventory logistics management when the demand shows patterns of temporal dependence.


Subject(s)
Models, Theoretical , Pharmaceutical Preparations/supply & distribution , Chile , Humans
SELECTION OF CITATIONS
SEARCH DETAIL