Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
Water Sci Technol ; 71(1): 89-96, 2015.
Article in English | MEDLINE | ID: mdl-25607674

ABSTRACT

A predictive modelling technique was employed to estimate wastewater temperatures in sewer pipes. The simplicity of abductive predictive models attracts large numbers of users due to their minimal computation time and limited number of measurable input parameters. Data measured from five sewer pipes over a period of 12 months provide 33,900 training entries and 39,000 evaluation entries to support the models' development. Two simple predictive models for urban upstream combined sewers and large downstream collector sewers were developed. They delivered good correlation between measured and predicted wastewater temperatures proven by their R(2) values of up to 0.98 and root mean square error (RMSE) of the temperature change along the sewer pipe ranging from 0.15 °C to 0.33 °C. Analysis of a number of potential input parameters indicated that upstream wastewater temperature and downstream in-sewer air temperature were the only input parameters that are needed in the developed models to deliver this level of accuracy.


Subject(s)
Models, Theoretical , Sewage/analysis , Temperature , Waste Disposal, Fluid , Wastewater/analysis , Belgium , Cities
2.
J Biomed Inform ; 38(6): 456-68, 2005 Dec.
Article in English | MEDLINE | ID: mdl-16337569

ABSTRACT

Medical applications are often characterized by a large number of disease markers and a relatively small number of data records. We demonstrate that complete feature ranking followed by selection can lead to appreciable reductions in data dimensionality, with significant improvements in the implementation and performance of classifiers for medical diagnosis. We describe a novel approach for ranking all features according to their predictive quality using properties unique to learning algorithms based on the group method of data handling (GMDH). An abductive network training algorithm is repeatedly used to select groups of optimum predictors from the feature set at gradually increasing levels of model complexity specified by the user. Groups selected earlier are better predictors. The process is then repeated to rank features within individual groups. The resulting full feature ranking can be used to determine the optimum feature subset by starting at the top of the list and progressively including more features until the classification error rate on an out-of-sample evaluation set starts to increase due to overfitting. The approach is demonstrated on two medical diagnosis datasets (breast cancer and heart disease) and comparisons are made with other feature ranking and selection methods. Receiver operating characteristics (ROC) analysis is used to compare classifier performance. At default model complexity, dimensionality reduction of 22 and 54% could be achieved for the breast cancer and heart disease data, respectively, leading to improvements in the overall classification performance. For both datasets, considerable dimensionality reduction introduced no significant reduction in the area under the ROC curve. GMDH-based feature selection results have also proved effective with neural network classifiers.


Subject(s)
Disease/classification , Electronic Data Processing/methods , Breast Neoplasms/classification , Female , Heart Diseases/classification , Humans , Medical Informatics , Nerve Net , Reproducibility of Results
3.
Comput Methods Programs Biomed ; 80(2): 141-53, 2005 Nov.
Article in English | MEDLINE | ID: mdl-16169631

ABSTRACT

This paper demonstrates the use of abductive network classifier committees trained on different features for improving classification accuracy in medical diagnosis. In an earlier publication, committee members were trained on different subsets of the training set to ensure enough diversity for improved committee performance. In situations characterized by high data dimensionality, i.e. a large number of features and a relatively few training examples, it may be more advantageous to split the feature set rather than the training set. We describe a novel approach for tentatively ranking the features and forming subsets of uniform predictive quality for training individual members. The abductive network training algorithm is used to select optimum predictors from the feature set at various levels of model complexity specified by the user. Using the resulting tentative ranking, the features are grouped into mutually exclusive subsets of approximately equal predictive power for training the members. The approach is demonstrated on three standard medical diagnosis datasets (breast cancer, heart disease, and diabetes). Three-member committees trained on different feature subsets and using simple output combination methods reduce classification errors by up to 20% compared to the best single model developed with the full feature set. Results are compared with those reported previously with members trained through splitting the training set. Training abductive committee members on feature subsets of approximately equal predictive power achieves both diversity and quality for improved committee performance. Ensemble feature subset selection can be performed using GMDH-based learning algorithms. The approach should be advantageous in situations characterized by high data dimensionality.


Subject(s)
Disease/classification , Algorithms , Breast Neoplasms/diagnosis , Diabetes Mellitus/diagnosis , Female , Heart Diseases/diagnosis , Humans , Male , Neural Networks, Computer
4.
Methods Inf Med ; 43(2): 192-201, 2004.
Article in English | MEDLINE | ID: mdl-15136869

ABSTRACT

OBJECTIVES: To introduce abductive network classifier committees as an ensemble method for improving classification accuracy in medical diagnosis. While neural networks allow many ways to introduce enough diversity among member models to improve performance when forming a committee, the self-organizing, automatic-stopping nature, and learning approach used by abductive networks are not very conducive for this purpose. We explore ways of over-coming this limitation and demonstrate improved classification on three standard medical datasets. METHODS: Two standard 2-class medical datasets (Pima Indians Diabetes and Heart Disease) and a 6-class dataset (Dermatology) were used to investigate ways of training abductive networks with adequate independence, as well as methods of combining their outputs to form a network that improves performance beyond that of single models. RESULTS: Two- or three-member committees of models trained on completely or partially different subsets of training data and using simple output combination methods achieve improvements between 2 and 5 percentage points in the classification accuracy over the best single model developed using the full training set. CONCLUSIONS: Varying model complexity alone gives abductive network models that are too correlated to ensure enough diversity for forming a useful committee. Diversity achieved through training member networks on independent subsets of the training data outweighs limitations of the smaller training set for each, resulting in net gain in committee performance. As such models train faster and can be trained in parallel, this can also speed up classifier development.


Subject(s)
Disease/classification , Medical Informatics , Data Collection , Humans , Saudi Arabia
5.
Comput Methods Programs Biomed ; 58(1): 69-81, 1999 Jan.
Article in English | MEDLINE | ID: mdl-10195648

ABSTRACT

The cluster analysis technique is considered for classifying kidney stones based on data for nine chemical analysis parameters. A set of 214 stones is used, which has been previously classified using empirical classification rules into three stone types using the percentage concentrations of the urate, oxalate, and phosphate radicals. We investigate whether cluster analysis utilising data on all parameters leads to different classifications and explore the possibility of other effective classifiers. We also compare the performance of various clustering techniques, distance and similarity measures and data standardisation methods. Results indicate that inclusion of the additional six parameters does not improve the classification accuracy. Best matching with the empirical classification (6% error) is achieved using the average linkage (between groups) clustering method and the squared Eculidean distance measure without data standardisation. Excluding these three main radicals causes a 63% matching error. Cluster analysis results suggest that carbon ions alone provide a single classifier for the three stone types, giving a matching error of approximately 10% with the empirical classification.


Subject(s)
Kidney Calculi/chemistry , Kidney Calculi/classification , Cluster Analysis , Humans , Ions , Oxalates/analysis , Phosphates/analysis , Uric Acid/analysis
6.
Comput Methods Programs Biomed ; 56(3): 235-47, 1998 Jun.
Article in English | MEDLINE | ID: mdl-9725649

ABSTRACT

Two univariate time-series analysis methods have been used to model and forecast the monthly patient volume at the family and community medicine primary health care clinic of King Faisal University, Al-Khobar, Saudi Arabia. Models were based on nine years of data and forecasts made for 2 years. The optimum ARIMA model selected is an autoregressive model of the fourth order operating on the data after differencing twice at the nonseasonal level and once at the seasonal level. It gives mean and maximum absolute percentage errors of 1.86 and 4.23%, respectively, over the forecasting interval. A much simpler method based on extrapolating the growth curve of the annual means of the patient volume using a polynomial fit gives the better figures of 0.55 and 1.17%, respectively. This is due to the fairly regular nature of the data and the lack of strong random components that require ARIMA processes for modeling.


Subject(s)
Ambulatory Care , Computer Simulation , Health Services Needs and Demand , Mathematical Computing , Analysis of Variance , Forecasting , Patient Admission/statistics & numerical data , Primary Health Care/statistics & numerical data , Saudi Arabia
7.
Comput Biomed Res ; 30(6): 451-71, 1997 Dec.
Article in English | MEDLINE | ID: mdl-9466835

ABSTRACT

This paper investigates the use of abductive-network machine learning for modeling and predicting outcome parameters in terms of input parameters in medical survey data. Here we consider modeling obesity as represented by the waist-to-hip ratio (WHR) risk factor to investigate the influence of various parameters. The same approach would be useful in predicting values of clinical parameters that are difficult or expensive to measure from others that are more readily available. The AIM abductive network machine learning tool was used to model the WHR from 13 other health parameters. Survey data were collected for a randomly selected sample of 1100 persons aged 20 yr and over attending nine primary health care centers at Al-Khobar, Saudi Arabia. Models were synthesized by training on a randomly selected set of 800 cases, using both continuous and categorical representations of the parameters, and evaluated by predicting the WHR value for the remaining 300 cases. Models for WHR as a continuous variable predict the actual values within an error of 7.5% at the 90% confidence limits. Categorical models predict the correct logical value of WHR with an error in only 2 of the 300 evaluation cases. Analytical relationships derived from simple categorical models explain global observations on the total survey population to an accuracy as high as 99%. Simple continuous models represented as analytical functions highlight global relationships and trends. Results confirm the strong correlation between WHR and diastolic blood pressure, cholesterol level, and family history of obesity. Compared to other statistical and neural network approaches, AIM abductive networks provide faster and more automated model synthesis. A review is given of other areas where the proposed modeling approach can be useful in clinical practice.


Subject(s)
Models, Biological , Neural Networks, Computer , Obesity/etiology , Adult , Algorithms , Blood Glucose/analysis , Blood Pressure , Body Constitution , Body Mass Index , Body Weight , Cholesterol/blood , Confidence Intervals , Diastole , Female , Forecasting , Humans , Male , Nonlinear Dynamics , Obesity/genetics , Outcome Assessment, Health Care , Risk Factors , Saudi Arabia , Triglycerides/blood
8.
Methods Inf Med ; 35(3): 265-71, 1996 Sep.
Article in English | MEDLINE | ID: mdl-8952313

ABSTRACT

The use of modern abductive machine learning techniques is described for modeling and predicting outcome parameters in terms of input parameters in medical survey data. The AIM (Abductory Induction Mechanism) abductive network machine-learning tool is used to model the educational score in a health survey of 2,720 Albanian primary school children. Data included the child's age, gender, vision, nourishment, parasite infection, family size, parents' education, and educational score. Models synthesized by training on just 100 cases predict the educational score output for the remaining 2,620 cases with 100% accuracy. Simple models represented as analytical functions highlight global relationships and trends in the survey population. Models generated are quite robust, with no change in the basic model structure for a 10-fold increase in the size of the training set. Compared to other statistical and neural network approaches, AIM provides faster and highly automated model synthesis, requiring little or no user intervention.


Subject(s)
Achievement , Education , Health Surveys , Child , Computer Simulation , Female , Humans , Male , Neural Networks, Computer , School Health Services
SELECTION OF CITATIONS
SEARCH DETAIL
...