Search | VHL Regional Portal

A discussion of calibration techniques for evaluating binary and categorical predictive models.

Fenlon, Caroline; O'Grady, Luke; Doherty, Michael L; Dunnion, John.

Prev Vet Med ; 149: 107-114, 2018 Jan 01.

Article in English | MEDLINE | ID: mdl-29290291

ABSTRACT

Modelling of binary and categorical events is a commonly used tool to simulate epidemiological processes in veterinary research. Logistic and multinomial regression, naïve Bayes, decision trees and support vector machines are popular data mining techniques used to predict the probabilities of events with two or more outcomes. Thorough evaluation of a predictive model is important to validate its ability for use in decision-support or broader simulation modelling. Measures of discrimination, such as sensitivity, specificity and receiver operating characteristics, are commonly used to evaluate how well the model can distinguish between the possible outcomes. However, these discrimination tests cannot confirm that the predicted probabilities are accurate and without bias. This paper describes a range of calibration tests, which typically measure the accuracy of predicted probabilities by comparing them to mean event occurrence rates within groups of similar test records. These include overall goodness-of-fit statistics in the form of the Hosmer-Lemeshow and Brier tests. Visual assessment of prediction accuracy is carried out using plots of calibration and deviance (the difference between the outcome and its predicted probability). The slope and intercept of the calibration plot are compared to the perfect diagonal using the unreliability test. Mean absolute calibration error provides an estimate of the level of predictive error. This paper uses sample predictions from a binary logistic regression model to illustrate the use of calibration techniques. Code is provided to perform the tests in the R statistical programming language. The benefits and disadvantages of each test are described. Discrimination tests are useful for establishing a model's diagnostic abilities, but may not suitably assess the model's usefulness for other predictive applications, such as stochastic simulation. Calibration tests may be more informative than discrimination tests for evaluating models with a narrow range of predicted probabilities or overall prevalence close to 50%, which are common in epidemiological applications. Using a suite of calibration tests alongside discrimination tests allows model builders to thoroughly measure their model's predictive capabilities.

Subject(s)

Logistic Models , Models, Biological , Veterinary Medicine/methods , Calibration , Predictive Value of Tests , ROC Curve

The creation and evaluation of a model to simulate the probability of conception in seasonal-calving pasture-based dairy heifers.

Fenlon, Caroline; O'Grady, Luke; Butler, Stephen; Doherty, Michael L; Dunnion, John.

Ir Vet J ; 70: 32, 2017.

Article in English | MEDLINE | ID: mdl-29201347

ABSTRACT

BACKGROUND: Herd fertility in pasture-based dairy farms is a key driver of farm economics. Models for predicting nulliparous reproductive outcomes are rare, but age, genetics, weight, and BCS have been identified as factors influencing heifer conception. The aim of this study was to create a simulation model of heifer conception to service with thorough evaluation. METHODS: Artificial Insemination service records from two research herds and ten commercial herds were provided to build and evaluate the models. All were managed as spring-calving pasture-based systems. The factors studied were related to age, genetics, and time of service. The data were split into training and testing sets and bootstrapping was used to train the models. Logistic regression (with and without random effects) and generalised additive modelling were selected as the model-building techniques. Two types of evaluation were used to test the predictive ability of the models: discrimination and calibration. Discrimination, which includes sensitivity, specificity, accuracy and ROC analysis, measures a model's ability to distinguish between positive and negative outcomes. Calibration measures the accuracy of the predicted probabilities with the Hosmer-Lemeshow goodness-of-fit, calibration plot and calibration error. RESULTS: After data cleaning and the removal of services with missing values, 1396 services remained to train the models and 597 were left for testing. Age, breed, genetic predicted transmitting ability for calving interval, month and year were significant in the multivariate models. The regression models also included an interaction between age and month. Year within herd was a random effect in the mixed regression model. Overall prediction accuracy was between 77.1% and 78.9%. All three models had very high sensitivity, but low specificity. The two regression models were very well-calibrated. The mean absolute calibration errors were all below 4%. CONCLUSION: Because the models were not adept at identifying unsuccessful services, they are not suggested for use in predicting the outcome of individual heifer services. Instead, they are useful for the comparison of services with different covariate values or as sub-models in whole-farm simulations. The mixed regression model was identified as the best model for prediction, as the random effects can be ignored and the other variables can be easily obtained or simulated.

A comparison of 4 predictive models of calving assistance and difficulty in dairy heifers and cows.

Fenlon, Caroline; O'Grady, Luke; Mee, John F; Butler, Stephen T; Doherty, Michael L; Dunnion, John.

J Dairy Sci ; 100(12): 9746-9758, 2017 Dec.

Article in English | MEDLINE | ID: mdl-28941818

ABSTRACT

The aim of this study was to build and compare predictive models of calving difficulty in dairy heifers and cows for the purpose of decision support and simulation modeling. Models to predict 3 levels of calving difficulty (unassisted, slight assistance, and considerable or veterinary assistance) were created using 4 machine learning techniques: multinomial regression, decision trees, random forests, and neural networks. The data used were sourced from 2,076 calving records in 10 Irish dairy herds. In total, 19.9 and 5.9% of calving events required slight assistance and considerable or veterinary assistance, respectively. Variables related to parity, genetics, BCS, breed, previous calving, and reproductive events and the calf were included in the analysis. Based on a stepwise regression modeling process, the variables included in the models were the dam's direct and maternal calving difficulty predicted transmitting abilities (PTA), BCS at calving, parity; calving assistance or difficulty at the previous calving; proportion of Holstein breed; sire breed; sire direct calving difficulty PTA; twinning; and 2-way interactions between calving BCS and previous calving difficulty and the direct calving difficulty PTA of dam and sire. The models were built using bootstrapping procedures on 70% of the data set. The held-back 30% of the data was used to evaluate the predictive performance of the models in terms of discrimination and calibration. The decision tree and random forest models omitted the effect of twinning and included only subsets of sire breeds. Only multinomial regression and neural networks explicitly included the modeled interactions. Calving BCS, calving difficulty PTA, and previous calving assistance ranked as highly important variables for all 4 models. The area under the receiver operating characteristic curve (ranging from 0.64 to 0.79) indicates that all of the models had good overall discriminatory power. The neural network and multinomial regression models performed best, correctly classifying 75% of calving cases and showing superior calibration, with an average error in predicted probability of 3.7 and 4.5%, respectively. The neural network and multinomial regression models developed are both suitable for use in decision-support and simulation modeling.

Subject(s)

Cattle Diseases/epidemiology , Cattle/physiology , Dairying/methods , Dystocia/veterinary , Models, Theoretical , Parturition , Animals , Cattle Diseases/physiopathology , Decision Support Techniques , Dystocia/epidemiology , Dystocia/physiopathology , Female , Incidence , Ireland/epidemiology , Machine Learning , Pregnancy

The creation and evaluation of a model predicting the probability of conception in seasonal-calving, pasture-based dairy cows.

Fenlon, Caroline; O'Grady, Luke; Doherty, Michael L; Dunnion, John; Shalloo, Laurence; Butler, Stephen T.

J Dairy Sci ; 100(7): 5550-5563, 2017 Jul.

Article in English | MEDLINE | ID: mdl-28477998

ABSTRACT

Reproductive performance in pasture-based production systems has a fundamentally important effect on economic efficiency. The individual factors affecting the probability of submission and conception are multifaceted and have been extensively researched. The present study analyzed some of these factors in relation to service-level probability of conception in seasonal-calving pasture-based dairy cows to develop a predictive model of conception. Data relating to 2,966 services from 737 cows on 2 research farms were used for model development and data from 9 commercial dairy farms were used for model testing, comprising 4,212 services from 1,471 cows. The data spanned a 15-yr period and originated from seasonal-calving pasture-based dairy herds in Ireland. The calving season for the study herds extended from January to June, with peak calving in February and March. A base mixed-effects logistic regression model was created using a stepwise model-building strategy and incorporated parity, days in milk, interservice interval, calving difficulty, and predicted transmitting abilities for calving interval and milk production traits. To attempt to further improve the predictive capability of the model, the addition of effects that were not statistically significant was considered, resulting in a final model composed of the base model with the inclusion of BCS at service. The models' predictions were evaluated using discrimination to measure their ability to correctly classify positive and negative cases. Precision, recall, F-score, and area under the receiver operating characteristic curve (AUC) were calculated. Calibration tests measured the accuracy of the predicted probabilities. These included tests of overall goodness-of-fit, bias, and calibration error. Both models performed better than using the population average probability of conception. Neither of the models showed high levels of discrimination (base model AUC 0.61, final model AUC 0.62), possibly because of the narrow central range of conception rates in the study herds. The final model was found to reliably predict the probability of conception without bias when evaluated against the full external data set, with a mean absolute calibration error of 2.4%. The chosen model could be used to support a farmer's decision-making and in stochastic simulation of fertility in seasonal-calving pasture-based dairy cows.

Subject(s)

Fertilization/physiology , Models, Statistical , Probability , Seasons , Animals , Breeding/statistics & numerical data , Cattle , Dairying , Female , Ireland , Lactation , Milk , Poaceae , Pregnancy

The impact of removal of the seasonality formula on the eligibility of Irish herds to supply raw milk for processing of dairy products.

Fenlon, Caroline; O'Grady, Luke; McCoy, Finola; Houtsma, Erik; More, Simon J.

Ir Vet J ; 70: 9, 2017.

Article in English | MEDLINE | ID: mdl-28250916

ABSTRACT

BACKGROUND: The dairy industry in Ireland is expanding rapidly, with a focus on the production of high quality milk. Somatic cell counts (SCC) are an important indicator both of udder health and milk quality. Milk sold by Irish farmers for manufacture must comply with EU regulations. Irish SCC data is also subject to a monthly seasonal adjustment, for four months from November to February, on account of the seasonality of milk production in Ireland. In a recent study, however, there was no evidence of a dilution effect on SCC with increasing milk yield in Irish dairy cattle. The aim of this paper is to estimate the impact of removal of the seasonality formula on the eligibility of Irish herds to supply raw milk for processing of dairy products. METHODS: Bulk tank SCC data from 2013 were collected from 14 cooperatives in Ireland. The geometric mean of SCC test results was calculated for each calendar month. We then calculated the number of herds and volume of milk supplied falling in three SCC categories (<200,000, 200,000-400,000, >400,000 cells/mL) in Ireland during 2013 based on their geometric mean SCC every month. Each herd was assigned an 'eligibility to supply' status (always compliant, under warning (first warning, second warning, third warning) and liable for suspension) each month based on their 3-month rolling geometric mean, using methods as outlined in EU and Irish legislation. Two methods were used to calculate the 3-month rolling geometric mean. We then determined the number of herds and volume of milk supplied by 'eligibility to supply' status in Ireland during 2013. All calculations were conducted with and without the seasonality adjustment. RESULTS: The analyses were performed on 2,124,864 records, including 1,571,363 SCC test results from 16,740 herds. With the seasonality adjustment in place, 860 (5.1%) or 854 (5.1%) of herds should have been liable for suspension during 2013 if calculation method 1 or 2, respectively, had been used. If the seasonality adjustment were removed, it is estimated that the number of herds liable for suspension would increase from 860 to 974 (13.2% increase) using calculation method 1, or from 854 to 964 (12.9% increase) using calculation method 2. CONCLUSIONS: The modelled impact of such removal would be relatively minor, based on available data, regardless of the method used to calculate the 3-month rolling geometric mean. The focus of the current study was quite narrow, effectively from July to December 2013. Therefore, the results are an underestimate of the total number of herds liable for suspension during 2013. They may also underestimate the true percentage change in herds liable for suspension, with the removal of the seasonality formula. A national herd identifier was lacking from a sizeable percentage of the 2013 bulk tank SCC data, but will be needed if these data are to be meaningfully used for this or other purposes.

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL