ABSTRACT
Background: Sleep stage identification is critical in multiple areas (e.g. medicine or psychology) to diagnose sleep-related disorders. Previous studies have reported that the performance of machine learning algorithms can be changed depending on the biosignals and feature-extraction processes in sleep stage classification. Methods: To compare as many conditions as possible, 414 experimental conditions were applied, considering the combination of different biosignals, biosignal length, and window length. Five biosignals in polysomnography (i.e. electrocardiogram (ECG), electroencephalogram (EEG), electromyogram (EMG), electrooculogram left, and electrooculogram right) were used to identify optimal signal combinations for classification. In addition, three different signal-length conditions and six different window-length conditions were applied. The validity of each condition was examined via classification performance from the XGBoost classifiers trained using 10-fold cross-validation. Furthermore, results considering feature importance were examined to validate the experimental results in terms of model explanation. Results: The combination of EEG + EMG + ECG with a 40â s window and 120â s signal length resulted in the best classification performance (precision: 0.853, recall: 0.855, F1-score: 0.853, and accuracy: 0.853). Compared to other conditions and feature importance results, EEG signals showed a relatively higher importance for classification in the present study. Conclusion: We determined the optimal biosignal and window conditions for the feature-extraction process in machine learning algorithm-based sleep stage classification. Our experimental results inform researchers in the future conduct of related studies. To generalize our results, more diverse methodologies and conditions should be applied in future studies.
ABSTRACT
Background: Despite multimodal assessment (clinical examination, biology, brain MRI, electroencephalography, somatosensory evoked potentials, mismatch negativity at auditory evoked potentials), coma prognostic evaluation remains challenging. Methods: We present here a method to predict the return to consciousness and good neurological outcome based on classification of auditory evoked potentials obtained during an oddball paradigm. Data from event-related potentials (ERPs) were recorded noninvasively using four surface electroencephalography (EEG) electrodes in a cohort of 29 post-cardiac arrest comatose patients (between day 3 and day 6 following admission). We extracted retrospectively several EEG features (standard deviation and similarity for standard auditory stimulations and number of extrema and oscillations for deviant auditory stimulations) from the time responses in a window of few hundreds of milliseconds. The responses to the standard and the deviant auditory stimulations were thus considered independently. By combining these features, based on machine learning, we built a two-dimensional map to evaluate possible group clustering. Results: Analysis in two-dimensions of the present data revealed two separated clusters of patients with good versus bad neurological outcome. When favoring the highest specificity of our mathematical algorithms (0.91), we found a sensitivity of 0.83 and an accuracy of 0.90, maintained when calculation was performed using data from only one central electrode. Using Gaussian, K-neighborhood and SVM classifiers, we could predict the neurological outcome of post-anoxic comatose patients, the validity of the method being tested by a cross-validation procedure. Moreover, the same results were obtained with one single electrode (Cz). Conclusion: statistics of standard and deviant responses considered separately provide complementary and confirmatory predictions of the outcome of anoxic comatose patients, better assessed when combining these features on a two-dimensional statistical map. The benefit of this method compared to classical EEG and ERP predictors should be tested in a large prospective cohort. If validated, this method could provide an alternative tool to intensivists, to better evaluate neurological outcome and improve patient management, without neurophysiologist assistance.
ABSTRACT
The integration of Micro Electronic Mechanical Systems (MEMS) sensor technology in smartphones has greatly improved the capability for Human Activity Recognition (HAR). By utilizing Machine Learning (ML) techniques and data from these sensors, various human motion activities can be classified. This study performed experiments and compiled a large dataset of nine daily activities, including Laying Down, Stationary, Walking, Brisk Walking, Running, Stairs-Up, Stairs-Down, Squatting, and Cycling. Several ML models, such as Decision Tree Classifier, Random Forest Classifier, K Neighbors Classifier, Multinomial Logistic Regression, Gaussian Naive Bayes, and Support Vector Machine, were trained on sensor data collected from accelerometer, gyroscope, and magnetometer embedded in smartphones and wearable devices. The highest test accuracy of 95% was achieved using the random forest algorithm. Additionally, a custom-built Bidirectional Long-Short-Term Memory (Bi-LSTM) model, a type of Recurrent Neural Network (RNN), was proposed and yielded an improved test accuracy of 98.1%. This approach differs from traditional algorithmic-based human activity detection used in current wearable technologies, resulting in improved accuracy.
Subject(s)
Micro-Electrical-Mechanical Systems , Wearable Electronic Devices , Humans , Artificial Intelligence , Bayes Theorem , Human ActivitiesABSTRACT
PURPOSE: Screening programs use mammography as a diagnostic tool for the early detection of breast cancer. Mammogram enhancement is used to increase the local contrast of the mammogram so that the lesions are more visible in the advanced image. For accurate diagnosis in the early stage of breast cancer, the appearance of masses and microcalcification on the mammographic image are two important indicators. The objective of this study was to evaluate the feasibility of the automatic separation of images of breast tissue microcalcifications and also to evaluate its accuracy. METHODS: The research was carried out by using two techniques of image enhancement and highlighting of breast tissue microcalcifications for the desired areas by regional ROI based on fuzzy system and also Gabor filtering method. After determining the clusters of breast tissue microcalcifications, the clusters are classified using the decision tree classification algorithm. Then, for segmentation, samples suspected of microcalcification are highlighted and masked, and in the last stage, tissue characteristics are extracted. Subsequently, with the help of an artificial neural network (ANN), determining the benign and malignant types of segmented ROI clusters was accomplished. The proposed system is trained with a Digital Database for Screening Mammography (DDSM) developed by the University of South Florida, USA, and the simulations are performed under MATLAB software and the results are compared with previous work. RESULTS: The results of this training performed under this work show an accuracy of 93% and an improvement of sensitivity above 95%. CONCLUSION: The result indicates that the proposed approach can be applied to ensure breast cancer diagnosis.
ABSTRACT
SARS-CoV-2 pandemic is the current threat of the world with enormous number of deceases. As most of the countries have constraints on resources, particularly for intensive care and oxygen, severity prediction with high accuracy is crucial. This prediction will help the medical society in the selection of patients with the need for these constrained resources. Literature shows that using clinical data in this study is the common trend and molecular data is rarely utilized in this prediction. As molecular data carry more disease related information, in this study, three different types of RNA molecules ( lncRNA, miRNA and mRNA) of SARS-COV-2 patients are used to predict the severity stage and treatment stage of those patients. Using seven different machine learning algorithms along with several feature selection techniques shows that in both phenotypes, feature importance selected features provides the best accuracy along with random forest classifier. Further to this, it shows that in the severity stage prediction miRNA and lncRNA give the best performance, and lncRNA data gives the best in treatment stage prediction. As most of the studies related to molecular data uses mRNA data, this is an interesting finding.
Subject(s)
COVID-19 , MicroRNAs , RNA, Long Noncoding , Humans , SARS-CoV-2/genetics , RNA, Long Noncoding/genetics , Algorithms , MicroRNAs/genetics , RNA, Messenger/geneticsABSTRACT
Storage is necessary for rice to ensure the year-round consumption of rice. With the increase in storage time, the taste quality and commercial value of rice gradually decrease. The accurate determination of the freshness of rice is critical to the rice trade. However, it is difficult to distinguish aging rice from fresh rice, so a quick and simple method is needed to identify the freshness of the rice. In this study, a combination of near-infrared spectroscopy (NIR) and various algorithms, such as partial least squares discriminant analysis (PLS-DA), support vector machines (SVM), and classification and regression trees (CART), were used to differentiate the freshness of rice. PLS-DA and SVM demonstrated excellent classification ability in identifying the freshness of rice, with sensitivity and specificity of 1. The original spectra were used with 100% accuracy in the test set to determine the freshness of the rice. As a result, PLS-DA and SVM can be used to determine the freshness of the rice.
Subject(s)
Oryza , Spectroscopy, Near-Infrared , Spectroscopy, Near-Infrared/methods , Oryza/chemistry , Discriminant Analysis , Algorithms , Least-Squares Analysis , Support Vector MachineABSTRACT
In recent years, detecting credit card fraud transactions has been a difficult task due to the high dimensions and imbalanced datasets. Selecting a subset of important features from a high-dimensional dataset has proven to be the most prominent approach for solving high-dimensional dataset issues, and the selection of features is critical for improving classification performance, such as the fraud transaction identification process. To contribute to the field, this paper proposes a novel feature selection (FS) approach based on a metaheuristic algorithm called Rock Hyrax Swarm Optimization Feature Selection (RHSOFS), inspired by the actions of rock hyrax swarms in nature, and implements supervised machine learning techniques to improve credit card fraud transaction identification approaches. This approach is used to select a subset of optimal relevant features from a high-dimensional dataset. In a comparative efficiency analysis, RHSOFS is compared with Differential Evolutionary Feature Selection (DEFS), Genetic Algorithm Feature Selection (GAFS), Particle Swarm Optimization Feature Selection (PSOFS), and Ant Colony Optimization Feature Selection (ACOFS) in a comparative efficiency analysis. The proposed RHSOFS outperforms existing approaches, such as DEFS, GAFS, PSOFS, and ACOFS, according to the experimental results. Various statistical tests have been used to validate the statistical significance of the proposed model.
Subject(s)
Algorithms , Machine LearningABSTRACT
PURPOSE: This study investigated brain microstructural changes in patients with amnestic mild cognitive impairment (aMCI) by retrospectively analyzing neurite orientation dispersion and density imaging (NODDI) data with machine learning algorithms. METHODS: A total of 26 aMCI patients and 24 healthy controls (HC) underwent NODDI magnetic resonance imaging (MRI) examinations. The NODDI parameters including neurite density index (NDI), orientation dispersion index (ODI), and volume fraction of isotropic water molecules (Viso) were estimated. Machine learning algorithms such as Knearest neighbor (KNN), logistic regression (LR), random forest (RF), and support vector machine (SVM) were used to evaluate the diagnostic efficacy of NODDI parameters in predicting aMCI. The differences in the NODDI parameter values between the aMCI and HC groups were analyzed using the independent sample ttest, False discovery rate (FDR) correction was used for multiple testing. After adjusting for age, sex, and educational years, partial correlation analysis was used to evaluate the relationship between NODDI parameters and clinical cognitive status of aMCI patients. RESULTS: The NDI, ODI, and Viso values of white matter (WM) and gray matter (GM) structure templates combined with the KNN, LR, RF and SVM machine learning algorithms accomplished the discrimination between aMCI and HC groups. The NDI and ODI values decreased (p value range, <â¯0.001-0.042) and Viso values increased (p value range, <â¯0.001-0.043) in the aMCI group compared to the HCs. The NDI, ODI, and Viso values of the WM and GM structure templates with significant differences were significantly correlated with mini-mental state examination (MMSE) and Montreal cognitive assessment (MoCA) scores. CONCLUSION: NODDI combined with machine learning algorithms is a promising strategy for early diagnosis of aMCI. Moreover, NODDI parameters correlated with the clinical cognitive status of aMCI patients.
ABSTRACT
How to recruit, test, and train the intelligent recommendation system users, and how to assign the archive translation tasks to all intelligent recommendation system users according to the intelligent matching principles are still a problem that needs to be solved. With the help of proper names and terms in China's Imperial Maritime Customs archives, this manuscript aims to solve the problem. When the corresponding translation, domain or attributes of a proper name or term is known, it will be easier for some archive translation tasks to be completed, and the adaptive archive intelligent recommendation system will also improve the efficiency of intelligent recommendation quality of archive translation tasks. These related domains or attributes are different labels of these archives. To put it simply, multi-label classification means that the same instance can have multiple labels or be labelled into multiple categories, which is called multi-label classification. With the multi-label classification, archives can be classified into different categories, such as the trade archives, preventive archives, personnel archives, etc. The system users are divided into different professional domains by some tests, for instance, system users who are good at economic knowledge and users who have higher language skills. With these labels, the intelligent recommendation system can make the intelligent match between the archives and system users, so as to improve the efficiency and quality of intelligent archive translation tasks. In this manuscript, through multi-label classification, the intelligent recommendation system can realize the intelligent allocation of archive translation tasks to the system users. The intelligent allocation is realized through the construction of intelligent control model, and verifies that the intelligent recommendation system can improve the performance of task allocation over time without the participation of task issuers.
ABSTRACT
Geological characteristic (GC) is one of the most essential factors influencing setting earth pressure balance (EPB) shield parameters and cutterhead wear. Identification of GC has crucial significance to shield tunnelling efficiency and safety. Stacking classification algorithm (SCA) is widely applied in engineering with the identification and classification. Grid search (GS) is designed to tune hyper-parameter and optimize non-linear problems with K-folds cross-validation (K-CV), which is commonly used to change validation set in the training set. The performance of SCA can be improved by GS and K-CV. The types of GC during shield advance can be identified by integrating K-means++ with silhouette coefficient (Si ) and elbow method (EM). The results of K-means++ and shield parameters severed as a database for SCA. The approach was applied in Guangzhou mixed ground. The results showed that the proposed framework could predict the geological characteristics well. The method article is a companion paper with the original article [1]. The proposed method enables: ⢠Developed approach merges SCA and GS method. ⢠Application of SCA-GS method in geological characteristics classification. ⢠It can increase the reliability of classification results.
ABSTRACT
CD19-targeted CAR T cell immunotherapy has exceptional efficacy for the treatment of B-cell malignancies. B-cell acute lymphocytic leukemia and non-Hodgkin's lymphoma are two common B-cell malignancies with high recurrence rate and are refractory to cure. Although CAR T-cell immunotherapy overcomes the limitations of conventional treatments for such malignancies, failure of treatment and tumor recurrence remain common. In this study, we searched for important methylation signatures to differentiate CAR-transduced and untransduced T cells from patients with acute lymphoblastic leukemia and non-Hodgkin's lymphoma. First, we used three feature ranking methods, namely, Monte Carlo feature selection, light gradient boosting machine, and least absolute shrinkage and selection operator, to rank all methylation features in order of their importance. Then, the incremental feature selection method was adopted to construct efficient classifiers and filter the optimal feature subsets. Some important methylated genes, namely, SERPINB6, ANK1, PDCD5, DAPK2, and DNAJB6, were identified. Furthermore, the classification rules for distinguishing different classes were established, which can precisely describe the role of methylation features in the classification. Overall, we applied advanced machine learning approaches to the high-throughput data, investigating the mechanism of CAR T cells to establish the theoretical foundation for modifying CAR T cells.
ABSTRACT
Notably, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has a tight relationship with the immune system. Human resistance to COVID-19 infection comprises two stages. The first stage is immune defense, while the second stage is extensive inflammation. This process is further divided into innate and adaptive immunity during the immune defense phase. These two stages involve various immune cells, including CD4+ T cells, CD8+ T cells, monocytes, dendritic cells, B cells, and natural killer cells. Various immune cells are involved and make up the complex and unique immune system response to COVID-19, providing characteristics that set it apart from other respiratory infectious diseases. In the present study, we identified cell markers for differentiating COVID-19 from common inflammatory responses, non-COVID-19 severe respiratory diseases, and healthy populations based on single-cell profiling of the gene expression of six immune cell types by using Boruta and mRMR feature selection methods. Some features such as IFI44L in B cells, S100A8 in monocytes, and NCR2 in natural killer cells are involved in the innate immune response of COVID-19. Other features such as ZFP36L2 in CD4+ T cells can regulate the inflammatory process of COVID-19. Subsequently, the IFS method was used to determine the best feature subsets and classifiers in the six immune cell types for two classification algorithms. Furthermore, we established the quantitative rules used to distinguish the disease status. The results of this study can provide theoretical support for a more in-depth investigation of COVID-19 pathogenesis and intervention strategies.
ABSTRACT
Evidence from observational studies has become increasingly important for supporting healthcare policy making via cost-effectiveness analyses. Similar as in comparative effectiveness studies, health economic evaluations that consider subject-level heterogeneity produce individualized treatment rules that are often more cost-effective than one-size-fits-all treatment. Thus, it is of great interest to develop statistical tools for learning such a cost-effective individualized treatment rule under the causal inference framework that allows proper handling of potential confounding and can be applied to both trials and observational studies. In this paper, we use the concept of net-monetary-benefit to assess the trade-off between health benefits and related costs. We estimate cost-effective individualized treatment rule as a function of patients' characteristics that, when implemented, optimizes the allocation of limited healthcare resources by maximizing health gains while minimizing treatment-related costs. We employ the conditional random forest approach and identify the optimal cost-effective individualized treatment rule using net-monetary-benefit-based classification algorithms, where two partitioned estimators are proposed for the subject-specific weights to effectively incorporate information from censored individuals. We conduct simulation studies to evaluate the performance of our proposals. We apply our top-performing algorithm to the NIH-funded Systolic Blood Pressure Intervention Trial to illustrate the cost-effectiveness gains of assigning customized intensive blood pressure therapy.
Subject(s)
Algorithms , Research Design , Humans , Cost-Benefit Analysis , Treatment Outcome , Computer SimulationABSTRACT
BACKGROUND: COVID-19 displays an increased mortality rate and higher risk of severe symptoms with increasing age, which is thought to be a result of the compromised immunity of elderly patients. However, the underlying mechanisms of aging-associated immunodeficiency against Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) remains unclear. Epigenetic modifications show considerable changes with age, causing altered gene regulations and cell functions during the aging process. The DNA methylation patterns among patients with coronavirus 2019 disease (COVID-19) who had different ages were compared to explore the effect of aging-associated methylation modifications in SARS-CoV-2 infection. METHODS: Patients with COVID-19 were divided into three groups according to age. Boruta was used on the DNA methylation profiles of the patients to remove irrelevant features and retain essential signature sites to identify substantial aging-associated DNA methylation changes in COVID-19. Next, these features were ranked using the minimum redundancy maximum relevance (mRMR) method, and the feature list generated by mRMR was processed into the incremental feature selection method with decision tree (DT), random forest, k-nearest neighbor, and support vector machine to obtain the key methylation sites, optimal classifier, and decision rules. RESULTS: Several key methylation sites that showed distinct patterns among the patients with COVID-19 who had different ages were identified, and these methylation modifications may play crucial roles in regulating immune cell functions. An optimal classifier was built based on selected methylation signatures, which can be useful to predict the aging-associated disease risk of COVID-19. CONCLUSIONS: Existing works and our predictions suggest that the methylation modifications of genes, such as NHLH2, ZEB2, NWD1, ELOVL2, FGGY, and FHL2, are closely associated with age in patients with COVID-19, and the 39 decision rules extracted with the optimal DT classifier provides quantitative context to the methylation modifications in elderly patients with COVID-19. Our findings contribute to the understanding of the epigenetic regulations of aging-associated COVID-19 symptoms and provide the potential methylation targets for intervention strategies in elderly patients.
Subject(s)
COVID-19 , SARS-CoV-2 , Aged , COVID-19/genetics , DNA Methylation , Humans , Protein Processing, Post-Translational , SARS-CoV-2/genetics , Support Vector MachineABSTRACT
Lung is the most important organ in the human respiratory system, whose normal functions are quite essential for human beings. Under certain pathological conditions, the normal lung functions could no longer be maintained in patients, and lung transplantation is generally applied to ease patients' breathing and prolong their lives. However, several risk factors exist during and after lung transplantation, including bleeding, infection, and transplant rejections. In particular, transplant rejections are difficult to predict or prevent, leading to the most dangerous complications and severe status in patients undergoing lung transplantation. Given that most common monitoring and validation methods for lung transplantation rejections may take quite a long time and have low reproducibility, new technologies and methods are required to improve the efficacy and accuracy of rejection monitoring after lung transplantation. Recently, one previous study set up the gene expression profiles of patients who underwent lung transplantation. However, it did not provide a tool to predict lung transplantation responses. Here, a further deep investigation was conducted on such profiling data. A computational framework, incorporating several machine learning algorithms, such as feature selection methods and classification algorithms, was built to establish an effective prediction model distinguishing patient into different clinical subgroups, corresponding to different rejection responses after lung transplantation. Furthermore, the framework also screened essential genes with functional enrichments and create quantitative rules for the distinction of patients with different rejection responses to lung transplantation. The outcome of this contribution could provide guidelines for clinical treatment of each rejection subtype and contribute to the revealing of complicated rejection mechanisms of lung transplantation.
Subject(s)
Lung Transplantation , Graft Rejection , Humans , Lung , Reproducibility of Results , TranscriptomeABSTRACT
Neurodegenerative diseases, including Alzheimer's disease (AD), Parkinson's disease, and many other disease types, cause cognitive dysfunctions such as dementia via the progressive loss of structure or function of the body's neurons. However, the etiology of these diseases remains unknown, and diagnosing less common cognitive disorders such as vascular dementia (VaD) remains a challenge. In this work, we developed a machine-leaning-based technique to distinguish between normal control (NC), AD, VaD, dementia with Lewy bodies, and mild cognitive impairment at the microRNA (miRNA) expression level. First, unnecessary miRNA features in the miRNA expression profiles were removed using the Boruta feature selection method, and the retained feature sets were sorted using minimum redundancy maximum relevance and Monte Carlo feature selection to provide two ranking feature lists. The incremental feature selection method was used to construct a series of feature subsets from these feature lists, and the random forest and PART classifiers were trained on the sample data consisting of these feature subsets. On the basis of the model performance of these classifiers with different number of features, the best feature subsets and classifiers were identified, and the classification rules were retrieved from the optimal PART classifiers. Finally, the link between candidate miRNA features, including hsa-miR-3184-5p, has-miR-6088, and has-miR-4649, and neurodegenerative diseases was confirmed using recently published research, laying the groundwork for more research on miRNAs in neurodegenerative diseases for the diagnosis of cognitive impairment and the understanding of potential pathogenic mechanisms.
ABSTRACT
There is an increasing demand for automatic classification of standard 12-lead electrocardiogram signals in the medical field. Considering that different channels and temporal segments of a feature map extracted from the 12-lead electrocardiogram record contribute differently to cardiac arrhythmia detection, and to the classification performance, we propose a 12-lead electrocardiogram signal automatic classification model based on model fusion (CBi-DF-XGBoost) to focus on representative features along both the spatial and temporal axes. The algorithm extracts local features through a convolutional neural network and then extracts temporal features through bi-directional long short-term memory. Finally, eXtreme Gradient Boosting (XGBoost) is used to fuse the 12-lead models and domain-specific features to obtain the classification results. The 5-fold cross-validation results show that in classifying nine categories of electrocardiogram signals, the macro-average accuracy of the fusion model is 0.968, the macro-average recall rate is 0.814, the macro-average precision is 0.857, the macro-average F1 score is 0.825, and the micro-average area under the curve is 0.919. Similar experiments with some common network structures and other advanced electrocardiogram classification algorithms show that the proposed model performs favourably against other counterparts in F1 score. We also conducted ablation studies to verify the effect of the complementary information from the 12 leads and the auxiliary information of domain-specific features on the classification performance of the model. We demonstrated the feasibility and effectiveness of the XGBoost-based fusion model to classify 12-lead electrocardiogram records into nine common heart rhythms. These findings may have clinical importance for the early diagnosis of arrhythmia and incite further research. In addition, the proposed multichannel feature fusion algorithm can be applied to other similar physiological signal analyses and processing.
ABSTRACT
Atopic dermatitis and psoriasis are members of a family of inflammatory skin disorders. Cellular immune responses in skin tissues contribute to the development of these diseases. However, their underlying immune mechanisms remain to be fully elucidated. We developed a computational pipeline for analyzing the single-cell RNA-sequencing profiles of the Human Cell Atlas skin dataset to investigate the pathological mechanisms of skin diseases. First, we applied the maximum relevance criterion and the Boruta feature selection method to exclude irrelevant gene features from the single-cell gene expression profiles of inflammatory skin disease samples and healthy controls. The retained gene features were ranked by using the Monte Carlo feature selection method on the basis of their importance, and a feature list was compiled. This list was then introduced into the incremental feature selection method that combined the decision tree and random forest algorithms to extract important cell markers and thus build excellent classifiers and decision rules. These cell markers and their expression patterns have been analyzed and validated in recent studies and are potential therapeutic and diagnostic targets for skin diseases because their expression affects the pathogenesis of inflammatory skin diseases.
ABSTRACT
Electronic point scoring systems (PSS) for vests are heavily relied upon in taekwondo. However, no classification and assessment of legal and illegal taekwondo techniques exist. This is also referred to as hit-validation and the objective of this research is to create an electronic helmet (eHelmet) for hit-validation. Three main studies were performed to achieve this objective: Robustness Testing, Sensor Placement and Classification of Impacts to the head. The first two studies are preliminary to the main Classification of Impacts study. This is needed as no data sets using an IMU are currently available for taekwondo. Robustness Testing: proved that IMU can in-fact be used in the inherently harsh environments of taekwondo with a linear response. The calculated response for the IMU is: f(x) = mx + b, where m is 0.2947 and b is 1.499 (accelerometer) and f(x) = mx + b, where m is 28.33 and b is 84.8 (gyroscope). Sensor Placement: Qualitatively and quantitatively concluded the ideal location for the sensor and electronics is indeed the back of the head, based on durability, cost, human factors, and signal quality. Classification of Impacts: IMU classified real-world impacts with 90% accuracy. The two classes were roundhouse kick (legal) and punch (illegal). An eHelmet using an IMU is capable of classifying impacts with high accuracy. The benefit of our system includes low cost, lightweight algorithm for on-device computing (edge computing), and real-time classification. Furthermore, it possesses all the safety requirements of current protective headgear.