ABSTRACT
The use of program code as a data source is increasingly expanding among data scientists. The purpose of the usage varies from the semantic classification of code to the automatic generation of programs. However, the machine learning model application is somewhat limited without annotating the code snippets. To address the lack of annotated datasets, we present the Code4ML corpus. It contains code snippets, task summaries, competitions, and dataset descriptions publicly available from Kaggle-the leading platform for hosting data science competitions. The corpus consists of ~2.5 million snippets of ML code collected from ~100 thousand Jupyter notebooks. A representative fraction of the snippets is annotated by human assessors through a user-friendly interface specially designed for that purpose. Code4ML dataset can help address a number of software engineering or data science challenges through a data-driven approach. For example, it can be helpful for semantic code classification, code auto-completion, and code generation for an ML task specified in natural language.
ABSTRACT
Natural killer (NK) cells are capable of lysing their target cells with the help of perforin. The application of these cells for immunotherapy requires the estimation of their potency for the purpose of validation and batch-to-batch comparison. Cytotoxicity measurements have been carried out at only a few effector target ratios, therefore, allowing only semiquantitative assessment at best. By using a novel approach of varying the effector target ratio continuously and careful analysis of the experimental data after the reactions, we have achieved a precision necessary for constructing a mathematical model of cytotoxic reaction. Curve-fitting to experimental data indicates that NK cell cytotoxicity follows the law of mass action and fits the model of a single ligand-receptor interaction. The method allows to use the value of half-maximal lysis to describe the potency of cytotoxic NK cells numerically.