ABSTRACT
High-throughput biological assays such as micro-arrays and mass spectrometry (MS) have risen as potential clinical tools for disease detection. Multiple potential biomarkers can be rapidly and cheaply evaluated for a large number of patients. Typical research and evaluation studies in these fields have focused primarily on data that were generated from samples in a single data-generation session. However, in the clinical setting, new patients screened by the technology will arrive at different times and data will unavoidably come from multiple data-generation sessions. The understanding and assessment of multi-session effects on data generated by the technology is critical for its application to clinical practice. This paper proposes a methodology for measuring and testing the reproducibility of various aspects of high-throughput data across multiple data-generation sessions. We test and demonstrate the framework on mass-spectrometry data obtained from four different data-generation sessions for the same set of samples.