Transcriptome analysis has been widely used to make
biomarker panels to
diagnose cancers. In
breast cancer, the age of the
patient has been known to be associated with clinical features. As clinical
transcriptome data have accumulated significantly, we classified all
human genes based on age-specific differential expression between normal and
breast cancer cells using public data. We retrieved the values for
gene expression levels in
breast cancer and matched normal
cells from The
Cancer Genome Atlas. We divided
genes into two classes by paired t test without considering age in the first
classification. We carried out a
secondary classification of
genes for each class into eight groups, based on the patterns of the p-values, which were calculated for each of the three
age groups we defined. Through this two-step
classification,
gene expression was eventually grouped into 16 classes. We showed that this
classification method could be applied to establish a more accurate prediction model to
diagnose breast cancer by comparing the performance of prediction models with different combinations of
genes. We expect that our scheme of
classification could be used for other types of
cancer data.