Background: It is not possible to search through public gene expression repositories (PGER) using clinically relevant terms such as receptor status in breast cancer. However previous research within the University of Limerick has enabled searches within these based on clinical parameters. The aim of this study is to use the data available within PGER to develop a gene signature that differentiates oestrogen receptor positive breast cancers to oestrogen receptor negative breast cancers.
Methods: Gene Expression Omnibus (GEO) and ArrayExpress were searched using the phrases, “breast” and “homosapien”. Each dataset was reviewed and the datasets obtaining a 5× rating in minimum information about a microarray experiment (MIAME) compliance score within ArrayExpress was retained for further analysis.
Results: This search yielded 29 datasets which obtained a 5× MIAME compliance score within ArrayExpress, 12 of these datasets had pertained information relating to oestrogen status and were further analyzed to generate a consensus profile. This profile contains 14 genes dysregulated between oestrogen positive and oestrogen negative cancers and they are: AGR2, TFF1, TFF3, CA12, CALML5, ELF5, FABP7, SCGB1D2, CPB1, CYP2B7P1, ESR, GABRP, NAT1 and SCGB2A1. This consensus profile was validated within an additional dataset with a sensitivity and specificity of 81% and 76% respectfully.
Conclusions: This demonstrates it is possible to use microarray expression data available in PGER generate consensus profiles which can potentially provide diagnostic and prognostic information that aid diagnosis and prognosis.