JECCR: Microsoft Excel-based algorithm predicts cancer prognosis
Using readily available computer programs such as Microsoft Excel, researchers can quickly generate valuable gene signatures for predicting breast cancer without specialized software or extensive bioinformatics training, according to an article published online Sept. 1 in Journal of Experimental & Clinical Cancer Research.
Robin Hallett, a graduate student working under the supervision of John Hassell, PhD, and colleagues from McMaster University in Ontario, Canada, developed an algorithm to generate gene signatures with predictive potential to identify genes in the classification of breast cancer using Microsoft Excel.
Human breast cancer patients display significant diversity in terms of survival, recurrence, metastasis and response to treatment, according to the authors. “These patient outcomes can be predicted by the transcriptional programs of their individual breast tumors. Predictive gene signatures allow us to correctly classify human breast tumors into various risk groups as well as to more accurately target therapy to ensure more durable cancer treatment.”
"Until now, constructing such a signature required the use of various clustering and classification algorithms, which in turn require specialized software and bioinformatics training,” wrote Hallet. “We completed all steps of our algorithm using Microsoft Excel 2007. This software is widely, if not universally, accessible to the biological research community, suggesting that implementation of this technique will not be hampered by lack of software or training."
The method first classifies the expression intensity for each gene as determined by global gene expression profiling as low, average or high. The matrix containing the classified data for each gene is then used to score the expression of each gene based its individual ability to predict the patient characteristic of interest. Finally, all examined genes are ranked based on their predictive ability and the most highly ranked genes are included in the master gene signature, which is then ready for use as a predictor.
The researchers used data from a group of 144 patients to train the algorithm to identify genes whose expression levels correlated with patient survival. The 10 most highly ranked genes predictive of poor prognosis and those 10 genes most highly predictive of good prognosis established a 20-gene expression based predictor, which was found to perform as well as two other models in the validation group.
The method was used to accurately predict the survival outcomes in a cohort of human breast cancer patients, the authors stated.
"Our algorithm produces prediction models with comparable accuracy to other feature selection techniques while having generally better accessibility and usability for biological research scientists,” concluded Hallet. “We've begun using our algorithm to generate gene expression based prediction models of breast cancer cell sensitivity to commonly used anti-cancer therapies. [We anticipate our methods] will prove useful to the molecular biological research community."
The article can be freely accessed from the journal’s web site here.
Robin Hallett, a graduate student working under the supervision of John Hassell, PhD, and colleagues from McMaster University in Ontario, Canada, developed an algorithm to generate gene signatures with predictive potential to identify genes in the classification of breast cancer using Microsoft Excel.
Human breast cancer patients display significant diversity in terms of survival, recurrence, metastasis and response to treatment, according to the authors. “These patient outcomes can be predicted by the transcriptional programs of their individual breast tumors. Predictive gene signatures allow us to correctly classify human breast tumors into various risk groups as well as to more accurately target therapy to ensure more durable cancer treatment.”
"Until now, constructing such a signature required the use of various clustering and classification algorithms, which in turn require specialized software and bioinformatics training,” wrote Hallet. “We completed all steps of our algorithm using Microsoft Excel 2007. This software is widely, if not universally, accessible to the biological research community, suggesting that implementation of this technique will not be hampered by lack of software or training."
The method first classifies the expression intensity for each gene as determined by global gene expression profiling as low, average or high. The matrix containing the classified data for each gene is then used to score the expression of each gene based its individual ability to predict the patient characteristic of interest. Finally, all examined genes are ranked based on their predictive ability and the most highly ranked genes are included in the master gene signature, which is then ready for use as a predictor.
The researchers used data from a group of 144 patients to train the algorithm to identify genes whose expression levels correlated with patient survival. The 10 most highly ranked genes predictive of poor prognosis and those 10 genes most highly predictive of good prognosis established a 20-gene expression based predictor, which was found to perform as well as two other models in the validation group.
The method was used to accurately predict the survival outcomes in a cohort of human breast cancer patients, the authors stated.
"Our algorithm produces prediction models with comparable accuracy to other feature selection techniques while having generally better accessibility and usability for biological research scientists,” concluded Hallet. “We've begun using our algorithm to generate gene expression based prediction models of breast cancer cell sensitivity to commonly used anti-cancer therapies. [We anticipate our methods] will prove useful to the molecular biological research community."
The article can be freely accessed from the journal’s web site here.