Multiple Machine Learnings Revealed Similar Predictive Accuracy for Prognosis of PNETs from the Surveillance, Epidemiology, and End Result Database

Song, Yiyan; Gao, Shaowei; Tan, Wulin; Qiu, Zeting; Zhou, Huaqiang; Zhao, Yue

doi:10.7150/jca.26649



International Journal of Biological Sciences

International Journal of Medical Sciences

Theranostics

Journal of Genomics

Nanotheranostics

Global reach, higher impact

Full Text | PDF

J Cancer 2018; 9(21):3971-3978. doi:10.7150/jca.26649 This issue Cite

Research Paper

Multiple Machine Learnings Revealed Similar Predictive Accuracy for Prognosis of PNETs from the Surveillance, Epidemiology, and End Result Database

Yiyan Song^1,2,#, Shaowei Gao^2,#, Wulin Tan², Zeting Qiu³, Huaqiang Zhou³, Yue Zhao^1,✉

1. Department of General Surgery, Guangdong Second Provincial General Hospital, Guangzhou, China
2. Department of Anesthesia, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
3. Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
^# Song and Gao contributed equally to this work and should be regarded as co-first authors.

Citation:

Song Y, Gao S, Tan W, Qiu Z, Zhou H, Zhao Y. Multiple Machine Learnings Revealed Similar Predictive Accuracy for Prognosis of PNETs from the Surveillance, Epidemiology, and End Result Database. J Cancer 2018; 9(21):3971-3978. doi:10.7150/jca.26649. https://www.jcancer.org/v09p3971.htm

Other styles

Abstract

Background: Prognosis prediction is indispensable in clinical practice and machine learning has been proved to be helpful. We expected to predict survival of pancreatic neuroendocrine tumors (PNETs) with machine learning, and compared it with the American Joint Committee on Cancer (AJCC) staging system.

Methods: Data of PNETs cases were extracted from The Surveillance, Epidemiology, and End Result (SEER) database. Statistic description, multivariate survival analysis and preprocessing were done before machine learning. Four different algorithms (logistic regression (LR), support vector machines (SVM), random forest (RF) and deep learning (DL)) were used to train the model. We used proper imputations to manage missing data in the database and sensitive analysis was performed to evaluate the imputation. The model with the best predictive accuracy was compared with the AJCC staging system using the SEER cases.

Results: The four models had similar predictive accuracy with no significant difference existed (p = 0.664). The DL model showed a slightly better predictive accuracy than others (81.6% (± 1.9%)), thus it was used for further comparison with the AJCC staging system and revealed a better performance for PNETs cases in SEER database (Area under receiver operating characteristic curve: 0.87 vs 0.76). The validity of missing data imputation was supported by sensitivity analysis.

Conclusions: The models developed with machine learning performed well in survival prediction of PNETs, and the DL model have a better accuracy and specificity than the AJCC staging system in SEER data. The DL model has potential for clinical application but external validation is needed.

Keywords: machine learning, pancreatic neuroendocrine tumor, prognostic prediction, SEER database

Citation styles

APA

Song, Y., Gao, S., Tan, W., Qiu, Z., Zhou, H., Zhao, Y. (2018). Multiple Machine Learnings Revealed Similar Predictive Accuracy for Prognosis of PNETs from the Surveillance, Epidemiology, and End Result Database. Journal of Cancer, 9(21), 3971-3978. https://doi.org/10.7150/jca.26649.

ACS

Song, Y.; Gao, S.; Tan, W.; Qiu, Z.; Zhou, H.; Zhao, Y. Multiple Machine Learnings Revealed Similar Predictive Accuracy for Prognosis of PNETs from the Surveillance, Epidemiology, and End Result Database. J. Cancer 2018, 9 (21), 3971-3978. DOI: 10.7150/jca.26649.

NLM

CSE

Song Y, Gao S, Tan W, Qiu Z, Zhou H, Zhao Y. 2018. Multiple Machine Learnings Revealed Similar Predictive Accuracy for Prognosis of PNETs from the Surveillance, Epidemiology, and End Result Database. J Cancer. 9(21):3971-3978.

This is an open access article distributed under the terms of the Creative Commons Attribution (CC BY-NC) license (https://creativecommons.org/licenses/by-nc/4.0/). See http://ivyspring.com/terms for full terms and conditions.