br Higher education br Middle high br Middle br
How do you feel about the current situation of your relative?
What is going through your head?
Are you worried about something in particular?
Do you receive any financial aid?
Have you had to change your usual routine to take care of your
Do you have help from other people to care for your relative?
What do you think the hospital could do to help you care for your
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/ins
A tree ensemble-based two-stage model for advanced-stage colorectal cancer survival prediction
Yuyan Wang a, Dujuan Wang b,∗, Xin Ye a, Yanzhang Wang b, Yunqiang Yin c, Yaochu Jin a,d,∗ a School of Management Science and Engineering, Dalian University of Technology, Dalian 116023, PR China b Business School, Sichuan University, Chengdu 610064, PR China
c School of Management and Economics, University of Electronic Science and Technology of China, Chengdu 611731, China d Department of Computer Science, University of Surrey, Guildford, Surrey GU2 7XH, United Kingdom
Semi-random regression tree
Classification techniques have widely been applied to cancer survival prediction for pre-dicting survival or death of patients. However, little attention has been paid to patients who are predicted to die. In this Filipin Complex work, we consider survival prediction as a two-stage task, where the first stage is to predict whether the outcome is survival or not, and the second stage is to predict the remaining lifespan for patients whose predicted outcome is death. To this end, we propose a two-stage machine learning model to enhance cancer sur-vival prediction. At the first stage, a tree-based imbalanced ensemble classification method is proposed for classification of the survivability of advanced-stage cancer patients. At the second stage, a selective ensemble regression method is proposed for survival time predic-tion, where a priori knowledge is adopted for feature selection and the mean proportion of error interval is proposed for selecting base learners. Extensive computational studies per-formed on colorectal cancer data from SEER database demonstrate that the proposed two-stage model can achieve a more accurate prediction compared to the one-stage regression model. The results show that the proposed classification approach can effectively handle the imbalanced survivability data, and the proposed regression method outperforms sev-eral state-of-the-art regression models.
Colorectal cancer is a malignant tumor that has become one of the deadliest diseases in the world. According to the global cancer statistics of 2012, the incidence of colorectal cancer ranked third for men and second for women in common malignant tumors, and the mortality rate is as high as 49% . With a high incidence and mortality rate, colorectal cancer constitutes a large part of medical care expenditures and has been a heavy burden on families and communities. Thus, it is of great importance to make more accurate prediction of survivability for cancer patients and to make better clinical deci-sions on diagnosis and treatment , including the choice of treatment methods, the timing of treatment and subsequent visits, which can make a big difference to treatment costs and the therapeutic results.
∗ Corresponding authors.