Introduction
Background
Ovarian cancer is one of the most common cancers worldwide [], and this gynecological cancer is characterized by poor prognosis and high mortality []. It is estimated that epithelial ovarian cancer (EOC) represents 90% of the ovarian cancer cases [], with serous carcinoma being the most common pathological type []. Because of the absence of cancer-specific symptoms and effective screening techniques, EOC is frequently diagnosed at a late stage [,]. Despite undergoing relevant treatments, patients with ovarian cancer still face high rates of recurrence and mortality, with a 5-year survival rate of <30% []. According to GLOBOCAN 2020, the number of new ovarian cancer cases in low and high Human Development Index countries will increase by approximately 96% and 19%, respectively, by 2040 [].
Currently, the National Comprehensive Cancer Network Guidelines (2023 Edition) recommend the use of paclitaxel or carboplatin for 3 weeks as the first-line treatment for stage 2 to 4 EOC []. Although platinum-based chemotherapy is effective in most patients with ovarian cancer, resistance may occur in some patients []. In addition, their response to platinum treatment is unknown until chemotherapy is completed. The platinum-free interval is a reliable indicator for predicting treatment efficacy and patient prognosis because it can evaluate whether patients with ovarian cancer respond to platinum drugs and their recurrence [,]. The Gynecologic Cancer Group divides responses to platinum chemotherapy into 4 categories based on the duration of platinum-free interval (platinum refractory: <1 mo, platinum-resistant: 1-6 mo, partial platinum response: 6-12 mo, and platinum response: >12 mo) []. The chemoresistance of ovarian cancer may be related to genome expression [,], tumor heterogeneity, intestinal microbiota, DNA repair [], DNA methylation [,], and mitochondrial function [,] related to immunoediting. Within 2 years, approximately 70% of these patients relapse []. Therefore, predicting the response to platinum-based chemotherapy in patients with ovarian cancer is critical. Despite the emergence of multiple approaches, including mutational signatures, transcriptomic signatures, tumor mutation burden, and functional biomarkers, there is no conclusive evidence to guide the precise treatment of patients with ovarian cancer [].
Objectives
In recent years, with the increasing availability of medical data and the continuous improvement in computer analysis capabilities, machine learning has been increasingly used in the medical field [,]. Machine learning is a technological application that uses algorithms and data to enable computers to automatically learn and enhance. It excels in handling large amounts of complex and multidimensional information, thereby improving the accuracy and efficiency of decision-making [-]. In various domains of oncology, machine learning has been used to identify cancer imaging features [], predict the risk of cancer recurrence [], screen cancer drug targets [], and optimize cancer treatment options []. Some researchers have explored machine learning–based methods to construct prediction models for platinum reactions in ovarian cancer. However, in the field of machine learning research, there is a diverse range of methods and variables. The predictive performance of these methods for outcome events remains controversial. Currently, in evidence-based medicine, a comprehensive summary of the predictive performance of machine learning is lacking. Therefore, we conducted this study to explore early risk stratification in response to platinum-based chemotherapy in patients with ovarian cancer. Our aim was to enhance chemotherapy management in patients with ovarian cancer.
Methods
This study was carried out following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 guidelines (Table S1 in ) and registered on PROSPERO (CRD42022340928).
Data Sources and Searches
Relevant studies published before March 15, 2022, were thoroughly searched in the PubMed, Web of Science, Embase, and Cochrane databases. Search terms included subject headings (Medical Subject Headings in PubMed and Emtree in Embase) and free words, such as “Ovarian Neoplasms,” “machine learning,” “prediction model,” and “Platinum-Based Chemotherapy.” The specific search strategy is presented in Table S1 in . To prevent the risk of missing new literature, we conducted a supplementary search of each database until April 26, 2023.
Inclusion and Exclusion Criteria
The inclusion criteria were as follows:
- Patients diagnosed with ovarian cancer
- The research types are case-control, cohort, nested case-control, and case cohort studies.\
- A completely constructed predictive model for platinum chemotherapy (platinum-sensitive or platinum-resistant) response in patients with ovarian cancer
- Studies without external validation
- Different machine learning studies published on the same data set
- The literature written in English
Meanwhile, the following studies were excluded:
- The research type was meta-analysis, review, guideline, expert opinion, etc
- Only a risk factor analysis was carried out, but no complete machine learning prediction model was developed
- The following outcome measures were used: receiver operating characteristic curve, concordance index (C-index), sensitivity, specificity, accuracy, recovery rate, precision rate, confusion matrix, diagnostic 4-grid table, F1-score, and calibration curve. The original study should include at least one of the above indicators. Missing studies need to be excluded.
- Studies with small sample sizes (<50 cases)
- Research on the accuracy of single-factor prediction
Literature Screening and Data Extraction
The retrieved studies were imported into EndNote (Clarivate Plc) to remove duplicate publications automatically and manually. Subsequently, we reviewed the titles or abstracts of the remaining studies to exclude original studies that did not align with the topic. We proceeded to read the full texts of the remaining studies to screen the original studies that ultimately met the criteria.
The following information was collected from each eligible study: first author, year of publication, location, research duration, population characteristics, background, number of hospitals, study design (prospective or retrospective), number of patients, age, histological classification of enrolled patients, presence of tumor deposit after treatment, treatment methods, prediction objects, chemotherapy methods, number of positive samples, number of training set samples, total number of samples, follow-up time, variable selection method, characteristics of the machine learning approach (specific algorithm or type), validation method (cross-validation, retention method, external validation, or none), number of model variables, included model variables, efficacy evaluation indicators, sample size of the validation set, and prediction results.
The literature screening and data extraction were independently conducted by 2 researchers (QW and ZC). Following completion, crosschecks were performed. In the event of any disputes, a third researcher (XL) was consulted to assist in resolution.
Quality Assessment
The Prediction Model Risk of Bias Assessment (PROBAST), a technique for predicting the model risk of bias, was used to assess the risk of bias in the predictive models reported in eligible studies [,]. This tool consists of 4 major domains, participants, predictors, outcomes, and statistical analysis, and it reveals the overall risk of bias and applicability. The 4 domains have 2, 3, 6, and 9 distinct questions, respectively, with 3 possible answers: yes or probably yes, no or probably no, and no information. A domain is deemed high risk if it receives at least 1 no or probably no question, whereas a domain that receives yes or probably yes for all of its questions is considered low risk. When all domains are classified as low risk, the overall risk of bias is graded as low. Meanwhile, when at least one domain is deemed high risk, the overall risk of bias is regarded as high. Two investigators independently assessed the risk of bias and crosschecked their findings using PROBAST. Any disagreements were resolved by discussion with a third researcher. The assessment of the risk of bias was independently conducted by 2 researchers (YW and CF). Upon completion, a crosscheck was performed. In case of any dispute, a third researcher (YP) was consulted to assist in the decision-making process.
Data Analysis
If the C-index lacked a 95% CI and SE, we referred to the study by Debray et al [], which estimated its SEs. Because machine learning encompasses a wide range of mathematical models and predictive factors, there is high heterogeneity among various studies. Hence, a random effects model was used for the meta-analysis. In addition, we used a bivariate mixed effects model, which is a random effects model, to perform the meta-analysis of sensitivity and specificity. At the same time, we used the heterogeneity index (I2) to measure the heterogeneity. P<.05 indicated a statistically significant difference. Moreover, subgroup analysis was conducted to increase the robustness of the results and reduce heterogeneity between studies, according to the different types of prediction models and the possible influencing factors, for instance, whether it is high-grade serous ovarian cancer and whether there is a tumor deposit.
Ethical Considerations
All analyses were based on previously published studies; therefore, ethics approval and patient consent were not required.
Results
Search Strategy
A total of 1749 articles were obtained from the PubMed, Web of Science, Embase, and Cochrane databases. After removing 752 duplicates, we screened the titles and abstracts and identified 261 potentially eligible articles. On the basis of a full-text review, 242 studies were excluded, with 234 (96.7%) studies deleted for inappropriate outcomes, 6 (2.5%) studies deleted for inadequate data, and 2 (0.8%) studies deleted for no access to the full text. Finally, this study included 19 articles. shows the study search strategy.
Characteristics of Included Studies
The basic characteristics of the 19 eligible articles [-] are presented in . Of the 19 studies, only one consisted of patients who had recurrent ovarian cancer, whereas the remaining 18 studies involved patients who had primary ovarian cancer. There were 3 multicenter studies, 5 single-center studies, and 11 database-based studies. In total, 15 studies were externally validated. The 19 eligible studies involved 7137 patients, and the number of patients included ranged from 58 to 1002. These eligible studies contained 39 predictive models, of which 22 reported sensitivity and specificity. The most widely used modeling methods in the training cohort were logistic regression (LR; 16/39, 41%), Extreme Gradient Boosting (XGBoost; 4/39, 10%), and support vector machines (SVMs; 4/39, 10%), whereas the common modeling method in the validation cohort was SVM (4/12, 33%).
Study, year | Country | Sample source | Chemotherapy | Positive samples, n/N (%) | Training set samples, n/N (%) | Overall sample size, n | Variable selection method | Type of model | External validation | External validation sample size |
Shannon et al [], 2021 | Singapore | GDSCa, TCGAb, and GEOc | Carboplatin | 39/50 (78) | 50/60 (83) | 60 | —d | SVMe, LRf, KNNg, DTh, AdaBoosti, GBMj, and XGBoostk | 1 | 10 |
Hwangbo et al [], 2021 | Korea | The Seoul National University Hospital, Asan Medical Center, and Severance Hospital | Platinum-based combination chemotherapy | 779/1002 (77.7) | 1002/1002 (100) | 1002 | Univariate and multivariate analysis | LR, RFl, SVM, and DNNm | 0 | — |
Zhao et al [], 2019 | China | GEO and TCGA | Platinum-based combination chemotherapy | 42/129 (32) | 129/707 (18.2) | 707 | Univariate and multivariate analysis | LR, COXn, SVM, and ANNo | 1 | 454 |
Paik et al [], 2017 | Korea | Samsung Medical Center | Platinum-based combination chemotherapy | 616/757 (81.4) | 757/757 (100) | 757 | Univariate and multivariate analysis | LR | 0 | — |
Han et al [], 2012 | China | TCGA and GEO | Platinum or paclitaxel-based treatment | 177/200 (88.5) | 200/322 (62.1) | 322 | Principal components method | SPCp | 1 | 122 |
Lan et al [], 2019 | China | Sun Yat-Sen University Cancer Center | Platinum or paclitaxel-based treatment | 22/71 (31) | 71/91 (78) | 91 | Univariate and multivariate analysis | LR | 0 | — |
Zheng et al [], 2021 | China | Beijing Cancer Hospital, Peking Union Medical College and TCGA | Taxol plus platinum-based chemotherapy | 44/60 (73) | 60/448 (13) | 448 | Univariate and multivariate analysis | COX | 1 | 388 |
Yi et al [], 2021 | China | Hunan Cancer Hospital | Platinum-based combination chemotherapy | 26/71 (36) | 71/102 (69) | 102 | LASSOq | RF and SVM | 1 | 31 |
Yu et al [], 2020 | America | TCGA and CPTACr | Platinum-based combination chemotherapy | — | 587/587 (100) | — | — | AlexNets, GoogLeNett, VGGNetu, SVM, modern deep convolutional neural networks, and multilayer neural networks | 1 | — |
Yu et al [], 2016 | America | TCGA and CPTAC | Platinum-based combination chemotherapy | 35/130 (2) | 130/130 (100) | 130 | LASSO | RF, SVM, NBv, and COX | 1 | — |
Sun et al [], 2016 | China | Tongji Hospital and Hubei Cancer Hospital | Platinum or taxane-based chemotherapy | 43/100 (43) | 100/251 (39.8) | 251 | Univariate analysis | SVM | 2 | 151 |
Chen et al [], 2022 | China | TCGA or GEO | Platinum-based combination chemotherapy | 161/230 (70) | 230/305 (75.4) | 305 | Univariate analysis | RF and COX | 1 | 75 |
Li et al [], 2022 | China | TCGA or GEO | Platinum-based combination chemotherapy | 287/489 (58.7) | 489/797 (61.4) | 797 | LASSO | LR | 1 | 308 |
Zhao et al [], 2021 | China | TCGA or GEO | Platinum-based combination chemotherapy | — | 146/483 (30.2) | 483 | — | LR | 1 | 337 |
Buttarelli et al [], 2022 | Italy | TCGA or GEO | Platinum-based combination chemotherapy | 7/14 (50) | 14/58 (24) | 58 | — | RF | 1 | 44 |
Gonzalez Bosquet et al [], 2016 | America | NCIw and NHGRIx | Platinum-based combination chemotherapy | 292/450 (64.9) | 450/450 (100) | — | Multivariate analysis | RF, Elastic Nety, PAMz, Diagonal Discriminant Analysis, partial least squares–LR, penalized LR, partial least squares, and partial least squares–RF | 1 | — |
Gong et al [], 2021 | China | Shengjing Hospital of China Medical University | Platinum or paclitaxel-based treatment | 77/174 (44) | 174/174 (100) | — | — | NB, generalized linear model, LR, Fast Large Margin, deep learning, DT, RF, Gradient Boosting Tree, and SVM | 1 | — |
Sun and Yang [], 2020 | China | TCGA | Platinum-based combination chemotherapy | 95/320 (29) | 320/320 (100) | — | Univariate and multivariate analysis | LR | — | — |
Lei et al [], 2022 | China | The Sun Yat-sen Memorial Hospital | Platinum-based combination chemotherapy | 50/62 (80) | 62/93 (66) | 93 | — | Convolutional neural network, principal component analysis, and SVM | 1 | 31 |
aGDSC: Genomics of Drug Sensitivity in Cancer.
bTCGA: The Cancer Genome Atlas.
cGEO: Gene Expression Omnibus.
dMissing data or not applicable.
eSVM: support vector machine.
fLR: logistic regression.
gKNN: k-nearest neighbor.
hDT: decision tree.
iAdaBoost: Adaptive Boosting.
jGBM: Gradient Boosting Machine.
kXGBoost: Extreme Gradient Boosting.
lRF: random forest.
mDNN: deep neural network.
nCOX: Cox Proportional Hazards Regression and Log-Rank Tests.
oANN: artificial neural network.
pSPC: supervised principal component.
qLASSO: Least Absolute Shrinkage and Selection Operator.
rCPTAC: Clinical Proteomic Tumor Analysis Consortium.
sAlexNet: Alexandria Network.
tGoogLeNet: Google’s Network.
uVGGNet: Visual Geometry Group Network.
vNB: naive Bayes.
wNCI: National Cancer Institute.
xNHGRI: National Human Genome Research Institute.
yElastic Net: Elastic Net Regularization.
zPAM: Prediction Analysis of Microarrays.
Quality Assessment of Included Studies Using PROBAST
PROBAST was used to evaluate the risk of bias in eligible articles that constituted or externally validated predictive models. summarizes the risk of bias in the 39 predictive models. Overall, 2 models had a low risk of bias in research participants, 2 models had a low risk of bias in predictors, 4 models had a low risk of bias in outcomes, and none had a low risk of bias in statistical analysis (). Some models were at a high risk of bias, suggesting that their real predictive performance may be worse than that previously reported. Therefore, we are reasonably concerned that these models may be unreliable when used by others.
Model Performance Evaluation
Discrimination and calibration are the most commonly used indicators for assessing the prediction model performance. Discrimination is usually evaluated by the area under the receiver operating characteristic curve, namely the C-index, which is between 0.5 and 1. A higher C-index indicates better discrimination in the prediction model. In general, a random effects model was used for the meta-analysis of C-index in 39 predictive models (XGBoost, LR, Least Absolute Shrinkage and Selection Operator [LASSO], SVM, random forest, convolutional neural networks, artificial neural networks, Prediction Analysis of Microarrays, Diagonal Discriminant Analysis, Penalized Logistic Regression, partial least squares, and supervised principal components method). The training cohort reported C-index in 39 predictive models, with a pooled value of 0.806 (95% CI 0.767-0.846); the validation cohort reported C-index in 12 predictive models, with a pooled value of 0.831 (95% CI 0.768-0.895). We conducted subgroup analyses according to the machine learning model type, histological type of ovarian cancer, and whether there was residual tumor after surgery. In terms of the subgroup analysis of model types, the C-index for the models in the training set was as follows—XGBoost: 0.861 (95% CI 0.808-0.914), LR: 0.816 (95% CI 0.775-0.858), SVM: 0.942 (95% CI 0.897-0.988), and ANN: 0.705 (95% CI 0.615-0.796); the C-index for the models in the test set were LR: 0.821 (95% CI 0.767-0.876), LASSO: 0.808 (95% CI 0.723-0.893), SVM: 0.879 (95% CI 0.808-0.949), and random forest: 0.909 (95% CI 0.868-0.950). With regard to the subgroup analysis of pathological types, the C-index in the training cohort was serous adenocarcinoma (0.751, 95% CI 0.682-0.820), high-grade serous ovarian cancers (0.837, 95% CI 0.780-0.894), and unclear (0.800, 95% CI 0.749-0.852); the C-index in the test set was high-grade serous ovarian cancers (0.786, 95% CI 0.679-0.893) and unclear (0.916, 95% CI 0.875-0.958). Meanwhile, in the subgroup analysis of residual tumor, the C-index for residual tumor in the training cohort was 0.767 (95% CI 0.732-0.803) and the C-index for nonresidual tumor was 0.811 (95% CI 0.770-0.852). In the test set, the C-index for residual tumor was 0.719 (95% CI 0.638-0.801) and the C-index for nonresidual tumor was 0.889 (95% CI 0.835-0.943). The forest plot for the subgroup analysis is shown in . presents the meta-analysis results of the C-index. High heterogeneity was identified among these studies (I2=97.3%; P≤.001), probably because of the varied machine learning methods and variables used in these studies. Furthermore, a meta-analysis of the sensitivity and specificity of the 22 predictive models was performed. The pooled sensitivity was 0.890 (95% CI 0.800-0.950) and the pooled specificity was 0.790 (95% CI 0.720-0.840) in the training set () [-,,,,,,,]. In the test set, the pooled sensitivity was 0.920 (95% CI 0.810-0.970) and the pooled specificity was 0.910 (95% CI 0.760-0.970; ) [,-,,].
Training set | Test set | ||||
Number (n=39), n (%) | C-index (95% CI) | Number (n=12), n (%) | C-index (95% CI) | ||
Model | |||||
XGBoosta | 4 (10) | 0.861 (0.808-0.914) | —b | — | |
LRc | 16 (41) | 0.816 (0.775-0.858) | 2 (17) | 0.821 (0.767-0.876) | |
LASSOd | 2 (5) | 0.734 (0.476-0.993) | 1 (8) | 0.808 (0.723-0.893) | |
SVMe | 4 (10) | 0.942 (0.897-0.988) | 4 (33) | 0.879 (0.808-0.949) | |
RFf | 2 (5) | 0.740 (0.721-0.759) | 1 (8) | 0.909 (0.868-0.950) | |
CNNg | 2 (5) | 0.935 (0.849-1.000) | 2 (17) | 0.914 (0.752-1.000) | |
ANNh | 3 (8) | 0.705 (0.615-0.796) | — | — | |
PAMi | 1 (3) | 0.640 (0.580-0.700) | — | — | |
DDAj | 1 (3) | 0.740 (0.705-0.775) | — | — | |
PLRk | 1 (3) | 0.790 (0.710-0.870) | — | — | |
PLSl | 1 (3) | 0.710 (0.655-0.765) | — | — | |
SPCm | 2 (5) | 0.802 (0.752-0.852) | 2 (17) | 0.659 (0.573-0.745) | |
Histological type | |||||
Serous | 4 (10) | 0.751 (0.682-0.820) | — | — | |
HGSOCn | 10 (26) | 0.837 (0.780-0.894) | 8 (67) | 0.786 (0.679-0.893) | |
Unclear | 25 (64) | 0.800 (0.749-0.852) | 4 (33) | 0.916 (0.875-0.958) | |
Residual tumor | |||||
Yes | 4 (10) | 0.767 (0.732-0.803) | 4 (33) | 0.719 (0.638-0.801) | |
No | 35 (90) | 0.811 (0.770-0.852) | 8 (67) | 0.889 (0.835-0.943) | |
Overall | 39 (100) | 0.806 (0.767-0.846) | 12 (100) | 0.831 (0.768-0.895) |
aXGBoost: Extreme Gradient Boosting.
bMissing data.
cLR: logistic regression.
dLASSO: Least Absolute Shrinkage and Selection Operator.
eSVM: support vector machine.
fRF: random forest.
gCNN: convolutional neural network.
hANN: artificial neural network.
iPAM: Prediction Analysis of Microarrays.
jDDA: Diagonal Discriminant Analysis.
kPLR: penalized logistic regression.
lPLS: partial least squares.
mSPC: supervised principal component.
nHGSOC: high-grade serous ovarian cancer.
Discussion
Principal Findings
This study conducted a meta-analysis of machine learning models for predicting responses to platinum chemotherapy in patients with ovarian cancer. It delves into the performance, reliability, and influencing factors of models. To our knowledge, this is the first systematic review and meta-analysis on the application of machine learning in predicting responses to platinum-based chemotherapy in patients with ovarian cancer. The search initially yielded 1749 studies, and after applying inclusion criteria, 19 studies (accounting for 1.09% of the total) were ultimately included. This research encompasses 12 machine learning models, such as XGBoost, LR, LASSO, and SVM, built based on various hospital or genomics data sources. The analysis results indicated that these models performed effectively in distinguishing patients’ responses to platinum chemotherapy, achieving C-indices of 0.806 and 0.831 in the training and validation sets, respectively. The model demonstrated high overall sensitivity and specificity, underscoring its accuracy and reliability in predicting platinum drug response in ovarian cancer. Subgroup analysis revealed the influence of model type, pathology type, and residual tumor on the prediction performance. SVM stood out on both the training and validation sets because it outperformed other machine learning methods in terms of accuracy and relative error rate measures [] and exhibited the ability to identify subtle patterns in complex data sets []. LR is the most commonly used modeling variable because it can handle not only binary results but also accommodate continuous or categorical predictor variables. This comprehensive approach considers the impact of multiple factors on the results, effectively controls potential confounding factors, and reduces bias []. As a result, LR is widely used in machine learning modeling within various fields. The analysis of residual tumor revealed that the model exhibited different performance in predicting patients with or without residual tumor. Compared with nonresidual tumor, the predictive performance of machine learning for residual tumor was more significant, suggesting that residual tumor may be a crucial factor influencing ovarian cancer patients’ response to platinum therapy.
Most published meta-analyses on the application of machine learning in ovarian cancer focus on the diagnosis and prediction of ovarian cancer; however, there are some differences in specific research methods, evaluation tools, and presentation of results. Huang et al [] reviewed the application of computed tomography and magnetic resonance imaging radiomics in ovarian cancer, achieving promising results in differential diagnosis and prognosis prediction. Other studies [,] have summarized artificial intelligence methods for gynecological malignant tumors, emphasizing that variable selection, machine learning methods, and end point selection can all influence model performance. Xu et al [] systematically reviewed studies that applied artificial intelligence to diagnose ovarian cancer based on medical images and highlighted the good performance of artificial intelligence algorithms in ovarian cancer diagnosis. Koch et al [] evaluated the accuracy of computer-aided diagnosis, encompassing computer-aided diagnosis for ultrasound, computed tomography, and magnetic resonance imaging, to predict the likelihood of malignancy in ovarian tumors. Given that it is challenging to predict the response of patients with ovarian cancer to platinum therapy before the completion of chemotherapy, accurate prediction of this response is crucial for devising effective treatment plans. This review focuses on the performance of machine learning in predicting responses to platinum-based chemotherapy in patients with ovarian cancer. This not only provides valuable information for clinical prediction but also addresses a long-standing challenge in the development of noninvasive methods for predicting chemotherapy response in patients with ovarian cancer. Feature selection emerges as a critical aspect influencing model performance in this context. Previous studies [,] have reported that next-generation sequencing technology can be used to explore correlations between intrinsic genomic features and the response to platinum-based chemotherapy. Radiomics is another approach. A recent study demonstrated that a predictive model based on the combination of radiomics with single nucleotide polymorphisms of Human Sulfatase 1 could predict platinum resistance in ovarian cancer treatment []. Previous research [] has shown that combining whole-slide histopathology scanners and high-throughput omics analysis with cutting-edge machine learning algorithms can help reveal correlations between microscopic tumor cell morphology and molecular pathways. Machine learning models have shown great promise in linking histopathological patterns to patient diagnosis and prognosis. Another study [] used tumor proteomic features to predict the clinical response to platinum-based chemotherapy in patients with ovarian cancer. The findings revealed a close association between tissue expression levels of 24 proteins and the response to platinum-based chemotherapy. The variables selected in the 19 included articles spanned from molecular-level factors to clinical characteristics, medical imaging, and the microbiome, reflecting the prevailing trend of considering multiple levels and aspects in cancer research. This comprehensive approach facilitates a more in-depth understanding of cancer pathogenesis and predictive factors.
Strengths and Limitations
Strengths
The most noteworthy aspect of our analysis is that it provides a comprehensive map of research on prognostic prediction models for patients with ovarian cancer. We gathered all available predictive models for potential clinical outcomes of platinum chemotherapy responses in patients with ovarian cancer. The characteristics of these models were elucidated in detail. Furthermore, this study critically evaluated the predictive models for platinum chemotherapy response in patients with ovarian cancer using the PROBAST tool. Moreover, a meta-analysis of the C-index using multiple externally verified predictive models was performed. There is currently no meta-analysis that summarizes research on machine learning prediction models for platinum chemotherapy response in ovarian cancer. Hence, this study aimed to explore its performance in prediction. It is critical to systematically review published studies on machine learning and provide guidance for future research. This helps to establish personalized treatment protocols and estimate prognosis by elucidating intrinsic tumor features such as platinum sensitivity in the initial therapy.
Limitations
However, several limitations of the current investigation must be considered. First, the meta-analysis of the C-index had a high degree of heterogeneity, probably because of the various machine learning methods, predictors, and parameters used in model construction, as well as differences in clinical settings, patient characteristics, and research time. At the same time, we should note that the risk of bias assessment of the predictive model is a rigorous tool for the construction of original models; for most of the original studies, the results assessed by this tool have a high risk of bias. In addition, our meta-analysis had several methodological problems in model development, which were reflected in the risk of bias. The PROBAST assessment suggested that some studies had a high or unclear risk of bias in 4 domains: participants, predictors, outcomes, and statistical analysis. Furthermore, the predictive value of machine learning for different diseases may vary. The essence of machine learning is efficient predictors. When the same machine learning model includes more efficient predictors, its predictive value will be significantly improved. This may result in heterogeneity between models. For constructing machine learning, especially for rare events, some studies face challenges in acquiring large data sets, making it difficult to establish an independent validation set. However, the importance of the training process cannot be overlooked as cross-validation may be used during training, although it cannot fully replace an independent validation set. When conducting meta-analysis, it is essential to consider whether the model is overfitting, necessitating attention to the results of the training set. Consequently, our meta-analysis includes studies without independent validation cohorts for a comprehensive evaluation. The most important aspect is the lack of original research with large multicenter samples in the modeling process. Therefore, more high-quality, multicenter, large-scale studies are required. Despite some limitations in this study, we have compiled a comprehensive summary of the current models to provide a reference for the development of more broadly applicable clinical tools in the future. Looking at it this way, it is necessary to conduct a meta-analysis. Although there are frequent disagreements about the predictive value of different studies, this is partially dependent on the selection of the predictive model, which is the most influential factor affecting predictive performance.
Conclusions
Machine learning has excellent predictive performance in predicting response to platinum chemotherapy in patients with ovarian cancer. At the same time, we found that SVM has the best prediction performance among the existing prediction models. Machine learning can be used as a prediction tool for platinum response in ovarian cancer. On the basis of this research, a large-scale, multicenter, and multiethnic prediction tool can be developed in the future for predicting platinum-based chemotherapy response in patients with ovarian cancer to advance precision chemotherapy for ovarian cancer management.
This study was supported by the National Nature Science Foundation of China (grants 81973894 and 82174421).
The authors would like to thank the researchers and study participants for their contributions.
The original contributions presented in this study are included in the paper or multimedia appendices, and further inquiries can be directed to the corresponding author.
QW was involved in study conceptualization and investigation and wrote the original draft. ZC participated in study visualization, supervised the study, and reviewed and edited the draft. XL participated in the investigation and prepared the original draft. YW and CF participated in the investigation and prepared the original draft. YP participated in the investigation and gathered resources. XF was involved in project administration.
None declared.
Edited by G Tsafnat; submitted 26.04.23; peer-reviewed by J Zhu, Y Li; comments to author 10.11.23; revised version received 23.11.23; accepted 24.11.23; published 22.01.24
©Qingyi Wang, Zhuo Chang, Xiaofang Liu, Yunrui Wang, Chuwen Feng, Yunlu Ping, Xiaoling Feng. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 22.01.2024.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.