Uncategorized

Medicine & Science in Sports & Exercise


Cardiopulmonary exercise testing (CPET) has become an essential diagnostic and prognostic tool in clinical practice (1,2). The prognostic value of CPET is established by the inverse relationship of peak oxygen uptake (V̇O2peak) and all-cause mortality (3), as well as the risk for various noncommunicable diseases (e.g., coronary artery disease, type 2 diabetes) (3,4). V̇O2peak reflects the upper ceiling of oxygen supply and utilization and is dependent on the interplay of 1) pulmonary–vascular, 2) mechanical–ventilatory, 3) cardiocirculatory, and 4) muscular systems (5). CPET is thus useful to determine the status quo of an individual’s cardiorespiratory fitness, detect physiological factors limiting V̇O2peak, and quantify changes of such (e.g., due to disease or interventions) (1).

The correct interpretation of the multitude of CPET data and especially their interplay requires extensive training (6). There is currently a lack of sufficiently trained staff, particularly in smaller institutions (7). CPET is thus underused (7). Several attempts have been made to simplify the interpretation of CPET, such as the nine-panel Wasserman plot or numerous decision trees (1,8,9). Although these tools might simplify the interpretation of CPET data by visualizing large amounts of data and decision processes, a substantial understanding of various physiological mechanisms is still required. In addition, this complexity in the interpretation and differences between decision trees as well as thresholds used to signal abnormal responses to exercise can lead to interobserver differences. To exploit the full potential of this clinically highly relevant tool, CPET interpretation needs to be further simplified and standardized to enable broader dissemination. As seen recently, machine learning (ML) seems to be a promising approach to achieve this (10,11).

Inbar et al. (10) developed an algorithm for CPET data able to discriminate between individuals with chronic heart failure and chronic obstructive pulmonary disease, and healthy counterparts. As the authors pointed out, the algorithm may be valuable to identify the investigated chronic conditions in clinical practice (10). Only recently, another important step forward was taken. Portella et al. (11) used ML to classify individuals according to their primary exercise limitation. Extending their approach by categorizing the severity of exercise limitations may be valuable considering that patients often present with a combination of exercise limitations (12). This would be particularly relevant in a patient population that is representative for clinical practice.

Thus, we aimed to provide a proof-of-concept and 1) determine the most important CPET parameters to identify pulmonary–vascular, mechanical–ventilatory, cardiocirculatory, and muscular exercise limitations; 2) create models that can automate the identification of exercise limitations and categorization of their severity; and 3) compare the accuracy of these models to expert consensus in a real-life scenario using CPET data of patients presenting at a pulmonary clinic.

METHODS

Study Design

This cohort study included 200 valid, historical CPET data sets from patients presenting at the Lung Centre (Bogenhausen-Harlaching), München Klinik, Germany. The CPETs were conducted between June 25, 2018, and February 15, 2019. No preselection was done based on the diagnosis, indication for the exercise test, or sex to obtain a real-life situation. The cohort thus included a larger fraction of patients with lung diseases than other diagnoses. However, this may still closely reflect the distribution of disease in patients referred for CPET in clinical practice. Although an ML algorithm should ideally apply to a wide range of individuals, we focused on patients presenting at a lung clinic to provide a proof-of-concept of our method before moving on to large and more diverse populations. This ensures the presence of a diagnosis and a more controlled development process regarding the availability of diagnoses but also consistency of data collection and devices used. The cohort was randomly split into a training (n = 100) and a confirmation group (n = 100) for the analyses. This was done to verify that the models not only perform well on the training data but also generalize effectively to the unseen data set. The study was approved by the Ethics Committee of the Technical University Munich (165/19 S-SR), and all procedures followed the Declaration of Helsinki.

Study Participants

CPET data were eligible if the following criteria were fulfilled: technically correct and valid data recording, complete CPET including inspiratory capacity maneuver (described elsewhere (13)), blood gas analysis and clinical report, sufficient patient compliance during testing (i.e., ability to tolerate physical effort and to obtain the target cadence on the cycle ergometer, and sufficient knowledge of the German language), and age >40 yr. The age limit was applied to reduce heterogeneity in the sample and ensure patients and reasons for referral are typical for clinical practice. CPET data were excluded in the case of premature, nonclinically justified test termination (i.e., no apparent symptom or organ limitation). These strict eligibility criteria are unlikely to be met in clinical practice. However, they were chosen to achieve a high number of parameters with valid data that can be included in the ML models. This will allow to detect key parameters relevant for identifying organ limitations from a wide range of parameters. If a subject was excluded, the subsequent subject (i.e., 201st subject) was included until 200 eligible data sets were obtained. More details on the number of CPET data that were excluded are available in the Results section.

Study Procedures

Expert rating of CPET data

Each of the 200 data sets was independently rated by two experts regarding pulmonary–vascular (related to the lungs and blood vessels that supply them), mechanical–ventilatory (related to mechanical aspects of breathing such as airway resistance and respiratory muscle function), cardiocirculatory (related to the cardiovascular system, including the heart, blood vessels, and circulatory function), and muscular limitations (related to muscle pathologies, e.g., metabolic, mitochondrial origin, not deconditioning) based on their experience. Experts perform >1000 CPETs yearly and publish regularly in this research area. The severity of the limitation was graded on a visual analog scale from 0 (no impairment) to 6 (severe impairment) for each category (Supplemental Fig. 1, Supplemental Digital Content, Visual analog scale used by exerts to rate limitations in the respective organ limitations, https://links.lww.com/MSS/C911). The 0–6 rating was based on the German school rating system. Rating combined limitations was possible. The mean of the two ratings was calculated for deviations ≤2 points. Substantial deviations of expert ratings (≥3 points) in a patient were resolved by involving a third expert.

Spirometry and CPET

Lung function was assessed via body plethysmography (MasterScreen Body; CareFusion Germany 234 GmbH, Höchberg, Germany). The procedures were performed according to respective guidelines (13–15). Forced expiratory volume in 1 s (FEV1) and forced vital capacity (FVC) were expressed as % of predicted (16).

CPET was performed on an electronically braked cycle ergometer (ER900; ergoline GmbH, Bitz, Germany). The breath-by-breath measurement of gas and airflow parameters was done using the MasterScreen CPX (CareFusion Germany 234 GmbH). Data were stored as 10-s means. Gas and ambient air calibrations were performed in standard fashion daily before the first test. Furthermore, the volume sensor was calibrated before each test. Cardiac function and heart rate were monitored by a 12-lead ECG. Peripheral oxygen saturation was recorded on the finger with a pulse oximeter (Vyaire Medical GmbH, Hoechberg, Germany). No measures of blood lactate concentration were done.

CPET started with a resting phase (2:30 min) used to assess the plausibility of gas and volume parameters while the patient was at rest (1). This phase was followed by 30 s of unloaded cycling. Subsequently, a ramp protocol with constantly increasing workload was initiated aiming at reaching maximum voluntary exertion of the patient within 8–12 min. Depending on the patient’s predicted maximum power, one of six ramp protocols were chosen with increments of 5, 10, 15, 20, 25, or 30 W·min−1. The test ended with a recovery phase consisting of cycling at a low workload. The protocol was chosen based on the physical activity history and subjective evaluation of the patient. Inspiratory capacity maneuvers were performed at rest and every 2 min during the ramp protocol as described elsewhere (17). V̇O2peak was recorded as the mean of the three highest consecutive 10-s intervals of V̇O2 values during the test (30-s mean). V̇O2peak as % of predicted was calculated according to the formulas in Wasserman et al. (18).

Maximum heart rate and respiratory exchange ratio were reported as the highest 10-s interval before the recovery phase of the test. Test duration refers to the time from the end of the unloaded phase to the start of the recovery phase. Ventilatory efficiency (V̇E/V̇CO2) refers to the lowest ratio of ventilation and carbon dioxide production (nadir V̇E/V̇CO2), whereas V̇E/V̇CO2 slope refers to the slope of these two parameters from the beginning of the ramp protocol to the second ventilatory threshold (19). In case a change in the slope of V̇O2 in relation to work rate (in watts) from the beginning of the ramp protocol to the beginning of recovery phase (V̇O2/WR slope) was detected during visual inspection of panel 3 in the nine-panel Wasserman graph, the V̇O2/WR slope up to this point was determined (V̇O2/WR slope 1). Furthermore, V̇O2/WR slope from this point onward was determined (V̇O2/WR slope 2). The percentage change was then calculated based on these two slopes. If the change in slope occurred during the last 30 s before the recovery phase, this was excluded and not considered abnormal (20). Breathing reserve was calculated as 100% − (V̇Epeak [in L.min−1]/(FEV1 [in L] × 35)) × 100 (21). V̇O2/WR slope was calculated from the start of the ramp protocol to the start of the recovery phase.

Statistical Analysis

To provide a proof of concept, a sample size of N = 200 in the present study was chosen based on a prior study including N = 199 examinations to develop a standardized procedure for the interpretation of CPET (9).

Analyses and figures were done in R version 4.1.3 (22). Data are presented as mean (SD) unless stated otherwise. Decision trees are a well-known regression method for building an interpretable predictive model (23). They represent a function that takes all measured variables (Table 1) as an input vector of attribute values and returns a “decision”—a single output value as a predicted limitation score (24).


TABLE 1 -
Full list of all variables included in the analyses.


































Variables Abbreviations in Figures
Sex
Age
Heart rate–reducing drugs (yes/no) HR_drugs
Breathing frequency at rest BF_rest
Oxygen uptake at rest V̇O2_rest
Respiratory exchange ratio at rest RER_rest
Respiratory exchange ratio variation at rest in percent RER_rest_deltapercent
Minute ventilation volume at rest VE_rest
Reached maximal oxygen consumption in percent of predicted V̇O2peak_percent
Maximal reached respiratory exchange ratio during exercise RER_exercise
Ratings of perceived exertion during exercise (scale 0–10) RPE_exercise
Maximum heart rate during exercise HR_exercise_is
Lowest V̇E/V̇CO2 ratio during the CPET nadir_VEVCO2
Duration of CPET duration_exercise
Exercise oscillatory ventilation (present/absent) EOV
Slope of V̇E/V̇CO2 from the beginning of the ramp protocol to the second ventilatory threshold VE/VCO2slope
Breathing reserve in percent BRR
Difference in arterial oxygen saturation between rest and exercise cessation DeltaSaO2
Slope of increase in V̇O2 to work rate VO2WRslope1
Inflection of slope of V̇O2/work rate in percent inflectionpercent
Systolic blood pressure during exercise RRsys_exercise
FVC in percent FVCpercent
FEV1 in percent FEV1percent
Ratio of FEV1 and FVC in percent FEV1FVCpercent
Heart rate reserve in percent during exercise HRRpercent
Exercise-induced hypertension (yes/no) Exercise_hypertension
Exercise-induced hypotension (yes/no) Exercise_Hypotension
Dynamic hyperinflation (present/absent) Dyn_Hyper
Exhaustion criteria combination (yes/no) exhaustion_DT
Clinical exhaustion criteria (yes/no) exhaustion_clinic

Exhaustion criteria combination: fulfillment of a combination of cardiometabolic exhaustion criteria. Clinical exhaustion criteria: fulfillment of ≥1 cardiometabolic exhaustion criterion.

The R-package “caret” (version 6.0-91) was used to generate regression-based decision trees (25). The most accurate decision tree of each limitation category, with respect to the smallest root mean square error (RMSE) in comparison to 500 randomly generated decision trees, was presented graphically to show the corresponding thresholds of the parameters included. These decision trees may therefore be useful to identify pulmonary–vascular, mechanical–ventilatory, cardiocirculatory, or muscular limitations and help quantify their severity. Assessed limitations were combined through conditional inference trees using “partykit” (version 1.2-15) (26). Random forests combine several randomized decision trees and aggregate their predictions by averaging (27). Therefore, the feature importance (FI) of all variables can be analyzed for each limitation. We used “randomForest” (version 4.7-1) with default ntree = 500 to train a random forest based on ntree randomized decision trees for imputation of the data, as well as for random forests analysis (28). All models were trained using data of the training group and tested based on data of the confirmation group. Errors of random forests and decision trees are presented as mean absolute error (MAE) and/or RMSE. MAE_0 and RMSE_0 are error measures of the null models that use only the mean of the confirmation group. MAE and RMSE are error measures of the trained models for the confirmation group. Both metrics reflect how far the predicted and actual errors (mean expert rating) lay apart. To examine the utility of ML in a real-life scenario, we further generated a combined decision tree that is capable of identifying the presence of multiple limitations within the same patient as well as rating their severity.

RESULTS

Twenty-nine CPETs were excluded because of incompleteness (missing inspiratory capacity maneuver (n = 19), missing blood gas analysis (n = 2), incomplete examination report (n = 8)), 32 because of a lack of compliance during CPET, and 21 because of invalid/implausible data (calibration error (n = 4), implausible heart rate data (n = 3), implausible CPET trajectory (n = 10), technical artifacts (n = 4)) until the target number of valid CPET data sets was achieved. Reasons for referral were preoperative examination (i.e., patients planned for lung resection due to lung carcinoma), evaluation of therapy progress, and unexplained dyspnea. The cohort characteristics are presented in Table 2 and medical diagnoses in Supplemental Table 1, Supplemental Digital Content, https://links.lww.com/MSS/C911).


TABLE 2 -
Characteristics of the full cohort (N = 200) and stratified by group.




























Variable Total Training Group (n = 100) Confirmation Group (n = 100)
Female sex, n (%) 97 (48.5) 52 (52.0) 45 (45.0)
Age, Mdn (IQR) [min, max], yr 69 (61–75) [41, 94] 68 (61, 74) [42, 85] 71 (62, 77) [41, 94]
Body mass index, Mdn (IQR), kg·m−2 25.3 (22.3–30.7) 25.7 (22.3–30.8) 24.3 (22.3–30.3)
FEV1, Mdn (IQR), % of pred. 80.8 (61.4–94.8) 80.7 (67.7–94.5) 81.0 (57.6–94.9)
FVC, Mdn (IQR), % of pred. 90.6 (73.7–100.6) 91.2 (78.4–101.3) 86.6 (72.1–99.9)
FEV1/FVC, Mdn (IQR), % pred. 94.3 (79.7–105.1) 92.5 (83.2–103.9) 95.1 (76.9–105.8)
V̇E/V̇CO2 slope, Mdn (IQR) 35.0 (31.0–42.0) 35.0 (31.1–41.6) 35.0 (30.6–43.3)
V̇O2/WR slope 9.5 (1.6) 9.5 (1.5) 9.5 (1.6)

P
max, Mdn (IQR), W
89.0 (67.0–119.0) 92.5 (70.5–122.8) 88.0 (60.5–113.5)
V̇O2peak, Mdn (IQR), L·min−1 1.20 (0.97–1.45) 1.23 (1.03–1.46) 1.17 (0.95–1.44)
V̇O2peak, Mdn (IQR), mL·min-1.kg−1 15.8 (12.6–19.7) 16.2 (12.6–20.3) 15.6 (12.3–19.2)
V̇O2peak, % of pred. 78.5 (22.6) 79.7 (22.7) 77.3 (22.5)
RER 1.1 (0.1) 1.12 (0.1) 1.10 (0.1)
HRmax, % of pred. 95.9 (14.3) 95.6 (14.0) 96.2 (14.6)
Medication


 β-Blockers, n (%) 61 (30.5) 28 33
 Antiarrhythmics, n (%) 93 (46.5) 43 50
 Lipid-lowering drugs, n (%) 66 (33.0) 32 34
 Antidiabetics, n (%) 28 (14.0) 16 12
 Diuretics, n (%) 60 (30) 26 34
 Antihypertensives, n (%) 91 (45.5) 48 43
 PDE5-antagonists, n (%) 6 (3.0) 4 2
 Endothelin-receptor antagonists, n (%) 4 (2.0) 2 2
 Guanylate-cyclase stimulators, n (%) 1 (0.5) 1 0
 Neprilysin inhibitors, n (%) 1 (0.5) 0 1

All data are presented as mean (SD) unless stated otherwise.

Expert Review of Exercise Limitations

Descriptive statistics of expert ratings are displayed in Table 3. Differences between the expert rating and the decision tree-based rating of organ limitations are shown in Supplemental Figure 2, Supplemental Digital Content, https://links.lww.com/MSS/C911.


TABLE 3 -
Descriptive statistics of expert ratings as limitation points stratified by limitation category for the full cohort (N = 200).








Limitation Category Mean (SD) [Min, Max] Sum of Limitation Points Mean (SD) Deviation between Experts
Pulmonary–vascular 1.8 (1.7) [0, 6] 357.5 1.1 (1.2)
Mechanical–ventilatory 1.8 (1.8) [0, 6] 353.5 1.0 (1.2)
Cardiocirculatory 1.0 (1.3) [0, 6] 205.5 1.1 (1.2)
Muscular 0.1 (0.3) [0, 2.5] 14.5 0.4 (0.9)

Limitation-Specific Decision Trees

Pulmonary–vascular decision tree

Our trained random forest analysis yielded nadir V̇E/V̇CO2 (FI 0.91) and V̇E/V̇CO2 slope (IS 0.77) as the most relevant parameters to detect pulmonary–vascular limitations (Fig. 1A). The null model with only the mean of the confirmation group had an RMSE_0 of 1.78. After training, the error decreased by 50% to an RMSE of 0.89. The decision tree with the lowest RMSE in this category (RMSE = 0.93) is displayed in Figure 1B.

F1
FIGURE 1:

Random forest analysis (A) and most accurate decision tree (B) of the confirmation group for the pulmonary–vascular limitation category. Dark green to dark red coloring reflects the severity of limitation (mild to severe, respectively). The first number in each box is the mean limitation score of each category. MAE, mean absolute error of the trained model in the confirmation group; MAE_0, MAE of the null model in the confirmation group; RMSE, root mean squared error of the trained model in the confirmation group; RMSE_0, MAE of the null model in the confirmation group.

Mechanical–ventilatory decision tree

To detect mechanical–ventilatory limitations, breathing reserve (FI 1.05), FEV1 (FI 1.00), and FVC (FI 0.44) were the most important parameters (Fig. 2A). RMSE_0 was 1.76 and RMSE was 1.03 (41% decrease). The RMSE of the most accurate decision tree was 1.28 (Fig. 2B).

F2
FIGURE 2:

Random forest analysis (A) and most accurate decision tree (B) of the confirmation group for the mechanical–ventilatory limitation category. Dark green to dark red coloring reflects the severity of limitation (mild to severe, respectively). The first number in each box is the mean limitation score of each category. MAE, mean absolute error of the trained model in the confirmation group; MAE_0, MAE of the null model in the confirmation group; RMSE, root mean squared error of the trained model in the confirmation group; RMSE_0, MAE of the null model in the confirmation group.

Cardiocirculatory decision tree

V̇O2peak (FI 0.56), percentage change of V̇O2/WR slope (FI 0.37), and V̇O2/WR slope (FI 0.24) had the highest FI to detect cardiocirculatory limitations (Fig. 3A). RMSE_0 and RMSE were 1.24 and 0.99, respectively (20% decrease). The most accurate decision tree in this category had an RMSE of 1.26 (Fig. 3B).

F3
FIGURE 3:

Random forest analysis (A) and most accurate decision tree (B) of the confirmation group for the cardiocirculatory limitation category. Dark green to dark red coloring reflects the severity of limitation (mild to severe, respectively). The first number in each box is the mean limitation score of each category. MAE, mean absolute error of the trained model in the confirmation group; MAE_0, MAE of the null model in the confirmation group; RMSE, root mean squared error of the trained model in the confirmation group; RMSE_0, MAE of the null model in the confirmation group.

Muscular limitations decision tree

It was not possible to train a model as indicated by a larger RMSE after training. Muscular limitations were largely absent in our cohort as evident by the expert rating (Table 3 and Supplemental Fig. 2, Supplemental Digital Content, Expert ratings for each patient in the training group compared with the rating of the decision tree for all four limitation categories, https://links.lww.com/MSS/C911).

Explorative Analyses

Using an 80:20 split (80% of the cohort in the training data set and 20% in the confirmation data set) with fivefold cross-validation to perform the previously presented analyses yielded a similar RMSE for the pulmonary–vascular decision tree (1.00 ± 0.21 vs 0.93), the mechanical–respiratory (1.31 ± 0.20 vs 1.28), and cardiocirculatory decision tree (1.31 ± 0.15 vs 1.26). Detailed results are presented in Supplemental Figures 3–8, Supplemental Digital Content, Random forests and most accurate decisions trees of the pulmonary–vascular, mechanical–ventilatory, and cardiocirculatory limitation categories, https://links.lww.com/MSS/C911.

Combined Decision Tree

Figure 4 shows the combined decision tree based on all 200 CPET data sets. The analyses yielded nine categories of limitations (displayed as boxplot columns; see Supplemental Table 2, Supplemental Digital Content, Limitations categories of the combined decision tree based on 200 CPET data sets, https://links.lww.com/MSS/C911).

F4
FIGURE 4:

Combined decision tree allowing for cross-category quantification of organ limitations. Vertical boxplot rows represent the nine limitation categories described in the text. Horizontal boxplot rows represent cardiocirculatory (CL), mechanical–ventilatory (MechL), and pulmonary–vascular limitations (PVL), respectively. Each boxplot shows the limitation score on the y axis (0 (no impairment) to 6 (severe impairment)), median (horizontal line), first and third quartiles (lower and upper box limits), minimum and maximum (whiskers), and outliers (dots).

DISCUSSION

The major novel findings were as follows: 1) we found good trainability of random forests and decision trees for detecting specific limitation patterns with accuracies comparable to expert consensus for pulmonary–vascular, mechanical–ventilatory, and cardiocirculatory but not muscular limitations; 2) nadir ventilatory efficiency for CO2 and ventilatory efficiency slope for CO2 were central to identify pulmonary–vascular limitations; breathing reserve, FEV1, and FVC for mechanical–ventilatory limitations; and V̇O2peak, O2 uptake/work rate slope, and % change of the latter for cardiocirculatory limitations; 3) decision trees yielded parameter thresholds for the interpretation of organ-specific limitations and their severity; and 4) we demonstrated the feasibility/ability of ML techniques to create algorithms performing a comprehensive cross-category quantification of organ limitations reflecting a real-life situation.

Expert Review of Exercise Limitations

Although all reviewers were experts in CPET, the data indicate that opinions may diverge when it comes to rating the degree of limitation. This again highlights the difficulty of CPET interpretation and the need for simplifying this process.

Limitation-Specific Decision Trees

We defined a total of eight crucial parameters to identify pulmonary–vascular, mechanical–ventilatory, and cardiocirculatory but not muscular limitations. The RMSE of the best decision tree of each of the aforementioned limitation categories ranged from 0.93 to 1.28 points indicating comparable accuracy of ML approaches to expert ratings (mean difference ranging from 1.0 to 1.1; SD, 1.2 points).

Pulmonary–vascular decision tree

The pulmonary–vascular model showed the best trainability. Nadir V̇E/V̇CO2 and V̇E/V̇CO2 slope were the most important parameters to predict these limitations. In the most accurate decision tree, the most severe pulmonary–vascular limitations were seen in patients with both high nadir V̇E/V̇CO2 and steeper V̇E/V̇CO2 slope. This is in line with the current scientific consensus (30). Ventilatory efficiency is recognized as a central predictor of pulmonary vascular conditions, that is, pulmonary arterial hypertension or chronic thromboembolic pulmonary hypertension, and its use is widely established (30). However, different ways of expressing ventilatory efficiency exist (described elsewhere (30–32)). V̇E/V̇CO2 slope from the onset of incremental exercise to the second ventilatory threshold was used in this article. For this form, consensus regarding thresholds used to indicate pathological ventilatory inefficiency is lacking (30). Our results suggest the use of 34 for nadir V̇E/V̇CO2 and 42 for V̇E/V̇CO2 slope allowing for differentiation between patients with pulmonary–vascular limitation scores of 0.75 and 3.1 as well as 2.3 and 3.9, respectively (no to mild limitation vs moderate to severe limitation). To our knowledge, these are the first data-based cutoffs for a cohort resembling real-life patients. Most available studies focused on specific diseases. For instance, Dumitrescu et al. (33) found that a threshold for nadir V̇E/V̇CO2 of 35.5 had the highest sensitivity and specificity for detecting pulmonary arterial hypertension in patients with systemic sclerosis (N = 173). We aimed at detecting pulmonary–vascular limitations in general and not specific conditions. In this case, a nadir V̇E/V̇CO2 of 34 might be most suitable. Also considering that in Figure 1B, a V̇E/V̇CO2 slope of 42 was used to distinguish individuals from the most severe limitation in this category, a higher threshold compared with, for example, 34, which was used to predict cardiac-related mortality in heart failure (34), seems to be justified. Importantly, these two parameters are not specific to pulmonary–vascular limitations but may also be abnormal in cardiac limitations (e.g., heart failure (34)) or patients presenting with combined limitations (e.g., chronic obstructive pulmonary disease with pulmonary hypertension (35)). In such cases, parameters specific to other organ systems may help rule out other limitations or identify combined organ limitations as they often occur in clinical practice (discussed later on).

Mechanical–ventilatory decision tree

The random forests of the mechanical–ventilatory category with the highest IS are also in line with the current scientific consensus (14). FEV1 and FVC are key parameters of spirometry and routinely used to diagnose ventilatory dysfunction at rest (14). Breathing reserve provides additional insight into ventilatory function during exercise, with lower values indicating a ventilatory limitation in clinical populations (1). FEV1 can be used to identify patients with the most severe mechanical–respiratory limitation (severity score of 4.1) in the most accurate decision tree. Breathing reserve was also a node in this tree and located below FEV1. This makes sense from a clinical standpoint because patients with substantial ventilatory dysfunction at rest (as evident by subnormal FEV1) would be expected to have the most severe limitation in this category. Normal lung function at rest does, however, not imply that lung function may not be limiting during intensive exercise. Here, breathing reserve comes into play. Thus, for patients with moderate mechanical–respiratory limitation (limitation score of 2.8), a low breathing reserve as well as V̇O2peak below 96% of predicted were critical. CPET is therefore most valuable for identifying organ limitations when combined with spirometry at rest (1).

Cardiocirculatory decision tree

Cardiocirculatory limitations were best detected through V̇O2peak, inflection of V̇O2/WR slope in percent, and V̇O2/WR slope. Although low V̇O2peak is not specific to cardiocirculatory limitations, the latter two parameters are (36). V̇O2 failing to increase linearly with WR has been highlighted as a sign of cardiocirculatory limitation already by Wasserman (5). Percentage change of V̇O2/WR slope was also an important node in a decision tree by Schmid et al. (9). In addition to these parameters, heart rate and oxygen pulse responses during exercise are attenuated (5,36). Moreover, the oxygen uptake efficiency slope (slope of regression line between log10 V̇E and V̇O2) is flattened, and the ventilatory threshold 1 (defined by V-slope method) is reached at a lower relative exercise intensity in patients with this limitation (36). To differentiate between specific conditions, other parameters than the inflection of V̇O2/WR slope and V̇O2/WR slope may be more adequate (36). However, if the aim is to identify underlying organ limitations, for example, in an individual with unexplained dyspnea, the parameters proposed in this study (i.e., V̇O2peak, inflection of V̇O2/WR slope in percent, and V̇O2/WR slope) may be preferred. Interestingly, V̇O2peak followed by FEV1 was the only two parameters included in the most accurate decision tree to categorize patients according to the severity of their cardiocirculatory limitation. Patients with V̇O2peak <63% of predicted but without apparent mechanical–ventilatory limitation (in the present study: FEV1 > 67% of predicted) were the most limited in this category (limitation score of 3.9).

Muscular limitations decision tree

No decision trees could be trained for muscular limitations. This may be explained by the low number of patients with muscular limitations as well as their mild severity, evident by a mean limitation score of 0.1 (0.3) on a scale from 0 to 6 (see Table 3 and Supplemental Fig. 2, Supplemental Digital Content, Expert ratings for each patient in the training group compared with the rating of the decision tree for all four limitation categories, https://links.lww.com/MSS/C911). To tackle this in future studies, it may be advisable to specifically include patients with muscular pathologies, that is, mitochondrial myopathies or McArdle disease (37), and a wide range of severity.

Although less likely in this case, the parameters included in the models being too unspecific to identify muscular limitations might be another explanation. Although this is not done routinely, including measurements of blood lactate concentration might be helpful to identify muscular limitations using ML. Drawing blood lactate samples at set time points throughout CPET might reveal an altered anaerobic metabolism and might therefore be useful to rule out other organ limitations (37). To assess general muscular deconditioning, measuring handgrip strength might help to detect strength deficits and is time-saving and inexpensive and (29,38). However, it needs to be investigated whether these measurements improve the models in detecting either myopathies and/or deconditioning.

Explorative Analyses

Including more patients in the training group (training-confirmation ratio, 80:20) yielded similar RMSEs compared with the 50:50 split. This indicates that larger sample sizes alone may not improve the precision of these models. Only few patients were severely limited in one or several of the organ systems (Table 3). It seems thus more important to also increase the variance of limitations in the cohort. A greater number of expert ratings for each patient would likewise be valuable considering the deviation between experts (mean discrepancy ranged between 1.0 and 1.1 points).

Combined Decision Tree

In clinical practice, patients commonly present with a combination of exercise limitations; for example, patients with chronic obstructive pulmonary disease may develop peripheral limitations secondary to respiratory limitations (12). The patient’s medical condition may thus not be described by a single limitation category but a combination of several. The combined decision tree in Figure 4 allows the classification of exercise intolerance into three categories: pulmonary–vascular, ventilatory–mechanical, and cardiocirculatory limitations. Furthermore, a subclassification is done, making it possible to reflect combinations of limitation categories. These results are promising as they indicate that a real-life classification using ML approaches is possible.

Ending this section, we want to point out another particular strength of a combined algorithm. Individual parameters, for example, ventilatory efficiency, not only reflect pulmonary–vascular limitations but could also be altered through processes induced by healthy aging (39). Moreover, depending on the context, individual CPET parameters may connote abnormality, whereas, in fact, the subject is limitation-free and vice versa. For instance, breathing reserve less than 30% is not rare in athletes and suggests great effort as well as motivation rather than ventilatory–mechanical limitations (40). On the other hand, breathing reserve was weakly associated with dyspnea burden in patients with chronic obstructive pulmonary disease (41). Our combined algorithm incorporates numerous CPET parameters (Table 1) in the identification of specific limitation patterns. Although a single parameter may not be sufficient to rule in a specific limitation category, additional parameters may be used to rule out other categories. This may be an advantage over traditional decision trees and correspond to the expert approach.

Synthesis of Our Findings with Available ML Algorithms

As mentioned in the Introduction section, two research groups developed similar algorithms. Inbar et al. (10) provided a proof of concept that ML may be used to successfully discriminate between healthy individuals, patients with heart failure, and those with chronic obstructive pulmonary disease. However, they did not report key features of their models. Portella et al. (11) used ML to classify individuals according to their main exercise limitation into four categories (cardiac, pulmonary, others, and normal response). Their method is different from ours because they analyzed the average data from 30-s intervals of selected CPET parameters (11). In contrast, we used precalculated parameters that are commonly used to detect exercise limitations, which hampers the direct comparison of results. Our key features of the pulmonary–vascular limitations model are partly in line with theirs. Although parameters describing the trajectories of V̇O2 and V̇CO2 were also important in our model, they additionally found % of O2 pulse predicted and maximum respiratory exchange ratio to be relevant for detecting limitations in this category (11). However, it should be noted that Portella et al. (11) did not differentiate between pulmonary–vascular and mechanical–ventilatory limitations, as was done in our study. Similarly, the key features of the cardiocirculatory limitations identified in our study are somewhat consistent with those in Portella et al. (11). V̇O2peak is included in both studies, whereas V̇O2/WR slope was only relevant in our model and V̇E slope as well as heart rate slope only in theirs.

Taken together, Inbar et al. (10) and Portella et al. (11) demonstrated the potential of ML algorithms to differentiate not only between selected conditions but also between primary organ limitations, mirroring the work of experts in clinical practice. These are the first crucial layers in the process of developing an accurate and implementable algorithm to detect exercise limitations using CPET data.

In line with this, our findings add another layer showcasing the potential of ML to identify and rate the severity of pulmonary–vascular, mechanical–ventilatory, or cardiocirculatory exercise limitations. Moreover, our research shows the capability of ML to detect and quantify even combined exercise limitations within a patient.

Although all three studies highlight the promising prospects of ML together with CPET data in clinical practice, there is also consensus that a heterogeneous patient population encompassing exercise limitations in all four categories and various degrees of severity may be helpful in the advancement of such ML algorithms. Moreover, standardized expert ratings of limitation and severity would advance the development process.

Limitations

First, 29 data sets were excluded because of incompleteness before obtaining 200 valid data sets. This may suggest limited applicability in clinical practice. However, our analyses reduced the multitude of parameters to only a few, meaning that less patients will be excluded in the future, for example, because of missing inspiratory capacity maneuver, as this is not part of our key parameters. On the other hand, 32 tests were excluded because of lack of compliance. Although insufficient compliance prevents the adequate determination of V̇O2peak, it is to be investigated whether our algorithm may also work with such data. Second, a larger sample size of patients and expert ratings will be beneficial to improve these models. This would yield a greater variance in the type and severity of limitations. Implementing an international database would facilitate this and may ultimately make the ML-based interpretation of CPET data ready for use in clinical practice. Finally, patients with unexplained dyspnea have been shown to be more often referred for CPET than other symptoms, for example, fatigue or chest pain (42). This could have induced bias.

CONCLUSIONS

This study identified few robust parameters crucial for identifying exercise limitations that are also considered important in clinical practice. Furthermore, we defined data-based cutoffs for relevant parameters that proved helpful for ML-based CPET interpretation. Combining these two aspects in the presented ML models allows for an automated interpretation, categorizing patients by the degree of impairment in the respective limitation category and facilitating clinical interpretation of CPET and decision making. These key aspects set our method apart from available decision trees and recommendations for CPET interpretation that are mostly based on expert experience. Finally, cross-category decision trees may be possible, improving real-life classification of patients. These findings may be generalizable to patients presenting at lung clinics in real-life practice, provided that CPET data are complete and valid. This study may enhance CPET being an even more frequently used assessment instrument of CRF and organ limitations in patients with cardiovascular or pulmonary entities.

We thank Dr. Johann-Jakob Schmid and Mirko Gadza of Schiller AG, Baar, for their support in soliciting and implementing the Innosuisse project.

This study was funded by Innosuisse – Swiss Innovation Agency (Project No. 28081.1). R. K. was funded by the Swiss National Science Foundation (Grant P2BSP3_191755). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

The authors declare no conflict of interest. The results of the study are presented clearly, honestly, and without fabrication, falsification, or inappropriate data manipulation. The results of the present study do not constitute endorsement by the American College of Sports Medicine.

F. S. and A.-K. B. wrote the original draft of the manuscript. A.-K. B. was responsible for data curation and screened the CPET data. M. N.-H. performed the statistical analyses, created the figures, and contributed to writing, review, and editing. V. R. planned and supervised the statistical analyses. A. S.-T., F. J. M., and A. H. were responsible for the conceptualization, supervision, rating of limitation categories, review, and editing. R. K. contributed to conceptualization, review, and editing. D. D. contributed the scale for the rating of organ limitations, review, and editing. A. S.-T. was responsible for funding acquisition. All authors read and approved the final version of the manuscript.

Data availability: The data underlying this article will be shared on reasonable request to the corresponding author.

REFERENCES

1. Radtke T, Crook S, Kaltsakas G, et al. ERS statement on standardisation of cardiopulmonary exercise testing in chronic lung diseases. Eur Respir Rev. 2019;28(154):180101.

2. Balady GJ, Arena R, Sietsema K, et al. Clinician’s guide to cardiopulmonary exercise testing in adults. Circulation. 2010;122(2):191–225.

3. Kodama S, Saito K, Tanaka S, et al. Cardiorespiratory fitness as a quantitative predictor of all-cause mortality and cardiovascular events in healthy men and women: a meta-analysis. JAMA. 2009;301(19):2024–35.

4. Korpelainen R, Lamsa J, Kaikkonen KM, et al. Exercise capacity and mortality—a follow-up study of 3033 subjects referred to clinical exercise testing. Ann Med. 2016;48(5):359–66.

5. Wasserman K. The Dickinson W Richards lecture. New concepts in assessing cardiovascular function. Circulation. 1988;78(4):1060–71.

6. Guazzi M, Adams V, Conraads V, et al. EACPR/AHA Scientific Statement. Clinical recommendations for cardiopulmonary exercise testing data assessment in specific patient populations. Circulation. 2012;126(18):2261–74.

7. Andonian BJ, Hardy N, Bendelac A, Polys N, Kraus WE. Making cardiopulmonary exercise testing interpretable for clinicians. Curr Sports Med Rep. 2021;20(10):545–52.

8. Wasserman K. Diagnosing cardiovascular and lung pathophysiology from exercise gas exchange. Chest. 1997;112(4):1091–101.

9. Schmid A, Schilter D, Fengels I, et al. Design and validation of an interpretative strategy for cardiopulmonary exercise tests. Respirology. 2007;12(6):916–23.

10. Inbar O, Inbar O, Reuveny R, Segel MJ, Greenspan H, Scheinowitz M. A machine learning approach to the interpretation of cardiopulmonary exercise tests: development and validation. Pulm Med. 2021;2021:5516248.

11. Portella JJ, Andonian BJ, Brown D, et al. Using machine learning to identify organ system specific limitations to exercise via cardiopulmonary exercise testing. IEEE J Biomed Health Inform. 2022;8:4228–37.

12. Bjørgen S, Hoff J, Husby VS, et al. Aerobic high intensity one and two legs interval cycling in chronic obstructive pulmonary disease: the sum of the parts is greater than the whole. Eur J Appl Physiol. 2009;106(4):501–7.

13. Meyer FJ, Borst MM, Buschmann HC, et al. Exercise testing in respiratory medicine—DGP recommendations. Pneumologie. 2018;72(10):687–731.

14. Miller MR, Hankinson J, Brusasco V, et al. Standardisation of spirometry. Eur Respir J. 2005;26(2):319–38.

15. Criée CP, Baur X, Berdel D, et al. Leitlinie zur Spirometrie [Standardization of spirometry]. Pneumologie. 2015;69(03):147–64.

16. European Respiratory Society. Standardized lung function testing. Official statement of the European Respiratory Society. Eur Respir J Suppl. 1993;16:1–100.

17. Guenette JA, Chin RC, Cory JM, Webb KA, O’Donnell DE. Inspiratory capacity during exercise: measurement, analysis, and interpretation. Pulm Med. 2013;2013:956081–13.

18. Wasserman K, Hansen JE, Sue DY, Stringer W, Whipp BJ. Principles of Exercise Testing and Interpretation 4ed. Philadelphia (PA): Lippincott Williams and Wilkins; 2005.

19. Wagner J, Agostoni P, Arena R, et al. The role of gas exchange variables in cardiopulmonary exercise testing for risk stratification and management of heart failure with reduced ejection fraction. Am Heart J. 2018;202:116–26.

20. Belardinelli R, Lacalaprice F, Carle F, et al. Exercise-induced myocardial ischaemia detected by cardiopulmonary exercise testing. Eur Heart J. 2003;24(14):1304–13.

21. American Thoracic Society, American College of Chest Physicians. ATS/ACCP statement on cardiopulmonary exercise testing. Am J Respir Crit Care Med. 2003;167(2):211–77.

22. R Core Team. R: A Language and Environment for Statistical Computing. Vienna (Austria): R Foundation for Statistical Computing; 2021. Available from: https://www.R-project.org/.

23. Breiman L. Classification and Regression Trees. Belmont (CA): Wadsworth International Group; 1984.

24. Russell SJ. Artificial Intelligence: A Modern Approach. Cranbury (NJ): Pearson Education, Inc; 2010.

25. Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28(5):1–26.

26. Hothorn T, Hornik K, Zeileis A. Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat. 2006;15(3):651–74.

27. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

28. Biau G, Scornet E. A random forest guided tour. Test. 2016;25(2):197–227.

29. Li R, Xia J, Zhang XI, et al. Associations of muscle mass and strength with all-cause mortality among US older adults. Med Sci Sports Exerc. 2018;50(3):458–67.

30. Weatherald J, Philipenko B, Montani D, Laveneziana P. Ventilatory efficiency in pulmonary vascular diseases. Eur Respir Rev. 2021;30(161):200214.

31. Sun XG, Hansen JE, Garatachea N, Storer TW, Wasserman K. Ventilatory efficiency during exercise in healthy subjects. Am J Respir Crit Care Med. 2002;166(11):1443–8.

32. Bard RL, Gillespie BW, Clarke NS, Egan TG, Nicklas JM. Determining the best ventilatory efficiency measure to predict mortality in patients with heart failure. J Heart Lung Transplant. 2006;25(5):589–95.

33. Dumitrescu D, Nagel C, Kovacs G, et al. Cardiopulmonary exercise testing for detecting pulmonary arterial hypertension in systemic sclerosis. Heart. 2017;103(10):774–82.

34. Arena R, Myers J, Aslam SS, Varughese EB, Peberdy MA. Peak VO2 and VE/VCO2 slope in patients with heart failure: a prognostic comparison. Am Heart J. 2004;147(2):354–60.

35. Chaouat A, Naeije R, Weitzenblum E. Pulmonary hypertension in COPD. Eur Respir J. 2008;32(5):1371–85.

36. Barron A, Francis DP, Mayet J, et al. Oxygen uptake efficiency slope and breathing reserve, not anaerobic threshold, discriminate between patients with cardiovascular disease over chronic obstructive pulmonary disease. JACC Heart Fail. 2016;4(4):252–61.

37. Laveneziana P, Di Paolo M, Palange P. The clinical value of cardiopulmonary exercise testing in the modern era. Eur Respir Rev. 2021;30(159):200187.

38. Bohannon RW. Grip strength: an indispensable biomarker for older adults. Clin Interv Aging. 2019;14:1681–91.

39. Phillips DB, Collins SÉ, Stickland MK. Measurement and interpretation of exercise ventilatory efficiency. Front Physiol. 2020;11:659.

40. Petek BJ, Tso JV, Churchill TW, et al. Normative cardiopulmonary exercise data for endurance athletes: the Cardiopulmonary Health and Endurance Exercise Registry (CHEER). Eur J Prev Cardiol. 2021;29(3):536–44.

41. Neder JA, Berton DC, Marillier M, Bernard AC, O’Donnell DE. Inspiratory constraints and ventilatory inefficiency are superior to breathing reserve in the assessment of exertional dyspnea in COPD. COPD. 2019;16(2):174–81.

42. Waraich S, Sietsema KE. Clinical cardiopulmonary exercise testing: patient and referral characteristics. J Cardiopulm Rehabil Prev. 2007;27(6):400–6.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *