1. Introduction
Heart failure (HF) is a heterogeneous and life-threatening clinical syndrome encompassing symptoms such as breathlessness, excessive fatigue, and swollen ankles [
1]. HF has been considered a global pandemic estimated to affect more than 64 million individuals worldwide. Its prevalence has continuously increased over the past decades, posing a great threat to public health [
2,
3]. Traditionally, HF was defined as a pathophysiological condition in which cardiac structure or function abnormality results in increased intra-cardiac pressures or reduced cardiac output at rest or during activities [
4]. However, the traditional definition only mentioned the non-specific symptoms or signs in clinical practice but did not include objective indicators to support HF diagnosis [
4,
5]. In 2021, a universal consensus on the definition of HF was proposed, namely a clinical syndrome with symptoms and/or signs caused by structural and/or functional cardiac abnormalities and corroborated by elevated natriuretic peptide levels and/or objective evidence of pulmonary or systemic congestion by diagnostic modalities [
4], which was conceptually comprehensive and clinically practical with at least one objective detection indicator included. Notably, the elevated levels of natriuretic peptides, including the B-type natriuretic peptide (BNP) and N-terminal pro-B-type natriuretic peptide (NT-proBNP), were enrolled in this newly proposed definition of HF, which suggested the pivotal role of objective detective signs in HF recognition.
Biomarker-guided diagnosis and management have gained popularity in clinical application [
5], and the natriuretic peptide has become a golden standard widely used in HF diagnosis or prognostic evaluation [
6]. Nevertheless, the elevated concentrations of natriuretic peptides do not always meet the commands of diagnosing HF in clinics because it might be affected by numerous factors, such as renal failure, pulmonary embolism, and obesity [
7]. Previously, evidence has shown that a deterioration in renal function significantly elevates the concentrations and associations of natriuretic peptides, indicating that the clinical interpretation of increased natriuretic peptides should consider renal function [
8]. In the condition of acute pulmonary embolism, the concentrations of natriuretic peptides were also identified to increase as well as predict adverse prognostic outcomes [
9]. Additionally, it has been demonstrated that obese patients with HF show a state of decreased levels of natriuretic peptides. Thus, lower thresholds of natriuretic peptide levels should be made to meet the diagnosis of obese individuals with HF [
10]. Therefore, the application of natriuretic peptides remains limited in HF because of its vulnerability to other pathophysiologic conditions, indicating that exploring novel biomarkers that could provide additional diagnostic utility is urgently needed [
11]. Given the complex pathophysiology of HF, which ranges from cardiac dysfunction to extra-cardiac alteration, emerging HF biomarkers that tackle different pathophysiological processes could significantly benefit the risk stratification and management of patients with HF [
12]. In the past decades, two novel identified HF biomarkers, Galectin-3 (Gal-3) and tumorigenesis-2 (sST2), have yet to prove their utility in the clinics [
13]. Moreover, growing evidence suggested that the multi-biomarker panel strategy, combining novel biomarkers with well-established ones, showed both strong diagnostic and prognostic values compared to a single-biomarker approach [
5,
14]. Hence, we hypothesized investigating newer specific diagnostic biomarkers together with elucidating their regulated pathogenic pathways involved in HF, which could provide candidates for the multi-biomarker strategy, thus deepening our understanding of molecular alterations in the progression of HF.
Given the explosion of biomedical big data, there is growing research enthusiasm in utilizing bioinformatics approaches for mining high-throughput RNA sequencing (RNA-seq) or single-cell RNA-seq (scRNA-seq) data, which make us better understand the altered molecular features in HF at tissue or cell levels [
15]. With these rapidly developed sequencing techniques and computational analysis approaches, novel biomarkers and mechanisms could be rapidly identified rather than speckled through traditional hypothetical-driven research [
5,
15], bringing new insight into understanding the physiopathology of HF and developing the diagnosis or clinical management. Although several bioinformatics studies have already tackled exploring the underlying biomarkers and molecular mechanisms of HF [
16,
17,
18,
19], apparent limitations existed mainly due to the lack of comprehensive multi-datasets analysis from both tissue and single-cell views, rational speculation of the biomarkers-regulated pathways, and external experimental validation of biomarkers on testing patients’ samples. Considering these shortcomings in previous research, we intended to mine novel candidate biomarkers of HF as well as explicitly interpret their mechanisms (including upstream and downstream ways) that participated in HF via the in silico analysis on multiple RNA-seq datasets (including bulk RNA-seq and scRNA-seq) and in vitro validation on plasma collected from HF patients and control cases.
In this study, multiple transcriptional datasets derived from patients with HF and control, including four expression profiles (GSE141910, GSE57338, GSE42955, GSE135055) and one sc-RNA seq dataset (GSE121893), were acquired from the Gene Expression Omnibus (GEO) database. Training the cohort (GSE141910) for major analysis and test cohorts (GSE57338 and GSE42955) for validation were then processed and obtained, respectively. GSE135055 served as an external validation cohort. First, the differential expression genes (DEGs) between the HF and control were identified. Second, weighted gene co-expression network analysis (WGCNA) and multiscale embedded gene co-expression network analysis (MEGENA) complementing each other via different algorithms were performed on the DEG expression profiles to filter key gene modules. Subsequently, Disease Ontology (DO), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted on the overlapping genes between WGNCA-identified and MEGNA-identified modules, and genes significantly enriched in HF-related terms were retained. Fourth, the retained genes were subjected to a machine learning-based integration workflow and protein–protein interaction (PPI) network selection. Fifth, hub genes performed with robust diagnostic in the training and test cohorts were generated. Hub genes’ expression and correlation patterns were evaluated in the training and test cohorts. Sixth, sc-RNA seq analysis was conducted on GSE121893 to uncover the cell-specific expression patterns of hub genes. Seventh, together with gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), transcription factor (TF) prediction, and single-sample gene set enrichment analysis (ssGSEA), we systemically identified the pathways co-regulated by hub genes involved in HF. Then, we used the ssGSEA approach to quantify the infiltration levels of immune cells in HF to screen crucial immune cell types highly correlated to the hub genes. Eighth, the therapeutic drugs targeting hub genes and TFs were verified using molecular docking. Finally, quantitative reverse transcription PCR (RT-qPCR) was utilized to explore the relative mRNA expression levels of hub genes in plasma collected from patients with HF, thereby developing plasma biomarkers with HF. Based on large-scale genome-wide association studies (GWASs), a two-sample Mendelian randomization (MR) analysis was further performed to investigate the causal effect of hub genes on the risk of HF. The overall study flow chart is shown in
Figure S1.
4. Discussion
HF remains high in incidence and mortality worldwide, and a novel approach is urgently needed to improve this situation. In the past few decades, HF has been proven to be a heterogeneous and progressive clinical syndrome that is reflected in its complex pathophysiology [
6].
Herein, we performed a comprehensive bioinformatics analysis on bulk RNA-seq and sc-RNA seq datasets from HF patients and non-HF controls. We briefly outlined our main findings shown in a schematic diagram (
Figure 11). First, we identified 926 distinct DEGs (HF versus control), including 648 up-regulated genes and 278 down-regulated genes in HF. The KEGG pathway enrichment analysis then showed that DEGs were mainly involved in ECM and immune-related pathways (cytokine–cytokine receptor interaction, cell adhesion molecules, the T cell receptor signaling pathway, ECM–receptor interaction, complement and coagulation cascades). Impaired signaling through corresponding cytokine receptors accelerates myocardial apoptosis and tissue damage after acute cardiac stress, suggesting an adverse role of cytokine–cytokine receptor interactions in HF [
70]. Furthermore, cell adhesion molecules play a crucial role in modulating cardiac inflammation and pathological cardiac remodeling by facilitating the recruitment of T-cells to the left ventricle, ultimately contributing to cardiac dysfunction of HF [
71]. Excessive extracellular matrix remodeling has extensively been known to precipitate HF. Coagulation proteases orchestrate clotting-independent activation of protease-activated receptors (PARs) to modulate TGF-β1 signaling, influencing cardiac fibrosis in HF [
72]. Notably, a large body of evidence has demonstrated the abnormal ECM and immune-related functions or mechanisms altered in HF development and progression.
Gene co-expression analysis is commonly used to identify gene modules linked to disease traits [
73]. Next, we performed WGCNA and MEGENA gene co-expression analyses on DEGs. Leveraging the gene–pairwise correlation and agglomerative clustering, WGCNA allows the assignment of each gene to certain modules, which helps recognize the modules highly correlated to disease traits. We initially identified two crucial modules from WGCNA, including ME turquoise and ME blue modules. Of note, these two modules all displayed significant strong correlations to HF (R > 0.7,
p 74]. Therefore, more compact and functionally coherent modules could be screened out through MEGENA, and we recognized two large gene modules from MEGENA, including c1_2 and c1_6 modules. WGCNA and MEGENA could complement each other [
74] and comprehensively capture gene modules strongly related to HF rather than rely on a single method. Four gene modules were obtained, including two WGCNA-identified gene modules (ME turquoise and ME blue modules) and two MEGENA-identified gene modules (c1_2 and c1_6 modules). More interestingly, we obtained two large intersections between the four identified gene modules, which were 248 genes shared by the ME turquoise and c1_6 modules and 234 genes common to the ME blue and c1_2 modules. Although WGCNA and MEGENA successfully seized these two preserved co-expression gene patterns associated with HF, their biological significance of HF remained uncertain. Therefore, we implemented enrichment analysis to annotate human disease, gene functions, and pathways to determine whether the HF-related terms are over-enriched or under-enriched in the two overlapping gene sets. Here, we selected gene sets with more HF-representative terms to guarantee the algorithmic and biological capture of HF characteristics. The DO, GO, and KEGG enrichment analyses demonstrated that 248 genes were mainly involved in diverse cardiovascular disorders and HF-related biological functions or pathways. In comparison, fewer HF-related enriched terms were shown in 234 genes. Consequently, we retained 248 genes and performed feature selection via a machine learning-based integration pipeline under the LOOCV framework. A total of 134 feature genes were selected using the Enet (alpha = 0.1) model with the highest AUC values in the training and test cohorts. Together with the PPI network establishment and five ranking algorithms of Degree, MNC, MCC, EPC, and DMNC, we further obtained 10 pivotal genes, including
KIT,
SLC6A1,
SLC6A4,
COL14A1,
MME,
OGN,
SFRP4,
CD1C,
MFAP4,
HTR2B. Most of these pivotal genes’ activations have been reported to contribute to HF, as summarized in
Table S15. ROC curve analysis then identified four key genes (
COL14A1,
OGN,
SFRP4,
MFAP4) performed with robust AUC scores (AUC > 0.85) in both the training and test cohorts. The up-regulation patterns of four key genes in HF and their positive correlation patterns were observed within the training and test cohorts. Leveraging sc-RNA seq analysis, we uncovered the cellular composition of failed hearts, mainly including cardiomyocytes, endothelial cells, fibroblasts, smooth muscle cells, and macrophages. Moreover, distinct cell-specific expression patterns were revealed in which
OGN and
SFRP4 were mainly expressed in cardiac fibroblasts,
COL14A1 was mainly expressed in cardiac endothelial cells, and
MFAP4 was expressed in cardiomyocytes.
Of these four key genes, Collagen Type XIV Alpha 1 Chain (
COL14A1) has previously been reported to affect arterial remodeling [
75] and showed up-regulation mRNA and protein levels in the ischemic heart of HF patients [
76].
Col14a1 deficiency would contribute to defective ventricular morphogenesis and chaotic collagen fibril organization in mice, thereby indicating its maintenance of structural integrity during early heart development [
77]. Given the specific expression pattern of
COL14A1 in endothelial cells uncovered by sc-RNA analysis, we speculate that the close association of
COL14A1-activated endothelial cells and dysfunction in collagen constituents contributes to myocardial hypertrophy of HF progression. Osteoglycin (
OGN), which encoded a class III small leucine-rich proteoglycan (SLRP) member, was first discovered to act as a key regulator in rat left ventricular mass (LVM) through modulating the TGF-β pathway [
78]. It was also proposed that
OGN might be involved in the regulation of cardiac dysfunction and adverse remodeling after myocardial infarction in HF [
79]. Moreover,
OGN was reported to modulate fibrosis and inflammation, resulting in diastolic dysfunction shown in hypertensive heart disease [
80]. More recently, Fang et al. have revealed the specific mechanism of
Ogn in myocarditis mice by which its silencing suppressed the Wnt signaling pathway, thus inhibiting myocardial fibrosis proliferation [
81]. The close linkage between activated cardiac fibroblast and adverse outcomes following HF (fibrosis, remodeling, and dysfunction) has been extensively expounded [
82]. Therefore, it is reasonable to deduce the expression of
OGN active in cardiac fibroblasts in the context of HF, leading to myocardial fibrosis and remodeling post-HF. Microfibril-associated glycoprotein 4 (
MFAP4) has been proven to regulate calcium-dependent cell adhesion or intercellular interactions, leading to inflammation and fibrosis, which was previously verified to engage in remodeling-related diseases such as liver fibrosis, atherosclerosis, and arterial injury stimulated remodeling [
83,
84]. Apart from the close association between
MFAP4 and remodeling-related disorders, it was also shown that
MFAP4 could also serve as a candidate biomarker for cardiovascular diseases such as HF [
84]. For instance, clinical cohort-based research demonstrated that the protein levels of
MFAP4 were significantly increased in serum derived from HF patients compared to control cases [
85]. Mechanistically,
Mfap4 deletion could attenuate cardiac fibrosis and ventricular arrhythmias, suggesting its potential as a therapeutic target in HF prevention [
86]. Additionally, Dorn et al. showed that
Mfap4 knockout affected pressure overload-induced cardiac remodeling, leading to elevated cardiac hypertrophy and exacerbating cardiac function [
87]. According to our previous analysis,
MFAP4 was identified to be specifically expressed in cardiomyocytes, which largely implies its significant activation in directly exacerbating hypertrophy of cardiomyocytes in the condition of HF. Secreted frizzled-related protein 4 (
SFRP4) has been reported to be expressed in human ventricular myocardium and correlates with apoptosis-related gene expression [
88]. Zeng et al. showed that the increased
Sfrp4 in mice myocardial infarction model and knockout of
Sfrp4 could protect against myocardial ischemia and reperfusion injury through attenuating apoptosis of cardiomyocytes [
89]. Additionally, evidence from the rat heart ischemic model showed that
Sfrp4 recombinant protein injection could exert a cardio-protective effect and improve the cardiac function of the ischemic heart, which suggested the therapeutic potential of
SFRP4 in humans [
90]. Regarding the evident expression of
SFRP4 in cardiac fibroblasts of failed hearts, we mechanically speculated that elevated
SFRP4 affects myocardial fibrosis and remodeling post-HF. Ji et al. found that plasma
SFRP4 concentrations were increased in patients with coronary artery disease (CAD) and acted as an independent factor [
91]. Altogether, findings from prior experimental and clinical evidence have indicated the great potential of four hub genes in developing ECM fibrotic, immune, or cardiac remodeling-related biomarkers and therapeutic molecular targets of HF.
Subsequently, we integrated GSVA and GSEA algorithms to explore the shared pathways regulated by the four hub genes in HF. Three significantly activated pathways (including cell adhesion molecules, the chemokine signaling pathway, and cytokine–cytokine receptor interaction) were co-identified through the combination of GSVA and GSEA. In contrast, oxidative phosphorylation was identified as a common suppressed pathway. Notably, the induction of ECM remodeling, inflammatory cytokines, and chemokines have been proven to contribute to the pathogenesis of HF, such as adverse remodeling, myocardial injury, myocardial fibrosis, and dysfunction [
70,
71,
92,
93]. Moreover, it was shown that impaired mitochondrial respiratory and decreased oxidative phosphorylation were detected in the hearts of HF patients [
94]. More strikingly, we observed the same phenomenon through GSVA and GSEA algorithms: cell adhesion molecules, chemokine signaling pathway, and cytokine–cytokine receptor interaction were active, whereas oxidative phosphorylation was inhibited in HF compared to control. We noticed that four hub genes participated in diverse pathogenic pathways involving ECM, immune factors, and oxidative phosphorylation, suggesting the pathophysiologic heterogeneity of HF. Given the close interplays between these pathogenic pathways in HF progression, we speculated that the continuous induction of chemokines or cytokines stimulated the formation of inflammatory conditions in HF that promoted ECM deposition and impaired oxidative phosphorylation capability, thereby exerting an adverse effect on cardiac function (
Figure 11). Regarding the upstream regulatory mechanism that could activate the transcription of hub genes, the transcription factors were predicted using the TRRUST v2 database. After systemically exploring the correlation patterns between the up-regulated TFs in HF and the hub gene-regulated pathogenic pathways, we identified
BNC2 and
MEOX2 as upstream regulators to activate the transcription of hub genes and drive the pathogenic pathways. Basonuclin 2 (
BNC2), a zinc finger TF, was recently identified as a core TF essential for myofibroblastic activation in fibrosis, causing ECM deposition during fibrogenesis [
95]. Mesenchyme Homeobox 2 (
MEOX2) has previously been reported to regulate the proliferation, differentiation, and migration of vascular endothelial cells and cardiomyocytes [
96,
97].
BNC2 and
MEOX2 are closely linked with cardiac functions and may contribute to HF-related pathogenic pathways, which could be confident upstream regulators during HF progression. Taken together, we uncovered a comprehensive mechanism of hub genes involved in HF, which was a compact regulatory interaction of TFs–hub genes–pathways: The up-regulations of TFs (
BNC2 and
MEOX2) activate the transcriptions of four hub genes (
COL14A1,
OGN,
MFAP4,
SFRP4) and subsequently drive the activation or suppression of downstream signaling pathways (
Figure 11). The interplays between activated pathways related to immune factors and ECM and the suppressed oxidative phosphorylation pathway may further cause abnormal cardiac functions, thereby promoting the development and progression of HF.
Given that the immunity-related pathways involving chemokine and cytokine were observed in hub genes, we wondered which main type of immune cell altered in HF could regulate or participate in these pathways. Leveraging the ssGSEA algorithm, we quantified the infiltration levels of 28 types of immune cells based on the gene expression profiles. Combining with Lasso and RF models, we then obtained a total of 14 key immune cell types with HF (activated CD4+ T cell, activated CD8+ T cell, activated dendritic cell, CD56dim natural killer cell, MDSC, macrophage, natural killer T cell, natural killer cell, neutrophil, Type 17 T helper cell, Type 2 T helper cell, effector memory CD4+ T cell, memory B cell, and central memory CD8+ T cell). Together with expression and correlation analyses on the training and test cohorts, we finally suggested effector memory CD4+ T cell, which demonstrated both increased infiltrations in HF and positive relationships with hub genes, as an essential immune cell type that participated in the chemokine signaling pathway and cytokine–cytokine receptor interaction with hub genes. Preliminary evidence has shown that the infiltrations of effector memory CD4+ T cells caused cardiac fibroblast activation and subsequent fibrosis, influencing pressure-overload-induced HF [
98]. This result motivated us to consider that the elevated infiltrations of effector memory CD4+ T cells could activate the chemokine signaling pathway and cytokine–cytokine receptor interaction—potentially by the recruitment of chemokines and cytokines.
Considering that four hub genes (
COL14A1,
OGN,
MFAP4, and
SFRP4) and two key TFs (
BNC2 and
MEOX2) may serve as potential therapeutic targets of HF, several small-molecule agents targeting hub genes were predicted. Then, molecular docking showed that the close binding patterns between ligands (small-molecule agents) and receptors (hub genes or TFs), including captopril-targeting
BNC2, aldosterone-targeting
MEOX2, cyclopenthiazide-targeting
MEOX2, estradiol-targeting
COL14A1, tolazoline-targeting
COL14A1, and genistein-targeting
SFRP4. We noticed that the estimated RMSE from the docked model was lower than 2 Å, indicating the high credibility of molecular docking. Furthermore, we collected the high-quality structures of receptors and ligands, which largely guarantee the accuracy of molecular docking. Using the molecular docking technique, we illustrated the treatment potential of these TFs and genes. Although the effectiveness of these molecules as drug targets in curing HF was demonstrated from an in-silico perspective, further animal studies and clinical trials are needed. Notably, these small-molecule agents were previously reported to cure or alleviate HF symptoms [
64,
65,
66,
67,
68,
69]. Therefore, the identified small-molecule agents may reverse the over-expression levels of two hub genes (
COL14A1,
SFRP4) and two TFs (
BNC2,
MEOX2) in HF, thus inhibiting the regulatory network of TFs–hub genes–pathways and then alleviating HF progression.
In the previous analysis, we illustrated the great potential of four hub genes as molecular diagnostic biomarkers and therapeutic targets for HF. To further investigate whether the four hub genes could serve as extra plasma biomarkers of HF for clinic applications, we collected six whole blood samples derived from three HF patients and three control cases. Then, the qRT-PCR results showed that
OGN was highly expressed in the plasma of HF patients compared to the control cases. Additionally,
OGN demonstrated robust diagnostic accuracy in HF recognition and a significantly positive correlation to NT-proBNP concentration. The elevated pattern and robust diagnostic power of plasma
OGN were subsequently verified in an external cohort, including 21 HF and nine control cases. Some clinical factors, particularly gender and age, have been reported to cause differences in the detection of HF biomarkers [
99]. For instance, gender-related differences in Cardiac Troponin (cTn) values have been evident in patients with HF [
100] with higher values commonly shown in males. However, most clinicians barely consider these factors when using these biomarkers [
100]. Thus, more effort should be put into investigating the interference on diagnostic values of biomarkers, considering the underlying differences brought by the confounding factors. Herein, we deduced that the successful clinically used biomarker should be stably detectable for reflecting disease rather than being disturbed via different characteristics of patients. Accordingly, we wondered whether the expression levels of
OGN varied in different clinical characteristics, mainly including HF type, gender, and age. Notably, no significant expressions of
OGN differed in groups of these clinical factors. This result suggests that
OGN may serve as a stable biomarker for HF detection with limited affection brought by the confounding factors such as gender and age. Given the promising application of plasma OGN in discerning HF, we also studied the causal association of plasma
OGN and the risk of HF from the genetic insight using GWAS datasets. We used the SNPs as IVs for a two-sample MR study based on the exposure (plasma
OGN) and outcome (HF) GWAS data. As a result, we found that plasma
OGN was significantly causally related to the increased risk of HF with an overall OR estimated larger than one measured by five MR approaches. Strikingly, the genetic variant in
OGN was proven to promote HF occurrence on the large-scale GWAS data, which largely sustained the great potential of
OGN as an HF biomarker. Thus, these results demonstrated that
OGN could serve as a plasma detective biomarker that is more indicative of HF patients and HF progression.
Nevertheless, some limitations remain in our present study. First, these findings were mainly generated from an in silico analysis of transcriptional datasets and a need for in-vitro confirmative studies. We are pushing ahead with a larger-scale investigation by integrating more in silico and in vitro datasets related to HF. However, it remained an effective means to rapidly explore the candidate diagnostic biomarkers and underlying pathogenic mechanisms of HF, thus providing meaningful and constructive references for understanding HF. Second, we detected the up-regulated OGN expressions in the plasma of three HF patients compared to three control cases. The sample sizes were relatively limited. Further validation should be conducted within a prospective and larger multi-center collaboration, which guarantees the credibility of OGN in diagnostic applications. Given our understanding of the biomarkers-regulated pathogenic mechanisms in HF, more in-depth molecular studies, such as genetic knockout and pharmacological exploration on cell or animal models, are needed to substantiate our work.