Background
Asthma is one of the most common chronic respiratory diseases affecting 300 million people worldwide, causing a significant global socioeconomic burden [
1]. Yet ‘asthma’ is a vague term that describes a collection of clinical symptoms with reversible airflow limitation or bronchial hyperresponsiveness [
2]. It is currently considered as an umbrella diagnosis for several diseases encompassing multiple subgroups with distinct mechanisms, as they are now termed, phenotypes [
3]. The heterogeneities between the different phenotypes were reflected by patients-specific diverseness in natural history, risk factors, disease severity and response to therapies [
4]. Several important asthma phenotypes based upon the combinations of certain clinical characteristics have been proposed, such as allergic asthma, early-onset asthma, elderly asthma, obese asthma, occupational asthma, aspirin-sensitive asthma and neuropsychological asthma [
5,
6]. These classifications of clinical phenotypes provide the first step to the heterogeneity of asthma and have significant implications for clinical practice.
However, it should be understood that phenotypes are based on certain observable characteristics, which are the downstream results of genetics and environment. They do not necessary reflect the unified molecular and cellular mechanisms of underlying disease [
7]. Endotypes, on the other hand, are the subgroups based upon the distinct pathophysiological mechanisms. According to the endotypes, treatment targeting specific pathways that may be disrupted within a given subgroup can be administrated. This is especially important because asthma responds to drugs with varying efficacy due to varying underlying mechanisms [
7]. The shift from phenotype to endotype is an advance from clinical to molecular approach, indicating a further understanding of the heterogeneity of asthma. Besides, the theoretical basis of endotyping corresponds to the current concept of individualized precision therapy [
8], which will promote the successful development of personalized treatment for asthma.
With the development of microarrays, high-throughput sequencing technologies and other omic approaches, a great opportunity to further understand the molecular subgroups (endotypes) of asthma has emerged. Woodruff et al. identified two asthma phenotypes based on the expression of TH-2-related genes in bronchial epithelial brushings using microarray [
9]. Baines et al. defined three transcriptional asthma phenotypes using unbiased hierarchic clustering [
10]. Furthermore, Fitzpatrick et al. reported classifications of children severe asthma by using protein arrays [
11]. Similarly, Hastie et al. analyzed asthma severity phenotypes based on proteomic profiles of induced sputum [
12]. These studies manifested the vital role of omic approaches in the study of disease heterogeneity and mechanisms. They have significantly enriched and expended the study of asthma heterogeneity, exerting important implications for the future clinical practice of asthma.
Although several subgroups or gene signatures of asthma have been identified with omic approaches, there are few studies that have utilized unsupervised consensus clustering analysis to identify the asthma subgroups based on transcriptional profile of induced sputum (currently the best available noninvasive sample used for asthma airway inflammation assessment). In the present study, we hypothesized that the molecular subgroups of asthmatics could be defined according to the gene expression patterns of induced sputum samples. So we categorized the asthmatics into different subgroups based on the transcriptional profile differences (or similarities). Then we further characterized the candidate subgroups by analyzing their clinical features, biological functions, immune status and hub genes, hoping to identify molecular subgroups related to endogenous mechanism and to provide implications for individualized management of asthma.
Discussion
In the present study, the transcriptional profiles were analyzed and the asthma cases were classified into two different molecular subgroups using unsupervised consensus clustering analysis, which was validated in an independent dataset. The transcriptional classification revealed subgroup-specific clinical characteristics, biological functions and immune status. Here, we have identified two molecular subgroups that were significantly associated with asthma airway inflammation. Furthermore, WGCNA was applied to determine the key gene modules and hub genes of the identified clusters. The ROC analysis illustrated that the hub genes can effectively distinguish the two identified clusters. This study highlights the heterogeneity of asthma at transcriptional level and provides the implication for mechanism research and disease management.
Analysis of differential gene expression in our study suggested that there were 148 up-regulated genes and 14 down-regulated genes in the Cluster I compared with the Cluster II, indicating the different gene expression patterns between the two subgroups. In Go enrichment analysis, the DEGs were mainly enriched in the items of immune response regulation and signal transduction, which indicated that differences in these biological processes may have existed between the two identified clusters. The following ssGSEA confirmed the results of biological function analyses. Compared with the Cluster II, the Cluster I had higher levels of immune infiltration, including eosinophils, Th2 cells and mast cells. These cells are major effector cells for Th2 or eosinophilic inflammation [
25]. The difference in immune infiltration could explain the tight association between the Cluster I and eosinophilic inflammation. Apart from immune cells related to eosinophilic inflammation, the Cluster II also showed a low degree of immune cell infiltration of other immune cells, such as activated dendritic cells, nature killer T cells and immature B cells. As for immune processes, the Cluster II showed decreased scores in several immune processes, such as APC, type II IFN response and CCR. Overall, the ssGESA scores of the immune cell infiltration and immune processes tend to be lower in the Cluster II, indicating that the immunoreactivity of the Cluster II may not be as high as those in the Cluster I. In our study, we found the proportion of PGA was higher in Cluster II and a significant association between them was detected. Previous studies have indicated that PGA are most likely to represent a “benign” phenotype of asthma [
26]. It may display a low-grade airway and systemic inflammation [
27]. The “benign” traits and low degree of inflammation of PGA may partly explain the low immune scores of the Cluster II.
In our study,
THBS1, CCL22 and
CCR7 were identified as hub genes based on the combined analyses of WGCNA, PPI and gene expression analysis.
THBS1 is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. It plays an important role in the formation of thrombosis. It is secreted by platelets, macrophages, mononuclear cells, vascular muscle cells, fibroblasts and endothelial cells following the onset of inflammation [
28]. Previous studies have found that platelet activation is a significant determinant of the severity of allergic asthma. It is positively associated with eosinophil activation [
29]. Activated platelets can induce pulmonary inflammation and enhance the Th2 immune response by releasing the platelet δ, α and λ granules [
30] and
THBS1 is proved to be a vital marker of platelets activation. Therefore the association between Th2 inflammation in asthma and
THBS1 may be connected via activation of platelets. In addition,
THBS1 can induce chemotaxis of the macrophagocytes and induce a proinflammatory response [
31]. Its direct role in inflammation response remains to be further clarified.
CCL22 is a kind of chemokine for several immune cells, including monocytes, dendritic cells, natural killer cells and activated T lymphocytes. It plays a role in the trafficking of activated T lymphocytes to inflammatory sites [
32]. Yamamoto et al
. reported that
CCL2 could induce selective migration of Th2 but not Th1 cells through binding to chemokine receptor
CCR4, which was preferentially expressed by Th2 cells [
33]. The mechanism could explain why Th2 cells migrate to asthma airways as T cells in bronchial mucosal or bronchoalveolar lavage fluid (BALF) of allergic asthma express
CCR4, and meanwhile the levels of
CCL22 in BALF but not Th1-selective chemokines are increased upon allergen challenge to the lung [
34‐
37]. Apart from migration, Hirata et al. found that
CCL22 could promote Th2 cell differentiation from accelerate helper T cell differentiation to Th2 cells and it could augment the proliferation of differentiating Th2 cells, which may potentiate Th2 immune response and contribute to eosinophilic airway inflammation [
37].
CCR7 is one of the most important chemokine receptors for adaptive immune cell migration. It is mainly expressed in lymphoid tissues and several immune cells.
CCR7 and its ligands
CCL19 and
CCL21 regulate emigration of T cells and DCs to areas of lymph nodes where T cell priming and initiation of adaptive immune response occurs [
38‐
40]. Wang et al. found that the binding of
CCR7 expressed in the eosinophils to
CCL19 was an important chemotaxis signal that triggers airway eosinophils traffic from the airway lumen into lung-draining paratracheal lymph nodes in the mouse model of allergic asthma. Another inflammatory cytokine, leukotriene C4, was highly involved in the process [
41]. Mozza et al
. found that compared with non-allergic asthmatic patients, the percentages of CCR7
+ memory CD4
+ T cells were significantly higher in allergic asthma, which is characterized by elevated levels of Th2 cytokines and eosinophilic inflammation [
42]. And the proportion of CCR7
+ memory CD4
+ T cell was negatively correlated with improved pulmonary tests and significantly associated with disease severity scores and IgE levels, showing significant clinical implications in asthma and eosinophilic inflammation [
43].
Compared to the original study for GSE45111 [
13], our study was different in many aspects. Firstly, the purpose of the original study was to identify gene signatures or biomarkers that could discriminate asthma inflammatory phenotypes to assist asthma management. They mainly focused on the genes that were differentially expressed between the three asthma inflammatory phenotypes and their diagnostic value for discrimination of the asthma inflammatory phenotypes. While in our study, we aimed to investigate the heterogeneity of asthma at gene expression level. We intended to identify molecular subtypes of asthma based on the transcriptional profiles. The clinical features, biological functions, immune status and hub genes of the molecular subtypes were also investigated. Moreover, the study methods were also different within the two studies. The original study conducted logistic regression and ROC analysis to test and evaluate the performance of the gene biomarkers. They did not perform bioinformatics analysis to study the gene expression profiles. While in our study, comprehensive bioinformatic analyses, such as enrichment analysis, WGCNA, ssGSEA, to analyze the features of the identified clusters. Therefore, the two studies were totally different.
It should be noted that our study is a re-analysis of dataset of GSE45111. Although
Baines et al. also performed clustering analysis based on the dataset [
10], our study was different from this one. Firstly, the methods of clustering analysis were different.
Baines et al. used hierarchical clustering to analyze the microarray data while consensus clustering was used in our study. Usually, microarray data contains a relatively small sample size compounded by the high dimensionality of the gene expression data, making the clustering results especially sensitive to noise and are susceptible to over-fitting [
43]. In fact, hierarchical clustering is unable to deal with noise and high dimensionality associated with the microarray gene expression data. Compared with hierarchical clustering, consensus clustering improves the robustness and quality of clustering analysis to gene expression datasets [
44]. Besides, the identified subgroups were not validated in Baines et al.’s study. In our study, we applied different methods to test the stability of identified subgroups, including PCA and t-SNE. We further used another dataset to validate the clustering results. Therefore, from methodological perspective, our results are stable. Furthermore, Baines et al. focused on the clinical features of the identified clusters and provided more clinical information, while our study went into more depth on bioinformatics analysis. For example, we performed KEGG and ssGSEA analysis to characterize biological function and immune status of the identified subgroups. WGCNA and PPI were used to identify gene modules and hub genes that were associated with airway inflammation types. The results support the molecular heterogeneity of asthma and provide potential targets and framework to investigate asthma molecular mechanisms. These analyses were not performed in Baines et al.’s study. In short, the two studies focused on different aspects and provided different implications for future study.
The present study had several limitations. Firstly, we identified the subgroups of asthma based upon the gene expression profiles in stable adult asthmatics. Therefore, whether it could be applied to patients with exacerbation or children still requires further investigation. Secondly, more important clinical characteristics of the asthma molecular subgroups, such as treatment response or exacerbation risk, could not be investigated due to data limitation. Thirdly, the different expression patterns in our subgroups still need to be prospectively validated in other populations.
In summary, the two identified clusters based on the transcriptional profiles revealed different clinical characteristics, gene expression patterns, biological functions and immune status. One of them (the Cluster I) showed a tight association with EA, which may have significant implication for individualized asthma management. The three hub genes, THBS1, CCL22 and CCR7, were likely to play an essential role in the Cluster I and might prove to be potential therapeutic targets for newly developed treatments. Our study supports the molecular heterogeneity of asthma and may provide potential therapeutic targets for newly developed treatments and may develop a framework for a more in-depth study of the mechanisms of asthma.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.