Introduction
Community-acquired pneumonia (CAP) is the leading cause of death among children under five years of age globally, with 16.4 million hospitalizations every year [
1,
2]. In China, a total of 1.42 million cases were reported as having one or more episodes of CAP, resulting in 1.48 million CAP episodes [
3]. Approximately 8–20% of children hospitalized with CAP progress to severe disease, and many of these, especially infants, require admission to the pediatric intensive care unit (PICU) [
1]. These severe cases require advanced interventions, such as invasive and non-invasive mechanical support to reduce the mortality rate of severe cases.
Diagnosis of pediatric CAP is often difficult due to the poor-quality evidence in clinical data, such as atypical imaging findings, complex clinical indicators, and poor prognostic signs [
4,
5]. Failure to provide timely diagnosis and treatment may lead to acid-base balance disorders causing multiple organ failure and even septic shock in critically ill children. Thus, it is essential to develop new methods for early assessment of which cases are likely to become clinically severe. In addition, disease progression of CAP is a complex, multi-system process, and its underlying molecular mechanisms remain unclear. Changes in systemic responses may be caused by a complex set of factors including pathogens, genetic predisposition, and immune response. As a result, these factors may alter proteins and the downstream metabolites involved in disease progression [
6]. Therefore, it is important to determine if host-derived proteins and metabolites in the circulation system are connected to the pathogenesis and progression of severe CAP.
Recent multi-omics studies have aimed to identify biomarkers and understand complex systemic changes which contribute to pathogenesis. Serum is the major container for small molecules whose relative amounts can provide valuable insights into disease pathogenesis [
7,
8]. Previous studies have used serum proteins and/or metabolites to distinguish infectious disease from healthy controls. For example, one study identified a set of proteins able to accurately distinguish and predict COVID-19 outcomes [
9], while in another study, metabolomics was combined with a random forest-based classification model and identified potential biomarkers for diagnosis of
Mycoplasma pneumoniae pneumonia [
10]. For CAP, metabolomics has been used to distinguish CAP from healthy individuals and identify metabolite signatures which correlate with disease severity [
11]. Moreover, plasma lipidomics was also found to be useful in predicting the 90-day mortality prognosis in bacterial CAP [
12]. Currently, in CAP, it is unclear which protein or metabolic pathways are involved in disease progression or what their combined roles are, especially in children. Thus, an integrated analysis of the proteome and metabolome may provide new avenues for understanding severe CAP.
Here, we used proteomics and metabolomics to profile the host response in CAP serum samples in a training cohort containing severe CAPs (S-CAPs), non-severe CAPs (NS-CAPs) and healthy controls (CONs). Our study uncovered several host proteins and metabolites that were altered in CAP. To identify potential biomarkers, we developed a machine learning-based pipeline that identified a combination of biomarkers that could accurately distinguish S-CAPs from controls. These selected biomarkers and combinations were then validated using enzyme-linked immunosorbent assay (ELISA) and metabolomics in a second validation cohort. Finally, the proteomics and metabolomics data generated in this study provided a global overview of the molecular changes, which may provide useful insight into the development of new therapeutics for treatment of CAP.
Material and methods
Ethical approval
The studies involving human participants were reviewed and approved by Ethical Committee of Capital Institute of Pediatrics (Ethical approval number: SHERLLM2019001). Written informed consent to participate in this study was provided by the participants’ legal guardian/next of kin.
Patient enrollment
S-CAP patients were recruited from the PICU department in the Capital Institute of Pediatrics between 26th of December 2021 and 8th of March 2022. NS-CAP cases were enrolled from the respiratory department at the same time. CONs were collected from children who underwent a health checkup at the Capital Institute of Pediatrics. This study was approved by the Capital Institute of Pediatrics Ethics Committee.
Diagnosis of pediatric CAP was performed in accordance with the Chinese Medical Association guidelines as follows: younger than 18 years; symptoms started in communities; clinical signs of pneumonia (fever; tachypnea; increased respiratory work during examination; or auscultatory findings consistent with CAP); and pulmonary infiltration on the chest radiograph [
13]. Severe cases required the following criteria: ICU treatment and positive pressure ventilation [
14]. Among them, 1 patient had septic shock with the need for vasopressors. Characteristic and pathogenic types are supplied in Additional file
1.
Evaluation of clinical characteristics and markers
Clinical information was retrospectively collected from the medical records of patients. This included proportion of blood cells [neutrophils (Neu), lymphocyte (Lym), monocytes (Mon)], white blood cells (WBC), procalcitonin (PCT), prothrombin time (PT), international normalized ratio (INR), activated partial thrombin time (APTT); fibrinogen (FIB), Fibrinogen degradation product (FDP) and thrombin time (TT). The non-invasive ventilation, invasive ventilation, days of hospitalization, ICU admission, and pediatric critical illness score (PCIS) were also assessed at hospital discharge.
Proteomic analysis
Serum samples from cohort 1 were used for proteomics analysis (Additional file
1) as previously described [
8,
15]. Briefly, each sample was lysed with 100 μL lysis buffer (8M urea in 100 mM triethylammonium bicarbonate, TEAB) at 25 °C for 30 min. The lysates were reduced by 5 mM Tris (2-carboxyethyl) phosphine (Pierce, Rockford, IL, USA) and incubated at 37 °C for 30 min with shaking (300 rpm). Next, samples were alkylated by 15 mM Iodoacetamide (Sigma-Aldrich, St. Louis, MO, USA) and digested with trypsin overnight at 37 °C. Then, mass spectrometry-grade trypsin gold (Promega, Madison, WI, USA) was used with an enzyme-to-protein ratio of 1:50. The dried peptides were dissolved in 20 μL loading buffer (1% formic acid, FA; 1% acetonitrile, ACN). Ten μL of sample was applied for LC–MS/MS analysis on an Orbitrap Fusion Lumos in data-dependent acquisition (DDA) mode coupled with Ultimate 3000 (Thermo Fisher Scientific, Waltham, MA, USA). The samples were loaded and separated by a C18 trap column (3 mm 0.10 × 20 mm).
For MS detection, the following parameters were used: full MS survey scans were performed in the ultra-high-field Orbitrap analyzer at a resolution of 120,000 and trap size of 500,000 ions over a mass range from 300 to 1400 m/z. MS/MS scan were detected in IonTrap and the 20 most intense peptide ions with charge states 2 to 7 were subjected to fragmentation via higher energy collision-induced dissociation (5 × 103 AGC target, 35 ms maximum ion time). The resultant mass spectrometry data were analyzed using Maxquant (Version 2.1.0.0) and the protein search database used was the Homo sapiens FASTA database downloaded from UniprotKB (UP000005640.fasta). The following search parameters were used for Maxquant: precursor ion mass tolerance was set at 20 ppm; full cleavage by trypsin was selected; a maximum of two missed cleavages was allowed; static modifications were set to carbamidomethylation of cysteine, and variable modifications were set to oxidation of methionine and acetylation of peptides’ N-termini. The remaining parameters followed the default Maxquant setup. For protein identification, the following criteria was used: (1) peptide length ≥ 6 amino acids; (2) FDR ≤ 1% at the PSM, peptide and protein levels. Peptides were quantified using the peak area derived from their MS1 intensity and analyzed by perseus.
Enzyme-linked immunosorbent assay (ELISA)
ELISA was used to quantify the concentrations of selected serum proteins. Samples from cohort 2 were used for ELISA verification. Adiponectin (ADIPOQ), apolipoprotein C (APOC1), vitamin K-dependent protein C (PROC), angiotensinogen (AGT), fibronectin (FN1), histidine-rich glycoprotein (HRG), albumin (ALB), C-reactive protein (CRP), and lipopolysaccharide (LBP) ELISA kits (Inselisa) were used to measure the proteins changes in serum from participants in the training (cohort 1) and testing (cohort 2) datasets. ELISAs were performed according to each kit’s instructions.
All serum samples (Additional file
1) were used for metabolomics analysis as described previously [
8,
15]. Quality control (QC) samples were applied by mixing equal amounts of all samples to ensure data quality for metabolic profiling. Samples (100 μL) were extracted by 400 μL of MeOH/ACN (1:1, v/v) solvent mixture, and then incubated and centrifuged for 10 min at 13,500 g at 4 °C. Next, the supernatant was divided into three fractions: two for reverse-phase/ultra-performance liquid chromatography (RP/UPLC)-MS/MS methods with positive ion-mode electrospray ionization (ESI) and negative-ion mode ESI, and one for hydrophilic interaction liquid chromatography (HILIC)/UPLC-MS/MS with positive-ion mode ESI.
All UPLC-MS/MS methods used the ACQUITY 2D UPLC system (Waters, Milford, MA, USA) and Q-Exactive Quadrupole-Orbitrap (QE, Thermo Fisher Scientific™, San Jose, USA) and TripleTOF 5600 + (AB SCIEX, MA, USA) with ESI source and mass analyzer. In the UPLC-MS/MS method, the QE was operated under ESI coupled with a C18 column (UPLC BEH C18, 2.1 × 100 mm, 1.7 μm; Waters). The mobile solutions used in the gradient elution were water and methanol containing 0.1% FA. When the QE was operated under negative ESI mode, the UPLC method used a C18 column eluted with mobile solutions containing methanol and water in 6.5 mM ammonium bicarbonate at pH 8. The UPLC column used in the hydrophilic interaction method was a HILIC column (UPLC BEH Amide, 2.1 × 150 mm, 1.7 μm; Waters), and the mobile solutions consisted of water and acetonitrile with 9 mM ammonium formate at pH 8.0; the TripleTOF 5600 + was operated under positive ESI mode. The mass spectrometry analysis was changed between MS and data-dependent MS2 scans. After raw data pre-processing, peak finding/alignment, and peak annotation by MSDIAL software, metabolite identifications were supported by matching the retention time, accurate mass, and MS/MS fragmentation data to MSDIAL software database and online MS/MS libraries (Human Metabolome Database).
Statistical analysis
Statistical analysis of clinical data
Data were analyzed using SPSS 16.0 and expressed as mean ± SD. Differences between 2 groups were analyzed using student’s t-test. The categorical data were analyzed by chi-square statistics. The significance level was set at p < 0.05.
Statistical analysis of multi-omics data
For each group pairing, the fold-change (FC) was calculated using the mean of each group and compared (e.g., mean of S-CAP vs mean of CON). A two-sided unpaired Welch’s t test was used to identify significant differences between groups. Statistically significant differentially abundant proteins (DAPs) and differentially abundant metabolites (DAMs) were identified using the following criteria: FC > 1.5 or FC < 0.67, and p < 0.05. P-values were adjusted for false discovery rate (FDR) using Benjamini and Hochberg. Partial least squares-discriminate analysis (PLS-DA) was conducted using MetaboAnalyst 4.0 and cross-validated using the tenfold unit variance scaling method.
Volcano plots were created based on FC and t tests, and the intensity data of these regions were used for GraphPad analysis and hierarchical clustering analysis. The cluster trend map is based on the Mfuzz R package [
16], which can analyze the differential characteristics of proteins. The tool was able to identify potential patterns of change in the protein profile, and clustering proteins with similar patterns can help us understand the dynamic patterns of proteins. Bar plots for Gene Ontology (GO) enrichment were created in R 4.2.1. Heatmaps and signaling pathway analysis were performed using the Kyoto Encyclopedia of Genes and Genome (KEGG) database, Small Molecule Pathway Database (SMPDB) and Metaboanalyst 5.0. Mfuzz v.2.46.0. Connected networks were then visualized with String, a plug-in for Cytoscape (v.3.2.1).
Selection of biomarker candidates
For biomarker selection and verification, a receiver operating characteristic (ROC) analysis was performed and the predictive power of each protein and metabolite was ranked according to the ROC area under curve (AUC) value. Next, 5 machine learning classifiers, including logistic regression, random forest, linear support vector machine, K-nearest neighbor, and decision tree were used to determine the best diagnostic model while the tenfold cross-validation method was used to evaluate their accuracy and error rate. Then, ROC curves were then applied to evaluate the accuracy of biomarker candidates in the validation set. Diagnostic parameters, including sensitivity and specificity, were also calculated.
Discussion
Severe pediatric CAP is a critical public health threat to children’s health. Although bacterial and viral infections may lead to different results, both present with symptoms of pneumonia. Healthcare associated pneumonia is no longer recognized as a distinct entity, but as a form of CAP, and there is increasing evidence of bacterial and virus as etiological agents of CAP. Due to the complexity and heterogeneity of the disease, diagnosis of CAP, especially for severe CAP, remains a clinical challenge. Therefore, it is important to identify early biomarkers that can detect the severity of CAP. For this purpose, we applied proteomics and metabolomics to test the serum protein and metabolite changes associated with severe CAP. To our knowledge, this is the first study to combine proteomic and metabolomic data obtained from children with CAP and different disease severity. Our study identified 2 proteins (CRP, LBP) and 3 metabolites [Fasciculol C, PE (14:0/16:1(19Z)), PS (20:0/22:6(4Z, 7Z, 10Z, 13Z, 16Z, 19Z))], which are good candidates to identify severe CAP cases from non-severe CAP cases and controls. These candidates were further validated in an independent cohort.
In this study, the proteomics and metabolomics data generated also enabled a systematic analysis of the molecular pathology in CAP. The development of children's lung function is not perfect, so age is likely to be an important factor affecting metabolism and morbidity. Therefore, we age-matched the cases and controls to minimize the influence of age on protein and metabolite abundance in each group. Significantly DAPs were identified to be involved in essential biological processes such as cell death, the complement system, coagulation cascades, platelet function and metabolic dysregulation. Our results are consistent with previous findings that severe CAP cases are frequently associated with acute respiratory distress syndrome, sepsis, and multi-organ injury [
34], which were pathophysiologically associated with cell death activation pathway, intravascular coagulation and microthrombosis [
34]. Our data revealed the molecular changes in CAP sera, which could potentially reflect the occurrence of cell damage in CAP. Here, we observed that severe CAP patients are often accompanied by tissue damage and inflammation. Higher expression of lysosome-related proteins, cytotoxicity-related proteins and phagosome-related proteins were observed in S-CAPs, suggesting that various cell death pathways contribute to the development of severe pneumonia. Lysosomes which are found in pre-existing endolysosomes or autolysosomes act as an important bridge between autophagy and endocytosis [
35]. Thus, as an important regulator of cell death, lysosomes, cytotoxicity proteins and the phagosome may be involved in exacerbating CAP leading to the development of severe disease.
Our data also observed activation of the complement system and inflammation system in CAPs. Here, multiple acute phase proteins such as CRP and complement-related proteins were upregulated in CAPs. It has been reported that CRP assists in activation of the complement system [
36]. This induces the production of cytokines and chemokines, potentially resulting in a ‘‘cytokine storm’’ [
36]; and also recruits macrophages from the peripheral blood, which may lead to acute lung injury. Since ~ 50% of platelets are produced in the lungs [
37], these platelets may help to aggravate lung injury and further induce cytokine storm. For example, C4BPB [
18] and F11 [
19] which are regulators of complement system were significantly decreased in S-CAP cases. PROC, which interacts with C4BP [
20], was also downregulated in S-CAPs. Moreover, CFHR3 [
21], CFHR4, CFHR5 [
22] and CR2 were also decreased in S-CAP patients compared to CONs. Complement and coagulation, together with platelet dysfunction, act as the linchpin in events leading to thromboinflammation [
17]. Declining platelet count has also been associated with poor outcomes in CAP patients [
23]. Two of the most intriguing proteins downregulated in severe patients were vasodilator-stimulated phosphoprotein (VASP) and integrin alpha-IIb (ITGA2B). VASP is an actin regulatory protein implicated in platelet adhesion [
24] while ITGA2B encodes aIIb and is an important gene associated with COVID-19-related stroke [
25]. In addition, the expression of most complement proteins, coagulation cascade proteins and platelet-related proteins were negatively associated with disease severity. Interestingly, the levels of platelet-related proteins, such as collagen alpha-1(I) (COL1A1), ITGA2B, fermitin family homolog 3 (FERMT3), talin-1 (TLN1) and VASP were positively correlated with TT levels, while negatively correlated with FIB levels, which are essential clinical indexes. Additionally, levels of fibrinogen alpha (FGA) and fibrinogen beta (FGB) were positively correlated with FDP levels (clinical index). Recently, increasing evidence indicates a potential cross-talk between complement factors and platelet activation, contributing to the pathophysiology of diseases and subsequent tissue remodeling processes [
17]. Therefore, activation of the cell death pathway, the inflammatory response and a dysregulated complement, coagulation cascade and platelet function are predicted to cause tissue damage in children with CAP.
Cross-talk between glucose metabolism and nucleotide metabolism were observed in CAP cases. Nucleotides are the building blocks for DNA and RNA synthesis. Glucose metabolic pathways such as the PPP and TCA cycle promote nucleotide formation by increasing the supply of glutamate and/or PRPP [
29,
38]. In this study, the levels of PRPP and glutamine were significantly upregulated in NS-CAPs and S-CAPs. Moreover, the nucleotide (CMP) and most deoxynucleotide (dAMP, dGMP, dCMP, dUMP) were also elevated in NS-CAP and/or S-CAPs. One explanation for this “cross-talk” might reflect increased DNA and RNA synthesis in CAP patients due to proliferation of immune cells as nucleotides are required for replication [
29]. Modulating nucleotide metabolism may also increase the host immune response against pathogen attack [
29,
39]. Furthermore, increased nucleotides and deoxynucleotides in the serum suggests higher RNA turnover and DNA degradation possibly due to apoptosis of host cells or immune cells. Consistent with previous reports, RNA turnover and DNA degradation are increased in inflammatory diseases [
40,
41]. The role of increased (deoxy)nucleotides in the pathogenesis of pneumonia requires further research; however, it is possible that higher levels of nucleotides lead to unbalanced deoxyribonucleotide pools which, in turn, contribute to the progression to severe CAP.
In addition to our findings of altered glucose and nucleotide metabolism in CAPs, we also uncovered dysregulated metabolites for lipid metabolism which are important for regulation of signal transduction and immune activation processes. Previously, Ning et al. [
11] suggested that sphingolipid metabolism was significantly affected in CAPs, and that lipid dysfunction was one of the potential pathological mechanisms. In another study on serum metabolites and lipid alterations in CAPs, sphingolipids were strongly correlated with respiratory function, the cardiovascular system and liver function [
42]. Similarly, our data also showed that lower sphingolipids were detected in both NS and S-CAP patients. In addition, dysregulated expression of APOM was reported to be associated with virus infection [
7]. This finding was consistent with our finding that the levels of apolipoproteins, which are involved in the transport and redistribution of lipids, were significantly dysregulated in both NS-CAP and S-CAP patients. Moreover, it is known that pulmonary surfactant is a protein-lipid mixture secreted by type-II alveolar epithelial cells. Impaired surfactant function in lung is thought to be an essential mechanism for pneumonia after pathogen infection. Thus, altered lipid metabolism in this study might have also been induced by surfactant metabolism dysfunction after pathogen infection. Furthermore, it has been reported that CAPs with pulmonary diffusing capacity affect oxygen transport and mitochondria changes in the β-oxidation pathway in children, especially young children. The previous study also reported that lipid catabolism can been improved by enhanced lipolytic and fatty acid β-oxidation pathways [
43]. Thus, we hypothesize that lipid metabolism and anaerobic pathways can be altered by the damage of pulmonary diffusing capacity due to lack of adequate oxygen, as well as beta-oxidation pathways in mitochondria due to CAP. Together, these data collectively indicate that dysregulated lipid metabolism is involved in the pathological mechanism of CAP disease progression.
There are still some limitations to this study which needs to be considered. Although our samples were age-matched, there may still be other genetic, clinical or environmental confounding factors such as pathogen type that may not have been detected or controlled for. Furthermore, although our results were verified using an independent cohort, further verification using larger samples sizes are still needed.
In conclusion, this study provides a systematic proteomic and metabolomic investigation of serum samples taken from severe and mild CAP patients as well as control groups. We demonstrated the potential of a panel of serum proteins and metabolites that can identify CAP cases which may progress into severe pneumonia. Although we successfully validated our serum biomarker panel in an independent testing cohort, the two cohort sizes are small and may require larger samples sizes to confirm our findings. Our data also laid out the molecular profile of serum changes in pediatric CAP, which may provide additional useful diagnostic markers and information for the development of therapeutic interventions in children who develop severe pneumonia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.