Discussion
Associations between inductively created clusters of metabolites identified with NMR metabolomics and a priori diet pattern scores and indices and inductively created dietary intake clusters were evaluated. Participants represented a large population-based cohort, and the dietary information was collected with a validated FFQ reflecting habitual intake. NMR metabolomics models were not able to predict dietary intake clusters and they showed poor association with HDS and hPDI. Somewhat better model fits were obtained for rMDS, PDI and uPDI although model qualities were not impressive.
Accurate measurement of habitual diet is challenging and there is a need for validated objective methods. Blood metabolite patterns reflect direct or enzymatically diet-induced metabolites and thus may capture immediate responses to exposures, in contrast to the field of genomics. Hence, there is today great interest in evaluating agreement between blood metabolite patterns and habitual food intake patterns. Still, for metabolomics to be useful in large epidemiological studies, putative biomarkers have to reliably reflect habitual intake also when only one biological sample per individual is available. Previous research has shown this to be the case: Floegel and colleagues [
28] used repeated fasting serum samples collected 4 months apart and demonstrated that reliability for most of the 163 metabolites evaluated was good. The authors concluded that for most metabolites a single measurement is sufficient to assess long-term exposure in large epidemiological studies. Finally, urine samples have higher concentrations and wider range of food-derived compounds, except most lipid-soluble compounds, than has blood which is under homeostatic control. Hence, urine may be preferred for identifying biomarkers of food intake. Even so, in many large epidemiological studies blood samples and not urine samples are available.
A recent review summarized biomarkers of diet patterns evaluated in smaller controlled intervention studies [
6]. Most of the identified 30 studies used MS techniques but a handful used NMR technique like in our study. Many studies applied targeted metabolomics in search for known biomarkers, e.g., n-3 index, 24-h urinary electrolytes and carotenoids. Some studies were exploratory and the most commonly discovered biomarkers were those associated with intake of fish, protein and lipids, but also meat, vegetables, fruit, dairy, chocolate, vitamins, whole grains and legumes. The review concluded that most biomarkers were associated with specific foods or nutritional aspects of the diet but, because these foods appear in many diet patterns, the biomarkers lacked specificity for a particular dietary pattern. The review also pointed out the challenge to compare results across studies that use different analytical platforms; when metabolites were investigated within the same study with both MS and NMR techniques, only one overlapping metabolite was identified. Hence, comparisons of our results with those from studies using other metabolomics platforms, and urine instead of serum as biofluid, should be made with caution [
6]. Another recent review of metabolomic biomarkers of healthy dietary patterns reported that metabolites associated with vegetarian diets were amino acids (emphasized in NMR metabolomics), whereas metabolites associated with the Mediterranean diet were lipids (emphasized in MS metabolomics) [
9]. The authors likewise caution about comparing studies using different metabolomics platforms.
Only a few studies have evaluated habitual dietary patterns in larger cohorts, like ours. O’Sullivan and colleagues used NMR metabolomics but applied to urine samples [
29]. Metabolites responsible for separation of clusters included TMAO, glycine,
O-acetylcarnitine and phenylacetylglutamine, thus mainly reflecting red meat and vegetable intake. A study using NMR metabolomics on a smaller sample compared intake data from repeated 24 h recalls with metabolites in urine samples [
30]. Here, metabolomics models were able to predict adherence to healthy diets as captured by Nutrient Rich Food index, DASH diet and OMNIHEART. Perhaps associations between metabolomics patterns and dietary intake patterns are stronger when metabolites are compared with indices based on nutrient content rather than on food content, because of the heterogeneous content of macro-and micronutrients in foods. Several researchers have used serum samples like our study but applied MS metabolomics [
10,
31]. Here, associations have been found between dietary indices and scores and metabolites such as fatty acids profiles and amino acids. With respect to food intake patterns, most metabolites have reflected intake of fish, fruits, vegetables, alcohol and whole grains; i.e., as for the evaluations of smaller controlled intervention studies, metabolite patterns were specific for certain foods but not for dietary patterns per se.
We have previously shown that NMR metabolomics has the ability to distinguish between habitual meat and nonmeat consumers (97.5% correctly classified using serum samples and 91% correctly classified using urine samples), but lower ability to distinguish between habitual vegans and nonvegans (92.5% correctly classified using serum samples and 75% correctly classified using urine samples) [
32,
33]. Here, most of the discriminating metabolites were related to amino acids. This likely explains the poorer ability of NMR metabolomics than of MS metabolomics to separate dietary intake patterns beyond meat vs. no meat, at least for dietary intake patterns based on food content rather than on nutrient content.
Plant-based dietary patterns have been associated with lower risk of cardiovascular diseases [
34] and it is therefore important to identify these dietary patterns in research on diet and health. Recent comparisons of the PDI indices and metabolites in plasma using MS among Danish [
35] and American cohorts [
36] found a minor set of metabolites that were specific for each index. In our analyses using NMR metabolomics, glutamine was one of the discriminating metabolites that in the PDI-model was higher in Q4 than in Q1. Glutamine has been found to be higher in individuals with diets that exclude meat and other animal-based foods [
32], and thus have a higher intake of plant food. 2-hydroxyvalerate, a metabolite found in meat and produced endogenously, was lower in Q4 than in Q1 for PDI, and this can possibly indicate a lower intake of meat. 3-hydroxybutyrate, a keton body and metabolite from branched chain amino acids, also was lower in Q4 than in Q1 for PDI. The strongest OPLS-DA models were obtained when comparing Q1 and Q4 for rMDS and uPDI.
For rMDS, Q4 was associated with a lower concentration of 1,5-anhydrosorbitol than Q1. This metabolite is a validated marker of short-term glycemic control. In addition, the lactate concentration was low in Q4, and high concentrations have previously been reported in metabolically impaired subjects [
37]. The opposite was seen for uPDI, which is an index constructed so that a higher score results from consumption of unhealthful plant-based foods such as fruit juice, refined grains, sugar-sweetened soda, potatoes, desserts, and sweets. At last, in uPDI acetate, a short-chain fatty acid, was lower in Q4 than in Q1. Acetate can be produced by gut bacteria but evidence whether serum acetate increases after increased dietary fiber intake are inconclusive [
38]. Studies have reported that acetate is higher in type 2 diabetes patients than in healthy subjects [
33]. Reduction in weight also has been associated with increased serum acetate [
39]. In sum, the metabolites discriminating between uPDI Q1 and Q4 do not seem to be markers of certain foods, but rather markers of consequences of unhealthy eating.
Compared with the more sensitive mass spectrometer (MS)-based metabolomics, NMR is not able to detect low-concentration metabolites and thus has poorer ability to capture compounds such as lipids, fibers and vitamins. This may explain some of the poor associations between our metabolomics patterns and healthy vs. unhealthy dietary intake patterns. However, reasons for using NMR metabolomics in dietary studies are minimal sample preparation, rapid analysis of high reproducibility, reliable metabolite identification, ability to quantify metabolites and low cost [
8]. It is therefore important to evaluate the ability of NMR metabolomics to serve as a biomarker of habitual diet for use in large epidemiological studies. Further, for personalized nutrition strategies, NMR has been pointed out as the optimal technical platform because of its technical reliability and affordability [
40]. A healthy diet usually refers to low intakes of red and processed meat, trans fatty acids and sodium, and high intakes of fruit, vegetables, legumes, whole grains, and nuts and seeds [
1]. How intakes of dairy, potatoe, plant oils like palm oil and alcohol should be classified is debated and varies between different definitions and indicators, as illustrated by the indexes used in this project. This may further explain different results in different studies.
The scores and indices evaluated in this study capture healthy diets in slightly different ways. In the rMDS, higher scores are assigned to high intakes of vegetables (excluding potatoes), fruit, legumes, fish, olive oil, cereals and moderate alcohol intake. Lower scores are assigned to high intakes of total meat and dairy products. Intakes of all components are energy adjusted before individuals are ranked into tertiles. HDS does not include potatoes, juices, legumes or alcohol among beneficial foods, and it does not include dairy or poultry among unfavorable foods. Also, there is no energy adjustment before ranking individuals on intake. Lastly, the PDI simply divides food intake into those of vegetable origin and those of animal origin, regardless of associations with health outcomes. Hence, refined grains, sodas and sweets and desserts receive positive scores and only foods of animal origin receive reverse scores. hPDI distinguishes between health aspects and only assigns positive scores to healthful foods of vegetable origin and reverse scores for unhealthful foods of vegetable origin as well as for animal foods. Finally, uPDI is an anomaly that assigns positive scores to the unhealthful foods of vegetable origin and reverse scores to all other foods. No adjustment for energy intake is made when creating the PDI indices. Negative correlations between uPDI and the other scores and indices, and the positive correlations among rMDS, HDS and hPDI, are therefore expected. Further, each score and index represent different combinations of amino acids, lipids and carbohydrates and these are detected by NMR technique to different extents. Hence, it is no surprise that a comparison between each score or index and detected metabolites yields somewhat different results.
Postmenopausal status of the women was not measured and may have affected metabolite patterns among women in the older age group that we were unable to explain. We used fasting blood samples and not postprandial blood samples; the former are more influenced by background characteristics as serum concentrations are controlled by homeostasis and reflect exogenous as well as endogenous processes, whereas the latter show stronger traces of food metabolites. Hence, weaker associations between diet intake data and circulating metabolites are expected from epidemiological studies than from intervention studies [
7]. Still, biomarkers that have been identified in cohort and case-referent studies have proven to be more sensitive and robust, perhaps because they are detectable in spite of metabolite degradation during storage [
7]. Regardless of which study design is optimal for comparison with metabolomics data, the aim of the current project was to identify objective biomarkers of habitual dietary intake.
The FFQ consisted of 64 food items, decided upon in 1985 and unchanged since to maintain continuity in data collection. Hence, it is a somewhat crude estimate of the diet diversity of inhabitants in Västerbotten due to the limited number of food items included and because it lacks modern products such as vegan alternatives. Nuts and seeds are not captured, and fruits and vegetables are only captured by a few questions. During 2020–2022 a new updated and extended digital version has been developed that addresses these issues, but the dietary data used for the current analyses suffer from the limitations of the older version. Also, the original version contained 84 food items that were later reduced to 64 items mostly by combining similar food stuffs. Most validations were carried out on the 84-item version but there is little reason to believe that these results do not also apply to the 64 version.
Strengths of the presented analyses include the large sample size for a metabolomics project and that subjects originated from a large, population-based cohort that has been well characterized over time. We hence believe that the results can be generalized to populations with similar Western diets. Diet scores, indices and clusters were created within the entire NSHDS database with over 120 000 participants, yielding robust estimates of these variables as they reflect relative positions within the sample in which they were created. The FFQ has been validated, blood was donated concurrently with questionnaire data and blood samples have been stored at -80 degrees. Limitations include that the FFQ only included 64 food items, the inherent measurement bias with all subjective dietary intake tools, and that NMR metabolomics only detects somewhat larger metabolites and thus has poorer ability to capture lipids, fibers and vitamins in the diet.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.