Background
Plasmodium falciparum malaria remains an important cause of global morbidity and mortality, accounting for an estimated 200 million annual cases and 600,000 deaths in the Africa alone [
1]. While significant progress against malaria has been made, largely due to the widespread distribution and use of long-lasting insecticidal nets (LLINs), there is increasing evidence that progress has stalled in many of the highest burden settings [
1]. Heterogeneity in bloodmeal-seeking behaviours (i.e., location and timing of feeding) may place an upper bound on the effectiveness of LLINs [
2‐
4] and result in proportionally more bites occurring outdoors following LLIN deployment [
5]. This has major implications for predicting residual malaria risk following LLIN deployment.
Therefore, an understanding of the factors beyond LLIN availability and use that are associated with malaria transmission remains necessary to target control measures effectively [
6,
7]. Uganda has been a leader in the effort to achieve universal coverage of LLINs and is therefore an interesting setting to examine the factors associated with residual malaria risk [
8]. The country conducted its first mass distribution campaign in 2013, [
9] followed by similar campaigns every three years, including in 2017–18 and most recently in 2020–21 [
10]. Remarkably, households reporting at least one LLIN increased from 16% in the 2006 Demographic and Health Survey (DHS) to more than 80% in the 2018 Malaria Indicator Survey (MIS), while over the same period the proportion of households with at least one LLIN for every two people increased from 5 to 54% [
11]. Despite this progress, malaria transmission persists with more than 12 million cases reported in 2020 [
1].
Malaria transmission intensity varies across Uganda, but individual and household risk may differ substantially even within a relatively small geographic area. Risk is impacted by numerous demographic, occupational, behavioural, and geographic factors occurring on different spatial and temporal scales which makes risk prediction, especially at fine-scale resolution, difficult [
12]. Yet, understanding this fine scale spatial heterogeneity, which may be best explained by environmental conditions in the immediate peri-domestic space as well as household socio-economic factors, is critically important. How to most effectively identify and incorporate these variables into predictive models is not well defined.
Therefore, the goal of this study was to compare the ability of statistical models to predict malaria risk at the household level using either (i) remotely-sensed data or (ii) results from a household survey. These two methods of collection differ in both (1) the resources required to obtain the information and (2) the scale over which the predictors act. For example, information about home construction is costly to obtain and likely impacts risk within the home but not for neighbours. On the other hand, the presence of flooded areas is easily detected and may impact risk for a large area but is unlikely to explain differences in risk between neighbours. Given the level of detail, it was hypothesized that the inclusion of information collected in household surveys would result in higher predictive ability compared with only using remotely-sensed environmental data.
Discussion
Using the results of a household malaria survey performed across three villages of differing terrain and malaria transmission in rural Uganda, the predictive ability of models for malaria risk were compared. The findings show that the environmental dataset outperforms the household dataset at predicting OOS malaria risk based on both uRDT result (mean AUC of 0.736 compared to 0.667) and inpatient admissions (mean AUC of 0.672 compared to 0.653). While this is not a large difference, the substantially higher cost of collecting the household dataset would heavily favor using the environmental dataset. In addition, while the inclusion of the household dataset with the environmental dataset (i.e., combined dataset) improved models’ ability to predict OOS inpatient admissions (mean AUC of 0.683 compared to 0.672), it actually decreased the ability to predict OOS uRDT results (mean AUC of 0.671). Importantly, no model outperformed a random classifier when predicting OOV risk (Additional file
1), highlighting the difficulty of extrapolating results to new regions, even in close proximity.
The datasets used here differed not just in the variables they contained, but in the costs associated with obtaining them. The environmental dataset contains variables that would be expected to predict the presence of vector habitat, such as standing water. This dataset is easily obtained from publicly available online tools (e.g., USGS EarthExplorer) and would be expected to best predict malaria risk if transmission primarily occurred outside the home, since it does not account for physical barriers (e.g., window screening, LLINs) limiting vector access to the individual inside the home. On the other hand, the household dataset is much more logistically difficult to obtain, requiring a detailed survey of households, and would be expected to best predict malaria risk if transmission primarily occurred inside the home (e.g., while individuals slept). In reality, malaria risk is expected to depend on a complex interaction between these variables, with their relative importance being location dependent. Therefore, it is important for policy-makers to understand the circumstances within their region, which requires at least a preliminary examination of all possible risk factors. Risk mapping is a valuable tool for malaria control, as it can identify high risk areas and guide surveillance, prevention and treatment activities, resource allocation [
30].
The low impact of several household variables is partly due to the lack of variation between individuals tested. For example, of 608 children surveyed, 568 (93.4%) reported sleeping under a LLIN the previous night, while 567 (93.3%) lived in households with toilets on the property. While we did not measure entomological indices, one possible explanation for the low predictive power of household variables compared to environmental variables is that the high proportion of children sleeping under bed nets could result in a shift in where malaria transmission occurs, from within the house to outside, [
2‐
5] lessening the ability of household variables to predict residual malaria risk. However, it has also been suggested that sufficient biting still occurs late at night within households with LLINs for transmission to occur [
4,
5]. The high prevalence of LLIN during this study was almost certainly the result of a national LLIN mass distribution campaign in 2020–21 [
40] and was significantly higher than observed in the region in January-March 2020, when coverage was found to be 64.7% [
17]. In addition, utilization of protective measures within the household (e.g., LLIN and installation of screening) may reflect both actual risk and the perceived risk of the homeowner. Homeowners may install protective measures in response to either a perceived high malaria risk (e.g., living near the spillway) or an actual risk (e.g., seeing mosquitoes in their homes). Previous work has also found household variables to have counter-intuitive relationships with malaria risk in the presence of LLINs, [
31] including a decreased risk of malaria associated with windows tied to cooler indoor temperatures and improved LLIN compliance. This association would become stronger if transmission occurs outside the household, where they are no longer protective.
The models found that slope and flow direction were significant predictors of both measures of malaria risk. Slope steepness and flow direction affect water accumulation, necessary for larval development, which has been previously shown to correlate with higher malaria risk [
32‐
34]. In addition, larval habitat is known to be more common in locations closer to streams and rivers, [
32] and proximity to water has been shown to influence malaria risk, both within Uganda [
35,
36] and in other regions [
37,
38], even after accounting for household construction [
39]. Elevation has long been established as a predictor of malaria risk with risk decreasing at higher elevations [
33,
40,
41] while this work finds no association of elevation with malaria risk, previous work in the region showing that low elevation villages have higher prevalence of infection [
17,
42] and lower levels of multiplicity of infection [
42] for malaria than high elevation villages, measures of malaria transmission intensity. Finally, it is well-established that distance to a health facility is a determinant of healthcare utilization in rural settings [
43] resulting in individuals delaying or refusing to seek care, [
44] self-medicating, [
45] or seeking care outside the formal healthcare system [
43]. While this was seen when using the environmental dataset, distance to the nearest level 3 health facility was not significant when household variables were included.
Several others have attempted to predict malaria risk using environmental and/or individual- and household-level variables across a number of settings [
33,
46‐
49]. These models typically had similar levels of predictive power (AUC = 0.7–0.9). Despite this, there are key differences in the data included to produce these models when compared to the models in this study. Several studies similarly use remotely-sensed data to predict risk based on environmental variables, [
12,
33,
46‐
49] but few combine this data with individual- and household-level information [
46,
47]. For those that do use individual- and household-level data, few include house construction information [
47]. Another key difference is that others have relied on aggregated malaria prevalence data [
46], while our analyses used individual household-level malaria prevalence information. Thus, this study offers a unique set of variables for predicting risk. Additionally, few studies have compared the predictive ability of three subsets of environmental and individual- and household-level variables, which this study has done.
While this study provides a unique dataset with which to compare the predictive ability of several factors, there are several important limitations. First, the dataset represents a single household survey conducted in November 2020. This excludes the possibility of examining the effect of seasonality or short-term weather conditions. Second, a single NDVI estimate, derived from December 2020 data, was used for fitting the environmental models. NDVI varies over the year, driven by a bi-annual rainy season. This variation was not captured in the analysis. Third, this study uses uRDT test results and previous inpatient admission as outcomes. Given the persistence of HRP2, it is possible that infection could have occurred anytime in the 6 weeks prior to the uRDT test results [
50]. Similarly, inpatient admission was assessed over the previous year. Thus, the risk factors present at the time of the study may not be representative of those present at the time the infection occurred. Finally, travel history was not collected as part of the household survey. Varying sizes of buffer regions around the households were included to account for areas individuals may visit, but it is not possible to adjust for individual-level variation in movement without a travel history.
Conclusion
Accurate fine-scale prediction of malaria risk is essential, especially in regions where malaria persists despite high LLIN uptake. Many of these regions have limited resources that need to be proactively targeted towards areas of the greatest need. There is a growing body of work looking at the determinants of malaria risk at a household level, but building accurate models still proves difficult. Further developing these models not only requires technical advancements in modelling, e.g. machine learning, but an understanding of the scales, implications, and costs of different predictive datasets. To this end, the use of easily obtainable remotely-sensed environmental data has been compared to a dataset collected as part of a highly detailed household survey when predicting two indicators of malaria risk. It was found that environmental data were able to better predict OOS uRDT positivity and inpatient admission across three villages in Uganda and that the addition of household-level data provided marginal, if any, benefit. This has important implications for developing predictive models in the current environment as it suggests that the use of remotely-sensed data may be sufficient and that the added benefit of household surveys may not justify their costs. However, in areas with low LLIN coverage, or with limited environmental variation, household surveys are likely still necessary to understand variation in malaria risk.
Acknowledgements
We would like to thank Dana Giandomenico and Paul Delamater for their feedback on the design and presentation of this manuscript. In addition we would like to thank our Research Assistants, Wesuta Andrew, Bitamazire Aprunalis, Nyangoma Grace, Masika Sarah, Katosi Ronald, Mbusa Jackson, Kabugho Jackie, Baguma Stephen, Bagyenyi Michael, and Mbusa Rapheal.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.