Evaluation of Internet-Based Dengue Query Data: Google Dengue Trends
Abstract
Dengue is a common and growing problem worldwide, with an estimated 70–140 million cases per year. Traditional, healthcare-based, government-implemented dengue surveillance is resource intensive and slow. As global Internet use has increased, novel, Internet-based disease monitoring tools have emerged. Google Dengue Trends (GDT) uses near real-time search query data to create an index of dengue incidence that is a linear proxy for traditional surveillance. Studies have shown that GDT correlates highly with dengue incidence in multiple countries on a large spatial scale. This study addresses the heterogeneity of GDT at smaller spatial scales, assessing its accuracy at the state-level in Mexico and identifying factors that are associated with its accuracy. We used Pearson correlation to estimate the association between GDT and traditional dengue surveillance data for Mexico at the national level and for 17 Mexican states. Nationally, GDT captured approximately 83% of the variability in reported cases over the 9 study years. The correlation between GDT and reported cases varied from state to state, capturing anywhere from 1% of the variability in Baja California to 88% in Chiapas, with higher accuracy in states with higher dengue average annual incidence. A model including annual average maximum temperature, precipitation, and their interaction accounted for 81% of the variability in GDT accuracy between states. This climate model was the best indicator of GDT accuracy, suggesting that GDT works best in areas with intense transmission, particularly where local climate is well suited for transmission. Internet accessibility (average ∼36%) did not appear to affect GDT accuracy. While GDT seems to be a less robust indicator of local transmission in areas of low incidence and unfavorable climate, it may indicate cases among travelers in those areas. Identifying the strengths and limitations of novel surveillance is critical for these types of data to be used to make public health decisions and forecasting models.
Problem melden