Using publicly available data on the number of new hospitalisations we use a newly developed statistical model to produce a phase portrait to monitor the epidemic allowing for assessing whether or not intervention measures are needed to keep hospital capacity under control. The phase portrait is called a cliquets' diagram, referring to the discrete alarm phases it points to. Using this cliquets' diagram we show that intervention measures were associated with an effective mitigation of a Summer resurgence but that too little too late was done to prevent a large autumn wave in Belgium. ; European Union's Horizon 2020 research and innovation programme - project EpiPose [101003688]; European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (ERC) [682540]
Small area estimation is an important tool to provide area-specific estimates of populations characteristics for governmental organisations in the context of education, public health and care. However, many demographic and health surveys are unrepresentative at a small geographical level, as often areas at a lower level are not included in the sample due to financial or logistical reasons. In this paper, we investigated (1) the effect of these unsampled areas on a variety of design-based and hierarchical model-based estimates and (2) the benefits of using auxiliary information in the estimation process by means of an extensive simulation study. The results showed the benefits of hierarchical spatial smoothing models towards obtaining more reliable estimates for areas at the lowest geographical level in case a spatial trend is present in the data. Furthermore, the importance of auxiliary information was highlighted, especially for geographical areas that were not included in the sample. Methods are illustrated on the 2008 Mozambique Poverty and Social Impact Analysis survey, with interest in the district-specific prevalence of school attendance. ; This research received no external funding
Small area estimation is an important tool to provide area-specific estimates of population characteristics for governmental organizations in the context of education, public health and care. However, many demographic and health surveys are unrepresentative at a small geographical level, as often areas at a lower level are not included in the sample due to financial or logistical reasons. In this paper, we investigated (1) the effect of these unsampled areas on a variety of design-based and hierarchical model-based estimates and (2) the benefits of using auxiliary information in the estimation process by means of an extensive simulation study. The results showed the benefits of hierarchical spatial smoothing models towards obtaining more reliable estimates for areas at the lowest geographical level in case a spatial trend is present in the data. Furthermore, the importance of auxiliary information was highlighted, especially for geographical areas that were not included in the sample. Methods are illustrated on the 2008 Mozambique Poverty and Social Impact Analysis survey, with interest in the district-specific prevalence of school attendance.
In geostatistics, both kriging and smoothing splines are commonly used to generate an interpolated map of a quantity of interest. The geoadditive model proposed by Kammann and Wand (J R Stat Soc: Ser C (Appl Stat) 52(1):1–18, 2003) represents a fusion of kriging and penalized spline additive models. Complex data issues, including non-linear covariate trends, multiple measurements at a location and clustered observations are easily handled using the geoadditive model. We propose a likelihood based estimation procedure that enables the estimation of the range (spatial decay) parameter associated with the penalized splines of the spatial component in the geoadditive model. We present how the spatial covariance structure (covariogram) can be derived from the geoadditive model. In a simulation study, we show that the underlying spatial process and prediction of the spatial map are estimated well using the proposed likelihood based estimation procedure. We present several applications of the proposed methods on real-life data examples. ; Support from a doctoral Grant of Hasselt University is acknowledged (BOF11D04 FAEC to YV). Support from the National Institutes of Health is acknowledged [award number R01CA172805 to CF]. Support from the University of Antwerp scientific chair in Evidence-Based Vaccinology, financed in 2009–2014 by a gift from Pfizer, is acknowledged [to NH]. Support from the IAP Research Network P7/06 of the Belgian State (Belgian Science Policy) is gratefully acknowledged. The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government – department EWI.
Geographical homogamy of spouses married in the Netherlands in the period 1812-1922 .People tend to choose a partner originating from their own region. However, in the course of time the geographical horizon widened. The mean geographical distance between the regions of origin of spouses in the Netherlands strongly increased during the nineteenth and twentieth century, particularly from the 1880s onwards. Also regional differences can be observed. This study aims at explaining the characteristics of the spatio-temporal changes in the geographical distance between the places of birth of spouses in the Netherlands in the period 1812-1922. By using spatio-temporal regression models both the effects of individual characteristics (like age and social class) and the regional socio-economic context can be explored.
The alleviation of food‐borne diseases caused by microbial pathogen remains a great concern in order to ensure the well‐being of the general public. The relation between the ingested dose of organisms and the associated infection risk can be studied using dose‐response models. Traditionally, a model selected according to a goodness‐of‐fit criterion has been used for making inferences. In this article, we propose a modified set of fractional polynomials as competitive dose‐response models in risk assessment. The article not only shows instances where it is not obvious to single out one best model but also illustrates that model averaging can best circumvent this dilemma. The set of candidate models is chosen based on biological plausibility and rationale and the risk at a dose common to all these models estimated using the selected models and by averaging over all models using Akaike's weights. In addition to including parameter estimation inaccuracy, like in the case of a single selected model, model averaging accounts for the uncertainty arising from other competitive models. This leads to a better and more honest estimation of standard errors and construction of confidence intervals for risk estimates. The approach is illustrated for risk estimation at low dose levels based on Salmonella typhi and Campylobacter jejuni data sets in humans. Simulation studies indicate that model averaging has reduced bias, better precision, and also attains coverage probabilities that are closer to the 95% nominal level compared to best‐fitting models according to Akaike information criterion.
Quantitative risk assessment involves the determination of a safe level of exposure. Recent techniques use the estimated dose‐response curve to estimate such a safe dose level. Although such methods have attractive features, a low‐dose extrapolation is highly dependent on the model choice. Fractional polynomials,(1) basically being a set of (generalized) linear models, are a nice extension of classical polynomials, providing the necessary flexibility to estimate the dose‐response curve. Typically, one selects the best‐fitting model in this set of polynomials and proceeds as if no model selection were carried out. We show that model averaging using a set of fractional polynomials reduces bias and has better precision in estimating a safe level of exposure (say, the benchmark dose), as compared to an estimator from the selected best model. To estimate a lower limit of this benchmark dose, an approximation of the variance of the model‐averaged estimator, as proposed by Burnham and Anderson,(2) can be used. However, this is a conservative method, often resulting in unrealistically low safe doses. Therefore, a bootstrap‐based method to more accurately estimate the variance of the model averaged parameter is proposed.
Since the seminal paper by Cook and Weisberg [9 R.D. Cook and S. Weisberg, Residuals and Influence in Regression, Chapman & Hall, London, 1982.], local influence, next to case deletion, has gained popularity as a tool to detect influential subjects and measurements for a variety of statistical models. For the linear mixed model the approach leads to easily interpretable and computationally convenient expressions, not only highlighting influential subjects, but also which aspect of their profile leads to undue influence on the model's fit [17 E. Lesaffre and G. Verbeke, Local influence in linear mixed models, Biometrics 54 (1998), pp. 570–582. doi:10.2307/3109764 [CrossRef], [PubMed], [Web of Science ®]. Ouwens et al. [24 M.J.N.M. Ouwens, F.E.S. Tan, and M.P.F. Berger, Local influence to detect influential data structures for generalized linear mixed models, Biometrics 57 (2001), pp. 1166–1172. doi:10.1111/j.0006-341X.2001.01166.x [CrossRef], [PubMed], [Web of Science ®] applied the method to the Poisson-normal generalized linear mixed model (GLMM). Given the model's nonlinear structure, these authors did not derive interpretable components but rather focused on a graphical depiction of influence. In this paper, we consider GLMMs for binary, count, and time-to-event data, with the additional feature of accommodating overdispersion whenever necessary. For each situation, three approaches are considered, based on: (1) purely numerical derivations; (2) using a closed-form expression of the marginal likelihood function; and (3) using an integral representation of this likelihood. Unlike when case deletion is used, this leads to interpretable components, allowing not only to identify influential subjects, but also to study the cause thereof. The methodology is illustrated in case studies that range over the three data types mentioned. ; Financial support from the IAP research network #P7/06 of the Belgian Government (Belgian Science Policy) is gratefully acknowledged.
Many applications in public health, medical and biomedical or other studies demand modeling of two or more longitudinal outcomes jointly to get better insight in their joint evolution. In this regard, a joint model for a longitudinal continuous and a count sequence, the latter possibly overdispersed and zero-inflated, will be specified that assembles aspects coming from each one of them into one single model. Further, a subject-specific random effect is included to account for the correlation in the continuous outcome. For the count outcome, clustering and overdispersion are accommodated through two distinct sets of random effects in a generalized linear model as proposed by Molenberghs et al (2010); one is normally distributed, the other conjugate to the outcome distribution. The association among the two sequences is captured by correlating the normal random effects describing the continuous and count outcome sequences, respectively. An excessive number of zero counts is often accounted for by using a so-called zero-inflated or hurdle model. Zero-inflated models combine either a Poisson or negative-binomial model with an atom at zero as a mixture, while the hurdle model separately handles the zero observations and the positive counts. This paper proposes a general joint modeling framework in which all these features can appear together. ; The authors are grateful to M. Assefa and F. Tessema for the permission to use the data. Financial support from the Institutional University Cooperation of the Council of Flemish Universities (VLIR-IUC) is gratefully acknowledged. The authors gratefully acknowledge support from IAP research Network P7/06 of the Belgian Government (Belgian Science Policy).
Count data are most commonly modeled using the Poisson model, or by one of its many extensions. Such extensions are needed for a variety of reasons: (1) a hierarchical structure in the data, e.g., due to clustering, the collection of repeated measurements of the outcome, etc.; (2) the occurrence of overdispersion (or underdispersion), meaning that the variability encountered in the data is not equal to the mean, as prescribed by the Poisson distribution; and (3) the occurrence of extra zeros beyond what a Poisson model allows. The first issue is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. Overdispersion is often dealt with through a model developed for this purpose, such as, for example, the negative-binomial model for count data. This can be conceived through a random Poisson parameter. Excess zeros are regularly accounted for using so-called zero-inflated models, which combine either a Poisson or negative-binomial model with an atom at zero. The novelty of this paper is that it combines all these features. The work builds upon the modeling framework defined by Molenberghs et al.(2010) in which clustering and overdispersion are accommodated for through two separate sets of random effects in a generalized linear model. ; The authors are grateful to M. Assefa and F. Tessema for the permission to use the data. Financial support from the Institutional University Cooperation of the Council of Flemish Universities (VLIR-IUC) is gratefully acknowledged. The authors gratefully acknowledge support from IAP research Network P7/06 of the Belgian Government (Belgian Science Policy).
This paper proposes a flexible modeling approach for so-called comet assay data regularly encountered in pre-clinical research. While such data consist of non-Gaussian outcomes in a multi-level hierarchical structure, traditional analyses typically completely or partly ignore this hierarchical nature by summarizing measurements within a cluster. Non-Gaussian outcomes are often modeled using exponential family models. This is true not only for binary and count data, but also for, e.g., time-to-event outcomes. Two important reasons for extending this family are: (1) the possible occurrence of overdispersion, meaning that the variability in the data may not be adequately described by the models which often exhibit a prescribed mean-variance link, and (2) the accommodation of a hierarchical structure in the data, owing to clustering in the data. The first issue is dealt with through so-called overdispersion models. Clustering is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. In the case of time-to-event data, one encounters, for example, the gamma frailty model (Duchateau and Janssen 2007). While both of these issues may occur simultaneously, models combining both are uncommon. Molenberghs et al (2010) proposed a broad class of generalized linear models accommodating overdispersion and clustering through two separate sets of random effects. Here, we use this method to model data from a comet assay with a three-level hierarchical structure. Whereas a conjugate gamma random effect is used for the overdispersion random effect, both gamma and Normal random effects are considered for the hierarchical random effect. Apart from model formulation, we place emphasis on Bayesian estimation. Our proposed method has upper hand over the traditional analysis in that it: (1) uses the appropriate distribution stipulated in the literature; (2) deals with the complete hierarchical nature; and (3) uses all information instead of summary measures. The fit of the model to the comet assay is compared against the background of more conventional model fits. Results indicate the toxicity of 1,2-Dimethylhydrazine dihydrochloride at different dose levels (low, medium, and high). ; IAP research network of the Belgian Government (Belgian Science Policy) (P6/03 ) - Fund for Scientific Research-Flanders (F.W.O.)
The quantification of the relationship between the amount of microbial organisms ingested and a specific outcome such as infection, illness, or mortality is a key aspect of quantitative risk assessment. A main problem in determining such dose‐response models is the availability of appropriate data. Human feeding trials have been criticized because only young healthy volunteers are selected to participate and low doses, as often occurring in real life, are typically not considered. Epidemiological outbreak data are considered to be more valuable, but are more subject to data uncertainty. In this article, we model the dose‐illness relationship based on data of 20 Salmonella outbreaks, as discussed by the World Health Organization. In particular, we model the dose‐illness relationship using generalized linear mixed models and fractional polynomials of dose. The fractional polynomial models are modified to satisfy the properties of different types of dose‐illness models as proposed by Teunis et al. Within these models, differences in host susceptibility (susceptible versus normal population) are modeled as fixed effects whereas differences in serovar type and food matrix are modeled as random effects. In addition, two bootstrap procedures are presented. A first procedure accounts for stochastic variability whereas a second procedure accounts for both stochastic variability and data uncertainty. The analyses indicate that the susceptible population has a higher probability of illness at low dose levels when the combination pathogen‐food matrix is extremely virulent and at high dose levels when the combination is less virulent. Furthermore, the analyses suggest that immunity exists in the normal population but not in the susceptible population.
The quantification of the relationship between the amount of microbial organisms ingested and a specific outcome such as infection, illness, or mortality is a key aspect of quantitative risk assessment. A main problem in determining such dose-response models is the availability of appropriate data. Human feeding trials have been criticized because only young healthy volunteers are selected to participate and low doses, as often occurring in real life, are typically not considered. Epidemiological outbreak data are considered to be more valuable, but are more subject to data uncertainty. In this article, we model the dose-illness relationship based on data of 20 Salmonella outbreaks, as discussed by the World Health Organization. In particular, we model the dose-illness relationship using generalized linear mixed models and fractional polynomials of dose. The fractional polynomial models are modified to satisfy the properties of different types of dose-illness models as proposed by Teunis et al. Within these models, differences in host susceptibility (susceptible versus normal population) are modeled as fixed effects whereas differences in serovar type and food matrix are modeled as random effects. In addition, two bootstrap procedures are presented. A first procedure accounts for stochastic variability whereas a second procedure accounts for both stochastic variability and data uncertainty. The analyses indicate that the susceptible population has a higher probability of illness at low dose levels when the combination pathogen-food matrix is extremely virulent and at high dose levels when the combination is less virulent. Furthermore, the analyses suggest that immunity exists in the normal population but not in the susceptible population. ; This study has been carried out with the financial support of the Belgian Federal Public Serviceof Health, Food Chain Safety, and Environment re-search programme (R-04/003-Metzoon) "Develop-ment of a methodology for quantitative assessment ofzoonotic risks in Belgium applied to the 'Salmonellain pork' model." The authors gratefully acknowledgethe financial support from the IAP Research Net-work P6/03 of the Belgian government (Belgian Sci-ence Policy). Marc Aerts acknowledges support fromFWO-Vlaanderen Research Project G.0151.05
In repeated dose-toxicity studies, many outcomes are repeatedly measured on the same animal to study the toxicity of a compound of interest. This is only one example in which one is confronted with the analysis of many outcomes, possibly of a different type. Probably the most common situation is that of an amalgamation of continuous and categorical outcomes. A possible approach towards the joint analysis of two longitudinal outcomes of a different nature is the use of random-effects models (Models for Discrete Longitudinal Data. Springer Series in Statistics. Springer: New York, 2005). Although a random-effects model can easily be extended to jointly model many outcomes of a different nature, computational problems arise as the number of outcomes increases. To avoid maximization of the full likelihood expression, Fieuws and Verbeke (Biometrics 2006; 62:424-431) proposed a pairwise modeling strategy in which all possible pairs are modeled separately, using a mixed model, yielding several different estimates for the same parameters. These latter estimates are then combined into a single set of estimates. Also inference, based on pseudo-likelihood principles, is indirectly derived from the separate analyses. In this paper, we extend the approach of Fieuws and Verbeke (Biometrics 2006; 62:424-431) in two ways: the method is applied to different types of outcomes and the full pseudo-likelihood expression is maximized at once, leading directly to unique estimates as well as direct application of pseudo-likelihood inference. This is very appealing when interested in hypothesis testing. The method is applied to data from a repeated dose-toxicity study designed for the evaluation of the neurofunctional effects of a psychotrophic drug. The relative merits of both methods are discussed. Copyright (c) 2008 John Wiley & Sons, Ltd. ; We gratefully acknowledge the support from the Institute for the Promotion of Innovation by Science andTechnology (IWT) in Flanders, Belgium, and from the IAP Research Network no. P5/24 of the Belgian Government (Belgian Science Policy).
Author summary The generation and serial intervals are epidemiological quantities used to describe and predict an ongoing epidemic outbreak. These quantities are related to the contact pattern of individuals, since infection events can take place if infectious and susceptible individuals have a contact. Therefore, intervention measures that reduce the interactions between members of the population are expected to affect both the realized generation and serial intervals. For the current COVID-19 pandemic unprecedented interventions have been adopted worldwide, e.g. strict lockdown, isolation and quarantine, which influence the realized value of generation and serial intervals. The extent of the effect thereof depends on the efficacy of the control measure in place, on the relationship between symptom onset and infectiousness and on the proportion of infectious individuals that can be detected. To get more insight on this, we present an investigation that highlights the effect of quarantine and isolation on realized generation and serial intervals. In particular, we show that not only their variances but also their mean values can differ, suggesting that the use of the mean serial interval as a proxy for the mean generation time can lead to biased estimates of epidemiological quantities. The SARS-CoV-2 pathogen is currently spreading worldwide and its propensity for presymptomatic and asymptomatic transmission makes it difficult to control. The control measures adopted in several countries aim at isolating individuals once diagnosed, limiting their social interactions and consequently their transmission probability. These interventions, which have a strong impact on the disease dynamics, can affect the inference of the epidemiological quantities. We first present a theoretical explanation of the effect caused by non-pharmaceutical intervention measures on the mean serial and generation intervals. Then, in a simulation study, we vary the assumed efficacy of control measures and quantify the effect on the mean and variance of realized generation and serial intervals. The simulation results show that the realized serial and generation intervals both depend on control measures and their values contract according to the efficacy of the intervention strategies. Interestingly, the mean serial interval differs from the mean generation time. The deviation between these two values depends on two factors. First, the number of undiagnosed infectious individuals. Second, the relationship between infectiousness, symptom onset and timing of isolation. Similarly, the standard deviations of realized serial and generation intervals do not coincide, with the former shorter than the latter on average. The findings of this study are directly relevant to estimates performed for the current COVID-19 pandemic. In particular, the effective reproduction number is often inferred using both daily incidence data and the generation interval. Failing to account for either contraction or mis-specification by using the serial interval could lead to biased estimates of the effective reproduction number. Consequently, this might affect the choices made by decision makers when deciding which control measures to apply based on the value of the quantity thereof. ; A.T. acknowledges support from the special research fund of the University of Antwerp. This work also received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (NH, AT: grant number 682540 - TransMID project; NH, PL: grant number 101003688 - EpiPose project). P.L. gratefully acknowledges support from the Fonds voor Wetenschappelijk Onderzoek (FWO) via postdoctoral fellowship 1242021N. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. ; Torneri, A (corresponding author), Univ Antwerp, Ctr Hlth Econ Res & Modelling Infect Dis, Antwerp, Belgium ; Hasselt Univ, Interuniv Inst Biostat & Stat Bioinformat, Data Sci Inst, Hasselt, Belgium. andrea.torneri@uantwerp.be