Relational event models (REMs) are the primary choice for the analysis of relational-event network data. However, the standard REM assumes static parameters, which hinders the modeling of time-varying dynamics. This assumption might be too restrictive in real-life scenarios, making a model that allows for time-varying parameters more valuable. We introduce a state-space extension of the relational event model as a way to tackle this problem. The model has three main attributes. First, it provides a statistical framework of the temporal change of the parameters. Second, it enables the forecasting of future parameter values (which can be utilized to simulate new networks that can account for temporal dynamics in out-of-sample predictions). Third, it requires smaller data structures to be loaded into computer memory compared to the standard REM; this makes the model easily scalable to large networks. We conduct empirical analyses on bike-sharing data, corporate communications, and interactions among socio-political actors to illustrate model usage and applicability.
AbstractLarge relational-event history data stemming from large networks are becoming increasingly available due to recent technological developments (e.g. digital communication, online databases, etc). This opens many new doors to learn about complex interaction behavior between actors in temporal social networks. The relational event model has become the gold standard for relational event history analysis. Currently, however, the main bottleneck to fit relational events models is of computational nature in the form of memory storage limitations and computational complexity. Relational event models are therefore mainly used for relatively small data sets while larger, more interesting datasets, including multilevel data structures and relational event data streams, cannot be analyzed on standard desktop computers. This paper addresses this problem by developing approximation algorithms based on meta-analysis methods that can fit relational event models significantly faster while avoiding the computational issues. In particular, meta-analytic approximations are proposed for analyzing streams of relational event data, multilevel relational event data and potentially combinations thereof. The accuracy and the statistical properties of the methods are assessed using numerical simulations. Furthermore, real-world data are used to illustrate the potential of the methodology to study social interaction behavior in an organizational network and interaction behavior among political actors. The algorithms are implemented in the publicly available R package 'remx'.
In: European journal of work and organizational psychology: the official journal of The European Association of Work and Organizational Psychology, Band 26, Heft 3, S. 337-355
AbstractRelational event network data are becoming increasingly available. Consequently, statistical models for such data have also surfaced. These models mainly focus on the analysis of single networks; while in many applications, multiple independent event sequences are observed, which are likely to display similar social interaction dynamics. Furthermore, statistical methods for testing hypotheses about social interaction behavior are underdeveloped. Therefore, the contribution of the current paper is twofold. First, we present a multilevel extension of the dynamic actor-oriented model, which allows researchers to model sender and receiver processes separately. The multilevel formulation enables principled probabilistic borrowing of information across networks to accurately estimate drivers of social dynamics. Second, a flexible methodology is proposed to test hypotheses about common and heterogeneous social interaction drivers across relational event sequences. Social interaction data between children and teachers in classrooms are used to showcase the methodology.
AbstractIn relational event networks, endogenous statistics are used to summarize the past activity between actors. Typically, it is assumed that past events have equal weight on the social interaction rate in the (near) future regardless of the time that has transpired since observing them. Generally, it is unrealistic to assume that recently past events affect the current event rate to an equal degree as long-past events. Alternatively one may consider using a prespecified decay function with a prespecified rate of decay. A problem then is that the chosen decay function could be misspecified yielding biased results and incorrect conclusions. In this paper, we introduce three parametric weight decay functions (exponential, linear, and one-step) that can be embedded in a relational event model. A statistical method is presented to decide which memory decay function and memory parameter best fit the observed sequence of events. We present simulation studies that show the presence of bias in the estimates of effects of the statistics whenever the decay, as well as the memory parameter, are not properly estimated, and the ability to test different memory models against each other using the Bayes factor. Finally, we apply the methodology to two empirical case studies.
In relational event networks, the tendency for actors to interact with each other depends greatly on the past interactions between the actors in a social network. Both the volume of past interactions and the time that has elapsed since the past interactions affect the actors' decision-making to interact with other actors in the network. Recently occurred events may have a stronger influence on current interaction behavior than past events that occurred a long time ago–a phenomenon known as "memory decay". Previous studies either predefined a short-run and long-run memory or fixed a parametric exponential memory decay using a predefined half-life period. In real-life relational event networks, however, it is generally unknown how the influence of past events fades as time goes by. For this reason, it is not recommendable to fix memory decay in an ad-hoc manner, but instead we should learn the shape of memory decay from the observed data. In this paper, a novel semi-parametric approach based on Bayesian Model Averaging is proposed for learning the shape of the memory decay without requiring any parametric assumptions. The method is applied to relational event history data among socio-political actors in India and a comparison with other relational event models based on predefined memory decays is provided.
The network autocorrelation model has been the workhorse for estimating and testing the strength of theories of social influence in a network. In many network studies, different types of social influence are present simultaneously and can be modeled using various connectivity matrices. Often, researchers have expectations about the order of strength of these different influence mechanisms. However, currently available methods cannot be applied to test a specific order of social influence in a network. In this article, the authors first present flexible Bayesian techniques for estimating network autocorrelation models with multiple network autocorrelation parameters. Second, they develop new Bayes factors that allow researchers to test hypotheses with order constraints on the network autocorrelation parameters in a direct manner. Concomitantly, the authors give efficient algorithms for sampling from the posterior distributions and for computing the Bayes factors. Simulation results suggest that frequentist properties of Bayesian estimators on the basis of noninformative priors for the network autocorrelation parameters are overall slightly superior to those based on maximum likelihood estimation. Furthermore, when testing statistical hypotheses, the Bayes factors show consistent behavior with evidence for a true data-generating hypothesis increasing with the sample size. Finally, the authors illustrate their methods using a data set from economic growth theory.
Currently available (classical) testing procedures for the network autocorrelation can only be used for falsifying a precise null hypothesis of no network effect. Classical methods can be neither used for quantifying evidence for the null nor for testing multiple hypotheses simultaneously. This article presents flexible Bayes factor testing procedures that do not have these limitations. We propose Bayes factors based on an empirical and a uniform prior for the network effect, respectively, first. Next, we develop a fractional Bayes factor where a default prior is automatically constructed. Simulation results suggest that the first two Bayes factors show superior performance and are the Bayes factors we recommend. We apply the recommended Bayes factors to three data sets from the literature and compare the results to those coming from classical analyses using p values. R code for efficient computation of the Bayes factors is provided.
The application of latent class (LC) analysis involves evaluating the LC model using goodness-of-fit statistics. To assess the misfit of a specified model, say with the Pearson chi-squared statistic, a p-value can be obtained using an asymptotic reference distribution. However, asymptotic p-values are not valid when the sample size is not large and/or the analyzed contingency table is sparse. Another problem is that for various other conceivable global and local fit measures, asymptotic distributions are not readily available. An alternative way to obtain the p-value for the statistic of interest is by constructing its empirical reference distribution using resampling techniques such as the parametric bootstrap or the posterior predictive check (PPC). In the current paper, we show how to apply the parametric bootstrap and two versions of the PPC to obtain empirical p-values for a number of commonly used global and local fit statistics within the context of LC analysis. The main difference between the PPC using test statistics and the parametric bootstrap is that the former takes into account parameter uncertainty. The PPC using discrepancies has the advantage that it is computationally much less intensive than the other two resampling methods. In a Monte Carlo study we evaluated Type I error rates and power of these resampling methods when used for global and local goodness-of-fit testing in LC analysis. Results show that both the bootstrap and the PPC using test statistics are generally good alternatives to asymptotic p-values and can also be used when (asymptotic) distributions are not known. Nominal Type I error rates were not met when sample size was small and the contingency table has many cells. Overall the PPC using test statistics was somewhat more conservative than the parametric bootstrap. We have also replicated previous research suggesting that the Pearson χ2statistic should in many cases be preferred over the likelihood-ratio G2statistic. Power to reject a model for which the number of LCs was one less than in the population was very high, unless sample size was small. When the contingency tables are very sparse, the total bivariate residual (TBVR) statistic, which is based on bivariate relationships, still had very high power, signifying its usefulness in assessing model fit.
The current article reports on the second large-scale prevalence study on transgressive behavior in sport in the Netherlands, and is a follow up of an earlier, comparable prevalence study in 2015. Using a dedicated and customized online questionnaire, approximately 4000 adults who met the inclusion criteria (18 to 50 years old and have played sports in an organized context during childhood in the Netherlands) were surveyed with respect to their experiences of childhood psychological, physical, and sexual transgressive behavior while playing sports. The survey showed that 71.7% experienced some form of transgressive behavior as a child, in which 48.6% of these events also made an impact (in other words, was significant at the time it took place). The degree of impact the event made is also related to the severity of the event. Severe emotional transgression events occurred in 22% of the youth athletes, severe physical assault events in 12.7%, and severe sexual assault events occurred in 6.9% of the youth athletes. Disabled athletes, and those competing at national and international levels, report more experiences of transgressive behavior in sport. The results are consistent with former research and indicate the need for structural attention to create a safe sports climate.
This Teacher's Corner paper introduces Bayesian evaluation of informative hypotheses for structural equation models, using the free open-source R packages bain, for Bayesian informative hypothesis testing, and lavaan, a widely used SEM package. The introduction provides a brief non-technical explanation of informative hypotheses, the statistical underpinnings of Bayesian hypothesis evaluation, and the bain algorithm. Three tutorial examples demonstrate informative hypothesis evaluation in the context of common types of structural equation models: 1) confirmatory factor analysis, 2) latent variable regression, and 3) multiple group analysis. We discuss hypothesis formulation, the interpretation of Bayes factors and posterior model probabilities, and sensitivity analysis.