Aufsatz(elektronisch)Januar 2019

When 4 ≈ 10,000: The Power of Social Science Knowledge in Predictive Performance

In: Socius: sociological research for a dynamic world, Band 5

Verfügbarkeit an Ihrem Standort wird überprüft

Abstract

Computer science has devised leading methods for predicting variables; can social science compete? The author sets out a social scientific approach to the Fragile Families Challenge. Key insights included new variables constructed according to theory (e.g., a measure of shame relating to hardship), lagged values of the target variables, using predicted values of certain outcomes to inform others, and validated scales rather than individual variables. The models were competitive: a four-variable logistic regression model was placed second for predicting layoffs, narrowly beaten by a model using all the available variables (>10,000) and an ensemble of algorithms. Similarly, a relatively small random forest model (25 variables) was ranked seventh in predicting material hardship. However, a similar approach overfitted the prediction of grit. Machine learning approaches proved superior to linear regression for modeling the continuous outcomes. Overall, social scientists can contribute to predictive performance while benefiting from learning more about data science methods.

Sprachen

Englisch

Verlag

SAGE Publications

ISSN: 2378-0231

DOI

10.1177/2378023118811774

Problem melden

Wenn Sie Probleme mit dem Zugriff auf einen gefundenen Titel haben, können Sie sich über dieses Formular gern an uns wenden. Schreiben Sie uns hierüber auch gern, wenn Ihnen Fehler in der Titelanzeige aufgefallen sind.