Kernel classification with missing data and the choice of smoothing parameters
In: Statistical papers, Band 60, Heft 5, S. 1487-1513
ISSN: 1613-9798
92 Ergebnisse
Sortierung:
In: Statistical papers, Band 60, Heft 5, S. 1487-1513
ISSN: 1613-9798
We present a novel method for the estimation of variance parameters in generalised linear mixed models. The method has its roots in Harville (J Am Stat Assoc 72(358):320-338, 1977)'s work, but it is able to deal with models that have a precision matrix for the random effect vector that is linear in the inverse of the variance parameters (i.e., the precision parameters). We call the method SOP (separation of overlapping precision matrices). SOP is based on applying the method of successive approximations to easy-to-compute estimate updates of the variance parameters. These estimate updates have an appealing form: they are the ratio of a (weighted) sum of squares to a quantity related to effective degrees of freedom. We provide the sufficient and necessary conditions for these estimates to be strictly positive. An important application field of SOP is penalised regression estimation of models where multiple quadratic penalties act on the same regression coefficients. We discuss in detail two of those models: penalised splines for locally adaptive smoothness and for hierarchical curve data. Several data examples in these settings are presented. ; This research was supported by the Basque Government through the BERC 2018-2021 program and by Spanish Ministry of Economy and Competitiveness MINECO through BCAM Severo Ochoa excellence accreditation SEV-2013-0323 and through projects MTM2017-82379-R funded by (AEI/FEDER, UE) and acronym "AFTERAM", MTM2014-52184-P and MTM2014-55966-P. The MRI/DTI data were collected at Johns Hopkins University and the Kennedy-Krieger Institute. We are grateful to Pedro Caro and Iain Currie for useful discussions, to Martin Boer and Cajo ter Braak for the detailed reading of the paper and their many suggestions, and to Bas Engel for sharing with us his knowledge. We are also grateful to the two peer referees for their constructive comments of the paper.
BASE
We use smoothing splines to introduce prior information in nonparametric models. The type of information we consider is based on the belief that the regression curve is similar in shape to a parametric model. The resulting estimator is a convex sum of a fit to data and the parametric model, and it can be seen as shrinkage of the smoothing spline toward the parametric model. We analyze its rates of convergence and we provide some asymptotic distribution theory. Because the asymptotic distribution is intractable, we propose to carry out inference with the estimator by using the method proposed by Politis and Romano (1994, AnnalsofStatistics 22, 2031–2050). We also propose a data-driven technique to compute the smoothing parameters that provides asymptotically optimal estimates. Finally, we apply our results to the estimation of a model of investment behavior of the U.S. telephone industry and we present some Monte Carlo results.
BASE
In: NYU Stern School of Business
SSRN
Working paper
The joint modelling of mortality rates for multiple populations has gained increasing popularity in areas such as government planning and insurance pricing. Sub-groups of a population often preserve similar mortality features with short-term deviations from the common trend. Recent studies indicate that the exponential smoothing state space (ETS) model can produce outstanding prediction performance, while it fails to guarantee the consistency across neighbouring ages. Apart from that, single-population models such as the famous Lee-Carter (LC) may produce divergent forecasts between different populations in the long run and thus lack the property of the so-called coherence. This study extends the original ETS model to a two-population version (2-ETS) and imposes a smoothing penalisation scheme to reduce inconsistency of forecasts across adjacent ages. The exponential smoothing parameters in the 2-ETS model are fitted by a Fourier functional form to reduce dimensionality and thus improve estimation efficiency. We evaluate the performance of the proposed model via an empirical study using Australian female and male population data. Our results demonstrate the superiority of the 2-ETS model over the LC and ETS as well as two multi-population methods - the augmented common factor model (LL) and coherent functional data model (CFDM) regarding forecast accuracy and coherence.
BASE
In this paper, we investigate the effect of pre-smoothing on model selection. Christobal et al 6 showed the beneficial effect of pre-smoothing on estimating the parameters in a linear regression model. Here, in a regression setting, we show that smoothing the response data prior to model selection by Akaike's information criterion can lead to an improved selection procedure. The bootstrap is used to control the magnitude of the random error structure in the smoothed data. The effect of pre-smoothing on model selection is shown in simulations. The method is illustrated in a variety of settings, including the selection of the best fractional polynomial in a generalized linear model. ; We also gratefully acknowledge the support from the IAP research network nr P5/24 of the Belgian Government (Belgian Science Policy). The research of Niel Hens has been financially supported by the Fund of Scientific Research (FWO, Research Grant # G039304) of Flanders, Belgium.
BASE
SSRN
Working paper
In: NBER working paper series 16957
"This paper studies the dynamics of portfolio rebalancing and consumption smoothing in the presence of non-convex portfolio adjustment costs. The goal is to understand a household's response to income and return shocks. The model includes the choice of two assets: one riskless without adjustment costs and a second risky asset with adjustment costs. With these multiple assets, a household can buffer some income fluctuations through the asset without adjustment costs and engage in costly portfolio rebalancing less frequently. We estimate both preference parameters and portfolio adjustment costs. The estimates are used for evaluating consumption smoothing and portfolio adjustment in the face of income and return shocks"--National Bureau of Economic Research web site
In: Financial Mathematics and Fintech
This open access book covers the most cutting-edge and hot research topics and fields of post-quantum cryptography. The main purpose of this book is to focus on the computational complexity theory of lattice ciphers, especially the reduction principle of Ajtai, in order to fill the gap that post-quantum ciphers focus on the implementation of encryption and decryption algorithms, but the theoretical proof is insufficient. In Chapter 3, Chapter 4 and Chapter 6, author introduces the theory and technology of LWE distribution, LWE cipher and homomorphic encryption in detail. When using random analysis tools, there is a problem of "ambiguity" in both definition and algorithm. The greatest feature of this book is to use probability distribution to carry out rigorous mathematical definition and mathematical demonstration for various unclear or imprecise expressions, so as to make it a rigorous theoretical system for classroom teaching and dissemination. Chapters 5 and 7 further expand and improve the theory of cyclic lattice, ideal lattice and generalized NTRU cryptography. This book is used as a professional book for graduate students majoring in mathematics and cryptography, as well as a reference book for scientific and technological personnel engaged in cryptography research.
In: Sociological methods and research, Band 33, Heft 1, S. 88-117
ISSN: 1552-8294
When the sample size is small compared to the number of cells in a contingency table, maximum likelihood estimates of logit parameters and their associated standard errors may not exist or may be biased. This problem is usually solved by "smoothing" the estimates, assuming a certain prior distribution for the parameters. This article investigates the performance of point and interval estimates obtained by assuming various prior distributions. The authors focus on two logit parameters of a 2 × 2 × 2 table: the interaction effect of two predictors on a response variable and the main effect of one of two predictors on a response variable, under the assumption that the interaction effect is zero. The results indicate the superiority of the posterior mode to the posterior mean.
The purpose of this study is to predict the arrival of foreign tourists on the island of Bali by using the Exponential Smoothing method. This research is a quantitative descriptive. The data used in the study are data on foreign tourist arrivals from the air and sea routes taken from the Central Statistics Agency. Data is managed through the Microsoft Excel application. In determining the RMSE, the Solver Parameters help listed in Microsoft Excel is used to determine the lowest error rate. The data used in this study indicate that there are trend and seasonal patterns so that the most suitable Exponential Smoothing method is the Triple Exponential Smoothing method. The results of this study indicate that foreign tourist arrivals on the island of Bali are predicted to increase in 2020 although not too significant. The results of this study are expected to help the Bali Island government and related agencies in terms of planning and decision making to overcome the crisis on the island of Bali caused by the tourism sector.
BASE
In: Natural hazards and earth system sciences: NHESS, Band 23, Heft 1, S. 91-106
ISSN: 1684-9981
Abstract. Spatial mapping is one of the most useful methods to display information about the seismic parameters of a certain area. As in b-value time series, there is a certain arbitrariness regarding the function selected as smoothing kernel (which plays the same role as the window size in time series). We propose a new method for the calculation of the smoothing kernel as well as its parameters. Instead of using the spatial cell-event distance we study the distance between events (event-event distance) in order to calculate the smoothing function, as this distance distribution gives information about the event distribution and the seismic sources. We examine three different scenarios: two shallow seismicity settings and one deep seismicity catalog. The first one, Italy, allows calibration and showcasing of the method. The other two catalogs: the Lorca region (Spain) and Vrancea County (Romania) are examples of different function fits and data treatment. For these two scenarios, the prior to earthquake and after earthquake b-value maps depict tectonic stress changes related to the seismic settings (stress relief in Lorca and stress build-up zone shifting in Vrancea). This technique could enable operational earthquake forecasting (OEF) and tectonic source profiling given enough data in the time span considered.
This paper addresses the selection of smoothing parameters for estimating the average treatment effect on the treated using matching methods. Because precise estimation of the expected counterfactual is particularly important in regions containing the mass of the treated units, we define and implement weighted cross-validation approaches that improve over conventional methods by considering the location of the treated units in the selection of the smoothing parameters. We also implement a locally varying bandwidth method that uses larger bandwidths in areas where the mass of the treated units is located. A Monte Carlo study compares our proposed methods to the conventional unweighted method and to a related method inspired by Bergemann et al. (2005). The Monte Carlo analysis indicates efficiency gains from all methods that take account of the location of the treated units. We also apply all five methods to bandwidth selection in the context of the data from LaLonde's (1986) study of the performance of non-experimental estimators using the experimental data from the National Supported Work (NSW) Demonstration program as a benchmark. Overall, both the Monte Carlo analysis and the empirical application show feasible precision gains for the weighted cross-validation and the locally varying bandwidth approaches.
BASE
Dynamic factor models have been the main "big data" tool used by empirical macroeconomists during the last 30 years. In this context, Kalman filter and smoothing (KFS) procedures can cope with missing data, mixed frequency data, time-varying parameters, non-linearities, non-stationarity, and many other characteristics often observed in real systems of economic variables. The main contribution of this paper is to provide a comprehensive updated summary of the literature on latent common factors extracted using KFS procedures in the context of dynamic factor models, pointing out their potential limitations. Signal extraction and parameter estimation issues are separately analyzed. Identification issues are also tackled in both stationary and non-stationary models. Finally, empirical applications are surveyed in both cases. This survey is relevant to researchers and practitioners interested not only in the theory of KFS procedures for factor extraction in dynamic factor models but also in their empirical application in macroeconomics and finance ; Financial support from the Spanish Government Project PID2019-108079GB-C22/AEI/10.13039/501100011033 (MINECO/FEDER) is gratefully acknowledged by Pilar Poncela. Esther Ruiz and Karen Miranda acknowledge financial support from project PID2019-108079GB-C21 (MINECO/FEDER)
BASE