NEW CRITERIA FOR THE CHOICE OF TRAINING SAMPLE SIZE FOR MODEL SELECTION AND PREDICTION: THE CUBIC ROOT RULE

The size of a training sample in Objective Bayesian Testing and Model Selection is a central problem in the theory and in the practice. We concentrate here in simulated training samples and in simple hypothesis. The striking result is that even in the simplest of situations, the optimal training sam...

Full description

Bibliographic Details
Main Authors: ISRAEL ALMODOVAR, RAÚL PERICCHI
Format: Article
Language:Spanish
Published: Universidad Nacional de Colombia, sede Medellín 2012-01-01
Series:Revista de la Facultad de Ciencias
Subjects:
Online Access:https://revistas.unal.edu.co/index.php/rfc/article/view/48975
Description
Summary:The size of a training sample in Objective Bayesian Testing and Model Selection is a central problem in the theory and in the practice. We concentrate here in simulated training samples and in simple hypothesis. The striking result is that even in the simplest of situations, the optimal training sample M, can be minimal (for the identification of the sampling model) or maximal (for optimal prediction of future data). We suggest a compromise that seems to work well whatever the purpose of the analysis: the 5\% cubic root rule}}: M=min[0.05*n, sqrt{3}]{n}].  We proceed to define a comprehensive loss function that combines identification  errors and prediction errors, appropriately standardized. We find that the very  simple cubic root rule is extremely close to an over- all optimum for a wide selection  of sample sizes and cutting points that define the decision rules. The first time that  the cubic root has been proposed is in Pericchi (2010). This article propose to generalize  the rule and to take full statistical advantage for realistic situations. Another way to look  at the rule, is as a synthesis of the rationale that justify both AIC and BIC.
ISSN:0121-747X
2357-5549