For best experience please turn on javascript and use a modern browser!
You are using a browser that is no longer supported by Microsoft. Please upgrade your browser. The site may not present itself correctly if you continue browsing.
A research paper by Amsterdam School of Economics (ASE) researchers Ramon de Punder, Cees Diks and Roger Laeven (all with the Quantitative Economics section), and Dick van Dijk (Econometric Institute, Erasmus University Rotterdam) was recently accepted for publication in the ‘Journal of the American Statistical Association’ (JASA).
Ramon de Punder
Ramon de Punder

JASA is considered a leading journal in this field.

De Punder initiated the study during his Research Master programme at the Tinbergen Institute. According to the Institute’s director, it is the first time since its founding in 1987 that research originating from a Research Master’s thesis has been published in the Journal of the American Statistical Association.

The ASE researcher will also be interviewed about his research on 21 March in the NPO Radio 1programme Dr. Kelder en Co where he will be featured in Jonge Doctors.

About the study

If you want to know whether you can sit outside on a terrace tomorrow, an extremely precise estimate of the probability of temperatures between 10 and 12 degrees is of little relevance. What matters primarily is the probability of temperatures above 18 degrees. But how should 2 predicted probability distributions be compared with the true distribution when only certain outcomes truly matter? That’s the question De Punder’s article ‘Localizing Strictly Proper Scoring Rules’ explores.

If all outcomes are weighted equally in the evaluation, accurate predictions of irrelevant temperature ranges may mask poor predictions for temperatures above 18 degrees.
If everything below 18 degrees is discarded entirely, a model may even appear 'better than reality'. In forecast evaluation, this creates substantial difficulties, because an incorrect model may then be preferred over a model that coincides exactly with the true distribution.

The solution is a principled middle ground, based on the classical idea of censoring. Irrelevant outcomes are not discarded but grouped into a single category, while the remainder of the distribution is left fully intact. Mathematically, this construction yields, for every (generalized) distance, a new local distance that evaluates forecasts fairly when outcomes are not equally important.

Relevance

The localised distances obtained through our method are also relevant for statistical testing. We show that, for certain hypotheses, the procedure yields the most powerful test. This is not only of theoretical interest. When applied to finance, inflation, and climate, our approach to focusing on relevant regions also leads empirically to higher statistical power.

Taken together, these results provide a coherent and theoretically grounded framework for forecast evaluation in settings where only specific regions of the outcome space are substantively relevant.

Publication details

Ramon F. A. de Punder, Cees G. H. Diks, Roger J. A. Laeven and Dick J. C. van Dijk (2026). Localizing strictly proper scoring rules',  Journal of the American Statistical Association, 1–13.