Identifying outliers in a data set is conceptually cut and dry. Extremes, by definition, should be conspicuous. But in practice, labeling outliers is a mix of art and science, in part because expectations vary from investor to investor as do sensitivities to “risk.” The analysis can also be muddied by the technique used to crunch the numbers.
In previous articles in this series (see list below) we used several applications that rely on a mix of observation and quantification for the test data (rolling 1-year returns for the S&P 500 Index). Sorting performances by way of interquartile range, for example, clearly indicates outliers, albeit on a relative basis. But this approach to outlier search is destined to find extremes, even if they don’t exist.
Imagine that we’re looking at a set of returns that we know do not contain outliers, based on some pre-determined rule that we agree on. Nonetheless, graphing the numbers via a histogram or boxplot will likely turn up outliers because the analysis is relative and so there are always outliers for any given data set, even if the extremes don’t pass muster in an absolute sense.
That creates a challenge, particularly if you and I disagree on defining outliers. There’s also a process issue. Visually inspecting histograms and boxplots is tedious and time-consuming if we’re analyzing data periodically for multiple markets and time windows. Yes, we can shift to a parametric approach – z-scores, for example. But we can still disagree over which z-scores are relevant or if the underlying analysis (standard deviation in this case) is reliable.
As a solution, we can formalize the analysis with models proper. Several model tests have been proposed. Once again, perfection is elusive, but a higher level of objectivity is available (assuming you buy into the underlying assumptions). Let’s look at one of the options – the Grubbs test. The goal here is to decide if a single outlier is truly extreme by way of a formal statistical test. Once again we’ll use the rolling 1-year returns for the S&P 500.
As outlined previously, there was a case for seeing some of the extreme returns as outliers. But those conclusions rely on a degree of subjectivity. The Grubbs’ test, by comparison, leaves no room for debate. That alone doesn’t mean it’s the last word on defining outliers, but at least it provides a benchmark that’s immune to the behavioral factors that can otherwise complicate the analysis.
As an example, running a Grubbs test on the 1-year returns in the chart above focuses on the highest positive return – a nearly 75% gain. The resulting test statistic (4.15) and p-value (0.25) indicate that this number isn’t an outlier. The p-value is far above 0.05, a reading that is commonly said to provide statistical support for accepting that the null hypothesis is true – in this case that the 75% isn’t an outlier.
Running the Grubbs test on the opposite extreme return – a near 49% loss – also finds no support for labeling the number an outlier.
This result conflicts with some of the previous analytics discussed in this series. The logic appears to be that in context with the full data set, a 75% gain or a 49% one-year loss for the S&P 500 don’t constitute outlier events, at least within the statistical paradigm set up by a Grubbs test.
But that’s not the end of the story, even if we’re limiting our data crunching analysis to formal statistical analytics. As we’ll see in the next installment, a Grubbs test isn’t the only game in town in the pursuit of formal outlier tests.
Previous articles in this series:
Outlier Risk, Part I: boxplot and interquartile range
Outlier RIsk, Part II: Z-score
Outlier RIsk, Part III: Hampel filter/median absolute deviation
Learn To Use R For Portfolio Analysis
Quantitative Investment Portfolio Analytics In R:
An Introduction To R For Modeling Portfolio Risk and Return
By James Picerno