Forecasting the volatility of the Australian dollar using high‐frequency data: Does estimator accuracy improve forecast evaluation?

AuthorGeorge Bailey,James M. Steeley
DOIhttp://doi.org/10.1002/ijfe.1723
Published date01 July 2019
Date01 July 2019
RESEARCH ARTICLE
Forecasting the volatility of the Australian dollar using
highfrequency data: Does estimator accuracy improve
forecast evaluation?
George Bailey | James M. Steeley
School of Management, Keele University,
Keele ST5 5BG, UK
Correspondence
James M. Steeley, School of Management,
Keele University, Keele ST5 5BG, UK.
Email: j.steeley@keele.ac.uk
Abstract
We compare forecasts of the volatility of the Australian dollar exchange rate to
alternative measures of ex post volatility. We develop and apply a simple test
for the improvement in the ability of loss functions to distinguish between fore-
casts when the quality of a volatility estimator is increased. We find that both
realized variance and the daily highlow range provide a significant improve-
ment in loss function convergence relative to squared returns. We find that a
model of stochastic volatility provides the best forecasts for models that use
daily data, and the GARCH(1,1) model provides the best forecast using high
frequency data.
KEYWORDS
Australian dollar, exchange rate, highlow range, realized variance, stochastic volatility, volatility
forecasting
1|INTRODUCTION
A key purpose of volatility models is to produce forecasts
of volatility that are useful for the pricing of financial
assets that depend heavily on the evolution of volatility,
such as derivatives, and also for input into economic pol-
icy analysis and forecasting.
1
Because truevolatility is
unobservable, volatility estimators are required when
one wishes to undertake a comparison of forecasted
values. However, the early studies of volatility forecast
evaluation
2
found that the forecasts performed poorly
when compared with the standard volatility estimator,
squared daily returns. Andersen and Bollerslev (1998)
demonstrated that more sophisticated estimators would
increase the explanatory power offered by the volatility
models and recommended the use of volatility measured
from highfrequency (intraday) returns, which has
become known as realized variance.
A key consideration in the evaluation of forecasts,
more generally, is the selection of the loss function.
Hansen and Lunde (2005) examined 330 ARCHrelated
models to an exchange rate and a stock return series
and found that different loss functions selected different
models as providing the best volatility forecasts.
3
Combining the two issues, Hansen and Lunde (2006)
and Patton (2011) explore the theoretical and practical
interactions between the selected loss function and the
quality of the volatility estimator. Both papers identify
those loss functions that are less sensitive to the noise
in volatility proxies and show that more sophisticated
measures of volatility are better able to distinguish
between forecasts.
In this paper, we also focus on the interaction between
the choice of volatility estimator and the selection of loss
functions in the evaluation of volatility forecasts but seek
to answer the following specific question. Does the use of
more sophisticated volatility proxies generate greater con-
vergence amongst the forecast rankings across different
loss functions? This is useful knowledge in forecast eval-
uation because, if the answer is yes, it means that the
Received: 30 November 2017 Revised: 21 December 2018 Accepted: 30 December 2018
DOI: 10.1002/ijfe.1723
Int J Fin Econ. 2019;24:13551389. © 2019 John Wiley & Sons, Ltd.wileyonlinelibrary.com/journal/ijfe 1355
choice of loss function matters much less than would oth-
erwise be the case.
We contribute and innovate in number of important
ways. Both Hansen and Lunde (2006) and Patton (2011)
examine the volatility of IBM stock returns, over similar
time frames, ending in 2003. We examine an exchange
rate, specifically the Australian dollar/U.S. dollar (AUD/
USD) rate, and over recent years. Although Hansen and
Lunde (2005) did examine an exchange rate in their com-
parison of forecasts, they use only highfrequency mea-
sures of volatility and, like many volatility forecasting
studies, used the German Mark (or Euro)/USD rate. The
AUD/USD is the fourth highest traded currency pair
(BIS, 2013) and has approximately double the trading
activity of the next two highest currency pairs.
4
The AUD
itself is the fifth highest traded currency, after the USD,
Euro, British Pound, and Japanese Yen. Using this cur-
rency pair, which is still highly traded, avoids the pitfalls
from overworking a dataset and has greater potential to
provide new empirical findings. However, we recognize
that such a narrow empirical application runs the risk that
our findings may be sample specific. So, as a control, we
additionally examine the AUD/USD at a different point
in time (2 years apart), the EUR/USD exchange rate (as a
different asset in the same asset class), and the S&P 500
index (as an asset in a different asset class).
Although Patton (2011) examined forecasts from past
squared returns, using either a rolling window average
or a partial adjustment mechanism, and Hansen and
Lunde (2006) examine forecasts from a small number of
GARCH models, we consider a model of stochastic vola-
tility (Harvey, Ruiz, & Shephard, 1994) as well as a selec-
tion of GARCH models, also relaxing the distributional
assumptions in Hansen and Lunde (2006). Harvey et al.
(1994) showed that the stochastic volatility model fitted
well to four USD exchange rates, and comparisons to
GARCH models for exchange rate data have been con-
ducted by Heynen and Kat (1994), Dunis, Laws, and
Chauvin (2003), and Lopez (2001). However, all of these
studies focus on the cross rates of the USD with the Ger-
man Mark (or Euro), the Japanese Yen, the British
Pound, the Canadian Dollar, and the Swiss Franc, and
none use highfrequency data. By contrast, Chortareas,
Jiang, and Nankervis (2011) do use highfrequency data,
but only highfrequency data, in their comparison of sto-
chastic volatility to (symmetric) GARCH models for
exchange rate data, and also for a subset of the aforemen-
tioned currencies. Moreover, the balance of evidence
from these studies is unable to clearly distinguish
between the forecasts from these two classes of model.
So our study will compare the stochastic volatility model
to a selection of GARCH models using a fresh data set, of
comparable trading activity levels, that includes both low
frequency and highfrequency data. To our knowledge,
our study is the first to examine forecasts of the volatility
of the AUD/USD using the GARCH model applied
directly to highfrequency data and the first to compare
this to forecasts derived from lowfrequency data.
We use the set of loss functions examined in Hansen
and Lunde (2005), which is broader than, but encom-
passes, those used by Patton (2011) and Hansen and
Lunde (2006). Our selection of volatility estimators is sim-
ilar to Patton (2011). As well as examining squared returns
and realized variance from high frequency returns, he also
considers the rangebased variance estimator of Parkinson
(1980). Although he calibrates the efficiency losses of this
volatility measure relative to those of squared returns and
realized variance measures, like Hansen and Lunde
(2006), he only uses realized variance and squared returns
in his empirical application. We believe our study to be
the first to include all the three measures in the empirical
application. Moreover, we focus on a different asset class
and a much more recent period of time.
The question of whether there is increased conver-
gence across loss functions of relative forecasting perfor-
mance when higher quality volatility estimators are
used is examined only indirectly by Hansen and Lunde
(2006) and Patton (2011). Hansen and Lunde (2006) pro-
vide comparative performance tables for the forecasts
from different models across different loss functions
using either squared returns or realized variance, whereas
Patton (2011) formally tests differences between such
forecasts using the tests proposed by Diebold and
Mariano (1995) and West (1996). In this study, we address
the question directly, by conducting paired ttests of the
mean (across forecasts/models) of the standard deviation
of the loss function rankings (across loss functions). A
smaller value of this mean indicates a greater conver-
gence of the rankings across loss functions, because con-
verged rankings would generate a zero standard deviation
across the loss functions for each model. We supplement
this, with a related procedure, similarly constructed, that
tests whether the ability of loss functions to distinguish
between the first and second best models is enhanced
by using more sophisticated variance estimators.
Our key findings are as follows. Using our test for
ranking convergence, we find that using either a range
based estimator (Parkinson, 1980) or a realized variance
estimator rather than squared returns as the measure of
ex post volatility results in a significant increase in the
convergence of loss function rankings. With better quality
measures of volatility, loss functions converge more
strongly on the preferred forecast. This is the case both
for loss functions classified as robust by Patton (2011)
and those not regarded as robust. Moreover, we also find
a significant increase in the ability of loss functions to
1356 BAILEY AND STEELEY
distinguish between the best and the second best models
when either the range measure or the realized variance
are used. Thus, the margin by which the best model is
chosen is significantly increased by using the higher qual-
ity volatility measures.
We find that there are significant differential gains in
terms of loss function ranking convergence by using the
realized variance measure, such that forecast comparison
tests are more able to facilitate the rejection of one or
more competing forecasts against a benchmark forecast.
If squared returns are used as the measure of ex post vol-
atility, the forecast comparison tests are less able to dis-
tinguish between any of the competing forecasts. Taken
together, these results also suggest that although the
range measure may provide an improvement upon the
use of squared returns, and so be a valuable substitute
for realized variance when highfrequency data is
unavailable or too costly to collect, it is still not going to
be as reliable as using realized variance.
We find that the regressionbased tests are not
enhanced by the use of realized variance for forecasts
derived from daily data, which is in contrast to early stud-
ies, such as Andersen and Bollerslev (1998) but is in line
with more recent studies, such as McMillan and Speight
(2012) and Chortareas et al. (2011). This is the case both
in regard to the regression R
2
and tests of forecast
unbiasedness.
Regarding the specific characteristics of the AUD/USD
exchange rate, we find among the models applied to daily
data that the stochastic volatility model is ranked the
highest by the loss functions and is significantly better
than the next best model when realized variance is used
as the ex post volatility measure. The best performing
GARCH model out of sample is the GJRGARCH model,
which reflects significant asymmetries in the volatility in
sample, but that this result is not stable, as in the earlier
sample of the AUD/USD the baseline GARCH(1,1)
model is preferred, despite also there being asymmetries
in the volatility in sample. These asymmetries imply that
an unanticipated depreciation of the AUD will have a
greater impact of the volatility of the exchange rate than
an unanticipated appreciation, relative to the USD. In
contrast to earlier studies, we find little evidence that
power ARCH models add value to the modelling or fore-
casting of the exchange rate, although again this result
appears to have some sample specificity. We also find
no differences between forecasts generated assuming nor-
mally distributed errors as opposed to those from
tdistributed errors, although the latter provides a signifi-
cantly better fit in sample.
We find that a GARCH model of daily volatility that is
generated from intraday data provides forecasts that are
superior to the stochastic volatility model from daily data
for most, but not all, loss functions and variance estima-
tors. This is not only the case for the AUD/USD, in both
sample periods, but also for our two control samples. This
result indicates that, as has been found in studies of other
exchange rates and for other asset classes, intraday data
are the most valuable for both evaluating forecasts and
also generating forecasts of daily volatility.
The remainder of the paper proceeds as follows.
Section 2 reviews related literature to provide some con-
text for our work and enable comparisons to our findings
to be drawn. Section 3 describes the stochastic volatility
model and the GARCH models that will be used to fore-
cast the exchange rate, describes the computation of the
ex post volatility measures, and explains the forecast eval-
uation testing procedures. In Section 4, we describe the
data and report the results of the model estimation in
sample and the out of sample forecast evaluations.
Section 5 summarizes and concludes.
2|RELATED LITERATURE
The literature on volatility forecasting and the evalua-
tion of these forecasts are extensive and have grown
alongside the development of models of volatility.
5
However, the literature on evaluating forecasts of
exchange rate volatility is relatively small and so pre-
sents further opportunities for discovery.
6
Our work con-
tributes to a number of literatures including, but not
limited to, the literature on forecasting exchange rates,
both with lowfrequency and highfrequency data, fore-
cast comparisons using highfrequency data, and to stud-
ies of specific financial markets, in this case the
AUD/USD exchange rate.
Evaluating forecasts of the volatility of exchange rates
was first undertaken by Taylor (1986) who compared fore-
casts from his AMARCH model with the exponentially
weighted moving average model in Taylor and Kingsman
(1979), using daily data for the £/$ between 1974 and
1982, and marginally favouring the exponentially
weighted moving average model. Forecast comparison
has thereafter broadened out to evaluate forecasts from a
variety of models in the GARCH class, stochastic volatility
models, autoregressive models of squared and absolute
returns, and models including implied volatility from
options. Bera and Higgins (1997), who examined the
weekly $/£ rate between 1985 and 1991, favoured a
GARCH model over a bilinear model, whereas Cumby
et al. (1993), who examine the weekly Yen/$ between
1977 and 1990, favoured an EGARCH model over histori-
cal measures of volatility. By contrast, Figlewski (1997),
who examine the daily DM/$ rate over the period 1971 to
1995, favoured a historical measure over a GARCH model.
Jorion (1995), who examined both the DM/$ and the Yen/$
BAILEY AND STEELEY 1357

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT