This document studies the power of CBOE Volatility Index (VIX) to predict stock market returns of Standard & Poor’s 500 (S&P 500), a good proxy of the USA economy. VIX measures the market’s expectation of S&P 500 volatility, however many authors call it fear index because the value of the index spikes in moments of high selling in financial markets. Even though VIX is a common indicator in technical analysis and market sentiment (Baker & Wurgler, 2007), this study fits better to fundamental analysis of the USA economy as a whole.
The download includes the complete master thesis in docx and pdf. The documents are well structured and correctly formatted. Additionally, it includes the complete project in R programming language, it is used to gather the data from internet, process the data, create all regression models and plot the images present in the management project.
- Date: 2019-02-24
- Author: Raúl Bartolomé Castro
- Tutor: Dr. Steve Wu.
- Program: Distance Learning MBA
- Module: Management Project
Table of Contents
List of Figures
Figure 1. Main events for GSPC 10
Figure 2. GSPC: S&P 500 index 10
Figure 3. Main events for VIX 11
Figure 4. VIX: CBOE volatility index 11
Figure 5. DGS3M0 and DGS10: USA treasuries 12
Figure 6. DAAA and DBAA: Moody’s seasoned corporate bond yield 13
Figure 7. CPIAUCSL: consumer price index for all urban consumers 13
Figure 8. SPY: SPDR S&P 500 ETF 14
Figure 9. SPYdiv: dividend of SPY 14
Figure 10. Lagged GSPC 16
Figure 11. Histograms of GSPCret_excp 18
Figure 12. Histograms of GSPCret_exca 18
Figure 13. Histograms of VIX 19
Figure 14. CPI time series 20
Figure 15. Histograms of TB3M 21
Figure 16. Histograms of CS 22
Figure 17. Histograms of TS 23
Figure 18. SPYdivttm time series 24
Figure 19. SPYDY time series 24
Figure 20. Histograms of SPYDY 25
Figure 21. Model 238. GSPCret_excp against VIX 29
Figure 22. Model 77. GSPCret_excp against TB3Mretp 31
Figure 23. Model 133. GSPCret_excp against TB3M 33
Figure 24. Model 122 36
Figure 25. Model 165 38
List of Tables
Table 1. Literature review summary 6
Table 2. Input variables 9
Table 3. Regression variables observations and data range 15
Table 4. Variables for regressions models 16
Table 5. Statistics of GSPCret_excp 17
Table 6. Statistics of GSPCret_exca 18
Table 7. Statistics of VIX 19
Table 8. Statistics of TB3M 21
Table 9. Statistics of CS 22
Table 10. Statistics of TS 23
Table 11. Statistics of SPYDY 25
Table 12. Regression models 26
Table 13. Per-se evaluation summary 28
Table 14. Relative evaluation, daily observations 30
Table 15. Relative evaluation, monthly observations 31
Table 16. Relative evaluation, quarterly observations 32
Table 17. Multivariate models, daily observations 34
Table 18. Multivariate models, monthly observations 35
Table 19. Multivariate models, quarterly observations 37
This document studies the power of CBOE Volatility Index (VIX) to predict stock market returns of Standard & Poor’s 500 (S&P 500), a good proxy of the USA economy. VIX measures the market’s expectation of S&P 500 volatility, however many authors call it fear index because the value of the index spikes in moments of high selling in financial markets. Even though VIX is a common indicator in technical analysis and market sentiment (Baker & Wurgler, 2007), this study fits better to fundamental analysis of the USA economy as a whole.
The principle of this quantitative analysis is the evaluation of regression model fit of hundreds of models to understand the long-term relationship between macroeconomic variables. S&P 500 lagged to explanatory variables conforms the response variable. VIX, USA treasuries, Moody’s seasoned corporate bond yield, consumer price index (CPI) and S&P 500 dividend yield constitute the sources for explanatory variables. Many other relevant macroeconomics variables such as gross domestic product (GDP), oil prices or price earnings are excluded to avoid an over utilization of explanatory variables that would lead to regressions model fits with low statistical significance.
This dissertation articulates a solid empirical evaluation of VIX as predictor of stock market returns paying special attention to the method used to calculate the variables for the regression’s models. The approach bridges the articles focused in volatility and generic market returns where the variables are calculated as annualized returns (McMillan, 2016; Bekaert & Hoerova, 2014) and capital asset pricing modeling (CAPAM) studies where the variables are calculated as periorized returns (Daróczi et al., 2013, p. 44; Kempthorne et al., 2013).
The conclusions of this paper bring clarity over the past relationship of S&P 500, VIX and other macroeconomic variables that may be used by investors and mangers to take better future financial decisions. Nevertheless, assuming that the asset prices may reflect a fair value at any given time as capture in a weak-form of efficient market hypothesis (EMH), investors and managers should be cautious taking decisions from historical data.
This work is organized as follows:
- Literature review,
- hypothesis definition,
- explanation and acquisition of the input variable,
- explanation and calculation of variables used for the regression’s models,
- list of regressions models and
- summary and conclusions.
A sensible expansion of this study is a microeconomic analysis for those companies with volatility index provided by CBOE, such as VIX on Apple (VXAPLSM). It is advisable to use CAPAM theory as framework, 3-months USA treasuries as risk free asset and confirm the predictability power of VXAPLSM by comparing the regressions models fit.
Exists an extensive body of literature related to market volatility and market returns. Next table depicts some of the most important articles where market volatility or VIX plays a central role in stock market returns forecasting.
The study of volatility and how it might impact to stock market returns has captured high interest in the financial literature, already in 1976 Finisher Black in his article “Studies of stock price volatility changes” starts to pay attention to the subject. French et al. in 1987 expand the study and find evidences of positive relationship between market returns in excess of treasure bills and volatility, something that is also observed in this document. They paper models volatility using the methods autoregressive conditional heteroskedasticity (ARCH) and autoregressive integrated moving average (ARIMA), VIX was not a possibility because it appears in 1990. The authors calculate markets returns as annual continuously compounded (CC), the standard methods used in articles where volatility and stock markets returns coexists.
Exists a solid literature that explains how to calculate VIX (Carr & Madan, 1998; Demeterfi et al., 1999; Britten‐Jones & Neuberger, 2002 and Bollerslev et al., 2009) and the CBOE also provides a very comprehensive paper (CBOE, 2018). VIX is a model-free, as opposed to Black-Scholes-Merton, that calculates the expected volatility of S&P 500 using options derivatives with an expiration date of 30 days. Nevertheless, the calculation method brings little value to this study because we are more interested in the utilization of VIX as predictor variable rather than the calculation in itself.
Bollerslev et al. (2009, 2013) and Bekaert & Hoerova (2014) are pioneers to recognize that VIX captures Implied Volatility (IV) and risk premium and explore the decomposition of the variable. The risk premium is what gives to VIX the pseudonym of fear index (Whaley, 2000 and Fernandes et al., 2013), that is the willingness of traders and investors to purchase options in declining markets to hedge long positions. Bekaert & Hoerova (2014) split VIX in an elegant equation by the sum of variance premium (VP) and realized variance (RV):
Where the root square of VIX is equal to VP plus the expectation of RV over the next 22 days. The authors conduct a profound evaluation of models to forecast RV, concluding that the best estimators are a combination of VIX, continuous and discontinuous jumps of RV as defined per Corsi et al. (2010).
One of the conclusions of Bollerslev et al. (2009) and Bekaert & Hoerova (2014) is that VP is better market returns indicator than VIX. This opens an additional research line for this investigation, to evaluate VP calculated as periorized simple return. Albeit, this is excluded because it would excessively bend this research to volatility modeling rather than stock market returns prediction.
Academics and professionals have developed a rich pallet of articles related to volatility modeling, where sometimes VIX is included. Kozyreva (2007) using the Nordic market index, compares the calculation method of VIX and generalize ARCH, concluding that GARCH is more accurate to capture volatility. She acknowledges the presence of significant differences during disruptive events in the market (“jumps”). Fernandes et al. (2013) perform a thorough statistical examination of VIX using heterogeneous autoregressive (HAR) processes for modeling and forecasting purposes. They conclude that VIX does not depend of other macroeconomic variables like oil features or interest rates. Chow et al. (2014) demonstrate that VIX does not truly capture volatility and define a generalized VIX method that is more effective. Thorlie et al. (2015) and Seul-Ki et al. (2017) are some of the authors that expand the state of the art methodologies to model volatility using GARCH, asymmetric power ARCH and HAR for S&P 500 and Korea Stock Price Index (KOSPI). Some of these volatility per-se studies deviate from the main goal of this dissertation, consequently they will not play a relevant role. Though, they provide a good foundation for a holistic study of volatility and market returns that could expand this paper.
Comparing the performance of VIX with other well know predictor variables, is a sensible method to judge up to what extend VIX is effective in its predictions. Consequently, part of the literature research consists to identify articles related to market returns studies with diverse variables. Anilowski et al. (2007) investigate the influence of earnings and earnings announcements to the performance of the stock market. Bollerslev et al. (2009) include in their article macroeconomic variables such as price to earnings ratio (P/E), price to dividends ratio (P/D), consumption-wealth ratio (CAY), term spread (TS), default spread (DFSP) and risk-free rate (RREL). Ferreira & Santa-Clara (2011) using the method sum-of-the-parts (SOP) for forecasting stock markets utilizes three variables: P/E ratio, P/D ratio and earnings ratio growth. Bekaert et al. (2013) and Bekaert & Hoerova (2014) are influencing articles in this dissertation, they consider VIX, TS, credit spread (CS), consumer price index (CPI), industrial price index (IPI), P/D ratio and other macroeconomic variables. McMillan (2016) conducts a very extensive evaluation of tenths of variables, among them P/E ratio, P/D ratio, earnings ratio, GDP, CPI, IPI, TSMP, market capitalization, etc. The benchmarking variables for this work are real 3-month treasury bills (TB3M), CS, TS and SPY dividend yield (SPYDY).
The hypothesis definition pivots around model testing to confirm whether VIX is an affective indicator of stock market returns. Consequently, the null hypothesis (H0) assumes that VIX is not an effective indicator of S&P 500 returns and the alternative hypothesis (H1) assumes that VIX is an effective indication of S&P 500 returns. We evaluate the hypothesis from three different angles:
First approach consists to build linear regression models with VIX and the response variable, to obtain the coefficient values ( and fit the model with the sample data, to obtain adjusted R2. The t-statistic tests the null hypothesis, the alternative hypothesis is accepted if is different from 0 and the chance to get a sample outside of the alternative hypothesis is lower than 1% (t-statistic significance). Adjusted R2 defines the proportion of the variance of the response variable captured by the explanatory variable. The value ranks from +1 to -1, where a value closed to unity means a high effective indicator.
Previous technique provides a good per-se evaluation; albeit, it does not indicate how good is compared with other indicators. To fill this gap, the second evaluation of the hypothesis is done in relative terms, comparing the effectiveness of VIX with other well-established indicators such as TB3M, CS, TP and SPYDY. Adjusted R2 is the main parameter for the comparison but the t-statistic significance is also taken into account.
Finally, the hypothesis is tested evaluating the effectiveness of VIX to improve the predictability of stock markets returns in multivariate regression models. This is capture by a relative comparison of four explanatory variables versus models with the same four variables plus VIX.
The foundation of this dissertation is the input data used for the statistical inference. Next are the nine considered variables:
- GSPC: daily closing value of S&P 500 index.
- VIX: daily closing value of CBOE volatility index.
- DGS3M0: USA 3-month treasury bill.
- DGS10: USA 10-year treasury note.
- DAAA: Moody’s seasoned Aaa corporate bond yield.
- DBAA: Moody’s seasoned Baa corporate bond yield.
- SPY: daily closing value of SPDR S&P 500 ETF.
- SPYdiv: dividend of SPDR S&P 500 ETF.
- CPIAUSCL: monthly consumer price index for all urban consumers.
Table 2 provides the key quantitative characteristics of the data. All variables are secondary data obtained from reputable sources. The variables GSPC, VIX, SPY and SPYdiv are downloaded from Yahoo! Finance. DGS3M0, DGS10, DAAA, DBAA and CPIAUSCL are obtained from Economical Data from Federal Reserve Bank of St. Louis (FRED).
In order to achieve the highest statistical significance, the time series should cover the longest period of time. The starting date of the data set is defined by SPY, the variable with the most recent creation, that was launched by State Street Global Advisors (SSgA) on January 1993. The ending date is defined by the date of the last data acquisition for this study, limited by the most recent date, in this case CPIAUSCL on October 2018.
It is a common practice in market returns studies to consider various observation periods (Bollerslev et al., 2009; Bekaert & Hoerova, 2014), we target three different observations period: daily, monthly and quarterly. For the variables GSPC, VIX, DGS3MO, DGS10, DAAA, DBAAA and SPY this is done by downloading data with daily frequency and with decimation process build monthly and quarterly data sets. Numerically, this is translated to 6720, 311 and 104 observations.
The decimation process is excluded for CPIAUSCL and SPYdiv because they participate in intermediate steps to create the regressions variables. SPYdiv is the dividends distribution of SPY that is provided every quarter granting 104 observations. CPIAUSCL is monthly information provided by the FRED, it is issued since 1947 that implies 861 observations.
S&P 500 index, ticker symbols ^GSPC, INX and $SPX, is a stock market index that tracks the 500 large USA companies listed in New York Stock Exchange (NYSE) and Nasdaq Stock Market (NASDAQ). The index is built by assigning a weight to each company based on the market capitalization, calculated as the market value of their outstanding shares. S&P 500 captures many industries and covers approximately 80% of the available market capitalization (Standard & Poor’s, 2018) consequently it is a good proxy for USA stock market analysis.
Figure 1 shows the most distinctive traits of GSPC: the dot com bubble in 2001, the starting of the financial crisis and Lehman bankruptcy in 2008 and the longest bull market period of 9 years stating in 2009 present a clear pattern. Other events such as the world trade center attack in 1998 or the rubble devaluation are not too significant.
GSPC is not used directly as response variable for the regression’s models, but GSPC in excess of the 3-month treasury bill. Likewise, since this is a forecasting exercise, the responsible variable is lagged against the explanatory variable for the regression fit testing, more details are given in subsequent chapters. Next figure depicts GSPC with three observation periods, the decimation process does a fair job holding the majority of the relevant information.
VIX is a stock market index created by Chicago Board Options Exchange (CBOE) that measures the market’s expectation of S&P 500 volatility with a horizon of 30-days expressed in annualized percentage. VIX is calculated using options prices from S&P 500 with expiration date more than 23 days and less than 37 days and risk-free interest rates (CBOE, 2018). VIX captures the implied volatility but it does not assume the Black-Sholes-Merton model but rather a “model-free” estimator (Britten‐Jones & Neuberger, 2000; Jiang & Tian, 2005). Other approach to calculate the implied volatility is solving the Black-Sholes-Merton option price model that uses as parameters price of the derivative, strike price, drift rate, time to expiration risk free interest rate and standard deviation of the underlying security’ returns also called implied volatility (Hull, 2018, pp. 343-369) however in this study focuses on VIX rather than Black-Sholes-Merton derivative.
VIX receives his name of fear index because it spikes in periods of big uncertainty. This is well represented in Figure 3, where the top three events are the global financial crisis in 2008, the US and Europe dept downgrade in 2011 and the Russian crisis and default of long-term capital management in 1998, although other well know black swan events such as the dot com bubble in 2000 are hidden with the background noise.
VIX is the explanatory variable that captures most of the attention in this study, the hypothesis testing pivots around it and the strategy of regressions modelling as well. Figure 4 represents the variable with three observation periods. It is worth to mention that the quarterly data filters the information dramatically, the most significant information event, the global financial crisis in 2008, scores bellow US and Europe dept downgrade in 2011.
USA Treasuries are government dept issued by USA Treasury Department used to finance government’s activities. The FRED provides data of 16 treasuries with constant maturity, next chart presents DGS3MO and DGS10. USA treasures occupies a preeminent position in macroeconomic models affecting the flow of money in the economy. It is not a surprise that they are key contributors in this document, especially DGS3MO.
DGS3MO is used in three occasions for the regression’s variables. First time to calculate the response variable as excess of GSPC and two times for explanatory variables as main constituent of real 3-months treasury bills and deductible member of TS. DGS10 only accounts one time as positive term of TS. Next figure plots DGS3MO in green and DGS10 in blue, it is easy to notice the long period of low interest rates from 2009 to 2016 to stimulate USA economy after the global financial crisis in 2008.
The FRED provides multitude of corporate bonds quotes, this investigation considers DAAA that is a granted to companies with extremely strong capacity to fulfill its financial commitments and DBAA that reflects that the company has an adequate capability to fulfill its financial obligations though it might face challenges in adverse economic conditions.
We create the explanatory variable CS, by subtracting DAAA from DBAA, representative of the ability of corporates to fulfil their dept obligations. Next image presents DAAA in red and DBAA in green, it is noticeable that the time series present a significant similitude to USA treasures, that is not a surprise since corporates depts are dependent of national interest rates.
The CPIAUCSL is a measure of the average monthly change in the price for goods and services paid by urban consumers between any two time periods. It can also represent the buying habits of urban consumers. This particular index includes roughly 88 percent of the total population, accounting for wage earners, clerical workers, technical workers, self-employed, short-term workers, unemployed, retirees, and those not in the labor force (FRED, 2018).
The subtraction of CPIAUCSL from DGS3MO provides the real part of DGS3M0 an important explanatory variable. For the purpose of units’ coherence, the variable needs to be converted from index to annualized percentage, the operations are explained in detail in chapter 5.
It is easy to observe in the figure that the indicator grows steadily, only during the global financial crisis in 2008 is noticeable a peak of acceleration.
SPDR S&P 500 is an exchanged traded found (ETF) that trades under the symbol SPY. Yahoo! Finance provides the share price (SPY) and dividends (SPYdiv) which are used for the calculation of the explanatory variable SPY dividend yield (SPYDY). Next chart shows SPY in three observations periods. It is easy to appreciate that SPY tracks GSPC with high accuracy.
GSPC cannot be used for the dividend yields calculation because as an index does not provide dividends. Fortunately, SPY provides dividends four times per annum and tracks with accuracy GSPC. Figure 9 plots SPYdiv, since its inception in 1993 the fund is providing a fast-growing stream of capital with a peak just before the global financial crisis in 2008.
This chapter describes the 21variables used to build regressions models, all of them are derived from input variables. 6 are the response variables grouped in two types, with periorized or annualized DGS3MO. The explanatory variables are 15, grouped in VIX, TB3M, CS, TS and SPYDY. In more detail:
- GSPCret_excp: CC return of GSPC in excess of CC return of periorized DGS3MO. With lags of 1, 3 and 12 months.
- GSPCret_exca: CC return of GSPC in excess of CC return of DGS3MO. With lags of 1, 3 and 12 months.
- VIX: CBOE Volatility Index.
- VIXretp: CC return of periorized VIX.
- VIXreta: CC return of VIX.
- Real 3-month rate (TB3M): difference between DGS3MO and CPI.
- TB3Mretp: CC return of periorized TB3M minus CC return of periorized CPI.
- TB3Mreta: CC return of TB3M minus CC return of CPI.
- Credit spread (CS): difference between DBAA and DAAA.
- CSretp: CC return of periorized DBAA minus CC return of periorized DAAA.
- CSreta: CC return of DBAA minus CC return of DAAA.
- Term spread (TS): difference between DGS10 and DGS3MO.
- TSretp: CC return of periorized DGS10 minus CC return of periorized DGS3MO.
- TSreta: CC return of DGS10 minus CC return of DGS3MO.
- SPYDY: SPY dividend yield.
- SPYDYretp: CC return of periorized SPDY.
- SPYDYreta: CC return of SPDY.
This investigation evaluates three different forecasting horizons (1, 3 and 13 months), consequently, the response variables shall be lagged 1, 3 and 12 months with regards to the explanatory variables. When the explanatory and response variable are consolidated, the lagging operation slices one more year the data set. Table 3 summarizes the number of observations and data range.
Table 4 lists the 21 regressions variables. The column variable name gives a distinctive abbreviation of the variable. The units of the majority of variables are percentages expressed in per unit basis (per one) except VIX, TB3M, CS, TS and SPYDY that are expressed per hundred basis (percent). The table also captures a simplified view of the calculation method per each case. Next chapters provide a detail explanation per each case.
For the purpose to test the forecasting effectiveness of the regression models, the response variables shall be lagged with regards the explanatory variable. It is also valid all the way around, the key requirement is that the variables shall be lagged each other’s. In this occasion GSPC is lagged 1, 3 and 13 months with regards to the explanatory variables.
Next image plots three lags for daily and monthly data and two lags for quarterly data. Note that one month lagged data (1m) seems overlapped with non-lagged data (0m) due to the resolution of the chart.
Previous image is intended to illustrate the lagging concept, in computational terms GSPC is not lagged but GSPC in excess of DGS3MO either periorized or annualized.
Next three equations represent the calculation method for three time series of GSPCret_excp. Each one includes the lag period, the CC return of GSPC minus the periorized CC return of DGS3MO.
The following table and illustrations provide the statistical information and graphics representation of the variables. The daily observations present a normal distribution shape with very high kurtosis. Monthly data is more normally distributed but with many observations away from the mean. The quarterly observations do not follow normal distribution with strong presence of observations in the extremes, indicative of fat tails.
The equations for GSPCret_exca follow high similitude to GSPCret_exca, the differences reside in the method used to calculate the excess return. In this occasion the excess is calculated annually rather than periorized.
The statistics and charts present a distinctive distribution with extremely high kurtosis, observations distributed around the mean, fat tails in both directions and certain negative kurtosis for monthly and quarterly time series.
This study uses three expression of VIX as explanatory variable. The first one is the input variable VIX and the other two are CC calculated as annualized returns and periorized returns. Matematically:
Table 7 and Figure 13 depict VIX with a distinctive shape of log-normal distribution with high positive kurtosis, though a high standard deviation flattens the curve. VIXretp shows a clear log-normal distribution but the quarterly data is questionable. VIXreta is a variable with a fairly normal distribution and high kurtosis.
In economics a meaningful variable is the short-term obligations from US-Treasure department because affects to the flows of money in the economy. In this investigation we use the real 3-Months treasury bill, that is the difference between the 3-months treasure bill and the CPI inflation.
The first step is to calculate the CPI since it is not a direct input variable. CPI is calculated as the summation average of last 12 months of CPIAUCSL expressed as simple return from last 12 months.
The time series of CPI daily data uses linear interpolation for missing observations between months. The monthly data is derived from previous equation and the quarterly data ignores the observations that are not quarterly.
Three forms of TB3M are the inputs variables for regression variables. TB3M is the difference between the DGS3MO and CPI, both variables expressed in annual percentages. TB3Mretp is the difference between DGS3MO and CPI, with both variables expressed in periorized CC returns. Finally, TB3Mreta is also the DGS3MO in excess of CPI expressed in annualized CC returns. Mathematically is represented by next equations:
Next panels depict TB3M and TB3Mretp with fairly similar frequency distribution, the difference relies on the standard deviation of the second is much lower than the first, which elevates the normal distribution bell. TB3Mreta shows observations very concentrated around the mean that implies high kurtosis.
Credit spread can have different definitions, in our case is the difference between two Moody’s seasoned corporate bond yield of different maturities. Since they are related to companies, investors might interpret a positive value as a sign of positive economic outlook and negative or narrow difference, as turbulent and uncertainty economic future. We use three expressions as explanatory variables, CS as the difference between DBAA and DAAA, both variables expressed in annual percentages. CSretp expressed as difference between DBAA and DAAA, with both variables as periorized CC returns. Finally, CSreta is also the DBAA in excess of DAAA expressed in annualized CC return. Mathematically:
Next representations show CS and CSretp with a familiar log-normal distribution consistent in high skewness, in the first case the standard deviation is too high to lift the curve and in the second case all the way around. The chart depicts a fairly normal distribution for CSreta, with high kurtosis for daily observations.
Term spread or yield curve is the difference between 3-months treasury bills and 10-years treasury notes. It is interpreted similarly to CS, a positive value of TS as a sign stable economic future and a narrow difference or inversion as worsening economic conditions. In our study, 3 forms of TS are the inputs variables for the regressions. TS is the difference between the DGS10 and DGS3MO, both variables expressed in annual percentages. TSretp is the difference between DGS10 and DGS3MO, with both variables expressed in periorized CC returns. Finally, TSreta is also the DGS10 in exceeds of DGS3MO expressed in annualized CC return. Mathematically is represented by next equations:
Table 10 and Figure 17 show TS and TSretp with a distinctive random distribution, the exception is TSrep with daily observation that matches a log-normal shape. TSreta presents an evident normal distribution with extreme kurtosis.
Dividend paid by S&P 500 and dividend per annum divided by share price, are common indicators of the USA economy. High dividends represent that companies are doing high benefits and the contrary in opposite case.
SPDY is calculated as the summation of the past 12 consecutive months of SPYdiv (trailing 12 months – TTM) divided by SPY. GSPC is not used for the dividend yield calculation because indices do not provide dividends, however SPDR S&P 500 is an ETF that trades under the symbol SPY provides dividends and tracks the GSPC index with high accuracy.
The first step is to calculate the TTM of SPYdiv, this is done by the summation of the yearly dividend distribution, a total of four times per annum. Next expression shows the mathematical calculation:
Next figure plots the variable SPYdivttm, it has similitude to SPYdiv but without spikes.
Finally, SPYDY is the ratio between SPYdivttm and SPY expressed in percentage:
Next image shows SPYDY time series. It presents a clear peak around the global financial crisis in 2008, that can be interpreted as the GSPC prices crashed but the dividend paid still are distributed. This is a characteristic of a lagging indicator rather than a leading indicator, which is what we are trying to identify.
The last set of explanatory variables for this work are based on SPYDY. Following a systematical cadence of this research, the first one is SPYDY in percentage terms and the other two are CC returns of the periorized and annualized variable.
Next information presents some similarities to some of the previous cases. SPYDY and SPYDYretp shape a log-normal distribution with two different standard deviations though quarterly data embodies an imperfect normal distribution. SPYDYreta presents more distinctive normal distribution with high kurtosis in daily data.
This research contemplates 336 regression models structured in response and explanatory variables, observations periods, forecasting horizons and model type. In more detail:
- 2 sets of 168 regressions per each response variable: GSPCret_excp and GSPCret_exca.
- 3 observation periods: daily, monthly and quarterly.
- 3 time horizons: 1, 3 and 12 months, except with quarterly data with only 2 horizons of 3 and 12 months.
- The first 15 model types are intended to study regression fits with one explanatory variable. The models with VIX are used for hypothesis testing in absolute terms or per-se.
- The model types 16, 17 and 18 evaluate the predictability power of well know multivariate models. They act as benchmarking reference.
- Model type 19, 20 and 21 study the additive contribution of VIX to previous models. They are used for hypothesis testing in relative terms.
Next table summarize all regressions models:
The regressions results discussion follows a repetitive patter along three chapters. Firstly, identification of regressions models with best linear regressions fit with Ordinary Least Squares (OLS). The main parameter criteria is adjusted R2, an indicator that captures how good the explanatory variable represents the variance of the response variable. The indicator ranks from +100% to -100%, the sign represents whether the relationship is proportional or inversely propositional. A regression model with high predictability power should score a high value in adjusted R2 in absolute terms.
Secondly, discussion of the significance of the coefficients. To reject the null hypothesis, the explanatory variables shall be different from 0, the higher the , higher is the evidence to reject the null hypothesis. We conduct a t-statistics to know the chance to obtain an observation that fulfills the null hypothesis, this is the significance level of null hypothesis and it is represented by asterisks besides to the coefficient, 1% with ***, 5% with ** and 10% with *. A solid argument to reject the null hypothesis shall present a high with a low significance level of null hypothesis.
Thirdly, for those regressions models with high predictability power and high relevance for the purpose of this study, we conduct a thorough evaluation of the regression model results with a set of graphical representations.
- Regression model fit is a scatter plot with the response variable against explanatory variable and the linear regression model fit in a green line. The chart depicts the observations with high leverage in red and with a blue cross the values with a Cook’s distance higher than 1.
- Leverage of observations plots the influence level of each observation value over the predicted value. The leverage lies between 0, meaning low influence, to 1 indicating high influencing case.
- Residuals vs. fitted is a scatter plot that helps to evaluate the assumption of randomness of the residuals. A linear regression fit using OLS should present homoscedasticity, that means that for each level of the explanatory variable the variance of the residual term should be constant. The chart should present an evenly distributed residuals vs. fitted for homoscedasticity. Funnel patter reflects heteroscedasticity meaning that the variances are unequal. Other patters reflect the non-linearity of the model being built.
- The normal Q-Q plots the standardized residuals vs. theoretical quantiles. It helps to check the assumption that the errors should be normally distributed, if that is the case the residuals will be plot alongside of the dashed line in the chart.
- Maybe more comprehensive representation of the normality check is the histogram of standardized residuals. It represents the frequency of the standardized residuals with the normal distribution overlaid on it. The chart helps to evaluate the skewness, kurtosis and fat tails of the residuals.
This chapter evaluates the per-se predictability power of VIX. Next table represents all single variable regressions models with VIX. It is organized in three big blocks, each one related for one observation period: daily, monthly and quarterly data. The columns of every block indicate the prognosis horizons of 1, 3 and 12 months. The rows are divided in two big groups, one per each response variable. The rest of information is the name of the regression, the adjusted R2, the adjusted R2 rank that indicated which adjusted R2 is bigger in absolute terms, the intercept and the regression variable name. For each coefficient the standard error is provided between brackets.
The regressions models with daily observation provides very small adjusted R2, the higher value is 0.051% from model 175. Definitely VIX is not a good predictor with daily data.
The model 238, from the monthly observations, gives an adjusted R2 of 9.71% with a coeficent of 0.756 and significance at the 0.01-level. The model is built with GSPCret_exca and VIXreta with a time horizon of one month. VIX is a variable that captures the expected volatility of the S&P 500 in one-month period, so it is comprehensive that the variable captures some the future behavior of GSPC with one-month horizon as the results demonstrate. The coefficient is positive, consequently positive returns of VIX is translated in positive returns of GSPC in one-month horizon. This case is sufficient to reject the null hypothesis H0 and accept the alternative hypothesis H1.
The quarterly observation data provides results better than the daily data but worse than monthly, the best model is the number 128 with an adjusted R2 of 3.47% however the significant level of the coefficient worsen to 0.05-level. Empirical data demonstrates that VIX is not a good predictor of future returns with quarterly data.
Next images present relevant information of the model 238. The regression model fit is built with GSPCret_exca vs. VIXreta, in green the regression fit, in red the observation with high leverage and one observation with high Cook’s distance. The leverages image highlights influential observations where China RMB devaluation in 2015 and the financial crisis in 2008 are the top influential events. The chart residuals vs. fitted shows that the residuals are concentrated, a symptom of non-linearity. The normal Q-Q shows that the residuals do not follow a normal distribution, confirmed by the histogram of standardized residuals that plots residuals with high kurtosis and fat tails.
This section compares VIX regressions fit with other well know stock market returns predictors. The structure of the tables is similar to previous section but with more variables.
Once again, the daily observation data provides a very low predicting power in all the cases as observed in Table 14. The best case is the model 202 with an adjusted R2 of 0.38% and coefficient significance at 0.01 level with 1-month horizon using GSPCret_exca and TSreata. The daily stock market prices present very high volatility, expecting that the return of one day might have influence in a 1, 3 or 12 months horizon seems very unprovable as the empirical results demonstrate. The low adjusted R2 implies that this case is not good for comparison purposes.
The monthly period observations from Table 15 are a good arena for the comparison due to the high degree of statistical significance of the model fit. With GSPCret_excp response variable, many explanatory variables provide a very high adjusted R2. Models with TB3M provide the highest adjusted R2, specifically
- Models 73, 74 and 75 using TB3M expressed as percentage grants 41.6%, 43% and 41.5% with 1, 3 and 12 months of horizon with negative coefficient of -0.0229, -0.0233 and -0.0226 and significance at the 0.01-level.
- Models 76, 77 and 78 using TB3Mretp create a fit of 41.1%, 43.5% and 41.8% with 1, 3 and 12 months of horizon with negative coefficient of -0.961, -0.988 and -0.957 and significance at the 0.01-level.
Other cases with good forecasting power are TSrep from 23.6% to 28.1% and TS from 21% to 25.2% and also around 10% CS, CSretp, SPYDY and SPYDYretp. However, models with VIX do not exceed 2.3%, consequently we can conclude that VIX has limited predictability power when it is compared with other variables.
When the response variable is GSPCret_exca, the model 238, that is based on VIXreta, presents the higher performance, with 9.71% of adjusted R2. The case was discussed in the previous chapter. The closest models are the number 265 with 5.85% using TSreta at one month of horizon and 247 with 3.75% using TB3Mreta at one month horizon.
In relative terms VIX present a significant predictability power with GSPCret_exca, however falls quite behind with GSPCret_excp as response variable.
It is worth to check in detailed the best result, model 77 with adjusted R2 of 43.5%. The regression model fit does not present observation with high leverage or Cook’s distance higher than 1. The top influencers are the global financial crisis in 2008 and the US dept downgrade in 2011. The residuals vs. fitted results shows a fairly evenly distributed pattern synonymous of homoscedasticity and linearity. Normal Q-Q and the histogram of standardized residuals depicts a quasi-normal distribution with negative fat tail and some positive kurtosis.
The models with quarterly data in Table 16 capture the results with the highest adjusted R2 for single explanatory variable. Like previous analysis, models with GSPCret_excp response variable dominate over GSPCret_exca. From this data set, models with TB3M score the highest:
- Models 133 and 134 using TB3M expressed as percentage grants 72.9% and 64.9% with 3 and 12 months of horizon with negative coefficient of -0.147 and -0.135 with significance at the 0.01-level.
- Models 135 and 136 using TB3Mretp produces similar results but high higher coefficients.
Also score very high models 147 and 148 using TSretp with 67.5% and 54.7%. In third place models 141 and 142 with 37.6% and 42.8% using CSretp.
Models using VIX fall way behind with of 3.47% in model 128. It is easy to conclude that VIX is a poor indicator of market returns for quarterly observation periods.
A closer look to model 133 provides similar results to model 77. The regression model fit does not present any high influential observation. The top leverage observations are again the global financial crisis and the US dept downgrade. The residuals vs. fitted are not evenly distributed, with clusters on the left-hand side and right-hand side. The Normal Q-Q depicts a good fit to the normal distribution line; however, the histogram of standardized residuals shows a questionable normal distribution match.
This chapter evaluates multivariate models and discusses the contribution of the VIX variable to the models. The model types under discussion are from the number 16 to 21. Model types 16 to 18 include TB3M, CS, TS and SPYDY and model types from 19 to 21 aggregate VIX.
The first panel presents the daily observation data set. The results present a slightly increase in adjusted R2 compared with single variables results however the overall statistical significance is poor. The response variable GSPCret_exca produces better results, up to 0.667% compared with GSPCret_exca with only 0.119%. Comparing top scoring models, with single variable, model 3 grants 0.14% and with multiple variables the model 229 tops 0.667%. One more time, the analysis evidences that, daily observations are not good indicative for market returns.
With monthly observations, the multivariable models provide a small improvement compared with the single variable models. The best single variable model is number 77 with 43.5% and the best multivariate is number 122 with 44.9%, this accounts for a positive difference of 1.4%.
The explanatory variable GSPCret_excp presents superior results, with 44.9% compared to 17.2% with GSPCret_exca. Again, periorized returns is more effective than annualized returns.
Comparing the three best type models with five variables versus four variables give:
- Model 122 increases 0.1% (from 44.8% to 44.9%) when VIXretp is added with no significance level
- Model 118 increases 0.8% (from 43.8% to 44.6%) when VIX is added with significance at 0.05-level
- Model 119 increases 0.2% (from 44.3% to 44.5%) when VIX is added with no significance level
The data is not conclusive for the top multivariate models, VIX slightly increases adjusted R2 but the coefficients and statistical significance are very low. The only clear contribution is the model 292 where VIXreta increases 9.24% (from 7.96% to 17.2%).
Figure 24 depicts nine charts for model 122. The first five figures plot the regression model fit for all explanatory variables, some observations are high influential but none of them with high Cook’s distance. TB3Mretp is the top contributor with a coefficient -0.89 and significance at 0.01-level. Once gain the leverage charts shows the global financial crisis in and US dept as the top contributors. The residuals vs. fitted presents a fairly randomly distributed residuals, albeit some clusters of residuals are the reflect of the non-linearities of the model like CSretp or SPYDYretp. Finally, the standardized residuals follow a certain grade of normal distribution with negative fat tails.
The multivariate models with quarterly data period from Table 19 provide the highest forecasting power of this research. The models 159 grants an adjusted R2 up to 80.4% that compared with the highest value of single variable of 72.9% in model 133 gives a difference of 7.5%.
The results with GSPCret_excp are clearly superior to GSPCret_exca, with 80.4% compared to 4.56%. This is a recurrent result along this study.
The contribution of VIX for the wining models are:
- Model 165 decreases 0.2% (from 80.4% to 80.2%) when VIXretp is added with no significance level
- Model 163 increases 0.9% (from 75% to 75.9%) when VIX is added with significance at 0.05-level
- Model 166 decreases 0.2% (from 71.6% to 71.4%) when VIXretp is added with no significance level
VIX barely improves the multivariate models and it is done with very small coefficients and low statistical significance, consequently is questionable whether VIX contribute positively in these multivariate models.
The last panel shows nine charts of OLS fit for model 165. The five first plots are the regression fit for the variables, TB3Mretp and TSretp hold good significance at 0.01-level and coefficients of -0.544 and 0.758 respectively, this is depicted as quasi-linear distribution in the respective charts. None of the observations have high Cook’s distance and only few have high leverage, such as global financial crisis and US dept downgrade. The residuals vs. fitted present a fairly random pattern, requirement for linear regressions. The Normal Q-Q and histogram of normalized residuals show a questionable normal distribution.
This study conducts an empirical evaluation to understand the effectiveness of VIX as stock market returns indicator. The rational is that VIX as technical and sentiment indicator may encapsulate certain forecasting power of stock market returns.
Exists an extensive literature related volatility modeling of S&P 500 and its predictability power, however all are based on annualized CC return of GSPC. This piece of work complements the existing body of literature including periorized CC return as well.
The hypothesis testing goes beyond the conventional single variable per-ser evaluation. We also include a relative comparation to well recognized variables such as TB3M, CS, TS and SPYDY and the contribution of VIX to multivariate models.
The foundation of any empirical study is the input variables, we acquire the variables GSPC, SPY, VIX, DGS3MO, DCS10, DAAA, DBAA, CPIAUCSL and SPYdiv from trusted sources such as Yahoo! Finance and FRED. We perform the necessary data processing such as decimation and interpolation. In order to confirm the consistency of the data, we plot all time series and we discuss the most representative traits.
The variables of the regression models are derived from the input variables, they include the observation periods of days, months and quarters. Six response variables composed by GSPC in excess of periorized DGSEMO and GSPC in excess of DGSEMO lagged 1 month, 3 months and 12 months. 15 explanatory variables calculated as percent, periorized CC return and annualized CC return of VIX, TB3M, CS, TS and SPYDY. Following a similar approach to input variables, we plot all time series to confirm the quality of the data and additionally we calculate and discuss basic statistical characteristics like mean, standard deviation, kurtosis and skewness.
We create 336 linear regressions models using OLS as criterion estimator. The models are structured to support the hypothesis testing, one block for VIX and relative comparison to other indicators, second block for multivariate without VIX and the third block multivariate models including VIX.
The regression results and discussion are structed in three chapters. In per-se evaluation we obtain a strong evidence that VIX is a good indicator of stock market returns, the model 238 gives an adjusted R2 of 9.71% with a coeficent of 0.756 and significance at the 0.01-level for one month horizon using monthly observations and annualized returns of GSPC. A detail evaluation shows that the model presents outliners, China RMB devaluation and global financial crisis in 2008 are the top contributions in the model, the residuals analysis show that the model is non-linear and do not follow a normal distribution. The results are enough to reject the null hypothesis, but we expand the study to understand up to what extend VIX is a good predictor of stock markets returns.
The relative evaluation demonstrates that in general VIX is a mediocre indicator compared with TB3M and TS, but only slightly worse than CS and SPYDY. Previous indicators hold a strong predictability power in all three predictive horizons, especially with quarterly and monthly observation periods. As an example, with three months horizon, the model 77 with monthly observation using TB3Mretp and GSPCret_excp gives an adjusted R2 of 43.5% with a coeficent of -0.988 and significance at the 0.01-level. The model 133 with quarterly observations built with TB3M and GSPCret_excp gives an adjusted R2 of 72.9% with a coeficent of -0.147 and significance at the 0.01-level.
We check the contribution of VIX in multivariate regressions. We conclude that VIX does not improve multivariate linear regression fit. For monthly observations, the improvements move in a marginal range from 0.1% to 0.8% and for quarterly data the contribution is neutral. The multivariate models provide a modest increase in the regression fit compare with single variables models, for monthly data the improvement is up to 1.4% and for quarterly data up to 7.5%. The best model with monthly data is the number 122 with adjusted R2 of 44.9% and coefficient of -0.89 at the 0.01-level and other coefficients with worst significance. The best model with quarterly data is the 165 with an impressive adjusted R2 of 80.2% and two coefficients at the 0.01-level, TB3Mretp with -0.544 and TSretp with 0.758.
The regressions result clearly demonstrates that observation with a daily period do not provide a significant statistical positive result. In single variable regressions fit the model 35 scores the highest with a humble 0.14% of adjusted R2. With multivariate regression the adjusted R2 gets slightly better, being the model 229 the best option with only 0.667%. This study confirms the believe that daily closing prices are bad indicator for future returns of 1, 3 or 12 months due to the volatile nature of the stock market.
Finance managers or investors may take advantage of the results of this work. Historical data provides strong evidences of positive and negative correlations. By order of significance, very strong negative relationship with TB3M, strong positive relationship with TS and moderate positive relationship with CS, SPYDY and VIX. However, if we assume a weak from of the efficient market hypothesis or random walk theory the historical data is not representative of future returns, consequently investors should be cautious taking decision from historical data.
This study suggests at least two new research lines. One is to explore the decomposition of VIX in variance risk premium and realized variance following methods such as GARCH, HAR, etc., and complement existing literature with periorized CC returns rather than annualized CC returns only. The second direction can be a microeconomic study using companies volatility indexes such as VXIBM or VXAPL representative of Apple and IBM respectively.
Carr, P. & Madan, D., 1998. Towards a Theory of Volatility Trading. In: Volatility: New Estimation Techniques for Pricing Derivatives. London: Risk Books, pp. 417-427.
Corsi, F., Pirino, D. & Reno, R., 2010. Threshold bipower variation and the impact of jumps on volatility forecastin. Journal of Econometrics, 159(2), pp. 276-288 .
Chow, V., Jiang, W. & Li, J., 2014. Does VIX Truly Measure Return Volatility?. SSRN.
Curme, C., Preis, T., Stanley, E. H. & Susannah Moat, H., 2014. Quantifying the semantics of search behavior before stock market moves. PNAS, 111(32), pp. 11600-11605.
Ahern, K. R. & Sosyura, D., 2015. Rumor Has It: Sensationalism in Financial Media. The Review of Financial Studies, 28(7), pp. 2050-2093.
Anilowski, C., Feng, M. & Skinner, D., 2007. Does earnings guidance affect market returns? The nature and information content of aggregate earnings guidance. Journal of Accounting and Economics, 44(1-2), pp. 36-63 .
Black, F., 1976. Proceedings of the 1976 Meetings of the American Statistical Association. s.l., Business and Economics Section.
Baker, M. & Wurgler, J., 2007. Investor Sentiment in the Stock Market. Journal of Economic Perspectives, 21(2), pp. 129-152.
Bekaert, G., Hoerova, M. & Lo Duca, M., 2013. Risk, uncertainty and monetary policy. Journal of Monetary Economics, 60(7), pp. 771-788.
Bekaert, G. & Hoerova, M., 2014. The VIX, the variance premium and stock market volatility. Journal of Econometrics, 183(2), pp. 181-192 .
Bollerslev, T., Tauchen, G. & Zhou, H., 2009. Expected Stock Returns and Variance Risk Premia. The Review of Financial Studies, 22(11), pp. 4463-4492.
Britten‐Jones, M. & Neuberger, A., 2000. Option Prices, Implied Price Processes, and Stochastic Volatility. The Journal of Finance, 55(2), pp. 839 – 866.
Daróczi, G. et al., 2013. Introduction to R for Quantitative Finance. 2nd Edition ed. s.l.:Packt Publishing.
Da, Z., Engelberg, J. & Gao, P., 2015. The Sum of All FEARS Investor Sentiment and Asset Prices. The Review of Financial Studies, 28(1), pp. 1-32.
Demeterfi, K., Derman, E., Kamal, M. & Zou, J., 1999. A Guide to Volatility and Variance Swaps. The Journal of Derivatives, 6(4), pp. 9-32.
CBOE, 2018. The CBOE Volatility Index – VIX®, Chicago: CBOE.
Fernandes, M., Medeiros, M. C. & Scharth, M., 2013. Modeling and predicting the CBOE market volatility index. Journal of Banking & Finance, 40(1), pp. 1-10.
Ferreira, M. & Santa-Clara, P., 2011. Forecasting stock market returns: The sum of the parts is more than the whole. Journal of Financial Economics, 100(3), pp. 514-537.
FRED, 2018. Consumer Price Index for All Urban Consumers: All Items. [Online]
Available at: https://fred.stlouisfed.org/series/CPIAUCSL
[Accessed 30 October 2018].
French, K. R., Schwert, W. G. & Stambaugh, R. F., 1987. Expected stock returns and volatility. Journal of Financial Economics, 19(1), pp. 3-29.
Hull, J. C., 2018. Options, Futures and Other Derivatives. 9th Edition ed. Harlow: Person.
Jiang, G. J. & Tian, Y. S., 2005. The Model-Free Implied Volatility and Its Information Content. The Review of Financial Studies, 18(4), p. 1305–1342.
Kaplanski, G. & Levy, H., 2010. Sentiment and stock prices: The case of aviation disasters. Journal of Financial Economics, 95(2), pp. 174-201.
Kempthorne, P., Lee, C., Strela, V. & Xia, J., Fall 2013. 18.S096 Topics in Mathematics with Applications in Finance. s.l., Massachusetts Institute of Technology: MIT OpenCourseWare.
Kirkpatrick II, C. D. & Dahlquist , J. A., 2010. Technical Analysis: The Complete Resource for Financial Market Technicians. 2nd Edition ed. New Jersey: FT Press.
Kozyreva, M., 2007. How reliable is implied volatility A comparison between implied and actual volatility on an index at the Nordic Market, Halmstad: Halmstad University.
McMillan, D., 2016. Which Variables Predict and Forecast Stock Market Returns?. SSRN.
McMillan, D. G., 2016. Which Variables Predict and Forecast Stock Market Returns?. Palgrave Pivot.
Natenberg, S., 2015. Options Volatility & Pricing. Advanced trading strategies and techniques. 2nd Edition ed. New York: Mc Graw Hill Education.
Pike, R., Neale, B. & Linsley, P., 2015. Corporate Finance and Investment: Decisions and Strategies. 8th Edition ed. Edimburg: PEARSON.
Preis, T., Susannah Moat, H. & Stanley, E. H., 2013. Quantifying Trading Behavior in Financial Markets Using Google Trends. Scientific Reports, Volume 1684.
Seul-Ki, P., Ji-Eun, C. & Dong Wan, S., 2017. Value at risk forecasting for volatility index. Applied Economics Letters, 24(21), pp. 1613-1620.
Standard & Poor’s, 2018. S&P 500 factsheet, s.l.: Standard & Poor’s.
The R Foundation, 2018. What is R?. [Online]
Available at: https://www.r-project.org/about.html
[Accessed 31 October 2018].
Thorlie, M. A., Song, L., Amin, M. & Wang, X., 2015. Modeling and forecasting of stock index volatility with APARCH models under ordered restriction. Statistica Neerlandica, 69(3), p. 329–356.
University of Michigan, 2018. University of Michigan, University of Michigan: Consumer Sentiment. [Online]
Available at: https://fred.stlouisfed.org/series/UMCSENT
[Accessed 15 July 2018].
Whaley, R. E., 2000. The Investor Fear Gauge. The Journal of Portfolio Management, 26(3), pp. 12-17.
Zheng, L., Yuan, K. & Zhu, Q., 2001. Are Investors Moonstruck? – Lunar Phases and Stock Returns. SSRN.
ARCH. Autoregressive Conditional Heteroskedasticity
ARIMA. Autoregressive Integrated Moving Average
CAPAM. Capital Asset Pricing Model
CAY. Consumption-wealth ratio
CBOE. Chicago Board Options Exchange
CC. Continuously Compounded
CPI. Consumer Price Index , Consumer Price Index
CPIAUSCL. Monthly consumer price index for all urban consumers
CS. Difference between DBAA and DAAA
CS. Credit Spread
CSreta. CC return of DBAA minus CC return of DAAA
CSretp. CC return of periorized DBAA minus CC return of periorized DAAA
DAAA. Moody’s seasoned Aaa corporate bond yield
DBAA. Moody’s seasoned Baa corporate bond yield
DGS10. USA 10-year treasury note
DGS3M0. USA 3-month treasury bill
EMH. Efficient Market Hypothesis
FDSP. Default Spread
FRED. Federal Reserve Bank of St. Louis
GDP. Gross Domestic Product
GSPC. Daily closing value of S&P 500 index
GSPCret_exca. CC return of GSPC in excess of CC return of DGS3MO. With lags of 1, 3 and 12 months
GSPCret_excp. CC return of GSPC in excess of CC return of periorized DGS3MO. With lags of 1, 3 and 12 months
HAR. Heterogeneous Autoregressive
IPI. Industrial Price Index
IV. Implied Volatility
KOSPI. Korea Stock Price Index
NASDAQ. Nasdaq Stock Market
NYSE. New York Stock Exchange
OLS. Ordinary Least Squares
P/D. Price to Dividends
P/E. Price to Earnings
RREL. Risk-free rate
RV. Realized Variance
S&P 500. Standard & Poor’s 500
SPY. Daily closing value of SPDR S&P 500 ETF
SPYdiv. Dividend of SPDR S&P 500 ETF
SPYDY. SPY dividend yield
SPYDY. SPY Dividend Yield
SPYDYreta. CC return of SPDY
SPYDYretp. CC return of periorized SPDY
SSgA. State Street Global Advisors
TB3M. Difference between DGS3MO and CPI
TB3M. 3-Month Treasury Bills
TB3Mreta. CC return of TB3M minus CC return of CPI
TB3Mretp. CC return of periorized TB3M minus CC return of periorized CPI
TMSP. Term Spread
TS. Difference between DGS10 and DGS3MO
TS. Term Spread
TSreta. CC return of DGS10 minus CC return of DGS3MO
TSretp. CC return of periorized DGS10 minus CC return of periorized DGS3MO
VIX. Daily closing value of CBOE volatility index
VIX. Volatility Index
VIXreta. CC return of VIX
VIXretp. CC return of periorized VIX
VP. Variance Premium
The prices of securities are determined by an equilibrium between buyers and sellers in stock markets. Buyers and sellers are the market constituents formed by individual investors, institution investors and governments. Market agents take trading and investment decisions based on fundamental analysis, technical analysis, market sentiment or a combination of all of them.
Fundamental analysis is a technique that determines the intrinsic value and future performance of a company based on financial and accounting data provided by the company, industry analysis and macroeconomic data (Pike, et al., 2015, p. 35). The financials cover an extensive variety of indicators such as price to earnings ratio, dividends yield, payout ratio, earnings per share, quick ratio or net profit margin. The industry analysis allows to compare the financials of the firm with its peers providing a relative valuation rather than per-se. Fundamental valuation models may include macroeconomic data such as GDP, unemployment rate, interest rates that helps to identify long trends and business cycles.
Technical analysis is a methodology based on financial time series charts that identifies trading opportunities and trends in the data (Pike, et al., 2015, p. 36) . Unlike fundamental analysis that is concerned in the inherent value of the company, technical analysis is based in the price value and trade volume of the security (Kirkpatrick II & Dahlquist , 2010). Technicians search for patterns in charts such as head and shoulders, double bottom or double top that signal trading opportunities. Additionally, they build indictors based on the time series like relative strength index or average moving average to expand the technical analysis.
Market sentiment stands for the prevailing attitude of the market participates towards to particular security, sector or economy as a whole. It is described as a bullish when the security price has an upward trend and bearish in the opposite case. Market sentiment indicators are built from fundamental, technical and psychological factors. The psychological factors are deduced from diverse sources, among them survey based like University of Michigan Consumer Sentiment Index (University of Michigan, 2018), data mining from social media, newspapers, blogs as studied by Ahern & Sosyura (2015), households behavior during internet searching as demonstrated by Da, et al. (2015), Preis, et al., (2013) and Curme, et al. (2014) and non-economic data such as aviation disasters (Kaplanski & Levy, 2010) up to the effect of the lunar phase have to investors (Zheng, et al., 2001).
Return is a percentage defined as the change of price relative to the initial price. It is an important concept because successful investors are those that obtain high returns. It turns out that the asset returns are more statistical attractive that asset price in itself, consequently makes more sense to tackle the analysis using returns rather than asset prices.
For an asset that does not provide dividends, simple return at the time is defined as:
Where is the price at the time and is the price at the time . In percent terms ,
If the asset is held periods, the k-period return is defined as the product of k one-period return minus one:
In economics annualized percentage stands for variables with units in percent referring to changes in one-year period. This is the case of variables such as VIX, DGS3MO or DAAA.
Annualized percentage variable can be quoted every day, consequently the percentage period differs from the observations period.
In analysis focused in the return between observations, might be interesting to transform the annualized percentage variable to periorized percentage variable. If that is the case, next formula does the conversion from annualized percentage to periorized percentage at observation time :
Being and the difference in time between two consecutive observations and divided by number of observations per year.
Let denote the simple return of one period, the CC return is defined as:
Note that logarithm with base e is equivalent to natural logarithm. Financial literature usually uses the term log as the natural logarithm, this convention is also used in this study.
Multiperiod CC returns is simpler than multiperiod simple returns, if the asset is held periods, the k-period return is defined as the summation of k one-period return:
The simplicity of multiperiod CC returns formula is a factor that favor the analysis with CC rather than simple return.
Linear regression consists in modeling the relationship between response and explanatory variables using a linear relationship.
Given observations (in our study: 6091, 282 and 94 for daily, monthly and quarterly period) of each data set , per each observation a value response (GSPCret_excp or GSPCret_exca) a column vector of explanatory variables (VIX, TB3M, etc.).
Where is called intercept and =1.
Previous linear models usually do not have solution, one common estimator criterion is OLS that finds minimizing , the sum of squared residuals:
Where are the coefficients values of the regression. The fitted values from the regression are:
And the residuals from the regression are calculated as the difference between and :
The linear regression model using OLS is the back bone of this research.
This is because in a falling market, traders are willing to pay higher prices for protective options regardless the realized volatility (Whaley, 2000; Fernandes et al., 2013; Natenberg, 2015, p. 524). ↑
The details of CC returns calculation are present in the appendix. ↑
Notice that the ending date of the data range is one year earlier. ↑