RDP 2001-09: What do Sentiment Surveys Measure? 2. Previous Research
November 2001
- Download the Paper 300KB
The extent to which sentiment indicators can forecast economic activity has been a recurrent topic in economic research ever since the Index of Consumer Sentiment (ICS) was introduced in 1952 by George Katona and his colleagues at the University of Michigan. In the United States research on consumer sentiment commenced only a decade after the Michigan index was introduced, and after flourishing briefly in the 1970s, re-emerged in the early 1990s due to renewed interest in surveys' power to predict recessions.
Surveys of business conditions have a longer history. The National Association of Purchasing Managers (NAPM) survey of manufacturers dates back to 1931; the German Ifo and French INSEE business surveys were initiated in 1949 and 1951. To date, however, business surveys have received comparatively little attention, despite some evidence (both academic and anecdotal) that they outperform consumer confidence surveys.
Consumer confidence surveys have generally been conducted with the intention of producing a leading indicator to forecast consumer expenditure. One early interpretation of survey-based indicators originates with Katona (1951, 1975). Katona argues that discretionary spending is postponable and, therefore, likely to be related to consumers' psychological ‘willingness to buy’ as well as their ‘ability to buy’. While the latter is founded on tangible considerations (e.g., the state of household balance sheets), willingness to buy is better captured in this view by survey-based sentiment indicators than more conventional aggregate data. Many researchers, taking this view as their starting point, have focused on the ability of sentiment to predict household spending (especially on durables, which are considered especially ‘discretionary’ by virtue of the ability to postpone their purchase).
In one of the first attempts to assess the forecasting performance of the Michigan consumer confidence survey, Mueller (1963) found that lagged confidence variables were significant predictors of durable and non-durable household expenditures. Only slightly later, Friend and Adams (1964) found that the ICS was useful for forecasting motor vehicle expenditures; however, they also found that stock prices were a reliable substitute for the survey measure.
Hymans (1970) treated the ICS both as a dependent and an independent variable in his regressions. He found that lags of household disposable income, the consumption price deflator and the stock market predicted the ICS. He then used the ICS as a predictor in a forecasting equation for automobile spending, with significant results.[2] Later studies (Fair 1971; Juster and Wachtel 1972a, 1972b) supported Mueller's claim that sentiment could predict other durables as well.
Mishkin (1978) argued that the ICS could be interpreted as measuring consumers' subjective assessment of the probability of financial distress, and used a significant relationship between the ICS and household assets and liabilities to support this hypothesis. He argued that the ICS should be a significant predictor of consumer durables expenditure, since durables are illiquid and hence less likely to be purchased by consumers facing financial difficulties.[3] He found that this was the case when financial variables were not taken into account, but that when they were the sentiment variable became largely redundant.
Interest in consumer sentiment indices jumped again in the early 1990s after a large decline in consumer sentiment appeared to coincide with the onset of the 1990–1991 recession in the US.[4] As soon as policy-makers (including Alan Greenspan) announced that the recession was probably over, confidence appeared to have bounced back.[5] These events were widely interpreted as evidence that sentiment could play an independent role in driving recessions and subsequent recoveries. To assess this proposition's empirical relevance Carroll, Fuhrer and Wilcox (1991, 1994) first estimated simple forecasting equations and found that lags of the ICS contributed marginally to the prediction of household spending after controlling for other variables (including lags of the dependent variable and real labour income growth). They then estimated a consumption function in the style of Campbell and Mankiw (1989) and found that lags of the Michigan ICS were jointly significant when included in the estimation.
In contrast, Throop (1992) estimated a five-variable vector error-correction model (VECM) with the changes in the ICS, durables spending, non-durables and services spending, permanent income, and the 6-month commercial paper rate as endogenous variables. He found that changes in sentiment caused changes in durables spending (but not in non-durables and services); in contrast, durables spending did not cause changes in sentiment. When he replaced the ICS with economic variables that he found predicted sentiment (unemployment and inflation), forecast errors were usually lower than in regressions where the ICS (or its current financial conditions component)[6] were used. However, over the period of the Gulf War (and coincident recession) forecasts were more accurate if the ICS was used. Throop concluded that sentiment ordinarily has little complementary value in forecasting durables spending, but when an unusual event occurs the ICS is likely to improve forecasts.
Similarly, Leeper (1992) used a vector autoregression (VAR) framework to assess the relationship between consumer sentiment and activity. His results echoed Mishkin's. Sentiment innovations only improved the VAR's predictions of industrial production and unemployment when financial variables (again, stock prices and T-bill rates) were excluded from the analysis. Later work by Matsusaka and Sbordone (1995) also used a VAR framework, but found that consumer sentiment explained a large proportion of the innovation variance of GNP, after controlling for the Index of Leading Indicators and a measure of default risk.
Estrella and Mishkin (1998) used a simple probit analysis including financial variables to assess the usefulness of survey measures for predicting recessions. They found that the NAPM survey composite index by itself had some predictive power for US recessions (as dated by the National Bureau of Economic Research) up to four quarters ahead (although the fit was relatively poor), and that the Michigan ICS was rather less useful. When a successful financial indicator (a Treasury bond–bill spread) was included in the regressions, the NAPM index became practically redundant at horizons greater than one quarter, and the ICS became insignificant at all horizons.
There is little published econometric work on business confidence indices, but the work of Stock and Watson (1993) and Estrella and Mishkin (1998) suggests these may have some potential as leading indicators. Indeed, Santero and Westerlund (1996) found (using simple graphical methods, correlations and Granger causality tests) that in OECD countries business confidence measures displayed a much stronger relationship with activity than consumer confidence indices.
The American literature offers clear evidence of some kind of bivariate association between sentiment and economic activity (generally proxied by GDP or components of household expenditure). The extent of this association, the reasons for it and the direction of causality are less clear, however. There is a suggestion that sentiment variables become redundant when the researcher controls for financial variables, but this finding is by no means consistent across the board.[7] The likelihood of endogeneity between sentiment and activity has long been recognised, and in recent times typically addressed in a VAR/VECM framework, but the strength of the correlation appears to be sensitive to the choice of variables included. The early work of Hymans and Mishkin tends to favour the interpretation that sentiment indicators summarise prior (or contemporaneous) economic information, a finding echoed by Throop (1992) and Lovell and Tien (2000).[8] While the leading indicators literature reports significant results for some sentiment variables, other indicators are often preferred for forecasting purposes.
In Australia there has been little investigation of sentiment indicators outside of the Melbourne Institute of Applied Economic and Social Research (IAESR), which began conducting a consumer confidence survey (modelled on the Michigan survey) in 1973. Boehm and McDonnell (1993) of the IAESR, building on earlier work by Defris and McDonnell (1976), argued that the consumer sentiment index performed well as a leading indicator of retail trade, consumer durables and new passenger vehicle registrations. They found that including variables generated from an application of principal components analysis to the ICS improved the fit of regression equations modelling consumption.[9] Loundes and Scutella (2000) applied the method of Carroll et al (1994) to Australia and found that including lagged values of the ICS in simple forecasting equations and Campbell-Mankiw equations in some cases improved both models' explanations of consumption. They also report a bivariate causality decomposition, suggesting that consumption has a major effect on sentiment with a lag of five quarters, while the ICS takes twice as long to have any appreciable effect on consumption. The apparent endogeneity of sentiment is correctly taken to indicate a relatively ‘complex’ causal relationship. We believe this relationship deserves further study, and in later sections offer an approach that we hope will shed some light on the issue.
A characteristic of much of the literature on consumer confidence indicators is that it takes for granted that ‘confidence’ is actually captured and quantified by a specific index, which is itself a somewhat arbitrary construction. While we address this issue below, it is worth noting here that the five component indices that are averaged to obtain the Michigan ICS (and its descendants, such as the Melbourne Institute ICS) have rarely been subjected to the precision of analysis that the aggregate index routinely receives. Attempts have been made to ‘re-weight’ sentiment indices (for example, using principal components), but there is little evidence that the revised indices perform a great deal better than existing (unweighted) indices. While Throop (1992) and Bram and Ludvigson (1998) find that specific components of the Michigan and Conference Board indices improve forecasts of aggregate consumption and durables spending, we have yet to see a similar exercise conducted using Australian data.
Footnotes
Hymans also estimated automobile expenditure by splitting the ICS into its predicted values and the residuals from the first regression. He found that there is little change in the coefficient estimates although the standard errors on the residual were much larger reducing its statistical significance. [2]
Mishkin maintains that Katona's approach is unconvincing when it comes to producing a rigorous definition of an item's postponability. In a comment on Hymans (1970), FT Juster argued that a durable item may be postponable but not discretionary, and thus may not be explained by sentiment. [3]
The decline in sentiment was widely attributed to Iraq's invasion of Kuwait and the Allied military response (e.g., Leeper (1992); Throop (1992)). [4]
Greenspan made the announcement in his statement before the Sub-committee on Domestic Monetary Policy of the Committee of Banking, Finance and Urban Affairs of the US House of Representatives, 16 July 1991 (Leeper 1992). [5]
That is, the index corresponding to Question 1 in the Michigan survey (see Section 3). [6]
Since the financial variables employed differ across studies, this is hardly surprising. Given the plurality of selected variables, estimation periods and econometric methods, no given result surveyed here can easily be compared to any other. [7]
Lovell and Tien find that the change in the unemployment rate, the stock market and real GDP explain much of the variation in the Michigan ICS. [8]
An earlier example of the application of principal components analysis to sentiment indicators is Adams (1964). The technique offers an alternative to using unweighted averages of component indices (e.g., based on net balance responses to specific questions) and involves constructing mutually orthogonal linear combinations (weighted averages) of component variables that account for the maximum possible variance in the original (standardised) components. [9]