RDP 2022-06: Do Australian Households Borrow to Keep up with the Joneses? 3. Data and Descriptive Statistics
November 2022
- Download the Paper 1.66MB
The dataset employed in this study is the Household, Income and Labour Dynamics in Australia (HILDA) Survey Release 2019, an annual longitudinal survey of approximately 8,000 households that is designed to be representative of the Australian population. The survey has asked detailed questions relating to household finances, such as household net worth and debt portfolio, every four years since 2002. The unrestricted version of the survey also includes household location information at an SA3 level, which I use to match with SA3-level income inequality and other macroeconomic measures.
The very detailed wealth modules in the HILDA Survey allow me to construct household total debt and examine its composition. I categorise debt into: mortgage debt, consumer debt (non-mortgage, including credit card and personal debt, e.g. hire purchase loan, car loan and other personal loans) and investment debt (e.g. bonds, shares and currencies). While questions about total debt and home debt were asked in all five wealth modules, those concerning consumer debt and investment debt were only asked from 2006 onwards.
The measure of local income inequality comes from ABS estimates of personal income in small areas. The ABS data allow for estimation of the Gini coefficient and the top 10 per cent share of total personal income at the SA3 level. Using this measure of income inequality data, rather than constructing it using the microdata, helps limit direct feedback within the regression, which could be driven, for example, by outlier observations.
The choice of SA3 level stems from a number of considerations. First, SA3s closely align to Local Government Areas, whose average income and income inequality are most regularly reported by the media, and so they may be most likely to be the relevant metric when households compare themselves to others. Second, a number of relevant macroeconomic controls are available at the SA3 level. Third, SA3s appear to provide the right balance between having a large number of individual areas, with good local control variables, and not having issues around having too small samples of households for robust analysis.
I also restrict the sample to the post-GFC period (2010 to 2018) for several reasons: 1) to control for potential structural breaks in the distribution of income inequality and in the financial landscape due to the GFC; 2) the ABS estimates of personal income in small areas are not comparable before and after 2007 (ABS 2022); and 3) to use only one sample across different types of debt (consumer debt and investment debt are only available from 2006) for more direct comparison of the estimates.
Moreover, I keep only households that exist in at least two consecutive wealth modules and do not move out of their initial SA3 region. I focus on this subset as households may choose to move closer to their ‘reference group’, leading to some direct feedback between (changes in) local inequality and debt for these households (Guerrieri, Hartley and Hurst 2013; Coibion et al 2020). To follow as many households as possible over time, I follow Price, Beckers and La Cava's (2019) procedure to merge households and determine household heads.[3] This gives me an unbalanced panel of 5,483 unique households (12,605 observations) over three survey waves, from 2010 to 2018.
Table 1 presents household demographic and financial statistics of the sample over the three survey years. The household demographics remain stable over the waves with approximately 60 per cent of household heads being male and 60 per cent married. The mean age of household heads is roughly 52 years old and on average households have two children. Around 30 per cent of household heads have a university degree and 60 per cent are employed. Turning to household financial positions, while the mean gross income grew steadily at around 2 to 3 per cent per annum, growth in housing wealth and household debt fluctuated over time. Post-GFC self-reported home values declined before picking up significantly between 2014 and 2018, while mortgage debt kept increasing over the period. On the other hand, both consumer debt and investment debt have declined since 2010. Overall, the full, unrestricted sample appears similar (see Table A1).
2010 | 2014 | 2018 | |
---|---|---|---|
Household variables (from HILDA Survey Release 19.0) | |||
Age (years) | 52.6 | 52.4 | 51.9 |
Male (%) | 59.3 | 59.3 | 57.6 |
Married (%) | 61.1 | 60.3 | 57.0 |
Number of children (if having children) | 1.8 | 1.8 | 1.8 |
University graduate (%) | 28.2 | 34.9 | 36.3 |
Labour force (%) | 64.3 | 65.3 | 64.9 |
Employed (%) | 62.2 | 63.3 | 62.6 |
Home owner (%) | 76.4 | 71.1 | 67.2 |
Gross income ($'000) | 87.4 | 90.1 | 88.9 |
Net worth ($'000) | 877.9 | 847.1 | 942.6 |
Financial wealth ($'000) | 315.9 | 349.8 | 378.2 |
Total debt ($'000) | 164.2 | 169.7 | 169.5 |
Mortgage debt ($'000) | 129.7 | 140.5 | 142.7 |
Home debt ($'000) | 96.2 | 101.4 | 102.5 |
Other property debt ($'000) | 33.5 | 39.1 | 40.2 |
Non-mortgage debt ($'000) | 34.5 | 29.3 | 26.8 |
Credit card debt ($'000) | 2.2 | 1.6 | 1.4 |
Hire purchase debt ($'000) | 0.3 | 0.2 | 0.2 |
Car debt ($'000) | 1.6 | 1.8 | 1.9 |
Business debt ($'000) | 7.9 | 9.5 | 7.3 |
Investment debt ($'000) | 7.8 | 4.5 | 3.8 |
No of observations | 4,197 | 5,790 | 6,123 |
SA3 macroeconomic variables (from ABS and Corelogic) | |||
Gini coefficient (index point) | 0.466 | 0.458 | 0.456 |
Median house price ($'000) | 397 | 435 | 533 |
Unemployment rate (at SA4 level, %) | 4.8 | 6.6 | 5.0 |
Sources: ABS; Author's calculations; Corelogic; HILDA Survey Release 19.0 |
Footnote
First, I determine household heads in the first year, and assign the household head ID and a unique household ID to all members of the household. The household head is identified following the standard tiebreaking procedure. For each following year, I identify household splits and mergers, and the source of change. I then move household IDs forward for as many households as possible, dependent on the identified change above. For more details on the procedures, see Price et al (2019). [3]