August 12, 2019

2018 Census - Issues with the Census dataset

As evidenced by the almost 12 month delay in the first release of data from the 2018 Census, Statistics New Zealand has experienced issues in producing a high quality dataset that provides full-coverage of the New Zealand population as at 6 March 2018.  At the core, these issues have arisen from a low response rate to the first online 2018 Census questionnaire.

Despite the issues with the 2018 Census, the final census dataset will contain vital information that will be used by policy makers to answer a wide range of questions.  To maximise the value of the final census dataset users should ensure they are fully aware of the data quality issues and the approaches Statistics New Zealand has employed to mitigate as many of these issues as possible.

The known issues with the 2018 Census collection and dataset are noted below, along with some proposed approaches to effectively use the final census dataset.  
On 17 July 2019, Statistics New Zealand released their interim coverage and response rates for the 2018 Census.  These coverage and response rates are considered interim until Statistics New Zealand has completed its analysis of the post-enumeration survey.  This survey of 15,000 households is undertaken to double check the accuracy of the coverage of the Census.

Interim coverage and response rates revealed that the national collection response rate was 87.5 percent (that is 87.5 percent of the expected New Zealand population had two data variables captured in the dwelling or individual questionnaire).  This interim response rate revealed that around 480,000 people did not respond to the 2018 Census.  Also, Statistics New Zealand revealed in early April that 220,000 people provided only a partial response to the 2018 Census.

Drilling down on 87.5 percent, it was revealed that the response rate for Māori was just 74.3 percent, while for Pacific People is was 73.5 percent.  These interim coverage rates highlight the increased data gap for Māori and Pacific People  arising out of the Census, with around a quarter of the respective population records either missing or derived from available administrative records.  This calls into question the accuracy of any Māori and Pacific People data, given that one in four people did not even provide a partial response to the 2018 Census.

To patch the data gaps  and  bolster the Census coverage to its reported 98.6 percent, Statistics New Zealand will have to make  extensive use of individual administrative records (held by Statistics New Zealand and other Central Government Agencies) to fill in the gaps.  In total around 526,000 people were created by Statistics New Zealand purely from administrative records. Of these only 165,000 could be placed in a household, meaning the remaining 361,000 people could only be placed into a geographic area.  Because of this, these 361,000 people will be unable to be included in family, household and dwelling Census datasets.

Of course many of the data variables collected in the Census are only gathered using the Census questionnaire. This means that for a number of Census questions, no administrative data exists that would enable Statistics New Zealand to fill in the missing data gaps, ensuring that due to the low response rate these data variables will have data quality issues.   Statistics New Zealand has already noted this by announcing that iwi affiliations data will not be officially released.  Other affected data variables include disability, unpaid activities, and quality of homes.

Given the known data issues for the 2018 Census there are a number of approaches that users can take when working with 2018 Census data, to ensure they are using the data appropriately and are able to minimise possible data issues.  These approaches will be used by BERL and include:

  • Assessing the data quality of each 2018 Census variable and population group.  Users will need to ask themselves what is the level of imputation or the number of administrative people included in these variables and population groups.  Especially when dealing with small population groups.
  • Review similar datasets from other sources (including other Statistics New Zealand data) to double check that we are happy with what the 2018 Census data is telling us.

Lastly, sense checking all of the data being used, will be crucial when working with 2018 Census data.