Gridded Population Data for India: Accuracy Assessment and a New Benchmark
Pratyush Tripath, Krishnachandran Balakrishnan | 10 December 2024
The accuracy and quality of various global gridded population datasets can vary over space, but a systematic assessment of these datasets for India, the most populous country in the world, is largely missing. This study contributes to the literature in two ways. First, using census population figures for nearly 600,000 towns and villages in India, it presents a comprehensive accuracy assessment of publicly available gridded population datasets. The global gridded population datasets evaluated include Gridded Population of the World (GPW), WorldPop, High Resolution Settlement Layer (HRSL), and Global Human Settlement Layer Population (GHS-POP) for India. Comparison of existing gridded population datasets shows that GHS-POP, which uses a built-volume layer and non-residential built up layer as input, largely outperforms all the other gridded datasets for towns, villages, and both combined, for almost all states in India. Second, using only a modified version of the GHSL built-up layer, we show that it is possible to improve on the GHS-POP method and achieve higher accuracies by using a simple regression model. The method used by GHS-POP, along with our improvements show that, given high quality built up data, simple explainable models can outperform machine learning models when it comes to population mapping. Our work contributes to global efforts towards generating high accuracy gridded population data, while relying on simple, explainable and easily replicable methods.