Image-Based Surrogates of Socio-Economic Status in Urban Neighborhoods Using Deep Multiple Instance Learning

(1) Background: Evidence-based policymaking requires data about the local population’s socioeconomic status (SES) at detailed geographical level, however, such information is often not available, or is too expensive to acquire. Researchers have proposed solutions to estimate SES indicators...

Full description

Bibliographic Details
Main Authors: Christos Diou, Pantelis Lelekas, Anastasios Delopoulos
Format: Article
Language:English
Published: MDPI AG 2018-10-01
Series:Journal of Imaging
Subjects:
Online Access:https://www.mdpi.com/2313-433X/4/11/125
id doaj-7f2c1bacb44a42c182c68fc8598222b4
record_format Article
spelling doaj-7f2c1bacb44a42c182c68fc8598222b42020-11-24T20:44:55ZengMDPI AGJournal of Imaging2313-433X2018-10-0141112510.3390/jimaging4110125jimaging4110125Image-Based Surrogates of Socio-Economic Status in Urban Neighborhoods Using Deep Multiple Instance LearningChristos Diou0Pantelis Lelekas1Anastasios Delopoulos2Multimedia Understanding Group, Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, 54124 Thessaloniki, GreeceMultimedia Understanding Group, Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, 54124 Thessaloniki, GreeceMultimedia Understanding Group, Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece(1) Background: Evidence-based policymaking requires data about the local population&#8217;s socioeconomic status (SES) at detailed geographical level, however, such information is often not available, or is too expensive to acquire. Researchers have proposed solutions to estimate SES indicators by analyzing Google Street View images, however, these methods are also resource-intensive, since they require large volumes of manually labeled training data. (2) Methods: We propose a methodology for automatically computing surrogate variables of SES indicators using street images of parked cars and deep multiple instance learning. Our approach does not require any manually created labels, apart from data already available by statistical authorities, while the entire pipeline for image acquisition, parked car detection, car classification, and surrogate variable computation is fully automated. The proposed surrogate variables are then used in linear regression models to estimate the target SES indicators. (3) Results: We implement and evaluate a model based on the proposed surrogate variable at 30 municipalities of varying SES in Greece. Our model has <inline-formula> <math display="inline"> <semantics> <mrow> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>=</mo> <mn>0.76</mn> </mrow> </semantics> </math> </inline-formula> and a correlation coefficient of <inline-formula> <math display="inline"> <semantics> <mrow> <mn>0.874</mn> </mrow> </semantics> </math> </inline-formula> with the true unemployment rate, while it achieves a mean absolute percentage error of <inline-formula> <math display="inline"> <semantics> <mrow> <mn>0.089</mn> </mrow> </semantics> </math> </inline-formula> and mean absolute error of <inline-formula> <math display="inline"> <semantics> <mrow> <mn>1.87</mn> </mrow> </semantics> </math> </inline-formula> on a held-out test set. Similar results are also obtained for other socioeconomic indicators, related to education level and occupational prestige. (4) Conclusions: The proposed methodology can be used to estimate SES indicators at the local level automatically, using images of parked cars detected via Google Street View, without the need for any manual labeling effort.https://www.mdpi.com/2313-433X/4/11/125deep learningmultiple instance learningweakly supervised learningdemographysocioeconomic analysisGoogle Street View
collection DOAJ
language English
format Article
sources DOAJ
author Christos Diou
Pantelis Lelekas
Anastasios Delopoulos
spellingShingle Christos Diou
Pantelis Lelekas
Anastasios Delopoulos
Image-Based Surrogates of Socio-Economic Status in Urban Neighborhoods Using Deep Multiple Instance Learning
Journal of Imaging
deep learning
multiple instance learning
weakly supervised learning
demography
socioeconomic analysis
Google Street View
author_facet Christos Diou
Pantelis Lelekas
Anastasios Delopoulos
author_sort Christos Diou
title Image-Based Surrogates of Socio-Economic Status in Urban Neighborhoods Using Deep Multiple Instance Learning
title_short Image-Based Surrogates of Socio-Economic Status in Urban Neighborhoods Using Deep Multiple Instance Learning
title_full Image-Based Surrogates of Socio-Economic Status in Urban Neighborhoods Using Deep Multiple Instance Learning
title_fullStr Image-Based Surrogates of Socio-Economic Status in Urban Neighborhoods Using Deep Multiple Instance Learning
title_full_unstemmed Image-Based Surrogates of Socio-Economic Status in Urban Neighborhoods Using Deep Multiple Instance Learning
title_sort image-based surrogates of socio-economic status in urban neighborhoods using deep multiple instance learning
publisher MDPI AG
series Journal of Imaging
issn 2313-433X
publishDate 2018-10-01
description (1) Background: Evidence-based policymaking requires data about the local population&#8217;s socioeconomic status (SES) at detailed geographical level, however, such information is often not available, or is too expensive to acquire. Researchers have proposed solutions to estimate SES indicators by analyzing Google Street View images, however, these methods are also resource-intensive, since they require large volumes of manually labeled training data. (2) Methods: We propose a methodology for automatically computing surrogate variables of SES indicators using street images of parked cars and deep multiple instance learning. Our approach does not require any manually created labels, apart from data already available by statistical authorities, while the entire pipeline for image acquisition, parked car detection, car classification, and surrogate variable computation is fully automated. The proposed surrogate variables are then used in linear regression models to estimate the target SES indicators. (3) Results: We implement and evaluate a model based on the proposed surrogate variable at 30 municipalities of varying SES in Greece. Our model has <inline-formula> <math display="inline"> <semantics> <mrow> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>=</mo> <mn>0.76</mn> </mrow> </semantics> </math> </inline-formula> and a correlation coefficient of <inline-formula> <math display="inline"> <semantics> <mrow> <mn>0.874</mn> </mrow> </semantics> </math> </inline-formula> with the true unemployment rate, while it achieves a mean absolute percentage error of <inline-formula> <math display="inline"> <semantics> <mrow> <mn>0.089</mn> </mrow> </semantics> </math> </inline-formula> and mean absolute error of <inline-formula> <math display="inline"> <semantics> <mrow> <mn>1.87</mn> </mrow> </semantics> </math> </inline-formula> on a held-out test set. Similar results are also obtained for other socioeconomic indicators, related to education level and occupational prestige. (4) Conclusions: The proposed methodology can be used to estimate SES indicators at the local level automatically, using images of parked cars detected via Google Street View, without the need for any manual labeling effort.
topic deep learning
multiple instance learning
weakly supervised learning
demography
socioeconomic analysis
Google Street View
url https://www.mdpi.com/2313-433X/4/11/125
work_keys_str_mv AT christosdiou imagebasedsurrogatesofsocioeconomicstatusinurbanneighborhoodsusingdeepmultipleinstancelearning
AT pantelislelekas imagebasedsurrogatesofsocioeconomicstatusinurbanneighborhoodsusingdeepmultipleinstancelearning
AT anastasiosdelopoulos imagebasedsurrogatesofsocioeconomicstatusinurbanneighborhoodsusingdeepmultipleinstancelearning
_version_ 1716816154890076160