Summary: | Stratifying behaviors based on demographics and socioeconomic status is crucial for political and economic planning. Traditional methods to gather income and demographic information, like national censuses, require costly large-scale surveys both in terms of the financial and the organizational resources needed for their successful collection. In this study, we use data from social media to expose how behavioral patterns in different socioeconomic groups can be used to infer an individual's income. In particular, we look at the way people explore cities and use topics of conversation online as a means of inferring individual socioeconomic status. Privacy is preserved by using anonymized data, and abstracting human mobility and online conversation topics as aggregated high-dimensional vectors. We show that mobility and hashtag activity are good predictors of income and that the highest and lowest socioeconomic quantiles have the most differentiated behavior across groups.
|