Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data
This paper studies the relationship between a vacancy population obtained from web crawling and vacancies in the economy inferred by a National Statistics Office (NSO) using a traditional method. We compare the time series properties of samples obtained between 2007 and 2014 by Statistics Netherland...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sciendo
2019-09-01
|
Series: | IZA Journal of Labor Economics |
Subjects: | |
Online Access: | https://doi.org/10.2478/izajole-2019-0004 |
id |
doaj-88e9610acd094b8a805d8c778e041fd8 |
---|---|
record_format |
Article |
spelling |
doaj-88e9610acd094b8a805d8c778e041fd82021-09-05T21:02:07ZengSciendoIZA Journal of Labor Economics2193-89972019-09-018110311610.2478/izajole-2019-0004izajole-2019-0004Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy DataPedraza Pablo de0Visintin Stefano1Tijdens Kea2Kismihók Gábor3University of Amsterdam and European Commission, Joint Research Centre (JRC), Unit I.1, Modelling, Indicators & Impact Evaluation, Via E. Fermi 2749, TP 361, Ispra (VA), I-21027, ItalyUniversity of Amsterdam/AIAS and Universidad Camilo José Cela, Facultad de Tecnología y Ciencia, Urb. Villafranca del Castillo, Calle Castillo de Alarcón, 49, 28692, Villanueva de la Cañada, Madrid, SpainUniversity of Amsterdam/AIAS, Postbus 94025, 1090 GAAmsterdam, The NetherlandsLeibniz Information Centre for Science and Technology, Welfengarten 1 B, 30167Hannover, GermanyThis paper studies the relationship between a vacancy population obtained from web crawling and vacancies in the economy inferred by a National Statistics Office (NSO) using a traditional method. We compare the time series properties of samples obtained between 2007 and 2014 by Statistics Netherlands and by a web scraping company. We find that the web and NSO vacancy data present similar time series properties, suggesting that both time series are generated by the same underlying phenomenon: the real number of new vacancies in the economy. We conclude that, in our case study, web-sourced data are able to capture aggregate economic activity in the labor market.https://doi.org/10.2478/izajole-2019-0004web crawlingstatistical inferencetime seriesvacancieslabor demanddata collectionj23j63c22c80 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Pedraza Pablo de Visintin Stefano Tijdens Kea Kismihók Gábor |
spellingShingle |
Pedraza Pablo de Visintin Stefano Tijdens Kea Kismihók Gábor Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data IZA Journal of Labor Economics web crawling statistical inference time series vacancies labor demand data collection j23 j63 c22 c80 |
author_facet |
Pedraza Pablo de Visintin Stefano Tijdens Kea Kismihók Gábor |
author_sort |
Pedraza Pablo de |
title |
Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data |
title_short |
Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data |
title_full |
Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data |
title_fullStr |
Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data |
title_full_unstemmed |
Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data |
title_sort |
survey vs scraped data: comparing time series properties of web and survey vacancy data |
publisher |
Sciendo |
series |
IZA Journal of Labor Economics |
issn |
2193-8997 |
publishDate |
2019-09-01 |
description |
This paper studies the relationship between a vacancy population obtained from web crawling and vacancies in the economy inferred by a National Statistics Office (NSO) using a traditional method. We compare the time series properties of samples obtained between 2007 and 2014 by Statistics Netherlands and by a web scraping company. We find that the web and NSO vacancy data present similar time series properties, suggesting that both time series are generated by the same underlying phenomenon: the real number of new vacancies in the economy. We conclude that, in our case study, web-sourced data are able to capture aggregate economic activity in the labor market. |
topic |
web crawling statistical inference time series vacancies labor demand data collection j23 j63 c22 c80 |
url |
https://doi.org/10.2478/izajole-2019-0004 |
work_keys_str_mv |
AT pedrazapablode surveyvsscrapeddatacomparingtimeseriespropertiesofwebandsurveyvacancydata AT visintinstefano surveyvsscrapeddatacomparingtimeseriespropertiesofwebandsurveyvacancydata AT tijdenskea surveyvsscrapeddatacomparingtimeseriespropertiesofwebandsurveyvacancydata AT kismihokgabor surveyvsscrapeddatacomparingtimeseriespropertiesofwebandsurveyvacancydata |
_version_ |
1717781269974089728 |