Technical Challenges in Developing Software to Collect Twitter Data

Over the past two years, George Washington University Libraries developed Social Feed Manager (SFM), a Python and Django-based application for collecting social media data from Twitter. Expanding the project from a research prototype to a more widely useful application has presented a number of tec...

Full description

Bibliographic Details
Main Authors: Daniel Chudnov, Daniel Kerchner, Ankushi Sharma, Laura Wrubel
Format: Article
Language:English
Published: Code4Lib 2014-10-01
Series:Code4Lib Journal
Online Access:http://journal.code4lib.org/articles/10097
Description
Summary:Over the past two years, George Washington University Libraries developed Social Feed Manager (SFM), a Python and Django-based application for collecting social media data from Twitter. Expanding the project from a research prototype to a more widely useful application has presented a number of technical challenges, including changes in the Twitter API, supervision of simultaneous streaming processes, management, storage, and organization of collected data, meeting researcher needs for groups or sets of data, and improving documentation to facilitate other institutions’ installation and use of SFM. This article will describe how the Social Feed Manager project addressed these issues, use of supervisord to manage processes, and other technical decisions made in the course of this project through late summer 2014. This article is targeted towards librarians and archivists who are interested in building collections around web archives and social media data, and have a particular interest in the technical work involved in applying software to the problem of building a sustainable collection management program around these sources.
ISSN:1940-5758