Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food

In 2016, the University of Waterloo began offering a mediated copyright review and deposit service to support the growth of our institutional repository UWSpace. This resulted in the need to batch import large lists of published works into the institutional repository quickly and accurately. A range...

Full description

Bibliographic Details
Main Authors: William Roy, Chris Gray
Format: Article
Language:English
Published: Code4Lib 2018-11-01
Series:Code4Lib Journal
Online Access:https://journal.code4lib.org/articles/13895
id doaj-cd45cf354dc74308b96aa83d103bf0d4
record_format Article
spelling doaj-cd45cf354dc74308b96aa83d103bf0d42020-11-25T03:26:10ZengCode4LibCode4Lib Journal1940-57582018-11-014213895Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle FoodWilliam RoyChris GrayIn 2016, the University of Waterloo began offering a mediated copyright review and deposit service to support the growth of our institutional repository UWSpace. This resulted in the need to batch import large lists of published works into the institutional repository quickly and accurately. A range of methods have been proposed for harvesting publications metadata en masse, but many technological solutions can easily become detached from a workflow that is both reproducible for support staff and applicable to a range of situations. Many repositories offer the capacity for batch upload via CSV, so our method provides a template Python script that leverages the Habanero library for populating CSV files with existing metadata retrieved from the CrossRef API. In our case, we have combined this with useful metadata contained in a TSV file downloaded from Web of Science in order to enrich our metadata as well. The appeal of this ‘low-maintenance’ method is that it provides more robust options for gathering metadata semi-automatically, and only requires the user’s ability to access Web of Science and the Python program, while still remaining flexible enough for local customizations.https://journal.code4lib.org/articles/13895
collection DOAJ
language English
format Article
sources DOAJ
author William Roy
Chris Gray
spellingShingle William Roy
Chris Gray
Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
Code4Lib Journal
author_facet William Roy
Chris Gray
author_sort William Roy
title Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_short Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_full Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_fullStr Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_full_unstemmed Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_sort preparing existing metadata for repository batch import: a recipe for a fickle food
publisher Code4Lib
series Code4Lib Journal
issn 1940-5758
publishDate 2018-11-01
description In 2016, the University of Waterloo began offering a mediated copyright review and deposit service to support the growth of our institutional repository UWSpace. This resulted in the need to batch import large lists of published works into the institutional repository quickly and accurately. A range of methods have been proposed for harvesting publications metadata en masse, but many technological solutions can easily become detached from a workflow that is both reproducible for support staff and applicable to a range of situations. Many repositories offer the capacity for batch upload via CSV, so our method provides a template Python script that leverages the Habanero library for populating CSV files with existing metadata retrieved from the CrossRef API. In our case, we have combined this with useful metadata contained in a TSV file downloaded from Web of Science in order to enrich our metadata as well. The appeal of this ‘low-maintenance’ method is that it provides more robust options for gathering metadata semi-automatically, and only requires the user’s ability to access Web of Science and the Python program, while still remaining flexible enough for local customizations.
url https://journal.code4lib.org/articles/13895
work_keys_str_mv AT williamroy preparingexistingmetadataforrepositorybatchimportarecipeforaficklefood
AT chrisgray preparingexistingmetadataforrepositorybatchimportarecipeforaficklefood
_version_ 1724593598317461504