Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food

In 2016, the University of Waterloo began offering a mediated copyright review and deposit service to support the growth of our institutional repository UWSpace. This resulted in the need to batch import large lists of published works into the institutional repository quickly and accurately. A range...

Full description

Bibliographic Details
Main Authors:	William Roy, Chris Gray
Format:	Article
Language:	English
Published:	Code4Lib 2018-11-01
Series:	Code4Lib Journal
Online Access:	https://journal.code4lib.org/articles/13895

id	doaj-cd45cf354dc74308b96aa83d103bf0d4
record_format	Article
spelling	doaj-cd45cf354dc74308b96aa83d103bf0d42020-11-25T03:26:10ZengCode4LibCode4Lib Journal1940-57582018-11-014213895Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle FoodWilliam RoyChris GrayIn 2016, the University of Waterloo began offering a mediated copyright review and deposit service to support the growth of our institutional repository UWSpace. This resulted in the need to batch import large lists of published works into the institutional repository quickly and accurately. A range of methods have been proposed for harvesting publications metadata en masse, but many technological solutions can easily become detached from a workflow that is both reproducible for support staff and applicable to a range of situations. Many repositories offer the capacity for batch upload via CSV, so our method provides a template Python script that leverages the Habanero library for populating CSV files with existing metadata retrieved from the CrossRef API. In our case, we have combined this with useful metadata contained in a TSV file downloaded from Web of Science in order to enrich our metadata as well. The appeal of this ‘low-maintenance’ method is that it provides more robust options for gathering metadata semi-automatically, and only requires the user’s ability to access Web of Science and the Python program, while still remaining flexible enough for local customizations.https://journal.code4lib.org/articles/13895
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	William Roy Chris Gray
spellingShingle	William Roy Chris Gray Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food Code4Lib Journal
author_facet	William Roy Chris Gray
author_sort	William Roy
title	Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_short	Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_full	Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_fullStr	Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_full_unstemmed	Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food
title_sort	preparing existing metadata for repository batch import: a recipe for a fickle food
publisher	Code4Lib
series	Code4Lib Journal
issn	1940-5758
publishDate	2018-11-01
description	In 2016, the University of Waterloo began offering a mediated copyright review and deposit service to support the growth of our institutional repository UWSpace. This resulted in the need to batch import large lists of published works into the institutional repository quickly and accurately. A range of methods have been proposed for harvesting publications metadata en masse, but many technological solutions can easily become detached from a workflow that is both reproducible for support staff and applicable to a range of situations. Many repositories offer the capacity for batch upload via CSV, so our method provides a template Python script that leverages the Habanero library for populating CSV files with existing metadata retrieved from the CrossRef API. In our case, we have combined this with useful metadata contained in a TSV file downloaded from Web of Science in order to enrich our metadata as well. The appeal of this ‘low-maintenance’ method is that it provides more robust options for gathering metadata semi-automatically, and only requires the user’s ability to access Web of Science and the Python program, while still remaining flexible enough for local customizations.
url	https://journal.code4lib.org/articles/13895
work_keys_str_mv	AT williamroy preparingexistingmetadataforrepositorybatchimportarecipeforaficklefood AT chrisgray preparingexistingmetadataforrepositorybatchimportarecipeforaficklefood
_version_	1724593598317461504

Preparing Existing Metadata for Repository Batch Import: A Recipe for a Fickle Food

Similar Items