DataUp: A tool to help researchers describe and share tabular data [v2; ref status: indexed, http://f1000r.es/48u]

Scientific datasets have immeasurable value, but they lose their value over time without proper documentation, long-term storage, and easy discovery and access. Across disciplines as diverse as astronomy, demography, archeology, and ecology, large numbers of small heterogeneous datasets (i.e., the l...

Full description

Bibliographic Details
Main Authors: Carly Strasser, John Kunze, Stephen Abrams, Patricia Cruse
Format: Article
Language:English
Published: F1000 Research Ltd 2014-09-01
Series:F1000Research
Subjects:
Online Access:http://f1000research.com/articles/3-6/v2
id doaj-3cb37c829348416bb2a68beff4944cbe
record_format Article
spelling doaj-3cb37c829348416bb2a68beff4944cbe2020-11-25T03:25:50ZengF1000 Research LtdF1000Research2046-14022014-09-01310.12688/f1000research.3-6.v25502DataUp: A tool to help researchers describe and share tabular data [v2; ref status: indexed, http://f1000r.es/48u]Carly Strasser0John Kunze1Stephen Abrams2Patricia Cruse3California Digital Library, University of California Office of the President, Oakland, CA 94612, USACalifornia Digital Library, University of California Office of the President, Oakland, CA 94612, USACalifornia Digital Library, University of California Office of the President, Oakland, CA 94612, USACalifornia Digital Library, University of California Office of the President, Oakland, CA 94612, USAScientific datasets have immeasurable value, but they lose their value over time without proper documentation, long-term storage, and easy discovery and access. Across disciplines as diverse as astronomy, demography, archeology, and ecology, large numbers of small heterogeneous datasets (i.e., the long tail of data) are especially at risk unless they are properly documented, saved, and shared. One unifying factor for many of these at-risk datasets is that they reside in spreadsheets. In response to this need, the California Digital Library (CDL) partnered with Microsoft Research Connections and the Gordon and Betty Moore Foundation to create the DataUp data management tool for Microsoft Excel. Many researchers creating these small, heterogeneous datasets use Excel at some point in their data collection and analysis workflow, so we were interested in developing a data management tool that fits easily into those work flows and minimizes the learning curve for researchers. The DataUp project began in August 2011. We first formally assessed the needs of researchers by conducting surveys and interviews of our target research groups: earth, environmental, and ecological scientists. We found that, on average, researchers had very poor data management practices, were not aware of data centers or metadata standards, and did not understand the benefits of data management or sharing. Based on our survey results, we composed a list of desirable components and requirements and solicited feedback from the community to prioritize potential features of the DataUp tool. These requirements were then relayed to the software developers, and DataUp was successfully launched in October 2012.http://f1000research.com/articles/3-6/v2Data SharingStatistical Methodologies & Health Informatics
collection DOAJ
language English
format Article
sources DOAJ
author Carly Strasser
John Kunze
Stephen Abrams
Patricia Cruse
spellingShingle Carly Strasser
John Kunze
Stephen Abrams
Patricia Cruse
DataUp: A tool to help researchers describe and share tabular data [v2; ref status: indexed, http://f1000r.es/48u]
F1000Research
Data Sharing
Statistical Methodologies & Health Informatics
author_facet Carly Strasser
John Kunze
Stephen Abrams
Patricia Cruse
author_sort Carly Strasser
title DataUp: A tool to help researchers describe and share tabular data [v2; ref status: indexed, http://f1000r.es/48u]
title_short DataUp: A tool to help researchers describe and share tabular data [v2; ref status: indexed, http://f1000r.es/48u]
title_full DataUp: A tool to help researchers describe and share tabular data [v2; ref status: indexed, http://f1000r.es/48u]
title_fullStr DataUp: A tool to help researchers describe and share tabular data [v2; ref status: indexed, http://f1000r.es/48u]
title_full_unstemmed DataUp: A tool to help researchers describe and share tabular data [v2; ref status: indexed, http://f1000r.es/48u]
title_sort dataup: a tool to help researchers describe and share tabular data [v2; ref status: indexed, http://f1000r.es/48u]
publisher F1000 Research Ltd
series F1000Research
issn 2046-1402
publishDate 2014-09-01
description Scientific datasets have immeasurable value, but they lose their value over time without proper documentation, long-term storage, and easy discovery and access. Across disciplines as diverse as astronomy, demography, archeology, and ecology, large numbers of small heterogeneous datasets (i.e., the long tail of data) are especially at risk unless they are properly documented, saved, and shared. One unifying factor for many of these at-risk datasets is that they reside in spreadsheets. In response to this need, the California Digital Library (CDL) partnered with Microsoft Research Connections and the Gordon and Betty Moore Foundation to create the DataUp data management tool for Microsoft Excel. Many researchers creating these small, heterogeneous datasets use Excel at some point in their data collection and analysis workflow, so we were interested in developing a data management tool that fits easily into those work flows and minimizes the learning curve for researchers. The DataUp project began in August 2011. We first formally assessed the needs of researchers by conducting surveys and interviews of our target research groups: earth, environmental, and ecological scientists. We found that, on average, researchers had very poor data management practices, were not aware of data centers or metadata standards, and did not understand the benefits of data management or sharing. Based on our survey results, we composed a list of desirable components and requirements and solicited feedback from the community to prioritize potential features of the DataUp tool. These requirements were then relayed to the software developers, and DataUp was successfully launched in October 2012.
topic Data Sharing
Statistical Methodologies & Health Informatics
url http://f1000research.com/articles/3-6/v2
work_keys_str_mv AT carlystrasser dataupatooltohelpresearchersdescribeandsharetabulardatav2refstatusindexedhttpf1000res48u
AT johnkunze dataupatooltohelpresearchersdescribeandsharetabulardatav2refstatusindexedhttpf1000res48u
AT stephenabrams dataupatooltohelpresearchersdescribeandsharetabulardatav2refstatusindexedhttpf1000res48u
AT patriciacruse dataupatooltohelpresearchersdescribeandsharetabulardatav2refstatusindexedhttpf1000res48u
_version_ 1724595424406274048