Summary: | Over the past five years, research on big data analysis has been actively conducted, and many services have been developed to find valuable data. However, low quality of raw data and data loss problem during data analysis make it difficult to perform accurate data analysis. With the enormous generation of both unstructured and structured data, refinement of data is becoming increasingly difficult. As a result, data refinement plays an important role in data analysis. In addition, as part of efforts to ensure research reproducibility, the importance of reuse of researcher data and research methods is increasing; however, the research on systems supporting such roles has not been conducted sufficiently. Therefore, in this paper, we propose a big data analysis system named the unified data analytics suite (UDAS) that focuses on data refinement. UDAS performs data refinement based on the big data platform and ensures the reusability and reproducibility of refinement and analysis through the visual programming language interface. It also recommends open source and visualization libraries to users for statistical analysis. The qualitative evaluation of UDAS using the functional evaluation factor of the big data analysis platform demonstrated that the average satisfaction of the users is significantly high.
|