A comparison of HDFS compact data formats: Avro versus Parquet / HDFS glaustųjų duomenų formatų palyginimas: Avro prieš Parquet

In this paper, file formats like Avro and Parquet are compared with text formats to evaluate the performance of the data queries. Different data query patterns have been evaluated. Cloudera’s open-source Apache Hadoop distribution CDH 5.4 has been chosen for the experiments presented in this articl...

Full description

Bibliographic Details
Main Authors: Daiga Plase, Laila Niedrite, Romans Taranovs
Format: Article
Language:English
Published: Vilnius Gediminas Technical University 2017-07-01
Series:Mokslas: Lietuvos Ateitis
Subjects:
Online Access:http://journals.vgtu.lt/index.php/MLA/article/view/500