Summary: | In a blockchain-based system, data and the consensus-based process of recording and updating them over distributed nodes are central to enabling the trustless multi-party transactions. Thus, properly understanding what and how the data are stored and manipulated ultimately determines the degree of utility, performance, and cost of a blockchain-based application. While blockchains enhance the quality of the data by providing a transparent, immutable, and consistent data store, the technology also brings new challenges from a data management perspective. In this paper, we analyse blockchains from the viewpoint of a developer to highlight important concepts and considerations when incorporating a blockchain into a larger software system as a data store. The work aims to increase the level of understanding of blockchain technology as a data store and to promote a methodical approach in applying it to large software systems. First, we identify the common architectural layers of a typical software system with data stores and conceptualise each layer in blockchain terms. Second, we examine the placement and flow of data in blockchain-based applications. Third, we explore data administration aspects for blockchains, especially as a distributed data store. Fourth, we discuss the analytics of blockchain data and trustable data analytics enabled by blockchain. Lastly, we examine the data governance issues in blockchains in terms of privacy and quality assurance.
|