Dealing with Big Data and Network Analysis Using Neo4j

In this lesson we will learn how to use a graph database to store and analyze complex networked information. Networks are all around us. Social scientists use networks to better understand how people are connected. This information can be used to understand how things like rumors or even communicabl...

Full description

Bibliographic Details
Main Author: Jon MacKay
Format: Article
Language:English
Published: Editorial Board of the Programming Historian 2018-02-01
Series:The Programming Historian
Online Access:https://programminghistorian.org/lessons/dealing-with-big-data-and-network-analysis-using-neo4j
id doaj-5ac07381b1974f8180b2073953474d2b
record_format Article
spelling doaj-5ac07381b1974f8180b2073953474d2b2020-11-24T23:24:06ZengEditorial Board of the Programming HistorianThe Programming Historian2397-20682397-20682018-02-01Dealing with Big Data and Network Analysis Using Neo4jJon MacKay0University of WaterlooIn this lesson we will learn how to use a graph database to store and analyze complex networked information. Networks are all around us. Social scientists use networks to better understand how people are connected. This information can be used to understand how things like rumors or even communicable diseases can spread throughout a community of people. The patterns of relationships that people maintain with others captured in a network can also be used to make inferences about a person’s position in society. For example, a person with many social ties is likely to receive information more quickly than someone who maintains very few connections with others. Using common network terminology, one would say that a person with many ties is more central in a network, and a person with few ties is more peripheral in a network. Having access to more information is generally believed to be advantageous. Similarly, if someone is very well-connected to many other people that are themselves well-connected than we might infer that these individuals have a higher social status. Network analysis is useful to understand the implications of ties between organizations as well. Before he was appointed to the Supreme Court of the United States, Louis Brandeis called attention to how anti-competitive activities were often organized through a web of appointments that had directors sitting on the boards of multiple ostensibly competing corporations. Since the 1970s sociologists have taken a more formal network-based approach to examining the network of so-called corporate interlocks that exist when directors sit on the boards of multiple corporations. Often these ties are innocent, but in some cases they can be indications of morally or legally questionable activities. The recent release of the Paradise Papers by the International Consortium of Investigative Journalists and the ensuing news scandals throughout the world shows how important understanding relationships between people and organizations can be. This tutorial will focus on the Neo4j graph database, and the Cypher query language that comes with it. - Neo4j is a free, open-source graph database written in java that is available for all major computing platforms. - Cypher is the query language for the Neo4j database that is designed to insert and select information from the database. By the end of this lesson you will be able to construct, analyze and visualize networks based on big — or just inconveniently large — data. The final section of this lesson contains code and data to illustrate the key points of this lesson.https://programminghistorian.org/lessons/dealing-with-big-data-and-network-analysis-using-neo4j
collection DOAJ
language English
format Article
sources DOAJ
author Jon MacKay
spellingShingle Jon MacKay
Dealing with Big Data and Network Analysis Using Neo4j
The Programming Historian
author_facet Jon MacKay
author_sort Jon MacKay
title Dealing with Big Data and Network Analysis Using Neo4j
title_short Dealing with Big Data and Network Analysis Using Neo4j
title_full Dealing with Big Data and Network Analysis Using Neo4j
title_fullStr Dealing with Big Data and Network Analysis Using Neo4j
title_full_unstemmed Dealing with Big Data and Network Analysis Using Neo4j
title_sort dealing with big data and network analysis using neo4j
publisher Editorial Board of the Programming Historian
series The Programming Historian
issn 2397-2068
2397-2068
publishDate 2018-02-01
description In this lesson we will learn how to use a graph database to store and analyze complex networked information. Networks are all around us. Social scientists use networks to better understand how people are connected. This information can be used to understand how things like rumors or even communicable diseases can spread throughout a community of people. The patterns of relationships that people maintain with others captured in a network can also be used to make inferences about a person’s position in society. For example, a person with many social ties is likely to receive information more quickly than someone who maintains very few connections with others. Using common network terminology, one would say that a person with many ties is more central in a network, and a person with few ties is more peripheral in a network. Having access to more information is generally believed to be advantageous. Similarly, if someone is very well-connected to many other people that are themselves well-connected than we might infer that these individuals have a higher social status. Network analysis is useful to understand the implications of ties between organizations as well. Before he was appointed to the Supreme Court of the United States, Louis Brandeis called attention to how anti-competitive activities were often organized through a web of appointments that had directors sitting on the boards of multiple ostensibly competing corporations. Since the 1970s sociologists have taken a more formal network-based approach to examining the network of so-called corporate interlocks that exist when directors sit on the boards of multiple corporations. Often these ties are innocent, but in some cases they can be indications of morally or legally questionable activities. The recent release of the Paradise Papers by the International Consortium of Investigative Journalists and the ensuing news scandals throughout the world shows how important understanding relationships between people and organizations can be. This tutorial will focus on the Neo4j graph database, and the Cypher query language that comes with it. - Neo4j is a free, open-source graph database written in java that is available for all major computing platforms. - Cypher is the query language for the Neo4j database that is designed to insert and select information from the database. By the end of this lesson you will be able to construct, analyze and visualize networks based on big — or just inconveniently large — data. The final section of this lesson contains code and data to illustrate the key points of this lesson.
url https://programminghistorian.org/lessons/dealing-with-big-data-and-network-analysis-using-neo4j
work_keys_str_mv AT jonmackay dealingwithbigdataandnetworkanalysisusingneo4j
_version_ 1725561773234323456