Predictability and inequality in complex social systems

Efficient access to Big Data and the development of related technologies have driven the study of complex social systems. In this thesis, I focus on two aspects of the study of complex social systems: predictability and inequality. I first demonstrate the predictability aspect of complex social syst...

Full description

Bibliographic Details
Published:
Online Access:http://hdl.handle.net/2047/D20413922
id ndltd-NEU--neu-bz613976r
record_format oai_dc
spelling ndltd-NEU--neu-bz613976r2021-08-20T05:11:13ZPredictability and inequality in complex social systemsEfficient access to Big Data and the development of related technologies have driven the study of complex social systems. In this thesis, I focus on two aspects of the study of complex social systems: predictability and inequality. I first demonstrate the predictability aspect of complex social systems in the publishing industry. I begin with a Big Data analysis of New York Times bestsellers and bestselling authors, including the genre composition, longevity, and sales of bestsellers, as well as the gender composition and career characteristics of bestselling authors. Next, I examine weekly book sales curve and discover the universal pattern of fast rise, slow decline.'' Though this model enables prediction of total sales by thepeeking strategy'', such a prediction requires at least 25 weeks of sales of a given book, a period well beyond the peak of the sales curve. To provide early prediction of book sales, I extract book features such as author, book, and publisher before publication. Then I develop the ``Learning to Place'' algorithm which addresses the problem of imbalance in book sales, i.e. most books have low sales and much less books have high sales. This framework also allows us to understand the characteristics that drive book sales, which is very important for understanding complex social systems. The second aspect of complex social systems that I examine in this thesis is inequality. First, I conduct a large-scale investigation of gender underrepresentation in the art world, using various statistical tools and complex networks. I propose two criteria: gender-neutral and gender-balanced to categorize the representation of women artists in each institution. I find a systematic underrepresentation of women artists in institutions, which may hinder career development and access to auctions for women artists. Finally, I use logistic regression to connect the institution exhibition inequality with the auction inequality, and find that institution exhibition inequality has an effect on artists' auction access. Following the line of inequality, the last project focuses on how information access inequality emerges in a network. One of the most important functions of networks is the dissemination of information, and it is argued that information is the basis for all kinds of inequality. In this project, I measure the information inequality of different models under different processes and understand what properties are related to information inequality. I propose different models with the majority/minority dichotomy, along with mechanisms such as homophily, preferential attachment, and diversity. I simulate different information spreading processes with different settings, from the type of process to the transmission rate to seeding position. I propose a measure of information access inequality that allows us to examine inequality at different stages of the process. I find that information access equality depends on both the network structure and the spreading process. It is also observed that there may be a trade-off between equality and efficiency in information spreading under certain circumstances. Ultimately, the goal of this thesis is to provide a starting point and inspiration to explore the predictability of complex social systems, especially on other cultural products such as films, music and videos, and to explore inequalities in complex social systems, not only in relation to specific case studies such as gender or racial bias, but also how inequalities arise and possible interventions to promote information equality.--Author's abstracthttp://hdl.handle.net/2047/D20413922
collection NDLTD
sources NDLTD
description Efficient access to Big Data and the development of related technologies have driven the study of complex social systems. In this thesis, I focus on two aspects of the study of complex social systems: predictability and inequality. I first demonstrate the predictability aspect of complex social systems in the publishing industry. I begin with a Big Data analysis of New York Times bestsellers and bestselling authors, including the genre composition, longevity, and sales of bestsellers, as well as the gender composition and career characteristics of bestselling authors. Next, I examine weekly book sales curve and discover the universal pattern of fast rise, slow decline.'' Though this model enables prediction of total sales by thepeeking strategy'', such a prediction requires at least 25 weeks of sales of a given book, a period well beyond the peak of the sales curve. To provide early prediction of book sales, I extract book features such as author, book, and publisher before publication. Then I develop the ``Learning to Place'' algorithm which addresses the problem of imbalance in book sales, i.e. most books have low sales and much less books have high sales. This framework also allows us to understand the characteristics that drive book sales, which is very important for understanding complex social systems. The second aspect of complex social systems that I examine in this thesis is inequality. First, I conduct a large-scale investigation of gender underrepresentation in the art world, using various statistical tools and complex networks. I propose two criteria: gender-neutral and gender-balanced to categorize the representation of women artists in each institution. I find a systematic underrepresentation of women artists in institutions, which may hinder career development and access to auctions for women artists. Finally, I use logistic regression to connect the institution exhibition inequality with the auction inequality, and find that institution exhibition inequality has an effect on artists' auction access. Following the line of inequality, the last project focuses on how information access inequality emerges in a network. One of the most important functions of networks is the dissemination of information, and it is argued that information is the basis for all kinds of inequality. In this project, I measure the information inequality of different models under different processes and understand what properties are related to information inequality. I propose different models with the majority/minority dichotomy, along with mechanisms such as homophily, preferential attachment, and diversity. I simulate different information spreading processes with different settings, from the type of process to the transmission rate to seeding position. I propose a measure of information access inequality that allows us to examine inequality at different stages of the process. I find that information access equality depends on both the network structure and the spreading process. It is also observed that there may be a trade-off between equality and efficiency in information spreading under certain circumstances. Ultimately, the goal of this thesis is to provide a starting point and inspiration to explore the predictability of complex social systems, especially on other cultural products such as films, music and videos, and to explore inequalities in complex social systems, not only in relation to specific case studies such as gender or racial bias, but also how inequalities arise and possible interventions to promote information equality.--Author's abstract
title Predictability and inequality in complex social systems
spellingShingle Predictability and inequality in complex social systems
title_short Predictability and inequality in complex social systems
title_full Predictability and inequality in complex social systems
title_fullStr Predictability and inequality in complex social systems
title_full_unstemmed Predictability and inequality in complex social systems
title_sort predictability and inequality in complex social systems
publishDate
url http://hdl.handle.net/2047/D20413922
_version_ 1719460731805499392