Evolution of communities of software: using tensor decompositions to compare software ecosystems

Abstract Modern software development is often a collaborative effort involving many authors through the re-use and sharing of code through software libraries. Modern software “ecosystems” are complex socio-technical systems which can be represented as a multilayer dynamic network. Many of these libr...

Full description

Bibliographic Details
Main Authors: Oliver A. Blanthorn, Colin M. Caine, Eva M. Navarro-López
Format: Article
Language:English
Published: SpringerOpen 2019-12-01
Series:Applied Network Science
Subjects:
Online Access:https://doi.org/10.1007/s41109-019-0193-5
id doaj-ae5353cb3df54128a37564afcebdacb1
record_format Article
spelling doaj-ae5353cb3df54128a37564afcebdacb12020-12-27T12:09:50ZengSpringerOpenApplied Network Science2364-82282019-12-014112210.1007/s41109-019-0193-5Evolution of communities of software: using tensor decompositions to compare software ecosystemsOliver A. Blanthorn0Colin M. Caine1Eva M. Navarro-López2School of Computer Science, University of ManchesterSchool of Geography, University of LeedsSchool of Environment, Education and Development, University of ManchesterAbstract Modern software development is often a collaborative effort involving many authors through the re-use and sharing of code through software libraries. Modern software “ecosystems” are complex socio-technical systems which can be represented as a multilayer dynamic network. Many of these libraries and software packages are open-source and developed in the open on sites such as GitHub, so there is a large amount of data available about these networks. Studying these networks could be of interest to anyone choosing or designing a programming language. In this work, we use tensor factorisation to explore the dynamics of communities of software, and then compare these dynamics between languages on a dataset of approximately 1 million software projects. We hope to be able to inform the debate on software dependencies that has been recently re-ignited by the malicious takeover of the npm package event-stream and other incidents through giving a clearer picture of the structure of software dependency networks, and by exploring how the choices of language designers—for example, in the size of standard libraries, or the standards to which packages are held before admission to a language ecosystem is granted—may have shaped their language ecosystems. We establish that adjusted mutual information is a valid metric by which to assess the number of communities in a tensor decomposition and find that there are striking differences between the communities found across different software ecosystems and that communities do experience large and interpretable changes in activity over time. The differences between the elm and R software ecosystems, which see some communities decline over time, and the more conventional software ecosystems of Python, Java and JavaScript, which do not see many declining communities, are particularly marked.https://doi.org/10.1007/s41109-019-0193-5Tensor factorisationMultilayer temporal networksSoftware dependency networks
collection DOAJ
language English
format Article
sources DOAJ
author Oliver A. Blanthorn
Colin M. Caine
Eva M. Navarro-López
spellingShingle Oliver A. Blanthorn
Colin M. Caine
Eva M. Navarro-López
Evolution of communities of software: using tensor decompositions to compare software ecosystems
Applied Network Science
Tensor factorisation
Multilayer temporal networks
Software dependency networks
author_facet Oliver A. Blanthorn
Colin M. Caine
Eva M. Navarro-López
author_sort Oliver A. Blanthorn
title Evolution of communities of software: using tensor decompositions to compare software ecosystems
title_short Evolution of communities of software: using tensor decompositions to compare software ecosystems
title_full Evolution of communities of software: using tensor decompositions to compare software ecosystems
title_fullStr Evolution of communities of software: using tensor decompositions to compare software ecosystems
title_full_unstemmed Evolution of communities of software: using tensor decompositions to compare software ecosystems
title_sort evolution of communities of software: using tensor decompositions to compare software ecosystems
publisher SpringerOpen
series Applied Network Science
issn 2364-8228
publishDate 2019-12-01
description Abstract Modern software development is often a collaborative effort involving many authors through the re-use and sharing of code through software libraries. Modern software “ecosystems” are complex socio-technical systems which can be represented as a multilayer dynamic network. Many of these libraries and software packages are open-source and developed in the open on sites such as GitHub, so there is a large amount of data available about these networks. Studying these networks could be of interest to anyone choosing or designing a programming language. In this work, we use tensor factorisation to explore the dynamics of communities of software, and then compare these dynamics between languages on a dataset of approximately 1 million software projects. We hope to be able to inform the debate on software dependencies that has been recently re-ignited by the malicious takeover of the npm package event-stream and other incidents through giving a clearer picture of the structure of software dependency networks, and by exploring how the choices of language designers—for example, in the size of standard libraries, or the standards to which packages are held before admission to a language ecosystem is granted—may have shaped their language ecosystems. We establish that adjusted mutual information is a valid metric by which to assess the number of communities in a tensor decomposition and find that there are striking differences between the communities found across different software ecosystems and that communities do experience large and interpretable changes in activity over time. The differences between the elm and R software ecosystems, which see some communities decline over time, and the more conventional software ecosystems of Python, Java and JavaScript, which do not see many declining communities, are particularly marked.
topic Tensor factorisation
Multilayer temporal networks
Software dependency networks
url https://doi.org/10.1007/s41109-019-0193-5
work_keys_str_mv AT oliverablanthorn evolutionofcommunitiesofsoftwareusingtensordecompositionstocomparesoftwareecosystems
AT colinmcaine evolutionofcommunitiesofsoftwareusingtensordecompositionstocomparesoftwareecosystems
AT evamnavarrolopez evolutionofcommunitiesofsoftwareusingtensordecompositionstocomparesoftwareecosystems
_version_ 1724369320319909888