Christoph Gote
Christoph Gote is a PhD Candidate at the Chair of Systems Design at ETH Zurich, Switzerland. Specialising in systems theory and control engineering, he received his MSc in Electrical Engineering and Information Technology from the Karlsruhe Institute of Technology in 2016. He further holds an MSc in Investment and Wealth Management from Imperial College Business School in London, UK, from 2017. His current research focuses on the analysis of collaboration structures in software development teams. To this end, he applies data-driven modelling and network analysis to large sets repositories of Open Source Software (OSS) projects. He further develops tools to facilitate the extraction of information form OSS repositories, as well as methods for temporal- and higher-order network analysis.
CV» |
|
|
Publications»
«Publications
Publications in
Predicting Sequences of Traversed Nodes in Graphs using Network Models with Multiple Higher Orders
|
[2020]
|
Gote, Christoph;
Casiraghi, Giona;
Schweitzer, Frank;
Scholtes, Ingo
|
arXiv preprint arXiv:2007.06662
|
more» «less
|
Abstract
We propose a novel sequence prediction method for sequential data capturing node traversals in graphs. Our method builds on a statistical modelling framework that combines multiple higher-order network models into a single multi-order model. We develop a technique to fit such multi-order models in empirical sequential data and to select the optimal maximum order. Our framework facilitates both next-element and full sequence prediction given a sequence-prefix of any length. We evaluate our model based on six empirical data sets containing sequences from website navigation as well as public transport systems. The results show that our method out-performs state-of-the-art algorithms for next-element prediction. We further demonstrate the accuracy of our method during out-of-sample sequence prediction and validate that our method can scale to data sets with millions of sequences.
Multi-layer network approach to modelling authorship influence on citation dynamics in physics journals
|
[2020]
|
Schweitzer, Frank;
Nanumyan, Vahan;
Gote, Christoph
|
Physical Review E,
pages: 032303,
volume: 102,
number: 3
|
more» «less
|
Abstract
We provide a general framework to model the growth of networks consisting of different coupled layers. Our aim is to estimate the impact of one such layer on the dynamics of the others. As an application, we study a scientometric network, where one layer consists of publications as nodes and citations as links, whereas the second layer represents the authors. This allows to address the question how characteristics of authors, such as their number of publications or number of previous co-authors, impacts the citation dynamics of a new publication. To test different hypotheses about this impact, our model combines citation constituents and social constituents in different ways. We then evaluate their performance in reproducing the citation dynamics in nine different physics journals. For this, we develop a general method for statistical parameter estimation and model selection that is applicable to growing multi-layer networks. It takes both the parameter errors and the model complexity into account and is computationally efficient and scalable to large networks.
Analysing Time-Stamped Co-Editing Networks in Software Development Teams using git2net
|
[2019]
|
Gote, Christoph;
Scholtes, Ingo;
Schweitzer, Frank
|
arXiv:1911.09484
|
more» «less
|
Abstract
Data from software repositories have become an important foundation for the empirical study of software engineering processes. A recurring theme in the repository mining literature is the inference of developer networks capturing e.g. collaboration, coordination, or communication from the commit history of projects. Most of the studied networks are based on the co-authorship of software artefacts. Because this neglects detailed information on code changes and code ownership we introduce git2net, a scalable python software that facilitates the extraction of fine-grained co-editing networks in large git repositories. It uses text mining techniques to analyse the detailed history of textual modifications within files. We apply our tool in two case studies using GitHub repositories of multiple Open Source as well as a commercial software project. Specifically, we use data on more than 1.2 million commits and more than 25'000 developers to test a hypothesis on the relation between developer productivity and co-editing patterns in software teams. We argue that git2net opens up a massive new source of high-resolution data on human collaboration patterns that can be used to advance theory in empirical software engineering, computational social science, and organisational studies.
git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories
|
[2019]
|
Schweitzer, Frank;
Gote, Christoph;
Scholtes, Ingo
|
Proceedings of the 16th International Conference on Mining Software Repositories,
pages: 433-444
|
more» «less
|
Abstract
Data from software repositories have become an important foundation for the empirical study of software engineering processes. A recurring theme in the repository mining literature is the inference of developer networks capturing e.g. collaboration, coordination, or communication from the commit history of projects. Most of the studied networks are based on the co-authorship of software artefacts defined at the level of files, modules, or packages. While this approach has led to insights into the social aspects of software development, it neglects detailed information on code changes and code ownership, e.g. which exact lines of code have been authored by which developers, that is contained in the commit log of software projects.
Addressing this issue, we introduce git2net, a scalable python software that facilitates the extraction of fine-grained co-editing networks in large git repositories. It uses text mining techniques to analyse the detailed history of textual modifications within files. This information allows us to construct directed, weighted, and time-stamped networks, where a link signifies that one developer has edited a block of source code originally written by another developer. Our tool is applied in case studies of an Open Source and a commercial software project. We argue that it opens up a massive new source of high-resolution data on human collaboration patterns.
Talks»
«Talks
Talks
Analysing Time-Stamped Co-Editing Networks in Software Development Teams using git2net
[Sept. 23, 2020]
NetSci 2020
A Generative Multi-Order Model to Predict Variable-Length Paths in Networks
[Sept. 23, 2020]
NetSci 2020
AIC Based Model Selection for Generative Multi-Order Models of Paths in Networks
[Sept. 21, 2020]
NetSci 2020
A Generative Multi-Order Model to Predict Variable-Length Paths in Networks
[Sept. 19, 2020]
NetSci 2020 — Machine Learning In Network Science Satellite
git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories
[Sept. 25, 2019]
INFORMATIK 2019 - Best of Data Science made in D/A/CH
Model selection for coupled growth of multi-layer networks
[May 30, 2019]
NetSci 2019
git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories
[May 30, 2019]
NetSci 2019
Variable-order network models based on path data
[May 28, 2019]
NetSci 2019 — Higher-Order Models in Network Science (Invited Talk)
git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories
[May 27, 2019]
MSR 2019
Awards: Special MSR Mention in MSR 2019 FOSS Award
git2net is awarded Special MSR Mention»
«git2net is awarded Special MSR Mention
We are proud to anounce that our paper introducing the Open Source python package git2net received a Special MSR Mention at the 16th International Conference on Mining Software Repositories (MSR) 2019 in Montreal, QC, Canada.
With git2net we introduce an Open Source tool that facilitates the scalable extraction of time-stamped co-editing relationships between developers in large git-based software repositories.
The Special MSR Mention is part of the MSR FOSS award given to papers that show outstanding contributions to the Free Open Source Software (FOSS) community (source). With this Special MSR Mention, the FOSS community recognises the potential of git2net to allow projects to gain knowledge on their own important aspects (source). We further anticipate git2net to be of great value for scientists studying the development of git-based software projects.
You can install and use git2net for your own research today via
pip install git2net
The preprint of our paper is available on arXiv.org. Check out the reproducibility package with tutorial to learn how to get started analysing your own repositories. To contribute to the future development of git2net visit our repository on GitHub.

|