Applications in Software Engineering
Examples for systems that are most commonly studied from a complex systems perspective can predominantly been found in the context of biological and social systems. However, it is important to realise that also man-made, engineered systems fall into this category. A particular class of complex engineered systems whose importance for society has increased tremendously over the last few years are software systems. Typical modern software systems consist of hundreds or even thousands of interdependent modules or subsystems, which are working together in complex ways and which mutually depend on each others' functionality. A proper understanding of the structure of such software systems, as well as of the evolution of these structures as their development progresses is fundamental for the design and management of secure, reliable and maintainable software. Naturally, such systems can be studied from a complex networks perspective, where functions, classes, or packages are represented by nodes and the interdependencies between them are modeled as links.-

The application of network-analytic techniques allows us to study the modularity of software systems from a network perspective. Our most recent research in this area has shown not only that we can use such a network perspective to quantify the evolution of modularity in software systems. It has also proven to be useful for the development of remodularisation algorithms, which can be used to support developers in developing software structures that are easy to understand and thus to maintain. In addition to developing quantitative measures and algorithms for the remodularisation of software, we also use agent-based modeling techniques to study the temporal evolution of software systems. This interdisciplinary approach not only helps us to better understand why software systems evolve the way they do, it also allows us to identify which growth processes are sustainable in the sense that they lead to software structure which can easily be maintained by engineers.

Notably our research in this area is truly interdisciplinary, being published both in interdisciplinary journals like Europhysics Letters as well as at premium software engineering venues like the International Conference on Modularity.
Selected Publications
Automated Software Remodularization Based on Move Refactoring
|
[2014]
|
Zanetti, Marcelo Serrano;
Tessone, Claudio Juan;
Scholtes, Ingo;
Schweitzer, Frank
|
In Proceedings of the 13th International Conference on Modularity 2014
|
more» «less
|
Abstract
Modular design is a desirable characteristic of complex software systems that can significantly improve their comprehensibility, maintainability and thus quality. While many software systems are initially created in a modular way, over time modularity typically degrades as components are reused outside the context where they were created. In this paper, we propose an automated strategy to remodularize software based on move refactoring, i.e. moving classes between packages without changing any other aspect of the source code. Taking a complex systems perspective, our approach is based on complex networks theory applied to the dynamics of software modular structures and its relation to an n-state spin model known as the Potts Model. In our approach, nodes are probabilistically moved between modules with a probability that nonlinearly depends on the number and module membership of their adjacent neighbors. The latter are defined by the underlying network of software dependencies. To validate our method, we apply it to a dataset of 39 Java open source projects in order to optimize their modularity. Comparing the source code generated by the developers with the optimized code resulting from our approach, we find that modularity (i.e. quantified in terms of a standard measure from the study of complex networks) improves on average by 166+-77 percent. In order to facilitate the application of our method in practical studies, we provide a freely available Eclipse plug-in.
The Link between Dependency and Cochange: Empirical Evidence
|
[2012]
|
Geipel, Markus Michael;
Schweitzer, Frank
|
IEEE Transactions on Software Engineering,
pages: 1432-1444,
volume: 38,
number: 6
|
more» «less
|
Abstract
We investigate the relationship between class dependency and change propagation (cochange) in software written in Java. On the one hand, we find a strong correlation between dependency and cochange. Furthermore, we provide empirical evidence for the propagation of change along paths of dependency. These findings support the often alleged role of dependencies as propagators of change. On the other hand, we find that approximately half of all dependencies are never involved in cochanges and that the vast majority of cochanges pertain to only a small percentage of dependencies. This means that inferring the cochange characteristics of a software architecture solely from its dependency structure results in a severely distorted approximation of cochange characteristics. Any metric which uses dependencies alone to pass judgment on the evolvability of a piece of Java software is thus unreliable. As a consequence, we suggest to always take both the change characteristics and the dependency structure into account when evaluating software architecture.
A network perspective on software modularity
|
[2012]
|
Zanetti, Marcelo Serrano;
Schweitzer, Frank
|
Architecture of Computing Systems (ARCS) Workshops 2012
|
more» «less
|
Abstract
Modularity is a desirable characteristic for software systems. In this article we propose to use a quantitative method from complex network sciences to estimate the coherence between the modularity of the dependency network of large open source Java projects and their decomposition in terms of Java packages. The results presented in this article indicate that our methodology offers a promising and reasonable quantitative approach with potential impact on software engineering processes.
Sustainable growth in complex networks
|
[2011]
|
Tessone, Claudio Juan;
Geipel, Markus Michael;
Schweitzer, Frank
|
Europhysics Letters,
pages: 58005,
volume: 96,
number: 5
|
more» «less
|
Abstract
Based on the analysis of the dependency network in 18 Java projects, we develop a novel model of network growth which considers both preferential attachment and the addition of new nodes with a heterogeneous distribution of their initial degree, k0. Empirically we find that the cumulative distributions of initial and final degrees in the network follow power law behaviours: 1−P(k0)∝k1−$α$ as a function of the network size, we find empirically K(N)∝N$β$,where $β$ ∈[1.25, 2] (for small N), while converging to $β$ ∼1 for large N. This indicates a transition from a growth regime with increasing network density towards a sustainable regime, which prevents a collapse due to 0 ,and 1−P(k)∝k1−$γ$, respectively. For the total number of links ever increasing dependencies. Our theoretical framework allows us to predict relations between the exponents $α$, $β$, $γ$, which also link issues of software engineering and developer activity. These relations are verified by means of computer simulations and empirical investigations. They indicate that the growth of real Open Source Software networks occurs on the edge between two regimes, which are dominated either by the initial degree distribution of added nodes, or by the preferential attachment mechanism. Hence, the heterogeneous degree distribution of newly added nodes, found empirically, is essential to describe the laws of sustainable growth in networks.
Software change dynamics: Evidence from 35 Java projects
|
[2009]
|
Geipel, Markus Michael;
Schweitzer, Frank
|
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
|
more» «less
|
Abstract
In this paper we investigate the relationship between class dependency and change propagation in Java software. By analyzing 35 large Open Source Java projects, we find that in the majority of the projects more than half of the dependencies are never involved in change propagation. Furthermore, our analysis shows that only a few dependencies are transmitting the majority of change propagation events. An additional analysis reveals that this concentration cannot be explained by the different ages of the dependencies. The conclusion is that the dependency structure alone is a poor measure for the change dynamics. This contrasts with current literature.
A complementary view on the growth of directory trees
|
[2009]
|
Geipel, Markus Michael;
Tessone, Claudio Juan;
Schweitzer, Frank
|
The European Physical Journal B,
pages: 641-648,
volume: 71,
number: 4
|
more» «less
|
Abstract
Trees are a special sub-class of networks with unique properties, such as the level distribution which has often been overlooked.We analyse a general tree growth model proposed by Klemm et al. [Phys. Rev. Lett. 95, 128701 (2005)] to explain the growth of user-generated directory structures in computers. The model has a single parameter q which interpolates between preferential attachment and random growth. Our analysis results in three contributions: first, we propose a more efficient estimation method for q based on the degree distribution, which is one specific representation of the model. Next, we introduce the concept of a level distribution and analytically solve the model for this representation. This allows for an alternative and independent measure of q.We argue that, to capture real growth processes, the q estimations from the degree and the level distributions should coincide. Thus, we finally apply both representations to validate the model with synthetically generated tree structures, as well as with collected data of user directories. In the case of real directory structures, we show that q measured from the level distribution are incompatible with q measured from the degree distribution. In contrast to this, we find perfect agreement in the case of simulated data. Thus, we conclude that the model is an incomplete description of the growth of real directory structures as it fails to reproduce the level distribution. This insight can be generalised to point out the importance of the level distribution for modeling tree growth.
A Complex Networks Perspective On Collaborative Software Engineering
|
[2015]
|
Cataldo, Marcelo;
Scholtes, Ingo;
Valetto, Giuseppe
|
ACS - Advances in Complex Systems,
pages: 1430001,
volume: 17,
number: 07n08
|
more» «less
|
Abstract
Large collaborative software engineering projects are interesting examples for evolving complex systems. The complexity of these systems unfolds both in evolving software structures, as well as in the social dynamics and organization of development teams. Due to the adoption of Open Source practices and the increasing use of online support infrastructures, large-scale data sets covering both the social and technical dimension of collaborative software engineering processes are increasingly becoming available. In the analysis of these data, a growing number of studies employ a network perspective, using methods and abstractions from network science to generate insights about software engineering processes. Featuring a collection of inspiring works in this area, with this topical issue, we intend to give an overview of state-of-the-art research. We hope that this collection of articles will stimulate downstream applications of network-based data mining techniques in empirical software engineering.
|