A Network Perspective on Software Modularity

The modularity of a software architecture is considered a key feature that contributes to the sustainability of large scale software projects. Ideally, modularization fosters the decoupling of software development efforts, which can then be performed independently if a binding standard interface is established. As the software evolves in time, modularity might even favor its maintainability and expandability. If the development of a given system is meant to be sustainable, the amount of effort required to perform modifications in the software architecture must be compatible with the resources (time, human, etc) available at any time. Therefore monitoring the modularity of an evolving software system promises to be an important step towards a sustainable software development regime, however such a task would be tedious and slow if performed manually. In this article we propose an efficient automatic quantitative approach to estimate the coherence between the modularity of the dependency network of large open source Java projects and their decomposition in terms of Java packages. Figure 1 shows an example visualization of the modular structure of the JAVA project AspectJ.

Example visualization of the modular structure of the project AspectJ

A measure for modular coherence inspired by complex network science

According to our needs, we adopt a network metric which was first used to study assortative mixing in networks, which is the tendency for network nodes to be connected to other nodes that are like (or unlike) them in some way. The resulting metric Q (details in the paper) measures the fraction of network edges that connect nodes within the same module minus the expected value of the same quantity measured from a random network with the same node/module allocation. If the first is not better than random Q = 0. In general, Q ∈ [−1, 1],i.e. the more modular the network, the closer Q is to 1. Figure 2 provides two examples of networks and their respective Q scores.

 

Two examples of undirected networks and their respective modular coherence

In the analysis of software structures, this metric is useful because in many cases the definition of modules is given by means of programming constructs like classes, files, namespaces or packages. The Q–metric
can thus be used to study how well the cluster structures in the network of dependencies correspond to the package decomposition. Furthermore, it allows us to study the evolution of the modular coherence of a number of Open Source Software projects (details in the paper). Two examples for proejcts whose modular coherence decreases and increases over time are shown in Figure 6

Two examples for network evolution of OSS dependency networks

Conclusion

The results presented in our paper indicate that the Q-metric known from the analysis of cluster structures in network science is a promising and reasonable approach to quantify the coherence between the package decomposition of large software projects and their dependency structures. As such, it constitutes a macroscopic measure that allows us to monitor and evaluate software engineering processes and reason about the sustainability of software architectures. In particular, it provides a simple mapping from local development activities to their respective impact on the mesoscopic and macroscopic structures of software systems.

 

Selected Publications

A network perspective on software modularity, 2012

Zanetti, Marcelo Serrano; Schweitzer, Frank