Modularity maximization considered harmful
The most widespread method for community detection is modularity maximization [2], which happens also to be one of the most problematic. This method is based on the modularity function,
where
The motivation behind the modularity function is that it compares the existence of an edge
Despite its widespread adoption, this approach suffers from a variety of serious conceptual and practical flaws, which have been documented extensively [9]. The most problematic one is that it purports to use an inferential criterion—a deviation from a null generative model—but is in fact merely descriptive. As has been recognized very early, this method categorically fails in its own stated goal, since it always finds high-scoring partitions in networks sampled from its own null model [5].
The reason for this failure is that the method does not take into account the deviation from the null model in a statistically consistent manner. The modularity function is just a re-scaled version of the assortativity coefficient [10], a correlation measure of the community assignments seen at the endpoints of edges in the network. We should expect such a correlation value to be close to zero for a partition that is determined before the edges of the network are placed according to the null model, or equivalently, for a partition chosen at random. However, it is quite a different matter to find a partition that optimizes the value of
We demonstrate this problem in Figure 1, where we show the distribution of modularity values obtained with a uniform configuration model with

Somewhat paradoxically, another problem with modularity maximization is that in addition to systematically overfitting, it also systematically underfits. This occurs via the so-called “resolution limit”: in a connected network1 the method cannot find more than
1 Modularity maximization, like many descriptive community detection methods, will always place connected components in different communities. This is another clear distinction with inferential approaches, since fully random models—without latent community structure—can generate disconnected networks if they are sufficiently sparse. From an inferential point of view, it is therefore incorrect to assume that every connected component must belong to a different community.

These two problems—overfitting and underfitting—can occur in tandem, such that portions of the network dominated by randomness are spuriously revealed to contain communities, whereas other portions with clear modular structure can have those obstructed. The result is a very unreliable method to capture the structure of heterogeneous networks. We demonstrate this in Figure 2 (c) and (d)
In addition to these major problems, modularity maximization also often possesses a degenerate landscape of solutions, with very different partitions having similar values of
The combined effects of underfitting and overfitting can make the results obtained with the method unreliable and difficult to interpret. As a demonstration of the systematic nature of the problem, in Figure 3 (a) we show the number of communities obtained using modularity maximization for 263 empirical networks of various sizes and belonging to different domains, obtained from the Netzschleuder repository. Since the networks considered are all connected, the values are always below

The systematic overfitting of modularity maximization—as well as other descriptive methods such as Infomap—has been also demonstrated recently in [12], from the point of view of edge prediction, on a separate empirical dataset of 572 networks from various domains.
Although many of the problems with modularity maximization were long known, for some time there were no principled solutions to them, but this is no longer the case. In the table below we summarize some of the main problems with modularity and how they are solved with inferential approaches.
Problem | Principled solution via inference |
Modularity maximization overfits, and finds modules in fully random networks. [5] | Bayesian inference of the SBM is designed from the ground to avoid this problem in a principled way and systematically succeeds [13]. |
Modularity maximization has a resolution limit, and finds at most |
Inferential approaches with hierarchical priors [14] [15] or strictly assortative structures [11] do not have any appreciable resolution limit, and can find a maximum number of groups that scales as |
Modularity maximization has a characteristic scale, and tends to find communities of similar size; in particular with the same sum of degrees. | Hierarchical priors can be specifically chosen to be a priori agnostic about characteristic sizes, densities of groups and degree sequences [15], such that these are not imposed, but instead obtained from inference, in an unbiased way. |
Modularity maximization can only find strictly assortative communities. | Inferential approaches can be based on any generative model. The general SBM will find any kind of mixing pattern in an unbiased way, and has no problems identifying modular structure in bipartite networks, core-periphery networks, and any mixture of these or other patterns. There are also specialized versions for bipartite [16], core-periphery [17], and assortative patterns [11], if these are being searched exclusively. |
The solution landscape of modularity maximization is often degenerate, with many different solutions with close to the same modularity value [7], and with no clear way of how to select between them. | Inferential methods are characterized by a posterior distribution of partitions. The consensus or dissensus between the different solutions [18] can be used to determine how many cohesive hypotheses can be extracted from inference, and to what extent is the model being used a poor or a good fit for the network. |
Because of the above problems, the use of modularity maximization should be discouraged, since it is demonstrably not fit for purpose as an inferential method. As a consequence, the use of modularity maximization in any recent network analysis can be arguably considered a “red flag” that strongly indicates methodological carelessness. In the absence of secondary evidence supporting the alleged community structures found, or extreme care to counteract the several limitations of the method, the safest assumption is that the results obtained with that method tend to contain a substantial amount of noise, rendering any inferential conclusion derived from them highly suspicious.
As a final note, we focus on modularity here not only for its widespread adoption but also because of its emblematic character. At a fundamental level, all of its shortcoming are shared with any descriptive method in the literature—to varied but always non-negligible degrees.
References
Webmentions2
(Nothing yet)
2 Webmention is a standardized decentralized mechanism for conversations and interactions across the web.
Comments