Networks
form the substrate of a dominating class of large-scale
complex systems, encompassing socio-economical relations, the
human brain, cell metabolism, ecosystems, informational
infrastructure, and many more. Being high-dimensional,
heterogeneously structured, and often large relational
objects, networks require special analytical and
methodological frameworks.
I focus on the development of methods to extract scientific
understanding from network data, as well as the mathematical
modeling of network behavior and evolution. I'm particularly
interested in problems of network inference, where meaningful
structural and functional patterns cannot be obtained by
direct inspection or low-order statistics, and require instead
more sophisticated approaches based on large-scale generative
models and efficient algorithms derived from them. In more
demanding scenarios, the network data are noisy, incomplete,
or even completely hidden, leaving their trace only via an
observed dynamical behavior — in which case the network needs
to be fully reconstructed from indirect information.
A central aim of my research is to obtain robust methods
and theory that are derived from fundamental principles, while
being at the same time algorithmically efficient and
interpretable. This is achieved by merging analytical tools
and concepts from a variety of disciplines,
including Statistical
Mechanics,Information Theory,
and Bayesian Statistics.
Many of the methods developed in my work are made available
as part of
the graph-tool
library, which
is extensively
documented. For a practical introduction to many inference
and reconstruction algorithms, please refer to
the HOWTO.
Research highlights
Inferring modular structures in networks
We develop principled
methods to infer the hierarchical, modular structure of
networks, based on generative models and Bayesian
inference. Our approaches are efficient (scaling up to
huge networks) and robust. In particular they are able to
avoid both overfitting and underfitting the data. See
review [B2] for an
introduction, and
the HOWTO
documentation
for graph-tool. See
also
[24,20,33,23,42,43].
Annotated and attributed networks
Network data are often annotated with weights
or covariates on the edges, or metadata on the nodes. We
develop inference approaches that are able to leverage
this formation to uncover latent, statistically meaningful
network structures. Our perspective is that such
annotations are just more data — not “ground truth”
— and hence are also subject to noise,
incompleteness, irrelevance, etc. See
[37,31].
Dynamical networks
In many instances, networks are dynamical
objects and their structure evolves in time. We develop
inference methods that are able to characterize how the
large-scale structure dynamically changes. Importantly,
instead of imposing a priori characteristic time scales,
we extract the relevant scales from data by formulating
arbitrary-order dynamical models, within a nonparametric
Bayesian inference framework. See
[B1,36,34,28].
Uncertain network reconstruction
As is unavoidable in any empirical setting,
network data are subject to measurement uncertainties and
omissions. However, differently from more established
empirical traditions, network data often do not contain
reported error assessments of any kind. We develop
principled methods of error evaluation and network
reconstruction that are able to function even in the
demanding scenario where only a single network is
observed, and the error magnitudes are unknown. See
[39].
Reconstruction from dynamics
Certain networks are impossible, or
prohibitively expensive, to be measured directly, and we
need to infer their structure from an observed dynamics
that takes place on them. We develop Bayesian methods that
are able to achieve this reconstruction, and demonstrate
how the joint inference of modular network structure with
the network itself can significantly improve the
reconstruction from indirect dynamics as well, specially
when coupled with efficient algorithms. This amounts to a
unification of network reconstruction with community
detection — two central but traditionally isolated
problems in network science, statistics and machine
learning. See [40].
Disentangling edge formation mechanisms
Networks are often the result of a variety of
different and interdependent generative mechanisms that
operate on different scales (e.g. global or local). We are
able to show that it is possible, in key cases, to
decompose the contributions of each mechanism, based only
on the traces they leave behind on the network
structure. In particular we show how homophily and triadic
closure can be disentangled from each other, given only a
single network snapshot. This has important consequences
to the interpretation of community detection methods,
since the effects of both mechanisms are often
conflated. See [50].