Skip to main content

SoBigData Event

Evaluating the significance of network observables with a maximum entropy-based approach

Biological networks are powerful resources for discovering and understanding the mechanisms that underlie human complex diseases. A crucial step in the statistical analysis of a network is the design of proper null models for identifying significant patterns in data while accounting for general structural features of the network, such as degree sequence and connectivity. These null models are often implicitly defined through numerical randomization of the network edges, which are expensive to compute and can lead to non-trivial biases in the converged distribution. 
 
Maximum entropy approaches allow to obtain closed-form solutions of the null distribution of graphs satisfying a given set of structural constraints in an unbiased way. A methodology proposed by Squartini et al. (2011), based on maximum entropy, provides a systematic way to analytically evaluate the significance of any feature of the network, avoiding expensive sampling procedures. 
 
In this seminar, Maiorino will introduce the basic concepts around maximum entropy modelling and the proposed methodology. Maiorino will also present the python package implementing the methodology which he is developing, called "claude", showing several use cases and examples.