Schedule for: 21w5108 - Random Graphs and Statistical Inference: New Methods and Applications (Online)
Beginning on Sunday, August 8 and ending Friday August 13, 2021
All times in Banff, Alberta time, MDT (UTC-6).
Monday, August 9 | |
---|---|
06:55 - 07:05 |
Introduction and Welcome by BIRS Staff ↓ A brief introduction to BIRS with important logistical information, technology instruction, and opportunity for participants to ask questions. (Online) |
07:05 - 07:55 | Allan Sly: Localization for Random Geometric Graphs (Online) |
08:15 - 08:40 |
Charilaos Efthymiou: On sampling symmetric Gibbs distributions on sparse random (hyper)graphs with contiguity ↓ In this paper, we present a polynomial time algorithm for approximate sampling from symmetric Gibbs distributions on the sparse random graph and hypergraph. The examples include (but are not restricted to) some important distributions on spin-systems and spin-glasses. We consider the $q$ state antiferromagnetic Potts model for $q\geq 2$, including the (hyper)graph colourings. We also consider the uniform distribution over the Not-All-Equal solutions of a random $k$-SAT instance. Finally, we consider sampling from the {\em spin-glass} distribution called the $k$-spin model. The $k$-spin model is essentially a diluted version of the well-known Sherrington-Kirkpatrick model. To our knowledge, this is the first rigorously analysed efficient algorithm for spin-glasses which operates in a non trivial range of the parameters of the distribution.
We make a major progress regarding the impressive predictions of physicists relating phase-transitions of Gibbs distributions with the efficiency of the corresponding sampling algorithms. We present, what we believe to be, an elegant sampling algorithm for symmetric Gibbs distributions. Our algorithm is unique in its approach and does not belong to any of the well-known families of sampling algorithms. We derive this algorithm by investigating the power and the limits of the the approach that was introduced in [Efthymiou: SODA 2012].
For the cases we consider here, our algorithm outperforms by far all other sampling algorithms in terms of the permitted range of the parameters of the Gibbs distributions. For the graph case, we show that our algorithm operates in a range of the parameters that coincides with the (in many cases conjectured) tree-uniqueness region, parametrised w.r.t. the {\em expected degree} of the graph. For the hypergraph case, where uniqueness is less restrictive for algorithmic purposes, it operates beyond
the uniqueness region.
For a symmetric Gibbs distribution $\mu$ on the random (hyper)graph whose parameters are within an appropriate range, our sampling algorithm has the following properties: with probability $1-o(1)$ over the instances of the input (hyper)graph, it generates a configuration which is distributed within total variation distance $n^{-\Omega(1)}$ from $\mu$. The time complexity is $O((n\log n)^2)$, where $n$ is the size of the input (hyper)graph.
We give a new insight to the problem by exploiting phenomena of
the Gibbs distributions that were not previously used by sampling algorithms. Our approach allows us to utilise the notion of {\em contiguity} between Gibbs distributions and the so-called {\em teacher-student model}, with the later distribution also known in various contexts as the {\em planted model}. This brings together tools and
notions from sampling and statistical inference algorithms. (Online) |
08:42 - 08:45 | Group Photo (Online) |
08:45 - 09:10 |
Fiona Skerman: Modularity and edge-sampling ↓ We analyse when community structure of an underlying graph can be determined from an observed subset of the graph.
Modularity is a function on graphs which is used in algorithms for community detection. For a given graph G, each partition of the vertices has a modularity score, with higher values indicating that the partition better captures community structure in G. The (max) modularity q*(G) of the graph G is defined to be the maximum over all vertex partitions of the modularity score, and satisfies 0≤q*(G)≤1.
In a natural model where we suppose edges in an underlying graph G appear independently with some probability in our observed graph G' we describe how high a sampling probability we need to infer the modularity of the underlying graph from the modularity of the observed graph.
Joint work with Colin McDiarmid. (Online) |
09:10 - 09:25 | Coffee Break (Online) |
09:25 - 09:50 | Alan Frieze: Spanners in randomly weighted graphs (Online) |
09:55 - 10:20 |
Subhabrata Sen: Large deviations for dense random graphs: beyond mean-field ↓ In a seminal paper, Chatterjee and Varadhan derived an LDP for the dense
Erd˝os-R´enyi random graph, viewed as a random graphon. This directly provides LDPs for continuous functionals such as subgraph counts, spectral norms,
etc. In contrast, very little is understood about this problem if the underlying
random graph is inhomogeneous or constrained.
In this talk, we will explore large deviations for dense random graphs, beyond the “mean-field” setting. In particular, we will study large deviations for
uniform random graphs with given degrees, and a family of dense block model
random graphs. We will establish the LDP in each case, and identify the rate
function. In the block model setting, we will use this LDP to study the upper
tail problem for homomorphism densities of regular sub-graphs. Our results
establish that this problem exhibits a symmetry/symmetry-breaking transition,
similar to one observed for Erd˝os-R´enyi random graphs.
Based on joint works with Christian Borgs, Jennifer Chayes, Souvik Dhara,
Julia Gaudio and Samantha Petti (Online) |
Tuesday, August 10 | |
---|---|
07:35 - 08:00 | Po-Ling Loh: Optimal rates for community estimation in the weighted stochastic block model (Online) |
08:05 - 08:30 | Anna Paola Muntoni: Epidemic mitigation by statistical inference from contact tracing data (Online) |
08:30 - 08:45 | Coffee Break (Online) |
08:45 - 09:10 | Markus Heydenreich: Distances in Scale-free percolation (Online) |
09:15 - 09:40 | Mark Sellke: Algorithmic pure states for the negative spherical perceptron (Online) |
Wednesday, August 11 | |
---|---|
09:00 - 09:50 | Jean Barbier: Multioverlap concentration, and performance limits in Bayesian linear regression in a mismatched model (Online) |
10:10 - 10:35 |
Cecilia Holmgren: The asymptotic distribution of cluster sizes for supercritical percolation on random split trees ↓ We consider the model of random trees introduced by
Devroye (1998), the so-called random split trees. The model
encompasses many important randomized algorithms and data
structures. We then perform supercritical Bernoulli
bond-percolation on those trees and obtain the asymptotic
distribution for the sizes of the largest clusters.
Joint work with Gabriel Berzunza,
University of Liverpool (Online) |
10:40 - 11:05 |
Joshua Erde: Expansion in the giant component of the percolated hypercube ↓ As in the case of the binomial random graph, it is known that the behaviour of a random subgraph of a d-dimensional hypercube, where we include each edge independently with probability p, undergoes a phase transition when p is around 1/d. More precisely, answering a question of Erdős and Spencer, it was shown by Ajtai, Komlós and Szemerédi that significantly above this value of p, in the supercritical regime, whp this random subgraph has a unique 'giant' component, whose order is linear in the order of the hypercube.
In the binomial random graph much more is known about the complex structure of this giant component, a lot of which can be deduced from more recent results about the likely expansion properties of the giant component. We show that whp the giant component L of a supercritical random subgraph of the d-dimensional hypercube has reasonably good expansion properties, and use this to deduce some structural information about L. In particular this leads to polynomial (in d) bounds on the diameter of L and the mixing times of a random walk on L, answering questions of Pete, and Bollobás, Kohayakawa, and Łuczak.
This is a joint work with Mihyun Kang and Michael Krivelevich. (Online) |
11:05 - 11:20 | Coffee Break (Online) |
11:20 - 11:45 |
Ilias Zadik: The All-or-Nothing Phenomenon: the Case of Sparse Tensor PCA ↓ In this talk, we will discuss about a relatively new sharp statistical phase transition, known as the all-or-nothing (AoN) phenomenon. The AoN phenomenon concerns the fact that in certain sparse inference models the asymptotic minimum mean squared error exhibits a sharp phase transition at some critical threshold; below the threshold, it is “trivial” (i.e. not beating the random guess), and above it, it is zero. This is in sharp contrast with multiple well-studied dense inference models such as Wigner PCA or the stochastic block model where the transition is either believed or proven to be continuous. In this talk, we will primarily focus on the details of AoN phenomenon in the specific context of sparse tensor PCA. This is joint work with Jonathan Niles-Weed. (Online) |
11:50 - 12:15 |
Aaditya Ramdas: Reverse martingales and exchangeable filtrations ↓ This talk will mix a completed work with an open direction. The completed work will focus on sequential estimation of convex divergences and functionals, by proving that convex functionals are reverse submartingales with respect to the "exchangeable filtration" https://arxiv.org/abs/2103.09267. The open direction is whether such techniques are useful in any way for sequences of random graphs, where the graph structure (forgetting order of nodes) plays the role of the aforementioned filtration. This is joint work with a PhD student at CMU, Tudor Manole, and the abstract of the above paper is presented below for completeness.
We present a unified technique for sequential estimation of convex divergences between distributions, including integral probability metrics like the kernel maximum mean discrepancy, f-divergences like the Kullback-Leibler divergence, and optimal transport costs, such as powers of Wasserstein distances. The technical underpinnings of our approach lie in the observation that empirical convex divergences are (partially ordered) reverse submartingales with respect to the exchangeable filtration, coupled with maximal inequalities for such processes. These techniques appear to be powerful additions to the existing literature on both confidence sequences and convex divergences. We construct an offline-to-sequential device that converts a wide array of existing offline concentration inequalities into time-uniform confidence sequences that can be continuously monitored, providing valid inference at arbitrary stopping times. The resulting sequential bounds pay only an iterated logarithmic price over the corresponding fixed-time bounds, retaining the same dependence on problem parameters (like dimension or alphabet size if applicable). (Online) |
Thursday, August 12 | |
---|---|
06:05 - 06:30 | Guangyan Zhou: Hiding solutions in model RB: Forced instances are almost as hard as unforced ones (Online) |
06:35 - 07:00 |
Max Hahn-Klimroth: Almost optimal efficient decoding from sparse pooled data ↓ In the pooled data problem, one tries to recover a signal x ∈{0, 1, 2, . . . ,d}^n from a measurement matrix A and the results of joint measurements Ax=y. It is therefore a special case of the famous compressed sensing problem with a discrete signal and a generalisation of the frequently studied quantitative group testing problem (d=1). The task becomes explicitly interesting if the signal is sparse (||x||_0=k << n). It is a well known fact that there are exponential time constructions to recover x from (A,y) with no more than O(k) measurements but all known efficient (polynomial time) constructions require at least Ω(k ln(n)) measurements and it was believed that closing this gap is hard. We show that this additional ln(n) factor is actually artificial by providing a randomised efficient construction A_{SC} coming with an efficient decoding algorithm that recovers x from (A_{SC},y) with high probability with no more than O(k) measurements. This is based on joint work with Noela Müller. (Online) |
07:00 - 07:15 | Coffee Break (Online) |
07:15 - 07:40 |
Oliver Cooley: Subcriticality and the Warning Propagation method ↓ There are by now many proofs of combinatorial results for random
discrete structures which, in one form or another, follow a common
scheme:
1. Run an appropriate recursive algorithm for a large, bounded number
of steps;
2. Show that subsequently few further changes will be made before the
algorithm terminates.
Examples include analysis of the k-core of random graphs and
hypergraphs, as well as random constraint satisfaction problems. The
second step typically involves showing that the propagation of
subsequent changes can be modelled by some subcritical process which
quickly dies out, and has so far been carried out on a case by case
basis.
We approach the problem from a very general point of view using the
Warning Propagation message passing scheme, and show that under mild
assumptions, this behaviour is indeed universal. This work was
motivated by our study of the sparse parity matrix, but it generalises
and unifies many previous approaches and we anticipate that it will
prove useful as a tool in future applications of the method.
This is based on joint work with Joon Lee and Jean Bernoulli Ravelomanana (Online) |
07:45 - 08:10 |
Alex Wein: Low-Degree Hardness of Maximum Independent Set ↓ In high-dimensional statistical problems (including planted clique, sparse PCA, community detection, etc.), the class of "low-degree polynomial algorithms" captures many leading algorithmic paradigms such as spectral methods, approximate message passing, and local algorithms on sparse graphs. As such, lower bounds against low-degree algorithms constitute concrete evidence for average-case hardness of statistical problems. This method has been widely successful at explaining and predicting statistical-to-computational gaps in these settings.
While prior work has understood the power of low-degree algorithms for problems with a "planted" signal, we consider here the setting of "random optimization problems" (with no planted signal), including the problem of finding a large independent set in a sparse random graph, as well as the problem of optimizing the Hamiltonian of mean-field spin glass models. Focusing on the independent set problem, we show that low-degree algorithms can find independent sets of "half-optimal" size (log d/d)n but no larger, which explains the failure of all known poly-time algorithms beyond this threshold. The proof of the lower bound leverages stability properties of low-degree polynomials along with a variant of the so-called "ensemble multi-overlap gap property", which is a structural property of the solution space.
Based on arXiv:2004.12063 (joint with David Gamarnik and Aukosh Jagannath) and arXiv:2010.06563 (Online) |
08:15 - 08:40 | Shuangping Li: Proof of the Contiguity Conjecture and Lognormal Limit for the Symmetric Perceptron (Online) |
Friday, August 13 | |
---|---|
07:00 - 07:50 | Afonso Bandeira: Matrix Concentration and Free Probability (Online) |
08:10 - 08:35 | Matthew Aldridge: Three random graph designs for group testing (Online) |
08:40 - 09:05 |
Jean Ravelomanana: The Sparse Parity Matrix ↓ Let A be an n×n-matrix over F2 whose every entry equals 1 with probability d/n independently for a fixed d > 0. Draw a vector y randomly from the column space of A. It is a simple observation that the entries of a random solution x to
Ax=y are asymptotically pairwise independent, i.e.,i (Online) |