[Ml-stat-talks] Four talks in ORFE this week

Ramon van Handel rvan at Princeton.EDU
Sun Feb 5 22:46:57 EST 2017

Dear all, we have the unusual pleasure of having four (!) talks in ORFE 
this week all of which could be of significant interest to people on this 
list. Please see below for the titles and abstracts. Best wishes, -- Ramon


Monday, February 6, 2017 - 4:30 PM - Sherrerd Hall 101

SPEAKER: Miki Racz (Microsoft)

TITLE: Statistical Inference in Networks and Genomics

ABSTRACT: From networks to genomics, large amounts of data are 
increasingly available and play critical roles in helping us understand 
complex systems. Statistical inference is crucial in discovering the 
underlying structures present in these systems, whether this concerns the 
time evolution of a network, an underlying geometric structure, or 
reconstructing a DNA sequence from partial and noisy information. In this 
talk I will discuss several fundamental detection and estimation problems 
in these areas.

I will present an overview of recent developments in source detection and 
estimation in randomly growing graphs. For example, can one detect the 
influence of the initial seed graph? How good are root-finding algorithms? 
I will also discuss inference in random geometric graphs: can one detect 
and estimate an underlying high-dimensional geometric structure? Finally, 
I will discuss statistical error correction algorithms for DNA sequencing 
that are motivated by DNA storage, which aims to use synthetic DNA as a 
high-density, durable, and easy-to-manipulate storage medium of digital 


Tuesday, February 7, 2017 - 4:30 PM - Sherrerd Hall 101

SPEAKER: Rachel Cummings (Caltech)

TITLE: The Implications of Privacy-Aware Choice

ABSTRACT: Privacy concerns are becoming a major obstacle to using data in 
the way that we want. It's often unclear how current regulations should 
translate into technology, and the changing legal landscape surrounding 
privacy can cause valuable data to go unused. In addition, when people 
know that their current choices may have future consequences, they might 
modify their behavior to ensure that their data reveal less --- or 
perhaps, more favorable --- information about themselves. Given these 
concerns, how can we continue to make use of potentially sensitive data, 
while providing satisfactory privacy guarantees to the people whose data 
we are using? Answering this question requires an understanding of how 
people reason about their privacy and how privacy concerns affect 

In this talk, we will see how strategic and human aspects of privacy 
interact with existing tools for data collection and analysis. I will 
begin by adapting the standard model of consumer choice theory to a 
setting where consumers are aware of, and have preferences over, the 
information revealed by their choices. In this model of privacy-aware 
choice, I will show that little can be inferred about a consumer's 
preferences once we introduce the possibility that she has concerns about 
privacy, even when her preferences are assumed to satisfy relatively 
strong structural properties. Next, I will analyze how privacy 
technologies affect behavior in a simple economic model of data-driven 
decision making. Intuition suggests that strengthening privacy protections 
will both increase utility for the individuals providing data and decrease 
usefulness of the computation. I will demonstrate that this intuition can 
fail when strategic concerns affect consumer behavior. Finally, I'll 
discuss ongoing behavioral experiments, designed to empirically measure 
how people trade off privacy for money, and to test whether human behavior 
is consistent with theoretical models for the value of privacy.


Wednesday, February 8, 2017 - 4:30 PM - Sherrerd Hall 101

SPEAKER: Tengyuan Liang (UPenn)

TITLE: Computational Constraints in Statistical Inference and Learning for 
Network Data

ABSTRACT: Network data analysis has wide applications in computational 
social science, computational biology, online social media, and data 
visualization. For many of these network inference problems, the 
brute-force (yet statistically optimal) methods involve combinatorial 
optimization, which is computationally prohibitive when we are faced with 
large scale networks. Therefore, it is important to understand the effect 
of computational constraints on statistical inference.

In this talk, we will discuss three closely related statistical models for 
different network inference problems. These models answer inference 
questions on cliques, communities, and ties, respectively. For each 
particular model, we will describe the statistical model, propose new 
computationally efficient algorithms, and study the theoretical properties 
and numerical performance of the algorithms. Further, we will quantify the 
computational optimality through describing the intrinsic barrier for 
certain efficient algorithm classes, and investigate the 
computational-to-statistical gap theoretically. A key feature shared by 
our studies is that, as the parameters of the model changes, the problems 
exhibit different phases of computational difficulty.


Thursday, February 9, 2017 - 4:30 PM - Sherrerd Hall 101

SPEAKER: Edgar Dobriban (Stanford)

TITLE: ePCA: Exponential family PCA

ABSTRACT: Many applications, such as photon-limited imaging and genomics, 
involve large datasets with entries from exponential family distributions. 
It is of interest to estimate the covariance structure and principal 
components of the noiseless distribution. Principal Component Analysis 
(PCA), the standard method for this setting, can be inefficient for 
non-Gaussian noise. In this talk we present ePCA, a methodology for PCA on 
exponential family distributions. ePCA involves the eigendecomposition of 
a new covariance matrix estimator, constructed in a deterministic 
non-iterative way using moment calculations, shrinkage, and random matrix 
theory. We provide several theoretical justifications for our estimator, 
including the Marchenko-Pastur law in high dimensions. We illustrate ePCA 
by denoising single-molecule diffraction maps obtained using 
photon-limited X-ray free electron laser (XFEL) imaging.

This is joint work with Lydia T. Liu (ORFE '17) and Amit Singer 
(Mathematics and PACM).

More information about the Ml-stat-talks mailing list