SCIENCE AT THE EDGE SEMINAR

QB/GEDD

Friday, March 1 at 11:30am

Room 1400 Biomedical and Physical Sciences Bldg.

Refreshments at 11:15

 

Mauro Maggioni

Departments of Mathematics, Computer Science, and

Electrical and Computer Engineering

Duke University, Durham, NC

 

 

 

Multiscale Geometric Methods for Data in High Dimensions

 

We discuss recent work on multiscale geometric analysis applied to high-dimensional data sets. A first application to the estimation of the intrinsic dimension of noisy data, a second one to the construction of data-driven dictionaries for efficient sparse representations data sets and a novel geometric multiresolution analysis framework for encoding data. Finally we discuss the problem of estimating a probability measure in high dimensions, whose support is (nearly) low-dimensional and has some geometric structure, for example that of a manifold, or a union of hyperplanes. We construct a multiscale geometric tree decomposition of the data and use this decomposition to construct an increasing family of approximation "spaces" in the space or probability measures, parametrized by certain subtrees of the multiscale tree, and perform a multiscale bias-variance tradeoff using this family of approximation spaces. We obtain finite-sample results that guarantee that with high probability the Wasserstein distance between the (random) measure estimated by our algorithm and the true measure is small, depending on the number of samples, a measure of complexity of the models we use (typically this depends only on the intrinsic dimension and not on the ambient dimension!), and a notion of "regularity" of the true measure.

 

 

 

 

Helen Geiger, Administrative Assistant

Quantitative Biology Graduate Program and

Gene Expression in Development and Disease

Biochemistry

603 Wilson Road, Room 212

East Lansing, MI   48824

Email: [log in to unmask]

Phone:  517-432-9895

QB Website: http://www.qbi.msu.edu/

GEDD Website: http://www.gedd.msu.edu/