April 10, 2009
"Adaptive Sampling for Bayesian Inference"
by Dr. Robert Kohn (University of New South Wales)
Abstract: The talk is concerned with the construction of adaptive sampling schemes for Bayesian Inference. Such schemes use previous iterates to tune the proposal distribution automatically and repeatedly. Such schemes have two main advantages over conventional Markov chain Monte Carlo sampling. First, they can be much more efficient than conventional schemes. Second, they are much easier to code because they usually just require problem specific code for the likelihood, with the code for the proposal densities generic. Adaptation needs to be done carefully to ensure convergence to the correct target distribution because the resulting chain is not Markovian. We give conditions for adaptive sampling to work and discuss their application in practice. We introduce several sampling schemes and illustrate the methodology using challenging but realistic models and priors applied to real data examples.
Education: PhD and Master of Economics Bachelor of Science
Academic Profile: Dr. Kohn's research and teaching interests include: Bayesian methodology Variable selection and model averaging Nonparametric regression models Time series modeling Multivariate Gaussian and non-Gassian regression Markov chain Monte Carlo simulation algorithms
January 21, 2009
"Risk Assessment for Pyroclastic Flows"
by Dr. James Berger (Duke University)
Abstract: Risk Assessment of rare natural hazards -- such as large volcanic block and ash or pyroclastic flows -- is addressed. Assessment is approached through a combination of computer modeling, statistical modeling, and extreme-event probability computation. A computer model of the natural hazard is used to provide the needed extrapolation to unseen parts of the hazard space. Statistical modeling of the available data is needed to determine the initializing distribution for exercising the computer model. In dealing with rare events, direct simulations involving the computer model are prohibitively expensive. Solution instead requires a combination of adaptive design of computer model approximations, (emulators) and rare event simulation. The techniques that are developed for risk assessment are illustrated on a test-bed example involving pyroclastic flow.
Brief Bio: Jim Berger received a Ph.D. in Mathematics from Cornell University in 1974. He was a faculty member in the Department of Statistics at Purdue University until 1997, at which time he moved to the Institute of Statistics and Decision Sciences (now the Department of Statistical Science) at Duke University, where he is currently the Arts and Sciences Professor of Statistics. He has also been Director of the national Statistical and Applied Mathematical Sciences Institute since 2002. Berger was president of the Institute of Mathematical Statistics from 1995-1996, chair of the Section on Bayesian Statistical Science of the American Statistical Association in 1995, and president of the International Society for Bayesian Analysis during 2004. He has been involved with numerous editorial activities, including co-editorship of the Annals of Statistics during the period 1998-2000, and has organized or participated in the organization of over 35 conferences. Among the awards and honors Berger has received are Guggenheim and Sloan Fellowships, the COPSS President's Award in 1985, the Sigma Xi Research Award at Purdue University for contribution of the year to science in 1993, the Fisher Lectureship in 2001, election as foreign member of the Spanish Real Academia de Ciencias in 2002, election to the USA National Academy of Sciences in 2003, award of an honorary Doctor of Science degree from Purdue University in 2004, and the Wald Lectureship in 2007. Berger's research has primarily been in Bayesian statistics, foundations of statistics, statistical decision theory, simulation, model selection, and various interdisciplinary areas of science and industry, especially astronomy and the interface between computer modeling and statistics. He has supervised 31 Ph.D. dissertations, published over 160 articles and has written or edited 14 books or special volumes.
For a list of papers & additional information about Dr. Berger: http://www.stat.duke.edu/~berger/.
Past Lectures: Fall 2008
November 21, 2008
"A Natural Nonparametric Generalization of Parametric Statistical Models"
by Dr. Timothy E. Hanson (University of Minnesota)
Abstract: Mixtures of Polya trees (MPTs) provide a robust modeling alternative to other nonparametric approaches, e.g. kernel smoothing, Dirichlet process mixtures, Bernstein polynomials, functional expansions, etc. A key property of the MPT prior is that it is easy to center an MPT distribution at any given parametric family of distributions. This facilitates testing the appropriateness of a parametric model, as well as the computation of Bayes factors or pseudo Bayes factors for testing against other nonparametric models. MPTs have been successfully applied in many settings including failure time and reliability modeling: longitudinal and frailty random effects models; time series; ROC curve, disease prevalence, and diagnostic test accuracy assessment; nonparametric link estimation in generalized linear mixed models; Rasch models; and so on. This talk provides a brief introduction to MPT models, briefly summarizes some recent applications, then presents new extensions, including dependent polya tree processes and smoothed multivariate Polya trees. Applications to survival modeling, ROC curve estimation, and meta-analysis illustrate the usefulness of this broad class of prior processes.
Brief Bio: Dr. Hanson holds a graduate degree in Mathematics and a Ph.D. in Statistics. An extremely versatile researcher, he is considered one of the foremost applied statisticians, given his highly influential work in biomedical, agricultural, engineering, environmental and veterinary science. At present, he is on the faculty in the Division of Biostatistics at the University of Minnesota. The author of numerous publications in top journals, Dr. Hanson's research has been supported by grants from the NSF, NIH, USDA, and Sandia Laboratories.
For a listing of Dr. Hanson's papers, please visit http://www.biostat.umn.edu/~hanson/papers.html.
October 31, 2008
"Random Partition Models Indexed with Covariates"
by Dr. Peter Mueller (M.D. Anderson Cancer Center)
Abstract: We propose a model for covariate-dependent clustering, ie., we develop a probability model for random partitions that is indexed by covariates. The motivating application is inference for a clinical trial. As part of the desired inference we wish to define clusters of patients. Defining a prior probability model for cluster memberships should include a regression on patient baseline covariates. We build on product partition models (PPM). We define an extension of the PPM to include the desired regression. This is achieved by including in the cohesion function a new factor that increases probability of experimental units with similar covariates to be included in the same cluster. We discuss implementations suitable for continuous, categorical, count and ordinal covariates.
Brief Bio: Dr. Mueller is a world renowned scholar in statistical theory and modeling. He holds a PhD in Statistics from Purdue University, and graduate degrees in Mathematics and Computer Science, as well as Physics Education from the University of Vienna, and the Technical University of Vienna. He has published over 75 papers in leading academic journals, and is the recipient of numerous NSF and NIH awards.
Dr. Mueller's research interests are extraordinarily diverse, spanning business, natural sciences, physical sciences, statistical theory, computational theory, etc. At present, he is considered one of the prominent experts in genetic research.
For a listing of Dr. Mueller's papers, please visit http://odin.mdacc.tmc.edu/~pm/.
October 10, 2008
"Fast Sparse Regression and Classification"
by Jerome H. Friedman (Stanford University)
Abstract: Regularized regression and classification methods fit a linear model to data, based on some loss criterion, subject to a constraint on the coefficient values. As special cases, ridge-regression, the lasso, and subset selection all use squared-error loss with different particular constraint choices. For large problems the general choice of loss/constraint combination's is usually limited by the computation required to obtain the corresponding solution estimates, especially when non-convex constraints are used to induce very sparse solutions. A fast algorithm is presented that produces solutions that closely approximate those for any convex loss and a wide variety of convex and non convex constraints, permitting application to very large problems. The benefits of this generality are illustrated by examples.
Brief Bio: Dr. Friedman earned his PhD in High Energy Particle Physics from The University of California at Berkeley. Until recently, he led the Computation Research Group at the Stanford Linear Accelerator Center. He has won numerous awards for his many contributions to mathematical physics, and statistics. He serves or has served on the editorial boards of several top journals. Dr Friedman is a Fellow of the American Academy of Arts and Sciences, and a recipient of the prestigious Parzen Prize for statistical innovation. He is also a recipient of the ACM Data Mining Lifetime Innovation Award. Technometrics recognized two of his papers as Paper of the Year and the Journal of the American Statistical Association recognized two others as Paper of the Year.
For a listing of Dr. Friedman’s publications, please visit http://www-stat.stanford.edu/~jhf/.
Past Lectures: Spring 2008
April 4, 2008
"Stochastic Space-time Modeling using Differential Equations"
by Dr. Alan Gelfand (Duke University)
Abstract: We will discuss the use of differential equation models to consider the analysis of two types of space-time data. In one case we look at soil moisture observed at a collection of locations in two hour increments over the course of several months. Soil moisture levels are a complex process that is driven by inputs of precipitation along with transpiration and drainage. Infinitesimal change in soil moisture can be expressed using a differential equation involving forms for transpiration and drainage as a function of soil moisture. In a second case, we consider time point patterns driven by a latent space time intensity surface which is a realization of a stochastic process, that is, our model is an example of a Cox process. The intensity surface is formulated as a growth process characterized through a stochastic differential equation. Our motivating example is to understand annual urban growth through single family home construction. We show how both of these examples can be handled using hierarchical modeling specifications. In each case, we discretize time, replacing integrals by sums. However, each introduces further computational wrinkles. The former works with empirically specified transpiration and drainage functions while the latter treats a very large number of points (roughly 12,000 houses). Results of the analyses will be presented.
Dr. Gelfand's CV
February 29, 2008
"Competitive Brand Salience Analysis Through a Model of Eye-Movements During Visual Research"
by Dr. Michel Wedel (University of Maryland)
Abstract: Brand salience--the extent to which a branded package visually stands out from its competitors--is vital in competing on the shelf in supermarkets, yet not easy to achieve in practice. Supermarket shelves contain thousands of brands and SKUs, each of which attempts to stand out among competitors and attract consumers' attention. We propose a model of visual search and eye-movement recordings collected during brand search on shelves. The model allows for switching between two unobserved attention states and enables us to estimate brand salience at the point-of-purchase, based on perceptual features (color, luminance, edges) and how these are influenced by consumers’ search goals, that we manipulated in an experiment. We show that the salience of brands has a pervasive effect on search performance, and is determined by two key components. The bottom-up component is due to in-store activity and package design. The top-down component is due to out-of-store marketing activities such as advertising. We show that about one-third of salience on the shelf is due to out-of-store and two thirds due to in-store marketing. The analysis exposes the optimal visual differentiation level of a brand versus its competitors, and of each SKU versus the other SKUs of the same brand. The model of the visual search process enables diagnostic analyses of the current levels of visual differentiation of brands and SKUs on shelves, and provides directions for increasing these.
Brief Bio: Dr. Wedel's main research interest is in Consumer Sciences, the application of statistical and econometric methods to further the understanding of consumer behavior and to improve marketing decision making. He has won the Hendrik Muller lifetime award for the social and behavioral sciences awarded by the Royal Netherlands Academy of Sciences for "exceptional achievements in the area of the behavioral and social sciences," and has been elected foreign correspondent of that Academy. He has also won the O'Dell best article award for the Journal of Marketing Research. He ranks third among all scholars in economics and business in the Netherlands based on productivity, as well as on citation impact. He has consulted for over 25 different companies in the nonprofit and profit sectors, including companies in market research, consulting, direct marketing, food financial services, automotive, and telecommunications.
Dr. Wedel's CV
Past Lectures: Fall 2007
November 9, 2007
"Information Theory and Statistics"
by Dr. Arnold Zellner (University of Chicago)
Abstract: After describing some aspects of information theory, it will be shown how it has been employed not only to produce models for observations and prior densities for their parameters but also to derive optimal learning models, including Bayes' theorem. These optimal learning models have the property that input information equals output information and thus they are 100% efficient, as recognized in the literature. Some of these learning models permit inverse inference to be performed without the use of a prior density and/or a likelihood function. Examples and references illustrating the derivations and uses of such models will be provided and discussed. By having a set of optimal learning models, including Bayes' theorem, on the shelf to help solve a variety of inference and decision problems, statisticians and other scientists will be more effective in their work.
Brief Bio: Dr. Arnold Zellner received his Ph.D. in Economics from the University of California in 1957. Currently, Dr. Zellner is Professor Emeritus in the Graduate School of Business at the University of Chicago, and Adjunct Professor at the University of California, Berkeley. He is a member of many professional societies such as: Econometric Society, American Economic Association, American Statistical Association, American Association for the Advancement of Science, International Statistical Institute, American Academy of Arts & Sciences, Institute of Mathematical Statistics, International Society for Bayesian Analysis, & International Institute of Forecasters. Dr. Zellner has chaired and served on many Ph.D. committees of graduate students in Econometrics, Macroeconomics, Time Series Analysis, and Bayesian Inference & Decision Theory. He has also chaired Econometrics Preliminary Committees and organized new econometrics research workshops and courses.
October 5, 2007
"Sparcity in Statistical Modeling"
by Dr. Michael West (Duke University)
Abstract: The concepts and methods of sparsity modeling are key to problems of model specification, variable selection and multivariate structure assessment in statistical science. Sparsity also provides the foundation for scaling statistical models from the viewpoints of both scientific parsimony and computational accessibility/feasibility. My talk will overview a range of developments in the application of sparsity modeling in modern, model-based analysis of multivariate data arising in problems with very many parameters. I will touch on specific contexts including high-dimensional latent factor analysis and graphical models, speaking to both modeling concepts and computational questions, and drawing on applied studies in areas including pathway genomics and financial time series analysis.
Brief Bio: Dr. West’s research and teaching activities are in a number of areas of Bayesian statistics, computational and mathematical sciences, especially on complex modeling in higher-dimensional problems. Core areas of modeling research relate to multivariate analysis, high-dimensional inference and computation, time series modeling, among others. Key collaborative activities include multidisciplinary projects in a number of biomedical and genomic areas.
|