Monographs on Statistics and Applied Probability.
Research Interests: design and analysis of observational studies, design and analysis of experiments, health outcomes research
Links: CV, Personal Website
PhD, Harvard University, 1980
AM, Harvard University, 1978
BA, Hampshire College, 1977
R. A. Fisher Award and Lecture from the Committee of Presidents of Statistical Societies, 2019
George W. Snedecor Award from the Committee of Presidents of Statistical Societies, 2003
IMS Medallion Lecture, 2020
Long-Term Excellence Award from the Health Policy Statistics Section of the American Statistical Association, 2018
Nathan Mantel Award from the Section on Statistics in Epidemiology of the American Statistical Association, 2017
Wharton: 1986-present. (named Robert G. Putzel Professor, 2001; Robert B. Egelston Term Professor of Statistics, 1991-92; Joseph Wharton Term Associate Professor and Professor of Statistics, 1986-91)
Previous appointment: University of Wisconsin, Madison
Senior Research Scientist, Research Statistics Group, Educational Testing Service, 1986
Research Scientist, Research Statistics Group, Educational Testing Service, 1983-86
Statistician, Division of Statistics and Applied Mathematics, Office of Radiation Programs, U.S. Environmental Protection Agency, 1980-81
Member, Committee on National Statistics, National Research Council, 1996-99
Member, Committee on Data and Research for Policy on Illegal Drugs, National Research Council, 1998-2000
Member, Advisory Board of the Measurement, Methodology and Statistics Program of the U.S. National Science Foundation, 1999-2001
For more information, go to My Personal Page
Book in the series: Chapman and Hall/CRC Monographs on Statistics and Applied Probability.
Absent randomization, causal conclusions gain strength if several independent evidence factors concur.
We develop a method for constructing evidence factors from several instruments plus a direct comparison
of treated and control groups, and we evaluate the methods performance in terms of design sensitivity
and simulation. In the application, we consider the effectiveness of Catholic versus public high schools,
constructing three evidence factors fromthree past strategies for studying this question, namely: (i) having
nearby access to a Catholic school as an instrument, (ii) being Catholic as an instrument for attending
Catholic school, and (iii) a direct comparison of students in Catholic and public high schools. Although these
three analyses use the same data,we: (i) construct three essentially independent statistical tests of no effect
that require very different assumptions, (ii) study the sensitivity of each test to the assumptions underlying
that test, (iii) examine the degree to which independent tests dependent upon different assumptions
concur, (iv) pool evidence across independent factors. In the application, we conclude that the ostensible
benefit of Catholic education depends critically on the validity of one instrument, and is therefore quite
fragile.
(Authors: Ruoqi Yu, Jeffrey Silber, and Paul R. Rosenbaum)
We propose new optimal matching techniques for large administrative
data sets. In current practice, very large matched samples are constructed
by subdividing the population and solving a series of smaller problems,
for instance, matching men to men and separately matching women
to women. Without simplification of some kind, the time required to optimally
match T treated individuals to T controls selected from C ≥ T potential
controls grows much faster than linearly with the number of people to be
matched—the required time is of order O{(T +C)^3}—so splitting one large
problem into many small problems greatly accelerates the computations. This
common practice has several disadvantages that we describe. In its place, we
propose a single match, using everyone, that accelerates the computations in
a different way. In particular, we use an iterative form of Glover’s algorithm
for a doubly convex bipartite graph to determine an optimal caliper for the
propensity score, radically reducing the number of candidate matches; then
we optimally match in a large but much sparser graph. In this graph, a modified
form of near-fine balance can be used on a much larger scale, improving
its effectiveness. We illustrate the method using data from US Medicaid,
matching children receiving surgery at a children’s hospital to similar children
receiving surgery at a hospital that mostly treats adults. In the example,
we form 38,841 matched pairs from 159,527 potential controls, controlling
for 29 covariates plus 463 Principal Surgical Procedures, plus 973 Principal
Diagnoses. The method is implemented in an R package bigmatch available
from CRAN.
An applied graduate level course in multiple regression and analysis of variance for students who have completed an undergraduate course in basic statistical methods. Emphasis is on practical methods of data analysis and their interpretation. Covers model building, general linear hypothesis, residual analysis, leverage and influence, one-way anova, two-way anova, factorial anova. Primarily for doctoral students in the managerial, behavioral, social and health sciences. Permission of instructor required to enroll.
An applied graduate level course in multiple regression and analysis of variance for students who have completed an undergraduate course in basic statistical methods. Emphasis is on practical methods of data analysis and their interpretation. Covers model building, general linear hypothesis, residual analysis, leverage and influence, one-way anova, two-way anova, factorial anova. Primarily for doctoral students in the managerial, behavioral, social and health sciences. Permission of instructor required to enroll.
An applied graduate level course for students who have completed an undergraduate course in basic statistical methods. Covers two unrelated topics: loglinear and logit models for discrete data and nonparametric methods for nonnormal data. Emphasis is on practical methods of data analysis and their interpretation. Primarily for doctoral students in the managerial, behavioral, social and health sciences. Permission of instructor required to enroll.
An applied graduate level course in multiple regression and analysis of variance for students who have completed an undergraduate course in basic statistical methods. Emphasis is on practical methods of data analysis and their interpretation. Covers model building, general linear hypothesis, residual analysis, leverage and influence, one-way anova, two-way anova, factorial anova. Primarily for doctoral students in the managerial, behavioral, social and health sciences. Permission of instructor required to enroll.
An applied graduate level course for students who have completed an undergraduate course in basic statistical methods. Covers two unrelated topics: loglinear and logit models for discrete data and nonparametric methods for nonnormal data. Emphasis is on practical methods of data analysis and their interpretation. Primarily for doctoral students in the managerial, behavioral, social and health sciences. Permission of instructor required to enroll.
Dissertation
Wharton retail expert discusses why shoppers may feel overwhelmed when faced with many options.…Read More
Knowledge @ Wharton - 2024/12/19