- Home
- Documents
*Robust, and Approximately Correct: Estimating Mixed Demand ... Estimating Mixed Demand Systems...*

prev

next

out of 82

View

0Download

0

Embed Size (px)

Fast, “Robust”, and Approximately Correct:

Estimating Mixed Demand Systems∗

Bernard Salanié† Frank A. Wolak‡

March 8, 2019

Abstract

Many econometric models used in applied work integrate over unobserved

heterogeneity. We show that a class of these models that includes many random

coefficients demand systems can be approximated by a “small-σ” expansion that

yields a linear two-stage least squares estimator. We study in detail the models

of product market shares and prices popular in empirical IO. Our estimator is

only approximately correct, but it performs very well in practice. It is extremely

fast and easy to implement, and it is “robust” to changes in the higher moments

of the distribution of the random coefficients. At the very least, it provides

excellent starting values for more commonly used estimators of these models.

∗We are grateful to Dan Ackerberg, John Asker, Steve Berry, Xiaohong Chen, Chris Conlon,

Pierre Dubois, Jeremy Fox, Han Hong, Guy Laroque, Simon Lee, Arthur Lewbel, Thierry Magnac,

Lars Nesheim, Ariel Pakes, Mathias Reynaert, Tobias Salz, Richard Smith, Pedro Souza, Frank Ver-

boven, Martin Weidner, and Ken Wolpin for their useful comments, as well as to seminar audiences

at NYU, Rice, UCL, and the Stanford Institute for Theoretical Economics (SITE). We also thank

Zeyu Wang for excellent research assistance. †Department of Economics, Columbia University, 1022 International Affairs Building, 420 West

118th Street, New York, NY 10027, bsalanie@columbia.edu. ‡Department of Economics and Program on Energy and Sustainable Development, Stanford

University, Stanford CA 94305-6072, wolak@zia.stanford.edu.

1

Introduction

Many econometric models are estimated from conditional moment conditions that

express the mean independence of random unobservable terms η and instruments Z:

E pη|Zq “ 0.

In structural models, the unobservable term is usually obtained by solving a set of

equations—often a set of first-order conditions—that define the observed endogenous

variables as functions of the observed exogenous variables and unobservables. That

is, we start from

Gpy, η, θ0q “ 0 (1)

where y is the vector of observed endogenous variables and θ0 is the true value of the

vector of unknown parameters. The parametric function G is assumed to be known

and can depend on a vector of observed exogenous variables. Then (assuming that

the solution exists and is unique) we invert this system into

η “ F py, θ0q

and we seek an estimator of θ0 by minimizing an empirical analog of a norm

‖E pF py, θqmpZqq‖

where mpZq is a vector of measurable functions of Z.

Unless F py, θ0q exists in closed form, inversion often is a step fraught with diffi- culties. Even when a simple inversion algorithm exists, it is still costly and must be

done with a high degree of numerical precision, as errors may jeopardize the “outer”

minimization problem. One alternative is to minimize an empirical analog of the

norm

‖E pηmpZqq‖

subject to the structural constraints (1). This “MPEC approach” has met with

some success in dynamic programming and empirical industrial organization (Su and

Judd 2012, Dubé et al 2012). It still requires solving a nonlinearly constrained,

nonlinear objective function minimization problem; convergence to a solution can

be a challenging task in the absence of very good initial values. This is especially

2

galling when the model has be estimated many times, for instance within models of

bargaining like those of Crawford and Yurokoglu (2012) or Ho and Lee (2017).

We propose an alternative that derives a linear model from a very simple series

expansion. To fix ideas, suppose that θ0 can be decomposed into a pair pβ0, σ0q, where σ0 is a scalar that we have reasons to think is not too far from zero. We rewrite (1)

as

Gpy, F py, β0, σ0q, β0, σ0q “ 0.

We expand σ Ñ F py, β0, σq in a Taylor series around 0 and re-write F py, β0, σ0q as:

F py, β0, σ0q “ F py, β0, 0q ` Fσpy, β0, 0qσ0 ` . . .` Fσσ...σpy, β0, 0q σL0 L! `OpσL`10 q,

where the subscript σ denotes a partial derivative with respect to the argument σ.

This suggests a sequence of “approximate estimators” that minimize the empirical

analogs of the following norms

‖E pF py, β, 0qmpZqq‖ ‖E ppF py, β, 0q ` Fσpy, β, 0qσqmpZqq‖ ›

›

› E

ˆˆ

F py, β, 0q ` Fσpy, β, 0qσ ` Fσσpy, β, 0q σ2

2

˙

mpZq ˙

›

›

›

. . .

If the true value σ0 is not too large, one may hope to obtain a satisfactory estimator

with the third of these “approximate estimators.” In general, this still requires solving

a nonlinear minimization problem. However, suppose that the function F satisfies

the following three conditions:

C1: Fσpy, β0, 0q ” 0

C2: F py, β, 0q ” f0pyq ´ f1pyqβ is affine in β for known functions f0p¨q and f1p¨q.

C3: the second derivative Fσσpy, β, 0q does not depend on β.

Denote f2pyq ” ´Fσσpy, β, 0q. Under C1–C3, we would minimize ›

›

› E

ˆˆ

f0pyq ´ f1pyqβ ´ f2pyq σ2

2

˙

mpZq ˙

›

›

› .

3

Taking the parameters of interest to be pβ0, σ20q, this is simply a two-stage least squares regression of f0pyq on f1pyq and f2pyq with instruments mpZq. As this is a linear problem, the optimal1 instruments mpZq associated with the conditional moment restrictions Epη|Zq “ 0 are simply

mpZq “ Z˚ “ pE pf1pyq|Zq , E pf2pyq|Zqq .

These optimal instuments could be estimated directly from the data using nonpara-

metric regressions. Or more simply, we can include flexible functions of the columns

of Z in the instruments used to compute the 2SLS estimates.

The resulting estimators of β0 and σ 2 0 are only approximately correct, because

they consistently estimate an approximation of the original model. On the other

hand, they can be estimated in closed form using linear 2SLS. Moreover, because

they only rely on limited features of the data generating process, they are “robust”

in ways that we will explore later.

Conditions C1–C3 extend directly to a multivariate parameter σ0. They may

seem very demanding. Yet as we will show, under very weak conditions the Berry,

Levinsohn, and Pakes (1995) (macro-BLP) model that is the workhorse of empirical

IO satisfies all three. In this application, σ0 is taken to be the square root of the

variance–covariance matrix Σ of the random coefficients in the mixed demand model.

More generally, we will characterize in Section 6.1 a general class of models with

unobserved heterogeneity to which conditions C1–C3 apply.

Our approach builds on “small-Σ” approximations to construct successive approx-

imations to the inverse mapping (from market shares to product effects). Kadane

(1971) pioneered the “small-σ” method. He applied it to a linear, normal simulta-

neous equation system and studied the properties of k-class estimators2 when the

number of observations n is fixed and σ goes to zero. He showed that when the num-

ber of observations is large, under these “small-σ asymptotics” the k-class estimators

have biases in σ2, and that their mean-squared errors differ by terms of order σ4.

Kadane argued that small σ, fixed n asymptotics are often a good approximation to

finite-sample distributions when the estimation sample is large enough.

The small-σ approach was used by Chesher (1991) in models with measurement er-

1In the sense of Amemiya (1975). 2Which include OLS and 2SLS.

4

ror. Most directly related to us, Chesher and Santos-Silva (2002) used a second-order

approximation argument to reduce a mixed multinomial logit model to a “heterogene-

ity adjusted” unmixed multinomial logit model in which mean utilities have additional

terms3. They suggested estimating the unmixed logit and using a score statistic based

on these additional covariates to test for the null of no random variation in preferences.

Like them, we introduce additional covariates. Unlike them, we develop a method to

estimate jointly the mean preference coefficients and parameters characterizing their

random variation; and we only use linear instrumental variables estimators. To some

degree, our method is also related to that of Harding and Hausman 2007, who use a

Laplace approximation of the integral over the random coefficients in a mixed logit

model without choice-specific random effects. Unlike them, we allow for endogeneous

prices; our approach is also much simpler to implement.

Section 1 presents the model popularized by Berry–Levinsohn–Pakes (1995) and

discusses some of the difficulties that practitioners have encountered when taking it

to data. We give a detailed description of our algorithm in section 2; readers not in-

terested in the derivation of our formulæ in fact can jump directly to our Monte Carlo

simulations in