[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Chris Rohlfs <car@uchicago.edu> |

To |
statalist@hsphsun2.harvard.edu, joe jacob <otharain@hotmail.com> |

Subject |
Re: st: 3sls, selection |

Date |
Tue, 9 Sep 2003 14:44:02 -0500 (CDT) |

joe, if you're going to use a heckman two-step, you should really spend some time to derive a selection equation from a maximization problem -- using heckman's 1979 econometrica paper as a starting point. that said, my feeling from what you've written is that it would be very hard to come up with a reasonable & intuitive selection model based on the variables you described. i'm not quite clear on what the question is that you're trying to answer, but i gather it's something like: "if we entice a company to export more -- by changing the prices/costs that it faces -- does this company also choose to spend more on research ?" and that the selection problem is: a cost shock might entice more exporters to enter the market -- which we could mistakenly observe as a decrease in export intensity. is that right? the simplest way that i can think of to address this problem would be to create an "exporter" dummy for whether exports exceed zero -- and then to estimate 2sls with two right-hand endogenous variables -- both the exporter dummy and export intensity. alternatively, you could try re-weighting the data based on your x variables -- i.e., split the sample into quantiles using your x variables & estimate a sampling frequency (the rate at which you observe a non-zero export intensity) within each bin -- and then weight your regressions using the inverse of that frequency. this would be similar to imputing export intensity (based on your xes) for the non-exporters. i'd say maybe try a couple different ways & hopefully there are some robust relationships that aren't too sensitive to how you cut the data. again, i wouldn't recommend the heckman procedure for this particular problem but i think there are other ways you can try modeling the selection. chris On Tue, 9 Sep 2003, joe jacob wrote: > Hi Chris, > > Your comments have been quite useful. Many thanks for that. > > I think dropping IRD(learning from embodied R&D, which is an endogenous > variable) from export intensity (EXPINT) and export decision equations can > solve one big problem. (I agree that cost related factors are crucial to > export (and so are technological). I am using these variables in the export > intensity equation.) > > I still wonder if I could combine the selection and simultaneous estimation > procedures (The idea is to derive IMR from heckman selection estimation and > then insert that in the export intensity equation in the simultaneous > estimation procedure). I describe these as Stata commands below. > > (1)Deriving IMR for use in step 2. > > .heckman EXPINT drd droy wagerate skill size size2 forg > twostep select(drd droy wagerate skill outshare gov forg) > mills(IMR) > > (2)Estimating simultaneously equations with dependent variables IRD and > EXPINT using 3sls > > .reg3 (IRD EXPINT drd droy skill outshare gov forg ) (EXPINT IMR drd droy > wagerate skill size size2 forg ) 3sls inst(drd droy wagerate skill > outshare size size2 gov forg) > > Note that IMR estimated from step 1 is used in step 2 (in the second part > where EXPINT is the dependent variable). My concern now is, is inserting IMR > from step 1 in step 2 the right way of addressing selection bias? > > An alternative is to discard the question of bias as you hinted and do only > the simultaneous estimation of step 2 above (without IMR variable). > > Thanks in advance, > > Joe > > > >From: Chris Rohlfs <car@uchicago.edu> > >Reply-To: statalist@hsphsun2.harvard.edu > >To: statalist@hsphsun2.harvard.edu > >Subject: Re: st: 3sls, selection > >Date: Tue, 9 Sep 2003 10:06:23 -0500 (CDT) > > > >joe, > > > >this is a difficult problem. > > > >so heckman wrote the two-step method with the particular example of > >education in mind, where agents have perfect foresight & face a decision > >between a wage offer in the high school labor market versus a wage offer > >in the college labor market. the primary feature of the model is that > >agents maximize a known function based on variables unobservable to the > >econometrician. and that the variable they're maximizing (in this case > >wages) is the dependent variable of interest. in this education example, > >you can use the model to estimate how much an agent's schooling decision > >affects his/her wages. > > > >ok -- so let's say you had a simple model in which firms decide whether or > >not to enter the international sector or remain in the domestic sector > >based entirely on long-term profits. in that case, i think the heckman > >two-step would apply -- and you could use such a model to determine how > >much the decision to export affects a company's profits. > > > >i think it makes a big difference that you're using REVENUE (as far as i > >can tell, that's what EXPINT is) rather than PROFITS. i'd think that most > >of the factors that firms consider are cost-related, not revenue-related > >-- most of the variation in REVENUE is going to be driven by scale. even > >if you had the profits data, though -- my feeling is that the model is > >getting extremely complicated at this point & a simpler model would > >probably do a much better job of explaining the data in a credible way. > > > >i would strongly recommend considering another approach toward modeling > >selection. you do have a lot of cost-related variables. you might want > >to consider just assuming that the selection is entirely based on observed > >cost variables (in which case unweighted least squares would still be > >unbiased). > > > >chris > > > >On Tue, 9 Sep 2003, joe jacob wrote: > > > > > Chris and others, > > > > > > I should apologise for not describing the variables in the first mail. > >Let > > > me explain. > > > > > > I have an establishment level data set for about 8 years (100,000 plus > > > observations) > > > > > > The key equation of interest is > > > IRD= EXPINT+ drd+ droy+ skill+ outshare+gov+ forg /*Eqn 1.*/ > > > > > > where, IRD captures learning efforts from embodied R&D (derived from > > > sectoral R&D stock of OECD countries and distributed across > >establishments > > > of a developing country) EXPINT is the export intensity variable.(Other > > > variables are basically control variables.) Since this variable > >(EXPINT) is > > > an endogenous variable we have a second equation, > > > > > > EXPINT = IRD+ drd+ droy+ wagerate+ skill+ size+ size2+ forg /*Eqn 2.*/ > > > > > > This calls for using a simultaneous estimation procedure like 3sls. > > > > > > The problem is, since all firms don't export, there is a selection bias, > > > which has to be accounted for using the Heckman procedure. > > > > > > The selection variables for EXPINT are the following. > > > > > > IRD, drd, droy,wagerate, skill, outshare, gov, forg. > > > > > > > > > What I originally thought (albeit not probably correctly) was to > >estimate > > > Eqn2 using heckman procedure, calculate the inverse mills ratio (IMR), > >and > > > then plug this variable in equation 2 and apply 3sls to equation 1 and > >2. > > > But when I do heckman I can't account for the endogeneity of the > >variable > > > IRD. > > > > > > Hope the problem is clear now. > > > > > > Thanks in advance for any help. > > > > > > Joe > > > > > > >From: Chris Rohlfs <car@uchicago.edu> > > > >Reply-To: statalist@hsphsun2.harvard.edu > > > >To: statalist@hsphsun2.harvard.edu > > > >Subject: Re: st: 3sls, selection > > > >Date: Mon, 8 Sep 2003 15:36:58 -0500 (CDT) > > > > > > > >jacob, > > > > > > > >could you please describe the variables you're looking at ? > > > > > > > >chris > > > > > > > >On Mon, 8 Sep 2003, joe jacob wrote: > > > > > > > > > Dear all, > > > > > > > > > > This is my first mail to statalist and this mail is made after days > >of > > > > > learning from the discussions in the listserver. > > > > > > > > > > I have a two-equation system to estimate. > > > > > > > > > > Eq. (1) y1 = y2 + x1 + x2 +x3+x4+ u > > > > > Eq. (2) y2 = y1 + x1 + x2+v, > > > > > > > > > > with the endogenous variables y1 and y2 (both continuous) appearing > >in > > > >the > > > > > RHS of both equations. Thus a simultaneous equation is of course the > > > >right > > > > > way to proceed. > > > > > > > > > > But variable y1 needs to be corrected for the Selection hazard using > >the > > > > > Heckman procedure. This is because some observations are zero due to > > > >'self > > > > > selection'. Thus we have a selection equation involving the > >variables > > > >(y2, > > > > > x1, x2 ,x3,x4,x5). > > > > > > > > > > One approach I could think of is to calculate the IMR from heckman > > > > > estimation of equation 1, plugging it back in the same equation and > > > >running > > > > > a 3sls estimation involving equations 1 and 2. BUT I think that does > >not > > > > > make much sense because IMR is calculated from two equations (Eqn 1 > >and > > > >the > > > > > selection equation) that has an endogenous explanatory variable > >(y2). > > > > > > > > > > My question is how could I take care of these two problems. 1. The > > > > > endogeneity (simultaneity) of y1 and y2 , 2.the selection bias > > > >pertaining to > > > > > variable y1. > > > > > > > > > > Thanks in advance for your kind suggestions. > > > > > > > > > > Sincerely, > > > > > > > > > > Itty Jacob > > > > > > > > > > PS: My apologies for any wrong terminology. > > > > > > > > > > _________________________________________________________________ > > > > > Need more e-mail storage? Get 10MB with Hotmail Extra Storage. > > > > > http://join.msn.com/?PAGE=features/es > > > > > > > > > > * > > > > > * For searches and help try: > > > > > * http://www.stata.com/support/faqs/res/findit.html > > > > > * http://www.stata.com/support/statalist/faq > > > > > * http://www.ats.ucla.edu/stat/stata/ > > > > > > > > > > > > >* > > > >* For searches and help try: > > > >* http://www.stata.com/support/faqs/res/findit.html > > > >* http://www.stata.com/support/statalist/faq > > > >* http://www.ats.ucla.edu/stat/stata/ > > > > > > _________________________________________________________________ > > > Meet Virgo. Fall in love. http://server1.msn.co.in/features/virgo03/ > >With > > > perfection! > > > > > > * > > > * For searches and help try: > > > * http://www.stata.com/support/faqs/res/findit.html > > > * http://www.stata.com/support/statalist/faq > > > * http://www.ats.ucla.edu/stat/stata/ > > > > > > >* > >* For searches and help try: > >* http://www.stata.com/support/faqs/res/findit.html > >* http://www.stata.com/support/statalist/faq > >* http://www.ats.ucla.edu/stat/stata/ > > _________________________________________________________________ > Got a wish? Make it come true. > http://server1.msn.co.in/msnleads/citibankpersonalloan/citibankploanjuly03.asp?type=txt > Best personal loans! > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**re: st: stata update** - Next by Date:
**st: weighting in bootstrapping** - Previous by thread:
**Re: st: 3sls, selection** - Next by thread:
**st: Survival time to prevalence data - efficient code?** - Index(es):

© Copyright 1996–2021 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |