26 ago 2014

Ejercicios de econometría resueltos usando STATA (capitulo 4. de Wooldridge (2009))

Índice de ejercicios resueltos
                Capítulo 2. El modelo de regresión simple

Chapter 4 - Multiple Regression Analysis: Inference

*Ejercicio C4.1 vote1.cvs
insheet using "C:\Users\Nerys\Documents\Biblioteca\Econometria, libos ebooks\Solucion a ejercicios de econometria\Base de datos wooldridge\vote1.csv ", comma clear

i.
*un cambio en una unidad porcentual en el gasto incide en B1 en el cambio del porcentaje de votos obtenidos.

ii.
*H0; _b[expendA]=1

iii.
. gen lexpenda=ln( expenda)
. regress votea lexpenda lexpendb prtystra

      Source |       SS       df       MS              Number of obs =     173
-------------+------------------------------           F(  3,   169) =  215.15
       Model |  38402.1673     3  12800.7224           Prob > F      =  0.0000
    Residual |  10055.0813   169  59.4975224           R-squared     =  0.7925
-------------+------------------------------           Adj R-squared =  0.7888
       Total |  48457.2486   172  281.728189           Root MSE      =  7.7135

------------------------------------------------------------------------------
       votea |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    lexpenda |   6.081334   .3821187    15.91   0.000     5.326994    6.835675
    lexpendb |  -6.615268   .3788756   -17.46   0.000    -7.363206   -5.867329
    prtystra |   .1520142   .0620267     2.45   0.015     .0295674    .2744611
       _cons |   45.08597    3.92679    11.48   0.000     37.33409    52.83785
------------------------------------------------------------------------------

*3.1. Si afecta. (P>|t|=0.000, por tanto se rechaza la H0; B=0)
*3.2. Si afecta. (P>|t|=0.000, por tanto se rechaza la H0; B=0)
*3.3. No, no se puede usar, es necesario modificar la forma en como se ha construido el estadístico t, el que el software testea por default es con un valor teórico igual a cero, en este caso sería igual a 1.

iv.
. scalar tvalue=(_b[lexpenda]-1)/_se[lexpenda]
. scalar pvalue=ttail(169, tvalue)
. display "T-value: " tvalue ", P-value: " pvalue
T-value: 13.297791, P-value: 2.249e-28

*Ejercicio C4.2 lawsch85.cvs
insheet using "C:\Users\Nerys\Documents\Biblioteca\Econometria, libos ebooks\Solucion a ejercicios de econometria\Base de datos wooldridge\lawsch85.csv ", comma clear

i.
. gen lsalary=ln(salary)
. gen lcost=ln(cost)
. gen llibvol=ln( libvol)
. regress lsalary lsat gpa llibvol lcost rank

      Source |       SS       df       MS              Number of obs =     136
-------------+------------------------------           F(  5,   130) =  138.23
       Model |  8.73362207     5  1.74672441           Prob > F      =  0.0000
    Residual |  1.64272974   130  .012636383           R-squared     =  0.8417
-------------+------------------------------           Adj R-squared =  0.8356
       Total |  10.3763518   135  .076861865           Root MSE      =  .11241

------------------------------------------------------------------------------
     lsalary |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        lsat |   .0046965   .0040105     1.17   0.244    -.0032378    .0126308
         gpa |   .2475239    .090037     2.75   0.007     .0693964    .4256514
     llibvol |   .0949932   .0332543     2.86   0.005     .0292035     .160783
       lcost |   .0375538   .0321061     1.17   0.244    -.0259642    .1010718
        rank |  -.0033246   .0003485    -9.54   0.000     -.004014   -.0026352
       _cons |   8.343226   .5325192    15.67   0.000       7.2897    9.396752
------------------------------------------------------------------------------

*siendo el t=-9.54 y como se puede usar la tabla de la normal (95% signifiancia=-1.645), por tanto cae en la región de rechazo y se rechaza ho (P>|t| también muestra evidencia en contra de h0).

ii.
*Individualmente: gpa es significativa (t=0.007), sin embargo lsat no (t=0.244)
*de manera conjunta lo podemos testear con la prueba F, comparando el modelo anterior con uno donde se omitan estas dos variables.

estimates store mz24, title(Model No_Rest)
    regress lsalary lsat gpa llibvol lcost rank

estimates store mz26, title(Model Rest)                          
    regress lsalary llibvol lcost rank

estout mz26 mz24, cells(b(star fmt(3)) se(par fmt(3))) legend label varlabels(_cons constant) stats(N r2 rss)title(Models of votes)

Models of votes
----------------------------------------------------
                       Model Rest    Model No_R~t  
                             b/se            b/se  
----------------------------------------------------
LSAT                        0.005                  
                          (0.004)                  
GPA                         0.248**                
                          (0.090)                  
llibvol                     0.095**         0.129***
                          (0.033)         (0.033)  
lcost                       0.038           0.027  
                          (0.032)         (0.030)  
rank                       -0.003***       -0.004***
                          (0.000)         (0.000)  
constant                    8.343***        9.880***
                          (0.533)         (0.343)  
----------------------------------------------------
N                         136.000         141.000  
r2                          0.842           0.822  
rss                         1.643           1.909  
----------------------------------------------------
* p<0.05, ** p<0.01, *** p<0.001
    
. scalar F=((1.909-1.643)/2)/(1.643/(136-5-1))
. display F
10.523433

. display invF(2,130,.95)
3.0658391

*por tanto, F, rechaza que ambas en conjunto sean no significativas

iii. *Coregir el modelo 3, no considera valores perdidos
eststo clear
estimates store mz24, title(Model No_Rest)
    regress lsalary lsat gpa llibvol lcost rank

estimates store mz26, title(Model Rest)                          
    regress lsalary llibvol lcost rank

estimates store mlo27, title(Model No_Rest2)                          
    regress lsalary lsat gpa llibvol lcost rank clsize faculty

estout mlo27 mz26 mz24, cells(b(star fmt(3)) se(par fmt(3))) legend label varlabels(_cons constant) stats(N r2 rss) title(Models of votes3)



Models of votes3
--------------------------------------------------------------------
                     Model No_R~2      Model Rest    Model No_R~t  
                             b/se            b/se            b/se  
--------------------------------------------------------------------
llibvol                     0.129***        0.095**         0.055  
                          (0.033)         (0.033)         (0.040)  
lcost                       0.027           0.038           0.030  
                          (0.030)         (0.032)         (0.035)  
rank                       -0.004***       -0.003***       -0.003***
                          (0.000)         (0.000)         (0.000)  
LSAT                                        0.005           0.006  
                                          (0.004)         (0.004)  
GPA                                         0.248**         0.266**
                                          (0.090)         (0.093)  
clsize                                                      0.000  
                                                          (0.000)  
faculty                                                     0.000  
                                                          (0.000)  
constant                    9.880***        8.343***        8.416***
                          (0.343)         (0.533)         (0.552)  
--------------------------------------------------------------------
N                         141.000         136.000         131.000  
r2                          0.822           0.842           0.844  
rss                         1.909           1.643           1.573  
--------------------------------------------------------------------
* p<0.05, ** p<0.01, *** p<0.001


*Ejercicio C4.3 hprice1.cvs
insheet using "C:\Users\Nerys\Documents\Biblioteca\Econometria, libos ebooks\Solucion a ejercicios de econometria\Base de datos wooldridge\hprice1.csv ", comma clear

i.
. regress lprice sqrft bdrms

      Source |       SS       df       MS              Number of obs =      88
-------------+------------------------------           F(  2,    85) =   60.73
       Model |  4.71671294     2  2.35835647           Prob > F      =  0.0000
    Residual |  3.30088996    85  .038833999           R-squared     =  0.5883
-------------+------------------------------           Adj R-squared =  0.5786
       Total |   8.0176029    87  .092156355           Root MSE      =  .19706

------------------------------------------------------------------------------
      lprice |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       sqrft |   .0003794   .0000432     8.78   0.000     .0002935    .0004654
       bdrms |   .0288844   .0296433     0.97   0.333    -.0300544    .0878231
       _cons |   4.766028   .0970445    49.11   0.000     4.573077    4.958978
------------------------------------------------------------------------------

scalar theta1=(150*_b[sqrft])+_b[bdrms]
. display theta1
.08580125

ii.
* _b[bdrms]’=theta1-(150*_b[sqrft]) *********

*Ejercicio C4.4 bwght.cvs
insheet using "C:\Users\Nerys\Documents\Biblioteca\Econometria, libos ebooks\Solucion a ejercicios de econometria\Base de datos wooldridge\bwght.csv ", comma clear

eststo clear
estimates store mz1, title(Model No_Rest)
    regress bwght cigs parity faminc motheduc fatheduc

estimates store mz2, title(Model Rest)
    regress bwght cigs parity faminc

estout mz1 mz2, cells(b(star fmt(3)) se(par fmt(3))) legend label varlabels(_cons constant) stats(N r2 rss) title(Models. Education de los padres y el peso al nacer)


Models. Education de los padres y el peso al nacer
----------------------------------------------------
                     Model No_R~t      Model Rest  
                             b/se            b/se  
----------------------------------------------------
cigs                       -0.477***       -0.596***
                          (0.092)         (0.110)  
parity                      1.616**         1.788**
                          (0.604)         (0.659)  
faminc                      0.098***        0.056  
                          (0.029)         (0.037)  
motheduc                                   -0.370  
                                          (0.320)  
fatheduc                                    0.472  
                                          (0.283)  
constant                  114.214***      114.524***
                          (1.469)         (3.728)  
----------------------------------------------------
N                        1388.000        1191.000  
r2                          0.035           0.039  
rss                    554615.199      464041.135  
----------------------------------------------------
* p<0.05, ** p<0.01, *** p<0.001


*Ejercicio C4.4 mlb1.cvs
insheet using "C:\Users\Nerys\Documents\Biblioteca\Econometria, libos ebooks\Solucion a ejercicios de econometria\Base de datos wooldridge\mlb1.csv ", comma clear

eststo clear
estimates store mz3, title(Model 1)
    regress lsalary years gamesyr bavg hrunsyr rbisyr

estimates store mz4, title(Model 2)
    regress lsalary years gamesyr bavg hrunsyr

estout mz3 mz4, cells(b(star fmt(3)) p se(par fmt(3))) legend label varlabels(_cons constant) stats(N r2 rss) title(Models. Salirio de las grandes ligas)

Models. Salirio de las grandes ligas
----------------------------------------------------
                          Model 1         Model 2  
                           b/p/se          b/p/se  
----------------------------------------------------
years                       0.068***        0.069***
                            0.000           0.000  
                          (0.012)         (0.012)  
gamesyr                     0.016***        0.013***
                            0.000           0.000  
                          (0.002)         (0.003)  
bavg                        0.001           0.001  
                            0.184           0.376  
                          (0.001)         (0.001)  
hrunsyr                     0.036***        0.014  
                            0.000           0.369  
                          (0.007)         (0.016)  
rbisyr                                      0.011  
                                            0.134  
                                          (0.007)  
constant                   11.021***       11.192***
                            0.000           0.000  
                          (0.266)         (0.289)  
----------------------------------------------------
N                         353.000         353.000  
r2                          0.625           0.628  
rss                       184.375         183.186  
----------------------------------------------------

*pasa a ser significativo a no. Y la magnitud del coeficiente se reduce.

iii.


estimates store mzl5, title(Model 3)
    regress lsalary years gamesyr bavg hrunsyr runsyr fldperc sbasesyr

estout mz4 mz3 mzl5, cells(b(star fmt(3)) p se(par fmt(3))) legend label varlabels(_cons constant) stats(N r2 rss) title(Models. Salirio de las grandes ligas)


Models. Salirio de las grandes ligas
--------------------------------------------------------------------
                          Model 2         Model 1         Model 3  
                           b/p/se          b/p/se          b/p/se  
--------------------------------------------------------------------
years                       0.069***        0.068***        0.070***
                            0.000           0.000           0.000  
                          (0.012)         (0.012)         (0.012)  
gamesyr                     0.013***        0.016***        0.008**
                            0.000           0.000           0.003  
                          (0.003)         (0.002)         (0.003)  
bavg                        0.001           0.001           0.001  
                            0.376           0.184           0.632  
                          (0.001)         (0.001)         (0.001)  
hrunsyr                     0.014           0.036***        0.023**
                            0.369           0.000           0.008  
                          (0.016)         (0.007)         (0.009)  
rbisyr                      0.011                                  
                            0.134                                  
                          (0.007)                                  
runsyr                                                      0.017***
                                                            0.001  
                                                          (0.005)  
fldperc                                                     0.001  
                                                            0.606  
                                                          (0.002)  
sbasesyr                                                   -0.006  
                                                            0.216  
                                                          (0.005)  
constant                   11.192***       11.021***       10.408***
                            0.000           0.000           0.000  
                          (0.289)         (0.266)         (2.003)  
--------------------------------------------------------------------
N                         353.000         353.000         353.000  
r2                          0.628           0.625           0.639  
rss                       183.186         184.375         177.665  
--------------------------------------------------------------------
* p<0.05, ** p<0.01, *** p<0.001

*individualmente solo runsyr es significativo

iii.
para la significancia conjunta se necesita la prueba F, entre los dos ultimos modelos.

*Ejercicio C4.6 wage2.cvs
insheet using "C:\Users\Nerys\Documents\Biblioteca\Econometria, libos ebooks\Solucion a ejercicios de econometria\Base de datos wooldridge\wage2.csv ", comma clear

i.
. regress lwage educ exper tenure
. *H0; _b[exper] = _b[tenure]

. testparm exper tenure, equal

 ( 1)  - exper + tenure = 0

       F(  1,   931) =    0.17
            Prob > F =    0.6805


. test exper =tenure

 ( 1)  exper - tenure = 0

       F(  1,   931) =    0.17
            Prob > F =    0.6805


*Ejercicio C4.7 twoyear.cvs
insheet using "C:\Users\Nerys\Documents\Biblioteca\Econometria, libos ebooks\Solucion a ejercicios de econometria\Base de datos wooldridge\twoyear.csv ", comma clear

i.

. summ phsrank

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
     phsrank |      6763    56.15703    24.27296          0         99


ii.

. regress  lwage jc totcoll exper phsrank

      Source |       SS       df       MS              Number of obs =    6763
-------------+------------------------------           F(  4,  6758) =  483.85
       Model |  358.050584     4   89.512646           Prob > F      =  0.0000
    Residual |  1250.24551  6758  .185002295           R-squared     =  0.2226
-------------+------------------------------           Adj R-squared =  0.2222
       Total |  1608.29609  6762  .237843255           Root MSE      =  .43012

------------------------------------------------------------------------------
       lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          jc |  -.0093108   .0069693    -1.34   0.182    -.0229728    .0043512
     totcoll |   .0754757   .0025588    29.50   0.000     .0704595    .0804918
       exper |   .0049396   .0001575    31.36   0.000     .0046308    .0052483
     phsrank |   .0003032   .0002389     1.27   0.204    -.0001651    .0007716
       _cons |   1.458747   .0236211    61.76   0.000     1.412442    1.505052
------------------------------------------------------------------------------

. display _b[phsrank]*10
.00303232

iii.

eststo clear
estimates store mzl3, title(Model 1)
    regress  lwage jc totcoll exper

estimates store mzl4, title(Model 2)
    regress  lwage jc totcoll exper phsrank

estout mzl3 mzl4, cells(b(star fmt(3)) p se(par fmt(3))) legend label varlabels(_cons constant) stats(N r2 rss) title(Models. Salario y bachillerato)

Models. Salario y bachillerato
----------------------------------------------------
                          Model 1         Model 2  
                           b/p/se          b/p/se  
----------------------------------------------------
jc                         -0.009          -0.010  
                            0.182           0.142  
                          (0.007)         (0.007)  
totcoll                     0.075***        0.077***
                            0.000           0.000  
                          (0.003)         (0.002)  
exper                       0.005***        0.005***
                            0.000           0.000  
                          (0.000)         (0.000)  
phsrank                     0.000                  
                            0.204                  
                          (0.000)                  
constant                    1.459***        1.472***
                            0.000           0.000  
                          (0.024)         (0.021)  
----------------------------------------------------
N                        6763.000        6763.000  
r2                          0.223           0.222  
rss                      1250.246        1250.544  
----------------------------------------------------
* p<0.05, ** p<0.01, *** p<0.001


Recodificación de variables usando dplyr en R

Una base de datos suele tener diversos tipos de variables del tipo cualitativo y cuantitativo. En función del tipo de variables aplicamos di...