9 dic 2017

Tablas de regresiones en R

Es bastante común presentar los resultados de las regresiones obtenidas en forma de tablas, siendo recomendado para un uso más eficiente del espacio en los artículos y una lectura más ligera de los documentos. En los siguientes ejemplos, utilizando la base de dato wage1.txt, del libro, Introducción a la económica: un enfoque moderno, de J. Wooldridge (2009). Podemos utilizar el paquete stargazer, para los resultados de distintas ecuaciones de forma elegante, en formato plano y en su código latex.

wage1 <- read.delim("wage1.txt")
attach(wage1)

result1 <-lm(wage~educ, data=wage1)
result2 <-lm(wage~educ+exper, data=wage1)
result3 <-lm(wage~educ+I(educ^2)+exper+I(exper^2), data=wage1)

library(stargazer)
stargazer(result1, result2, result3, type = "text")

===========================================================================================
                                              Dependent variable:                         
                    -----------------------------------------------------------------------
                                                     wage                                 
                              (1)                     (2)                     (3)         
-------------------------------------------------------------------------------------------
educ                       0.644***                0.644***                -0.541**       
                            (0.054)                 (0.054)                 (0.230)       
                                                                                          
I(educ2)                                                                   0.048***       
                                                                            (0.009)       
                                                                                          
exper                      0.070***                0.070***                0.277***       
                            (0.011)                 (0.011)                 (0.036)       
                                                                                          
I(exper2)                                                                  -0.005***      
                                                                            (0.001)       
                                                                                          
Constant                   -3.391***               -3.391***                 2.354        
                            (0.767)                 (0.767)                 (1.448)       
                                                                                           
-------------------------------------------------------------------------------------------
Observations                  526                     526                     526         
R2                           0.225                   0.225                   0.304        
Adjusted R2                  0.222                   0.222                   0.298        
Residual Std. Error    3.257 (df = 523)        3.257 (df = 523)        3.094 (df = 521)   
F Statistic         75.990*** (df = 2; 523) 75.990*** (df = 2; 523) 56.774*** (df = 4; 521)
===========================================================================================
Note:                                                           *p<0.1; **p<0.05; ***p<0.01


Ahora, solo cambiaremos algunos argumentos de la función stargazer.

-   type = "text", Omitir este argumento del código anterior, ofrece el código latex de la tabla de ecuación.  ['latex' (default), 'html' or 'text.']
-   keep.stat="n", REstringre la cantidad de estadísticos del modelo colocados en la parte inferior de la tabla.
-   single.row=FALSE, coloca el error estándar de la regresión debajo o al lado del estadístico estimado del modelo.
-   intercept.bottom=FALSE, indica la parte de la tabla donde aparecerá el intercepto.

stargazer(result1, result2, result3, header=FALSE,
  type = "text",
  title="Tabla 1. Comparación de modelos",
  keep.stat="n",digits=2, single.row=FALSE,
  intercept.bottom=FALSE)

Tabla 1. Comparación de modelos
==========================================
                  Dependent variable:    
             -----------------------------
                         wage            
                (1)       (2)       (3)  
------------------------------------------
Constant     -3.39***  -3.39***    2.35  
              (0.77)    (0.77)    (1.45) 
                                         
educ          0.64***   0.64***   -0.54**
              (0.05)    (0.05)    (0.23) 
                                         
I(educ2)                          0.05***
                                  (0.01) 
                                         
exper         0.07***   0.07***   0.28***
              (0.01)    (0.01)    (0.04) 
                                          
I(exper2)                        -0.005***
                                  (0.001)
                                         
------------------------------------------
Observations    526       526       526  
==========================================
Note:          *p<0.1; **p<0.05; ***p<0.01


stargazer(result1, result2, result3, header=FALSE,
  type = "text",
  title="Tabla 1. Comparación de modelos",
  keep.stat="n",digits=2, single.row=TRUE,
  intercept.bottom=FALSE)

Tabla 1. Comparación de modelos
==============================================================
                            Dependent variable:              
             -------------------------------------------------
                                   wage                      
                   (1)             (2)              (3)      
--------------------------------------------------------------
Constant     -3.39*** (0.77) -3.39*** (0.77)    2.35   (1.45)  
educ          0.64*** (0.05)  0.64*** (0.05)   -0.54** (0.23) 
I(educ2)                                      0.05***  (0.01) 
exper         0.07*** (0.01)  0.07*** (0.01)  0.28***  (0.04) 
I(exper2)                                    -0.005*** (0.001)
--------------------------------------------------------------
Observations       526             526              526      
==============================================================
Note:                              *p<0.1; **p<0.05; ***p<0.01


Utilizando el argumento covariate.labels, podemos cambiar el nombre de las variables que aparecen en el cuadro.   

stargazer(result1, result2, result3, header=FALSE,
  type = "text",
  title="Tabla 1. Comparación de modelos",
  digits=2, single.row=FALSE,
  intercept.bottom=TRUE,
  covariate.labels=c("Educación","Educación2","Experiencia","Experiencia2"),
  omit.stat=c("LL","ser","f")
   )

Tabla 1. Comparación de modelos
==========================================
                  Dependent variable:    
             -----------------------------
                         wage            
                (1)       (2)       (3)  
------------------------------------------
Educación     0.64***   0.64***   -0.54**
              (0.05)    (0.05)    (0.23) 
                                         
Educación2                        0.05***
                                  (0.01) 
                                          
Experiencia   0.07***   0.07***   0.28***
              (0.01)    (0.01)    (0.04) 
                                         
Experiencia2                     -0.005***
                                  (0.001)
                                          
Constant     -3.39***  -3.39***    2.35  
              (0.77)    (0.77)    (1.45) 
                                         
------------------------------------------
Observations    526       526       526  
R2             0.23      0.23      0.30  
Adjusted R2    0.22      0.22      0.30  
==========================================
Note:          *p<0.1; **p<0.05; ***p<0.01

 Pero, el comando anterior, omite información sobre los residuos y la significancia conjunta de las variables del modelo.

stargazer(result1, result2, result3, header=FALSE,
  type = "text",
  title="Tabla 1. Comparación de modelos",
  digits=2, single.row=FALSE,
  intercept.bottom=TRUE,
  df = FALSE,
)

Tabla 1. Comparación de modelos
=================================================
                         Dependent variable:    
                    -----------------------------
                                wage            
                       (1)       (2)       (3)  
-------------------------------------------------
educ                 0.64***   0.64***   -0.54**
                     (0.05)    (0.05)    (0.23) 
                                                
I(educ2)                                 0.05***
                                         (0.01) 
                                                
exper                0.07***   0.07***   0.28***
                     (0.01)    (0.01)    (0.04) 
                                                 
I(exper2)                               -0.005***
                                         (0.001)
                                                
Constant            -3.39***  -3.39***    2.35  
                     (0.77)    (0.77)    (1.45) 
                                                
-------------------------------------------------
Observations           526       526       526  
R2                    0.23      0.23      0.30  
Adjusted R2           0.22      0.22      0.30  
Residual Std. Error   3.26      3.26      3.09  
F Statistic         75.99***  75.99***  56.77***
=================================================
Note:                 *p<0.1; **p<0.05; ***p<0.01

Adicionalmente, podemos colocar el intervalo de confianza asociado a los coeficientes estimados, a determinado nivel de confianza.

stargazer(result1, result2, result3, header=FALSE,
  type = "text",
  title="Tabla 1. Comparación de modelos",
  digits=2, single.row=FALSE,
  intercept.bottom=TRUE,
  omit.stat=c("LL","ser","f"),
  ci=TRUE,
  ci.level=0.90)

Tabla 1. Comparación de modelos
==========================================================
                          Dependent variable:            
             ---------------------------------------------
                                 wage                    
                  (1)            (2)             (3)     
----------------------------------------------------------
educ            0.64***        0.64***         -0.54**   
              (0.56, 0.73)   (0.56, 0.73)  (-0.92, -0.16)
                                                          
I(educ2)                                       0.05***   
                                            (0.03, 0.06) 
                                                         
exper           0.07***        0.07***         0.28***   
              (0.05, 0.09)   (0.05, 0.09)   (0.22, 0.34) 
                                                         
I(exper2)                                     -0.005***  
                                           (-0.01, -0.004)
                                                          
Constant        -3.39***       -3.39***         2.35     
             (-4.65, -2.13) (-4.65, -2.13)  (-0.03, 4.74)
                                                         
----------------------------------------------------------
Observations      526            526             526     
R2                0.23           0.23           0.30     
Adjusted R2       0.22           0.22           0.30     
==========================================================
Note:                          *p<0.1; **p<0.05; ***p<0.01

Se puede, cambiar el estilo de las tablas.

stargazer(result1, result2, result3, header=FALSE,
  type = "text",
  title="Tabla 1. Comparación de modelos",
  digits=2, single.row=FALSE,
  intercept.bottom=TRUE,
  omit.stat=c("LL","ser","f"),
  style = "qje"
 )

Tabla 1. Comparación de modelos
====================================================
                              wage                 
                 (1)           (2)          (3)    
----------------------------------------------------
educ           0.64***       0.64***      -0.54**  
               (0.05)        (0.05)        (0.23)  
                                                    
I(educ2)                                  0.05***  
                                           (0.01)  
                                                   
exper          0.07***       0.07***      0.28***  
               (0.01)        (0.01)        (0.04)  
                                                   
I(exper2)                                -0.005*** 
                                          (0.001)  
                                                   
Constant      -3.39***      -3.39***        2.35   
               (0.77)        (0.77)        (1.45)  
                                                   
N                526           526          526    
R2              0.23          0.23          0.30   
Adjusted R2     0.22          0.22          0.30   
====================================================
Notes:        ***Significant at the 1 percent level.
               **Significant at the 5 percent level.
               *Significant at the 10 percent level.

Colocando el nombre a los modelos en las tablas:

stargazer(result1, result2, result3, header=FALSE,
  type = "text",
  title="Tabla 1. Comparación de modelos",
  digits=2, single.row=FALSE,
  intercept.bottom=TRUE,
  omit.stat=c("LL","ser","f"),
  column.labels = c("Model1", "Model2", "Model3")
 )

Tabla 1. Comparación de modelos
==========================================
                  Dependent variable:    
             -----------------------------
                         wage            
              Model1    Model2    Model3 
                (1)       (2)       (3)  
------------------------------------------
educ          0.64***   0.64***   -0.54**
              (0.05)    (0.05)    (0.23) 
                                         
I(educ2)                          0.05***
                                  (0.01) 
                                         
exper         0.07***   0.07***   0.28***
              (0.01)    (0.01)    (0.04) 
                                         
I(exper2)                        -0.005***
                                  (0.001)
                                         
Constant     -3.39***  -3.39***    2.35  
              (0.77)    (0.77)    (1.45) 
                                         
------------------------------------------
Observations    526       526       526  
R2             0.23      0.23      0.30  
Adjusted R2    0.22      0.22      0.30  
==========================================

Note:          *p<0.1; **p<0.05; ***p<0.01


Referencias

Hlavac, Marek (2015). stargazer: Well-Formatted Regression and Summary Statistics Tables.  R package version 5.2. http://CRAN.R-project.org/package=stargazer

Hlavac, Marek (2015). stargazer: beautiful LATEX, HTML and ASCII tables from R statistical output. Harvard University. Consultado en 2/12/2017.

Jake (nd). Stargazer. Consultado en 2/12/2017. https://www.jakeruss.com/cheatsheets/stargazer/

Torres, O. (2014). Using stargazer to report regression output and descriptive statistics in R. (for non-LaTeX users). Pricenton University. Consultado en 2/12/2017. https://www.princeton.edu/~otorres/NiceOutputR.pdf

Recodificación de variables usando dplyr en R

Una base de datos suele tener diversos tipos de variables del tipo cualitativo y cuantitativo. En función del tipo de variables aplicamos di...