clase R primera (1)

CLASE DE REGRESIÓN EN R
https://www.r-project.org/
Getting Started
R is a free software environment for statistical computing and graphics.

It compiles and runs on a wide variety of UNIX platforms, Windows and
MacOS. To download R, please choose your preferred CRAN mirror.
Consola
El siguiente paso es instalar un IDE para R, en este caso utilizaremos RStudio
Las funciones de R se agrupan en paquetes (packages, libraries), los que contienen las
funciones más habituales se incluyen por defecto en la distribución de R, y el resto se
encuentran disponibles en la Comprehensive R Archive Network (CRAN). Las entidades que R
crea y manipula se llaman objetos. Dichos objetos pueden ser: •Escalares: números,
caracteres, lógicos (booleanos), factores •Vectores/matrices/listas de escalares •Funciones
•Objetos ad-hoc Dichos objetos se guardan en un workspace. Durante una sesión de R todos
los objetos estarán en memoria, y se pueden guardar en disco para próximas sesiones
1
2
1) Paso 1, abrimos y fijamos directorio
Una vez abierta la consola de R fijamos directorio
DIRECTORIO … a la derecha en los …. Y elegimos la carpeta en la que insertamos todos

los documentos de la práctica
3
Para fijar directorio: More- Save as orking directory
En la consola se refleja el comando de R en modo programación
setwd("C:/Users/Usuario/Desktop/practica R clase")
2) Script
Para poder guardar nuestro trabajo
EL script está en blanco para poder guardar todo el trabajo que hacemos, el primer paso es
guardarlo y a partir de ahí escribimos las órdenes
4
# para ir comentando las notas de clase puedo escribirlo con la almohadilla
# podemos poner las líneas que queramos para acordarnos y compartir con el equipo
# ahora comenzamos fijando el directorio de trabajo
setwd("C:/Users/Usuario/Desktop/practica R clase")
Ya lo teníamos, pero de este modo queda en nuestro script para la próxima sesión, una vez que
# ahora vamos a cargar los datos y tenemos que saber si es csv, texto, spss...
ESTE ES EL CODIGO DE R QUE ESTAMOS HACIENDO POR VENTANAS
library(readxl)
datos_pib_para_R <- read_excel("datos pib para R.xlsx")
View(datos_pib_para_R)
En R prácticamente todo es definido como un objeto; un dato numérico, un vector, una matriz
de datos o una función, son objetos, y R opera sobre ellos. Cada objeto tiene un nombre, y el
hecho de escribirlo en la ventana de comandos hará que se muestre su contenido. R distingue
entre distintos tipos de objetos (no es lo mismo un vector, una matriz o una función) cada uno
de los cuales posee características propias, que en el entorno R se conocen como modo y
atributos. Los objetos son de un determinado tipo (mode) y tienen atributos (attributes). Un
objeto es en definitiva la forma en la que R almacena la información. Para obtener un listado
de los objetos disponibles en el espacio de trabajo pueden utilizarse las funciones ls() u
objects().
PARA VERLAS TODAS JUNTAS
5
VAMOS A RENOMBRAR LAS VARIABLES PARA QUE SEA MÁS SENCILLO
#Renombramos la variable Año
names(datos_pib_para_R) estamos preguntando eL nombre de las columnas
names(datos_pib_para_R)[1] estamos seleccionando la primera columna
names(datos_pib_para_R)[1] = "Año" el cambiamos el nombre a la primera columna
names(datos_pib_para_R)[2] = "Gasto"
names(datos_pib_para_R)[3] = "PIB"
saveRDS(datos_pib_para_R,file = "datos.RDS")
Para gravar los datos cambiado el nombre
#Hacemos un gráfico de evolución del PIB
plot(datos_pib_para_R$PIB)
datos_pib_para_R #fichero en el que vamos a realizar nuestro gráfico
$PIB la variable
#Hacemos un gráfico de evolución del gasto
plot(datos_pib_para_R$Gasto)
#Hacemos un gráfico de dispersión de PIb y consumo
plot(datos_pib_para_R$Gasto,datos_pib_para_R$PIB)
El primero es el eje X y el segundo en eje Y
6
#Sacamos la correlacion lineal entre el PIB y el consumo
cor(datos_pib_para_R$Gasto,datos_pib_para_R$PIB)
# Hacemos un modelo lineal
modelo_lineal<-lm
<- # sirve para guardar el objeto
modelo_lineal<-lm(Y ~ X,data = datos_pib_para_R)
modelo_lineal<-lm(Gasto ~ PIB,data = datos_pib_para_R)
Autocorrelación
Ejemplo de prueba Durbin – Watson (Autocorrelación de 1° orden)
Usando librería “lmtest”
library(lmtest)
dwtest(modelo_lineal,alternative ="two.sided",iterations = 1000)
Durbin-Watson test
data: modelo_lineal
DW = 0.68054, p-value = 1.758e-05
alternative hypothesis: true autocorrelation is not 0
https://www.rpubs.com/Econ0metria/505378
Breusch-Godfrey Test
Description
bgtest performs the Breusch-Godfrey test for higher-order serial correlation.
Usage
bgtest(formula, order = 1, order.by = NULL, type = c("Chisq", "F"),
data = list(), fill = 0)
bgtest(modelo_lineal)
Breusch-Godfrey test for serial correlation of order up to 1
7
data: modelo_lineal
LM test = 11.555, df = 1, p-value = 0.0006758
escription
bgtest performs the Breusch-Godfrey test for higher-order serial correlation.
Usage
Arguments
formula a symbolic description for the model to be tested (or a fitted "lm" object).
order integer. maximal order of serial correlation to be tested.
order.by Either a vector z or a formula with a single explanatory variable like ~ z. The
observations in the model are ordered by the size of z. If set to NULL (the default) the
observations are assumed to be ordered (e.g., a time series).
type the type of test statistic to be returned. Either "Chisq" for the Chi-squared test statistic
or "F" for the F test statistic.
data an optional data frame containing the variables in the model. By default the variables
are taken from the environment which bgtest is called from.
fill starting values for the lagged residuals in the auxiliary regression. By default 0 but can
also be set to NA.
Details
Under H_0 the test statistic is asymptotically Chi-squared with degrees of freedom as given in
parameter. If type is set to "F" the function returns a finite sample version of the test statistic,
employing an F distribution with degrees of freedom as given in parameter.
By default, the starting values for the lagged residuals in the auxiliary regression are chosen to
be 0 (as in Godfrey 1978) but could also be set to NA to omit them.
bgtest also returns the coefficients and estimated covariance matrix from the auxiliary
regression that includes the lagged residuals. Hence, coeftest can be used to inspect the
results. (Note, however, that standard theory does not always apply to the standard errors and
t-statistics in this regression.)
Value
A list with class "bgtest" inheriting from "htest" containing the following components:
statistic the value of the test statistic.
8
p.value the p-value of the test.
parameter degrees of freedom.
method a character string indicating what type of test was performed.
data.name a character string giving the name(s) of the data.
coefficients coefficient estimates from the auxiliary regression.
vcov corresponding covariance matrix estimate.
Author(s) David Mitchell <david.mitchell@dotars.gov.au>, Achim Zeileis
HETEROCEDASTICIDAD
https://fhernanb.github.io/libro_regresion/homo.html
9
estimaciones alternativas
modelo_lineal_log<-lm(log(Gasto) ~ log(PIB),data = datos_pib_para_R)
# Analizamos el modelo
summary(modelo_lineal_log)
Call:
lm(formula = log(Gasto) ~ log(PIB), data = datos_pib_para_R)
Residuals:
Min 1Q Median 3Q Max
-0.086746 -0.051583 -0.005912 0.051723 0.129441
Coefficients:
Estimate Std. Error t value Pr(>|t|)
10
(Intercept) -4.37819 0.53710 -8.151 1.24e-08 ***
log(PIB) 1.19639 0.03917 30.546 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.0633 on 26 degrees of freedom
Multiple R-squared: 0.9729, Adjusted R-squared: 0.9718
F-statistic: 933 on 1 and 26 DF, p-value: < 2.2e-16
dwtest(modelo_lineal_log,alternative ="two.sided",iterations = 1000)
Durbin-Watson test
data: modelo_lineal_log
DW = 0.55299, p-value = 1.124e-06
alternative hypothesis: true autocorrelation is not 0
bgtest(modelo_lineal_log)
Breusch-Godfrey test for serial correlation of order up to 1
LM test = 13.414, df = 1, p-value = 0.0002497
prueba_white_log<-bptest(modelo_lineal_log,~I(PIB^2),data = datos_pib_para_R)
studentized Breusch-Pagan test
BP = 0.064684, df = 1, p-value = 0.7992
11
#Para ver los residuos
modelo_lineal_log$residuals
# install from CRAN
install.packages("kableExtra")
# install the development version
remotes::install_github("haozhu233/kableExtra")
https://scpoecon.github.io/ScPoEconometrics/R-intro.html
12

clase R primera (1)

Cargado por

Información del documento

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

clase R primera (1)

Cargado por

Copyright:

Formatos disponibles

CLASE DE REGRESIÓN EN R

R is a free software environment for statistical computing and graphics.

El siguiente paso es instalar un IDE para R, en este caso utilizaremos RStudio

Una vez abierta la consola de R fijamos directorio

DIRECTORIO … a la derecha en los …. Y elegimos la carpeta en la que insertamos todos

En la consola se refleja el comando de R en modo programación

Para poder guardar nuestro trabajo

# ahora comenzamos fijando el directorio de trabajo

ESTE ES EL CODIGO DE R QUE ESTAMOS HACIENDO POR VENTANAS

datos_pib_para_R <- read_excel("datos pib para R.xlsx")

PARA VERLAS TODAS JUNTAS

#Renombramos la variable Año

names(datos_pib_para_R) estamos preguntando eL nombre de las columnas

names(datos_pib_para_R)[1] estamos seleccionando la primera columna

names(datos_pib_para_R)[1] = "Año" el cambiamos el nombre a la primera columna

Para gravar los datos cambiado el nombre

#Hacemos un gráfico de evolución del PIB

datos_pib_para_R #fichero en el que vamos a realizar nuestro gráfico

#Hacemos un gráfico de evolución del gasto

#Hacemos un gráfico de dispersión de PIb y consumo

El primero es el eje X y el segundo en eje Y

# Hacemos un modelo lineal

<- # sirve para guardar el objeto

modelo_lineal<-lm(Y ~ X,data = datos_pib_para_R)

modelo_lineal<-lm(Gasto ~ PIB,data = datos_pib_para_R)

Ejemplo de prueba Durbin – Watson (Autocorrelación de 1° orden)

Usando librería “lmtest”

DW = 0.68054, p-value = 1.758e-05

alternative hypothesis: true autocorrelation is not 0

bgtest performs the Breusch-Godfrey test for higher-order serial correlation.

bgtest(formula, order = 1, order.by = NULL, type = c("Chisq", "F"),

data = list(), fill = 0)

Breusch-Godfrey test for serial correlation of order up to 1

LM test = 11.555, df = 1, p-value = 0.0006758

bgtest performs the Breusch-Godfrey test for higher-order serial correlation.

bgtest(formula, order = 1, order.by = NULL, type = c("Chisq", "F"),

data = list(), fill = 0)

order integer. maximal order of serial correlation to be tested.

statistic the value of the test statistic.

parameter degrees of freedom.

method a character string indicating what type of test was performed.

data.name a character string giving the name(s) of the data.

coefficients coefficient estimates from the auxiliary regression.

vcov corresponding covariance matrix estimate.

Author(s) David Mitchell <david.mitchell@dotars.gov.au>, Achim Zeileis

modelo_lineal_log<-lm(log(Gasto) ~ log(PIB),data = datos_pib_para_R)

lm(formula = log(Gasto) ~ log(PIB), data = datos_pib_para_R)

Min 1Q Median 3Q Max

-0.086746 -0.051583 -0.005912 0.051723 0.129441

Estimate Std. Error t value Pr(>|t|)

log(PIB) 1.19639 0.03917 30.546 < 2e-16 ***

Residual standard error: 0.0633 on 26 degrees of freedom

Multiple R-squared: 0.9729, Adjusted R-squared: 0.9718

F-statistic: 933 on 1 and 26 DF, p-value: < 2.2e-16

dwtest(modelo_lineal_log,alternative ="two.sided",iterations = 1000)

DW = 0.55299, p-value = 1.124e-06

alternative hypothesis: true autocorrelation is not 0

Breusch-Godfrey test for serial correlation of order up to 1

LM test = 13.414, df = 1, p-value = 0.0002497

studentized Breusch-Pagan test

BP = 0.064684, df = 1, p-value = 0.7992

# install from CRAN

# install the development version

También podría gustarte