Trabajo Final Muestreo

TRABAJO FINAL MUESTREO
DECLER JOSE PABUENA VERGARA
10 de julio de 2019
Introducción
EL siguiente trabajo tiene por finalidad aplicar diversas tecnicas o métodos
estadísticos que prmitan determinar de la forma mas acertada posible la distribusion
de ciertas variablesde y apartir de alli, definir tamañ o de muestra y limites de cofianza
para posteriores analisis. Esta data tiene 65534 observaciones y 20 variables
setwd("~/MUESTREO")
med<- read.csv("Med18.csv", dec=",", sep = ";", header = TRUE )
1. DESCRIPCION DE LA VARIABLE COSTO TOTAL Determino las características de

la variable de interes, COSTO TOTAL
dim(med)
[1] 65534 20
class(med$VALOR.TOTAL)
[1] "numeric"
pander(summary(med$VALOR.TOTAL))
Min. 1st Qu. Median Mean 3rd Qu. Max.

12 1290 3024 23381 7800 31904343
variable numérica Rango de la variable extremadamente grande para los valores de la

funció n
hist(med$VALOR.TOTAL)
plot(med$VALOR.TOTAL)
Acoto la
variable aplicando logaritmo
vtlog <- log(med$VALOR.TOTAL)
head(vtlog)
[1] 7.090077 11.533316 8.451053 6.445720 7.185387 9.305651
pander(summary(vtlog))
Min. 1st Qu. Median Mean 3rd Qu. Max.

2.485 7.162 8.014 8.052 8.962 17.28
hist(vtlog)
plot(vtlog)
boxplot(vtlog)
barplot(vtlog)
Hallo la varianza y la desviacion de la variable
var(vtlog)
[1] 2.71108
sd(vtlog)
[1] 1.646536
x=vtlog
head(dnorm(x, mean(vtlog), sd(vtlog)))
[1] 0.20427321 0.02592157 0.23528283 0.15054111 0.21094630 0.18133255
dnor<- dnorm(vtlog, mean(vtlog), sd(vtlog))

curve(dnorm(x, mean(vtlog),
sd(vtlog)),xlim=c(3.052,13.052),col="blue",lwd=2,
xlab="x",ylab="f(x)",main="Función de Densidad
N(mean(vtlog),sd(vtlog))")
descdist(log(med$VALOR.TOTAL))
summary statistics
------
min: 2.484907 max: 17.27825
median: 8.014336
mean: 8.052096
estimated sd: 1.646536
estimated skewness: 0.2961642
estimated kurtosis: 4.563761
Como el valor de skewness es muy cercano a cero indica que posee algun tipo de
simetría la distribució n empírica, y la kurtosis cuantifica el peso de los extremos, ya
que un valor cercano a 3 es la kurtosis de una distribució n que acerca mucho a lo
normal VERIFICO LA DISTRIBUSION
fw<-fitdist(log(med$VALOR.TOTAL), "weibull")
summary(fw)
Fitting of the distribution ' weibull ' by maximum likelihood

Parameters :
estimate Std. Error
shape 5.010458 0.013814132
scale 8.716251 0.007192268
Loglikelihood: -128502.4 AIC: 257008.8 BIC: 257026.9
Correlation matrix:
shape scale
shape 1.0000000 0.3273995
scale 0.3273995 1.0000000
plot(fw)
fg<-fitdist(log(med$VALOR.TOTAL), "gamma")
summary(fg)
Fitting of the distribution ' gamma ' by maximum likelihood

Parameters :
estimate Std. Error
shape 22.583267 0.12384724
rate 2.804683 0.01555278
Correlation matrix:
shape rate
shape 1.0000000 0.9889523
rate 0.9889523 1.0000000
plot(fg)
fl<-fitdist(log(med$VALOR.TOTAL), "lnorm")
summary(fl)
Fitting of the distribution ' lnorm ' by maximum likelihood

Parameters :
estimate Std. Error
meanlog 2.0636322 0.0008502502
sdlog 0.2176607 0.0006011606
Loglikelihood: -128299.4 AIC: 256602.8 BIC: 256621
Correlation matrix:
meanlog sdlog
meanlog 1.000000e+00 3.719011e-12
sdlog 3.719011e-12 1.000000e+00
plot(fl)
fn<-fitdist(log(med$VALOR.TOTAL), "norm")
summary(fn)
Fitting of the distribution ' norm ' by maximum likelihood

Parameters :
estimate Std. Error
mean 8.052096 0.006431829
sd 1.646523 0.004547982
Correlation matrix:
mean sd
mean 1 0
sd 0 1
plot(fn)
2. SELECCION DE LA VARIABLE QUE CUMPLE LA ESPECIFICACION
POR ESTRATO Y POR CONGLOMERADO
escogería la variable MEDICO o especialidad médica para tratarla como una variable
subdividida en estratos, ya que la variabilidad dentro de cada uno de los médicos o
especialidad médica será muy pequeñ a y variabilidad entre especialidades será muy
grande. La variable que escogeria para estudiarla por conglomerado seria CODIGO o
diagnó stico del usuario, ya que la variabilidad interna dentro de cada conjunto de
individuos del mismo diagnostico puede ser grande y la variabilidad que puede existir
entre conjuntos de diagnosticos diferentes podria ser pequeñ a.
3. DETERMINACION DEL TAMAÑO DE LA MUESTRA PARA LA

VARIABLE COSTO TOTAL

Trabajo Final Muestreo

Cargado por

Información del documento

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Trabajo Final Muestreo

Cargado por

Copyright:

Formatos disponibles

TRABAJO FINAL MUESTREO

DECLER JOSE PABUENA VERGARA

1. DESCRIPCION DE LA VARIABLE COSTO TOTAL Determino las características de

Min. 1st Qu. Median Mean 3rd Qu. Max.

variable numérica Rango de la variable extremadamente grande para los valores de la

[1] 7.090077 11.533316 8.451053 6.445720 7.185387 9.305651

Min. 1st Qu. Median Mean 3rd Qu. Max.

[1] 0.20427321 0.02592157 0.23528283 0.15054111 0.21094630 0.18133255

dnor<- dnorm(vtlog, mean(vtlog), sd(vtlog))

Fitting of the distribution ' weibull ' by maximum likelihood

Fitting of the distribution ' gamma ' by maximum likelihood

Fitting of the distribution ' lnorm ' by maximum likelihood

Fitting of the distribution ' norm ' by maximum likelihood

3. DETERMINACION DEL TAMAÑO DE LA MUESTRA PARA LA

También podría gustarte