59 vistas

Cargado por Dennis Clason

- Timing of Control Activities in Project Planning
- stach5_1
- Biostatistics With R Solutions
- HPHT Gas Well Cementing and Effect on Collapse
- Beta Distribution
- 3 - Simulation in Practice and Intro to Probability and Statistics
- 4
- Chapter 04
- Stat
- Massey Applied Digital Info Theory
- Space Filling Does
- 8349090231127871
- 2004 JAN
- sinopsis_fizik_gunaan.doc
- Bi Variate
- Instructive Review of Computation of Electric Fiel
- sd
- Integrated Cost-Schedule Risk Analysis Using the Risk Driver Approach
- parentalidade disruptiva
- Statistics

Está en la página 1de 5

Distributions

Applied Statistics 565, Fall 2011

Parameters

To date, we have considered parameters to be fixed indices that specified a

particular member of a distribution family. We have tinkered around the edges with

this idea. Now we will change the idea entirely. We will consider parameters to be

(unobservable) random variables. Our goal will be to find the marginal distribution

of the observable random variable.

You might ask what sort of problem this could address. Suppose that we are

employed by a public health agency, and working on a regional measure. This might

be measuring neonatal mortality rates, or post-operative infection rates for

hospitals.

There is no a priori reason to believe that these rates should be homogenous

across hospitals. In fact, there are structural reasons to believe that they should

differ. Teaching hospitals serve a different clientele than community hospitals,

which in turn serve a different clientele than rural community hospitals. For

example, the Obstetrics department at a teaching hospital will handle many more

high-risk pregnancies. Their neonatal mortality rate should be higher than for a

community hospital obstetricians practicing in community hospitals generally

send high-risk cases to specialists in high-risk pregnancies. These specialists often

prefer to practice in teaching hospitals, because more state-of-the-art equipment is

available. Even with the high-tech equipment, the death rate for these newborns

will be higher.

So, we need marginal models that make allowance for possibly different rates

for different hospitals. Now, for any particular hospital, the death rate should be an

unknown value, (say) P. But the value of P will vary for different hospitals.

One way to manage this problem is to take P to be a random variable, and

allow the number of neonatal deaths to be Binomial with parameters

1

n and P. We

can take P ~ Beta(o, |) and X | P ~ binomial(n, P). This method of creating new

distributions is called compounding. This particular compounding is written

( )

parameter, Parameter distribution. So, in this instance we are compounding a

binomial distribution (on p) with a beta distribution.

1

We could consider allowing n to be a random variable as well. My personal

preference in these situations is to make inferences conditional on n, rather than on

the marginal distribution.

The Beta-binomial distribution

If we take the distributions as defined above, we find the joint distribution is

( | )

{

( )

()()

( )

( )

The parameter space for this joint density is n = 1, 2, ; 0 < o; 0 < |. This simplifies

by grouping the terms in p together

( | ) {

( )

()()

(

( )

Of course, we are not especially interested in the joint density/mass function, except

as a tool to find the marginal mass function of X. This requires us to integrate out p

( | )

( )

()()

(

( )

( )

()()

(

( )

( )

()()

(

)

( )( )

( )

By using the recursion relationship for gamma functions, we can simplify this

slightly

( | ) {

(

)

( )

( )

( )

This is a discrete distribution with mean

The variance is

) (

) (

)

These parameters are easily interpreted if we recall that the mean of the Beta

distribution is o/(o+|) and think of this as the average value of p. Then the mean

follows the binomial form exactly and the variance is multiplied by a constant

greater than 1. That is, the marginal distribution of X is more variable than a

binomial with p = o/(o+|). A graph of a binomial and a beta-binomial with the

same mean is shown on the next page. The standard deviation of the beta-binomial

is nearly twice the binomial value.

This distribution is called the Beta-binomial distribution. The analysis is

made possible by the relationship between the beta distribution density

(| ) {

( )

()()

( )

and the binomial mass function

( | ) {

(

( )

Notice that the functional forms are nearly

identical. If we considered the binomial mass

function as a density for p with support for 0 <

p < 1, it works. (Rewrite the combinatorial

symbol as the equivalent gamma functions, and

a Beta(x, n x) density results.) The beta

density and the binomial mass function are

conjugate densities

2

. Finding the conjugate

density for a given distribution is a matter of

looking carefully at the kernel of the data

density and asking, When I consider this

kernel as a function of the parameter, what

density/mass function results? Other pairs of

conjugate distributions are the

Poisson/gamma, negative binomial/beta, and

Normal/Normal (for ) and Normal/gamma (for o).

All of this is also related to the problem of Bayesian inference. The

compounding distribution is called the prior (distribution of the parameter).

However, instead of finding the marginal distribution of X, in a Bayesian setting we

need to find the conditional distribution of p | X=x. This may (but usually doesnt)

involve finding the marginal density of X to produce the conditional distribution.

The Gamma-Poisson Distribution

One of your recommended homework problems asked you to consider the

case where X | M = m ~ Poisson(m), and M ~ _

2

(1). We will generalize this to the

case where M ~ Gamma(, p/(1-p) ). The joint density is

( ) {

() (

()

2

This sort of thing is another reason it is convenient (in linguistic terms) to refer to

the mass function of a discrete variable as a density.

0 5 10 15 20

0

.

0

0

0

.

0

5

0

.

1

0

0

.

1

5

0

.

2

0

x

P

r

o

b

a

b

i

l

i

t

y

Binomial(20, 1/6)

Beta-binomial(1,5,20)

{

( )() (

We now integrate out the Poisson parameter m to obtain the marginal distribution

of X

()

( )()

()

( )

( )() (

( )

{

( )

()

( )

Perhaps you recognize this, perhaps you dont. In case you dont, this is a negative

binomial with parameters and (1 p). We can interpret the negative binomial

distribution as gamma compounding of the Poisson distribution. The mean of this

negative binomial is p/(1-p), and the mean of the gamma is (generically) o|, and in

this specific case that becomes p/(1-p). The mean of the compound distribution

corresponds to the mean of the distribution of M. If X were Poisson with parameter

p/(1-p), this would also be its variance. But the variance of the negative binomial

is p/(1-p)

2

: this is (obviously) larger than the Poisson variance. The negative

binomial is an over-dispersed alternative to the Poisson distribution.

Mixture distributions

We mentioned in class that every random variable could be written as a

combination of a discrete random variable, a continuous random variable, and a

singular random variable

3

. Some of the exercises in Chapter I suggested that a

convex combination of densities/mass/distribution functions is also a

density/mass/distribution function. Showing this is not difficult. Suppose that F1,

F2,, Fm are distribution functions, and o1, o2, om are non-negative numbers

summing to 1. Then

()

()

is also a distribution function. Showing this involves passing to the limits as

and showing that for every x and every h > 0, F(x + h) > F(x). These

properties all follow immediately from the fact the individual terms in the sum are

3

Singular random variables have a distribution function, but the distribution

function is such a mess that it does not admit a density. Such variables are

continuous, but not absolutely continuous. They are of theoretical interest only.

themselves distribution functions. From this fact, the similar combinations of

densities and mass functions being densities or mass functions follows by taking the

derivative with respect to the appropriate measure.

One application of mixture distributions is in modeling heavier-than-Normal

tailed data distributions. The c-contamination model asks us to suppose that for

some 0 < c < 1, our data follow a contaminated Normal distribution, that is the

distribution function of the data is

() ( )(

) (

)

That is, we usually observe data with standard deviation s, but we occasionally get

an observation from a distribution with the same mean, but a much larger standard

deviation. One question of interest is, How large must c be before the mixture

seriously damages the SD of the sampling model of the mean? The answer

(surprisingly) is, Not very large at all. According to John Tukey, with this sort of

contamination, c as small as 3% is sufficient to make the median superior to the

mean.

We can also perform mixtures with respect to a continuous distribution. For

example, it can be shown that the T distribution is a gamma mixture (with respect to

o) of Normal distributions. At this point, the distinction between mixing and

compounding grows blurry indeed.

- Timing of Control Activities in Project PlanningCargado porapi-3707091
- stach5_1Cargado porPrashanthi Priyanka Reddy
- Biostatistics With R SolutionsCargado porMcNemar
- HPHT Gas Well Cementing and Effect on CollapseCargado porvarunbhalla91
- Beta DistributionCargado porHasrul Zahid
- 3 - Simulation in Practice and Intro to Probability and StatisticsCargado portwiddleap
- 4Cargado porseyyed81
- Chapter 04Cargado porHillary Wooden
- StatCargado porMaQsud AhMad SaNdhu
- Massey Applied Digital Info TheoryCargado porrweller
- Space Filling DoesCargado porram rak
- 8349090231127871Cargado porArif Syaifuddin
- 2004 JANCargado porkazuki
- sinopsis_fizik_gunaan.docCargado porrahulsingh
- Bi VariateCargado porIera Ahmad
- Instructive Review of Computation of Electric FielCargado porJB -dev
- sdCargado pordmwaura_2
- Integrated Cost-Schedule Risk Analysis Using the Risk Driver ApproachCargado porsam
- parentalidade disruptivaCargado porInes Santos
- StatisticsCargado porHesham M S Elabani
- Probability Theory Full VersionCargado porator0380
- II PUC Statistics Lesson Plan 2017 2018Cargado pormohanraokp2279
- IBHM_633-672Cargado poralphamale173
- Reg Poisson Mei 2018Cargado porHabibJazuli
- detailed lectureCargado porapi-311540719
- UT Dallas Syllabus for stat3360.002.08s taught by Charles Mcghee (cxm070100)Cargado porUT Dallas Provost's Technology Group
- A Classiﬁcation of Trust SystemsCargado porSupriyanto Praptodiyono
- Joint Distributed Random VariableCargado porsathish22
- 5Enote5Cargado porYusuf Hussein
- stat100b_gamma_chi_t_f.pdfCargado porsumanta1234

- 04 Moments, Skewness & KurtosisCargado porDaniel Robert
- SM2205ES1-7Cargado porJooSie241
- 8. Probability Distributions.docCargado porlengmiew
- Joint Probability FunctionsCargado porBran Relieve
- Complementary Error Function TableCargado porAbdul Aziz Turdi
- 2011440 Midterm One SolutionsCargado porEd Z
- SS1 Quantitative Analysis ICargado porRohit Mittal
- Skew-KurtCargado porTitis Nur Widiawati
- statistika matematikaCargado pordchasanah
- 15MA207_aug16Cargado porJagdesh
- Poisson-Size-biased Lindley DistributionCargado porIJSRP ORG
- Probability and Random ProcessCargado porShri Prakash
- JuliaPro v0.6.2.1 Package API ManualCargado porCapitan Torpedo
- Binomial vs. Geometric DistributionsCargado porRocket Fire
- psqt imp questions[1][1].docCargado porSai Santhosh
- 4 Discrete Probability Distribution.pdfCargado porpandaprasad
- c6.pdfCargado porCindy Britto
- introecon_normal_dist.pdfCargado porFetsum Lakew
- Probability and Statistics Course NotesCargado porCammi Smith
- ME4504 02 Basic ReliabilityCargado poralijazizaib
- Ch3Cargado pordownloadfreakforever
- Characeristic FunctionCargado porKunalTelgote
- The Poisson Probability DistributionCargado porlilcashy
- computational methods and statisticsCargado porChiko Red
- HW4Cargado porPallavi Raiturkar
- Collaborative Statistics_ Symbols and Their MeaningsCargado porShijo Thomas
- Math 130A - Sorace - Winter 2019Cargado porThomas Lai
- Normal Distribution TableCargado porMegat Azwan
- Order StatisticsCargado porAnwaarAhmad
- prp notesCargado porBharath Jojo