Discrete Distributions

Discrete Distributions

• Poisson

• Geometric

• Hypergeometric

• Binomial

What is Discrete Data?

• Data which describes distinct attributes or outcomes• Pass/Fail

• Red/Green

• Large/Small

• Go/No-Go

Poisson Distribution

• Poisson is used to model rates• Defects per unit

• Occurrences per hour

• Represents the probability of “x” occurrences in a fixed interval• Assumptions: the probability of occurrence in an interval must be

proportional to the length of the interval, and the number of occurrences per interval is independent.

• There is no upper bound to the number of occurrences • In modeling defects per unit, there is theoretically no upper limit

• In modeling defective units per shipment, the upper bound is the number of units shipped, therefore Poisson could not apply

Poisson Distribution

• Exact value of x: p(x,m) = (e-m mx)/x!

• Prob(x,m) = Probability of Occurrence

• x = number of occurrences

• m = rate of occurrence(defects per unit, occurrences per hour, etc.)

• Cumulative value of x:Sp(i,m)= S(e -m mi) /i!i=0

x

i=0

x

x!

e)p(x,

xμ

μμ−

=

Poisson Distribution – Example

• Given an average of 5.5 warranty claims per month• What is the probability of exactly 3 claims in a month?

• What is the probability of less than 3 claims in a month?

• What is the probability of more than 3 claims in a month?

Geometric Distribution

• Geometric Distribution is used to model the the probability of first occurrence (success or failure)• The number of trials is the variable

• The number of occurrences is fixed (at 1)

• The probability of occurrence from trial to trial is constant

• There is no upper bound to the number of trials required

Prob(x,p)= p(1-p)(x-1)

X = number of trials to first occurrence

p = probability of occurrence (per trial)

Example

• An event has historically averaged a success rate of 0.247 on a single trial, what is the probability that more than or equal to 5 trials are required to obtain the first success?

Hypergeometric Distribution

• Used to model the number of successes given a fixed number of trials and two possible outcomes on each trial• The probability of success on one trial is dependent on the outcome of the previous trial

• Example: Dealing cards from a deck — the odds of getting an Ace on the first card is 4/52. If the first card is not not replaced, the odds of getting an Ace on the second card depends on whether the first card was an ace (it’s either 4/51 or 3 out of 51).

• Sampling without replacement

=

p(x,N,n,m) =

m

x

N-m

n-x

N

n

This nomenclature denotes a “Combination”

(number of possible combinations)

a

b

a!b!(a-b)!

m = number of occurrences in population

x = number of occurrences in sample

N = population size

n = sample size

Cumulative Probability = Sp(i,N,n,m)i=0

x

Hypergeometric Distribution

Exercise• A sample of 8 items is taken from a population of

20 containing 12 blue items.

• What is the probability of obtaining less than 5 blue items?

Binomial Distribution

• Used to model the number of successes given a fixed number of trials and two possible outcomes on each trial• The probability of success is independent from trial to trial

• Example: dealing cards from a deck, but putting the card back in and reshuffling after each card is drawn.

N!

x!(N-x)!px(1-p)(N-x)p(x,N) = N = Number of trials

x = Number of occurrences

p = Probability of occurrence

Example

• A plant produces marbles, and 18.5% of all marbles produced are red.

• What is the probability of selecting more than 3 red marbles in a randomly selected sample of 5 marbles?

Modeling a Rate, with

no Upper Bound for

number of

Occurrences?

Poisson

Looking for the

Probability of First

Occurrence?

The Random variable is

number of trials

Looking for Probability of

Occurrence,

Samples pulled from fixed

population with no

replacement?

Looking for Probability of

Occurrence, Probability

Constant from Trial-to-

Trial?

Geometric

Hyper-

Geometric

Binomial

Yes

Yes

Yes

Yes

No

No

No

Start

Roadmap

Exercises

2008-10-01 © SKF Group Slide 16SKF (Group Six Sigma) 1.07 Basic Statistics

Exercises

Exercises

Exercise

Real Example With Data Changed

• 100,000 Hub Units Shipped

• A hub unit is found with no retainer nut after accident

• Five thousand units are inspected and another hub unit is found with no locking nut

• What is the expected number of hub units with no locking nut?

• What is the upper 90% confidence limit for this number?

• Should there be a recall? Think beyond statistics.