Distribution Basics

Statistics

• Descriptive statistics – Those methods involving the collection, summarisation, presentation

and characterisation of a set of data in order to properly describe the various features of that

set of data.

• Inferential statistics – Those methods that make possible the estimation of a characteristic of

a population or the making of a decision concerning a population based only on a sample.

Population vs. sample

• A population is a total collection of observations or measurements of interest

• A sample is a subset of measurements or observations from a population

Sample vs. population

m = Population mean

s = Population standard deviation

EstimateSamplestatistics

Populationparameters

SAMPLE POPULATION

s = Sample standard deviation (s)̂

X = Sample mean (m)̂

Describing the sample

• Central tendency – a distinct tendency to cluster about a central point (an average)

• Dispersion – amount of variation or spread in the data

• Shape – manner in which data are distributed

Statistical functions

• f(x) – Probability Density Function (PDF)

• Models a histogram

• F(x) – Cumulative Distribution Function (CDF)

• The area under f(x) to the left of x

• Probability of less than: integral of f(x) from negative infinity to x

• R(x) – Reliability function

• The area under f(x) to the right of x

• 1-F(x)

• Probability of greater than: integral of f(x) from x to infinity

• h(x) – Hazard function

• Instantaneous failure rate: f(x)/R(x)

• A measure of proneness to fail

Probability Density Function Definition

Symmetric vs. skewed data

• Mean > Median: Positive or right-skewness

• Mean = Median: Symmetry or zero-skewness

• Mean < Median: Negative or left-skewness

positively skewed (right) negatively skewed (left)

symmetric

Moment Generating Functions

1st moment generating function about the origin

2nd moment generating function about the mean

Skewness = the 3rd moment generating function about the origin

Example• Given the density function f(x) = ax, 0 < x < 10

• Determine the value of a that makes f(x) a valid density function

• Determine the mean and variance of the distribution

• Derive expressions for the reliability and hazard functions

( ) ( )( )ax dxa

xa

a0

102

0

10

2 21

21

210 0 1

1

50 = = − = =

( )E x xx

dxx

dx( ) .= = = = 50 50

1

15010 6 667

0

102

3

0

10

( ) ( )V x x dx( ) . . .= − = − =1

506 667

1

20010 44 449 5 55

3 2 4

0

10

Example Continued

Probability Density Function (PDF)

0.59

0.71

0.83

0.95

1.07

1.19

1.31

1.43

1.55

1.67

1.79

1.91

2.03

2.15

2.27

2.39

2.51

2.63

0

10

20

30

40

Count

Probability Density Function

0

x

f(x)

a b

6 8Diameter

4 6 80

0.2

0.4

0.6

0.8

1

Diameter

Cu

mu

lati

ve D

istr

ibu

tio

n

40

0.002

0.004

0.006

0.008

Pro

ba

bil

ity

De

nsi

ty

Cumulative Distribution Function (CDF)

Reliability Function

0x

1

F(x) R(x)

Hazard Function

• Instantaneous failure rate