Experimental Design

Introduction

• What is DOE?

• Why & when to use DOE?

• Main purpose of DOE: Gain knowledge with minimum expense

What is DOE?

• DOE is a structured method data collection & analysis for empirical curve fitting• It begins with the statement of the experimental objective and ends with the reporting of the

results

• A systematic set of experiments which permits evaluation of the effect of one or more factors without concern about extraneous variables or subjective judgments

• It is the vehicle of the scientific method giving unambiguous results which can be used for inferring cause & effect

• Possible Objectives:• Eliminate non-significant factor

• Estimate Y=f(x) relationship

• Design or process optimisation

• It may often lead to further experimentation (heuristic approach – with each step in experimentation, new hypotheses are generated that need to be tested)

Why & When to Use DOE?

• If you know the physics, you don’t need experimentation

• DOE may be more expensive than other options. Consider other options before DOE

• Multi-Vari charts

• Stepwise Regression with historical data

• SPC & process control

• Geographic Analysis

DOE Purpose

• Gain knowledge to• Improve something

• Optimise something

• Solve a problem

• DOE enables knowledge to be gained• Efficiently

• Objectively

Terminology

•Response• The independent variable

• Output

• Effect

• Y

•There may be more than one response variable!• Be careful not to ignore response variables that are not the focus of the experiment

• An engineer wants to increase productivity, but not at the sacrifice of quality

• You want to reduce the electrical interference or a radio, but reception quality cannot be sacrificed

Terminology

•Factor• The dependent variable

• Input

• Controlled variable

• X

•It is the variable under investigation. • The variable settings are manipulated in a controlled way during the experiment

• May be quantitative or qualitative

Steps of DOE

1. Verify your measurement system

2. Build linear model (1st order)

• Pick many factors

• Screen out factors

• Eliminate confounding

3. Optimise non-linear model (2nd order or higher)

• Evolutionary Operation (EVOP)

• Response Surface Methodology (RSM)

Objective & Planning Phase

Screening Phase

ConfirmationPhase

OptimisationPhase

Define experimental objective and purpose

Screen for influential variables (Xs)

Correlate analysis results with the actual process

Optimise the response variables (Ys)

DOE Process

Objective & Planning Phase Purpose

• Clearly define the purpose of the DOE (maximise, minimise, hit a target, or minimise

variance)

• Answer the question: “Is the purpose of the DOE consistent with the practical problem

statement?”

• Perform measurement system and stability assessments

• Objective phase often overlooked

Objective & Planning Phase Tasks

• What is the practical problem?

• What is the response variable?

• Is it the correct one? Is it the only one?

• Gauge repeatability & reproducibility; can we measure the expected changes in our response

variable(s)?

• What is our desired response?

• What is the objective (maximise, minimise, or hit target) in terms of response?

• Is the process stable?

Screening Phase Purpose

• Identify variables that have a significant effect on the response

• NOT interested in defining a mathematical relationship at this phase

• Goal is to determine the factors that are carried for further experimentation

Screening Phase Tasks

• What are potential X variables?

• What are the noise control and signal factors for the experiment?

• Select experimental design (set-up)

• What are the factor level settings?

• Perform (series of) screening experiment(s)

DOE Pre-Work

Noise Factors

Control Factors

Signal Factors ResponseProduct or Process

1. Randomisation, Blocking2. Measurement Systems Analysis;

Replication3. Factor selection, factor level settings,

replication4. Experimental Procedure

P-Diagram1. Variation due to shift, batch,

environment, maintenance intervals, machines, etc.

2. Measurement System Capability3. Process Control4. Discipline in performing experiment

Screening Model

Assume a linear relationship – identify factors that move the response

Real predictive relationship

Factor

Response

Screening Guidelines

• Only use two levels (assume linear model)

• Set the levels as far apart as possible, but realistic

• Include as many factors as possible

•Demonstration: <Factor Levels Simulation.xls>

Screening Designs

• Use standard designs that minimise the number of trials

• Plackett-Burman• Orthogonal or balanced

• Taguchi copied these

• In Minitab or at the link below

• <Factorial Designs.xls>

Analysis of Variance

Demonstration: <SupportFilesStandardDeviationSearch.xlsx>

Total variability

Between Groups (effects)

Total variability

Between Groups (effects)

Within Groups variability (noise)

Variance = (Sum of Squares) / df = Mean Square

Degrees of Freedom (DF)

• What is DF?

• With every increase in DF, the better you can predict what is going on

• With 3 factors and a screening design• Y = c0 + c1X1 + c2X2 + c3X3 + error

• 9 = c0 + c1(-1) + c2(-1) + c3(+1) + error

• 7 = c0 + c1(-1) + c2(+1) + c3(-1) + error

• 5 = c0 + c1(+1) + c2(-1) + c3(-1) + error

• 19 = c0 + c1(+1) + c2(+1) + c3(+1) + error

•4 unknowns, 4 equations = perfect fit?

X1 X2 X3 Y

-1 -1 1 9

-1 1 -1 7

1 -1 -1 5

1 1 1 19

DF Example

Falsely represents noise in the system

Is the slope of the line > 0?

Factor A

0 10 20

Re

spo

nse

low highcentre point

DF Strategy

• Do Not replicate centre points

• Replicate subset of points – usually start with the treatments that generated

the highest & lowest response (if more replicates are required, go to the

treatment with the next highest and lowest response)

• Number of replications will always be a function of time and money

Repetitions vs. ReplicationsRepetitions Replications

Set-up Equip Set-up 1 Set-up 2 Set-up 3

Trial 1 Trial 1 Trial 2 Trial 3

Trial 2 Trial 4 Trial 5 Trial 6

Trial 3 Trial 7 Trial 8 Trial 9

Trial 4

Trial 5

Trial 6

Trial 7

Trial 8

Trial 9

Repetitions ReplicatesMultiple observations of the same experimental run (no adjustments of the settings, average of responses)

Duplication of a series of runs (takes error setting up equipment into account)

Minimises within subgroup errorGives information to predict experimental noise in the system

Adds Degrees of Freedom

Screening Design for 7 Factors with 8 runs

Exp. No. A B C D E F G Results

1 – – – + + + –

2 + – – – – + +

3 – + – – + – +

4 + + – + – – –

5 – – + + – – +

6 + – + – + – –

7 – + + – – + –

8 + + + + + + +

Randomise Trials

•Prevents a lurking (hidden) variable from influencing results

•Examples:

• Ambient temperature increasing from 15 °C to 30 °C

• Learning effects

• Experimenter fatigue

Why Do I Need Statistics?

• Paretos don’t always work

• Demonstration• SupportFilesPareto.xlsx

• Show main effects plot

• Compute significance

Interactions

• Coupled effects – the response is dependent upon the input of two or more factors

• Confounded

• Aliased

• Are you healthy if you weigh 23 kg (50 lbs)?

• Yes, if you are 1.22 m (4 feet) tall

• Weight and height are called interacting factors

Interactions

•Y = c0 + c1A + c2B + c12AB

Column A = Column B × Column C

Demonstration: <Interaction.xls>

Trial

Factor

A B C

1 -1 -1 1

2 -1 1 -1

3 1 -1 -1

4 1 1 1

2-Way Interactions

A + BC

B + AC

C + AB

Resolution

• Resolution refers to the amount of information that may be obtained from a given experiment.

• The higher the resolution, the more information may be obtained from the experiment (i.e., learn about interactions and higher order terms).

• For most experimentation, three resolutions are appropriate to discuss (III, IV, & V).

Resolution III: A design in which main effects may be separated from other main effects, but not from interactions. That is, interactions are confounded or aliased with main effects.

Resolution IV: A design in which main effects may be separated from other main effects and two-way interactions (two factor), but two-way interactions are confounded with other two-way and higher order interactions.

Resolution V: A design in which main effects may be separated from other main effects, and two-way interactions may be separated from other two-way interactions, but higher order interactions are confounded.

Saturated Designs

•Beware of how software handles non-saturated designs

•Demonstration: <Dummy Variable.xls>

One Factor at a Time

0 6

12

18

24

30

36

0

16

32

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

X1

X2

9000-10000

8000-9000

7000-8000

6000-7000

5000-6000

4000-5000

3000-4000

2000-3000

1000-2000

0-1000

0 4 8

12

16

20

24

28

32

36

40

0

4

8

12

16

20

24

28

32

36

40

X1

X2

9000-10000

8000-9000

7000-8000

6000-7000

5000-6000

4000-5000

3000-4000

2000-3000

1000-2000

0-1000

C

A

B

X1

X2

D

Worse Case: Pure Interaction

F = MAV = IR

Approach:Change one factor at a time (from min to max) while holding all other factors constant

Full Factorial

• Used to determine which factors have a statistically significant effect on the response

variable(s)

• Factors may be quantitative or qualitative

• At least one value of the response is observed at each treatment combination

• Normally the experiment is significantly larger due to the multiple treatment combinations

• No confounding is present in a Full Factorial

Fractional Factorial

• Used to determine which factors have a statistically significant effect on the response

variable(s)

• Factors may be quantitative or qualitative

• Goal is not to define a mathematical model, but to determine which factors should be

included in further experimentation

• Confounding is present in a Fractional Factorial

• Initial experimentation will utilise less runs than a Full Factorial

•Preferred Choice for Initial Screening Design

Example

• Screening

• Resolution separation

Response Surface Methodology

2

Objective – Response surface methodology• Develop conceptual understanding of response surface methodology (RSM)

• Describe method of steepest ascent

• Discuss response surface methodology modelling including characterising the response surface

• Discuss designs for fitting response surfaces

• Describe analysis methods when multiple responses are present0 6

12

18

24

30

36

0

14

28

-1500

-1000

-500

0

500

1000

1500

2000

2500

30002500-3000

2000-2500

1500-2000

1000-1500

500-1000

0-500

-500-0

-1000–500

-1500–1000

0 4 8

12

16

20

24

28

32

36

40

0

4

8

12

16

20

24

28

32

36

40

2500-3000

2000-2500

1500-2000

1000-1500

500-1000

0-500

-500-0

-1000–500

-1500–1000

Response surface methodology• A “response surface” is a topographical representation of the

response over a region of the input variables

• Response Surface methods are simple curve fitting• Use 2nd order polynomial

• Limitation: Real world is not 2nd order polynomials• Sequential approach using conclusions from screening to reduce

experimental region

Maximum Minimum Saddle point Stationary ridge

Steps of response surface methodology

•Typically a sequential experiment• Start with response surface

over a large region

• Use results of larger response surface to focus on smaller region that may contain optimum point

• Repeat as necessary

• May stop at any time and use EVOP

Response surface modelsUse second order polynomial approximations

• Include the linear effects

• Allow us to estimate curvature (squared terms)

• Model all second order interactions

Second order polynomial: Two factorsY = a0 + a1A + a2B + a3A

2 + a4B2 + a5AB

Second order polynomial: Three factorsY = a0 + a1A + a2B + a3C + a4A

2 + a5B2 + a6C

2 + a7AB + a8AC + a9BC

Type of response surface designs• Box-Behnken design

• Minimal design points

• All factors are never set at their high levels simultaneously

• Central composite design

• Can incorporate information from a properly planned factorial

experiment

• Only need to add axial and centre points

Type of response surface designs•Central composite design (CCD)• Can incorporate information from a

properly planned factorial experiment

• Only need to add axial and centre points

•Box-Behnken design• Minimal design points

• All factors are never set at their high levels simultaneously

• Box-Behnken Design consists of twelve “edge” points (shown as solid dots) all lying on a single sphere about the centre of the experimental region, plus three replicates of the centre point.

B

C

A

“Cube” + “Star”+ Centre Points

“Face-centred”CCD

Central composite designs

“Face centred””Circumscribed” “Inscribed”

-1

+1

Experimental set-up for 2 factors

Trial Factor A Factor B

1 -1 -1

2 -1 0

3 -1 1

4 0 -1

5 0 0

6 0 1

7 1 -1

8 1 0

9 1 1

Design using coded variables

Simplest design is 3×3 full factorial

Factor A

1

0

-1

0-1 1

*** Re-use trials from screening experiments ***

Response surface procedure

• The real world is not 2nd order polynomials

• Limit the size of the region modelled to improve accuracy

• Consider sequential RSM with each new RSM modelling a

smaller region

Sequential Simplex Optimisation

3

EVOP – Sequential simplex optimisation60

80

90

70

85

93

5

10

15

20

25

0

40 60 80 100 120

Fact

or

B

Factor A

EVOP principles• Evolutionary operation, abbreviated EVOP

• An evolutionary, iterative path of the steepest ascent method

for determining the optimum process setting

• Typically performed in manufacturing

• Small changes in factor settings with large sample sizes

• Non-disruptive to manufacturing process; experimental parts are

shippable

• Also valuable approach outside of manufacturing

• Small sample sizes

• Alternative to response surface methodology

EVOP principles• Factor levels may be the boundaries for producing acceptable

(and saleable) products

• Can be run during normal production time

• Data from normal production runs provide input for

calculations of subsequent process parameter settings

• Process is repeated until optimum response variable is

obtained

Sequential simplex optimisation

Fixed step size

example

60

80

90

70

85

93

5

10

15

20

25

0

40 60 80 100 120

Fact

or

B

Factor A

Sequential simplex fundamentals• The number of trials in the initial simplex is k+1 (k is the

number of experimental factors)

• Factorial approach has at least 2k, and possibly 3k or 4k trials

• Only one new trial is required to move to a new area in the

space defined by the factors

• Factorial design requires at least 2k-1 trials

• Don’t have to worry about a mathematical model

• Decisions are based on ranking of vertices of our simplex

Sequential simplex fundamentals• How does it work?1. Choose initial simplex

2. Experiment at each setting defined by vertices of simplex

3. Rank the vertices as best, next best, and worst based on response

4. Calculate “Reflection” of worst point and experiment at the reflection

5. Go back to step 3 and repeat until optimum is reached

Variable X1

BEST (B) NEXT BEST (N)

WORST (W)

REFLECTION (R)

Choosing starting pointIn manufacturing

• Start with small “region” to minimise disruption to process

Where you don’t need to worry about disrupting a

production process

• Start with a large “region”

• Large starting regions tend to converge on optimal solution

more quickly

“Sequential Simplex Optimization” – Walters, F.H., Parker, L.R., Morgan, S.L. and Deming, S.N.

www.chem.sc.edu/faculty/morgan/pubs/sequentialsimplexoptimization.pdf

Tips for simplex optimisation• Make sure that you have a real difference in results –

confidence intervals

• Don’t be afraid to change sample sizes as vertices become

closer or farther apart

• If your simplex collapses, pick new vertices and start again

• For example, three points in a straight line

• Can happen if you run into a boundary for a variable

Sequential simplex fundamentals• Fixed step size

• No other choice other than reflection

• Simplest form of simplex

• Issues with fixed size simplex model

• If large size step is used, the optimum point is never reached

• If small steps are used, it takes an excessive number of steps to

reach the optimum point

• Variable size simplex solves these issues

Sequential simplex fundamentals• Variable size simplex

• Instead of a simple, fixed step sized reflections (R), there are

other options:

• double the length of the reflection (E)

• reduce the length of the expansion by 50% (Cr)

• produce the reflection in the opposite direction but at 50% of the

normal length (Cw)

• Makes the discrimination of the simplex variable

• Increases rate to arrive at optimum

Variable size simplex concept

B N

W

R

E

CR

CW

Variable X1

Rank

vertices

Try and

rank “R”

B

N

W

If R > B, Try E

– If E > B, use E

– Else, use R

If N <= R <= B,

use R

If W <= R < N,

use CR

If R < W,

use CW

Sequential simplex fundamentals• Confidence intervals

• When entering new points in the Simplex, it is important to

determine that the new point is statistically different (with a

degree of confidence) from the other prior three points

considered

• This is accomplished by a form of hypothesis test of the means

• It is critical to perform this test when the Simplex begins to

converge on the optimum and the points grow progressively

closer together

Demonstration: <Simplex Confidence Intervals.xls>

Traps in sequential simplex

• Collapse of the simplex

– Can collapse if the aspect ratio of the shape becomes too large

– To avoid, monitor the vertices and re-start the simplex with new points if

collapse is evident

• Run into limit

– If the simplex falls outside the variable limit, the variable is set to the limit

– This could cause the simplex to collapse

– Consider re-starting with a small simplex in the best region

• Local vs. global optimum

– Multiple optimal points can cause the simplex to find a local optimum rather

than the global optimum

– To avoid, a large starting simplex or starting with a response surface is

advisable

Variable size simplex• Tips for initial simplex:

• Allow sufficient room between the initial simplex and the

factor limits

• Select a range for the initial simplex such they are not within

one step of the factor limits

2014-11-06 Design of ExperimentsSlide 58

Comparison sequential simplex vs. RSM

Sequential simplex Response surface method

No underlying mathematical model Based on mathematical model

If “k” = # factors, you need (k+1) points to start, with 1 additional point per step

If “k” = # factors, you will need 3k points to start, and 3k points for each step

Seeks optimum “point”Seeks optimum settings, with some information on surrounding regions

Information on “path” to optimumInformation collected for “regions” of operation

Better suited for ongoing manufacturing process improvement

Better suited for design optimisation, or brand new manufacturing process

Causes for a “noisy experiment”• Measurement error

• Measured the wrong response

• Missed a significant factor

• Undisciplined experimental procedure

• Experimental procedure was not controlled

• Planning after research completed

• Should have blocked out noise

• No randomisation

• Levels were too close

• Interaction mistakenly used for degrees of freedom

When not to use experimental design•If the relationship is defined with equations

representing the physical relationship – it is better just

to perform the calculations rather than try and derive

the relationship through experimentation

•Remember experimentation predicts the relationship

between the dependent and independent variables by

assuming a very simple mathematical model

20 301030

65

100