<

Chapter 7

One Population

The only relevant test of the validity of a hypothesis is comparison of its predictions with experience.
Milton Friedman (1912 - 2006) Economist and Statistician


ONE POPULATION HYPOTHESIS TESTING


Humor Alert: I don't know why people are so negative about statistics and statisticians. I'm only a first-year student, and statistics has already taught me everything I need to know about life - always Proceed with Caution and Reject H0! - Priscilla Mok


NOMINAL DATA


THE RUNS TEST

The Runs Test addresses only nominal data, only two data categories, and only whether or not the sample observed was drawn from a population that is generated randomly. An examination of runs can assist in deciding upon the randomness or nonrandomness of what is actually observed. Runs can be used to formulate tests of hypotheses regarding the randomness of the population from which the sample was drawn.

H0: the number of runs observed is due to chance, that nothing is influencing the generation of runs, H0: number of runs = what is expected from a chance or random process
H1: the number of runs observed is due to some factor other than chance, that something is influencing the generation of runs, H1: number of runs ≠ what is expected from a chance or random process

A run is defined as a simply a series (1 or more) of two differing values. For example, consider tossing a coin 26 times (the sample size: n=26) with this result:

H H H   T T T   H H   T T   H H H   T T T   H H   T T   H H   T T T T

In the example above, there are 10 runs (each run is underlined), which is the observed number of runs. Note that a run can be of any length: from 1 to ∞.

The "H" is arbitrarily changed to a 1 and a "T" is changed to a 0. Then, :

are calculated. That is all that is needed. Using these formulae, the expected number of runs (Rbar), and the variance of the expected number of runs (S2Rbar) can be calculated.

Expected number of runs = Rbar = (2* n1* n2)/(( n1+ n2)+1)
      where: n1 is the number of "1"s ("H") observed, and n2 is the number of "0"s ("T") observed

The variance of the Expected number of runs SRbar = ((2*n1*n2)*(2*n1*n2-n1-n2))/(( n1* n2)2)+(n1+n2-1))   The standard deviation (needed to calculate Zcalc) is simply the square root of the variance.

The above formulae appears formidable, so thank goodness for the "COPY" command, and that the RUNS spreadsheets can be downloaded. In other words, the hard work has already been done for you!

So, the test statistic, conveniently named Z, is (the absolute value of): (observed number of runs - Rbar)/ √ (σ2),    and is called Zcalc.
Also, Zcalc = [(observed number of runs - Rbar)/σ]   Note: this is the way it is calculated in the spreadsheets.

The test statistic is approximately normally distributed, so you can make good decisions by using the OpenOffice Normal Distribution function NORMINV.


WARNING: in order to use the OpenOffice NORMINV function properly, the value of α (typically 0.05) MUST be halved, then subtracted from 1. That value (typically 0.975) is entered as the first argument into the NORMINV function(1 - α/2). The second and third arguments are ALWAYS 0 and 1 (the mean and standard deviation of the normal probability distribution are 0 and 1).

You can, alternately, use the NORMSINV function, where the second and third arguments (0 and 1) are assumed. All else is the same. Both functions return the same value.


Here is how you find the appropriate value by using the NORMINV function to get the "critical" Z value.
Typically, α = .05 (you want to be [at least] 95% confident that you are correct regarding your decision about H0), so first divide α by 2, yielding .025.
    Subtract that from 1, yielding .9750, the first argument value entered in the NORMINV function (the second and third arguments are ALWAYS 0 and 1). The value returned is 1.95996... (or 1.96). This is your "critical" Z value.
    Use this procedure to find the "critical" Z value when conducting any test (in this book) when using the NORMINV function.
    The reason you must use this procedure is because the NORMINV function is not suitable for direct use (because the OpenOffice programmers said so!)


Now calculate the test statistic (Zcalc) using the formula above: how many standard deviations what was actually observed (in your sample) is from the expected (hypothesized) number of runs (RBar). This quantity is called your calculated Z score or value. Why Z? Why not? It has to be called something. Zcalc, your calculated Z score, is actually the number of standard deviations what you observed differs from what you expected (hypothesized).

Now compare your calculated Z score with the Z returned by NORNINV (often called the critical Z - note that "critical" is just a communication convenience).
IF your calculated Z score is greater than the "critical" Z, REJECT H0 and make your managerial decision based upon some other criterion (usually what you observed since it is available and you have nothing else upon which to go).

IF your calculated Z score is NOT greater than the "critical" Z, fail to reject H0 (which is NOT the same as accepting H0) and make your managerial decision as if H0 is correct. (Note that H0 was not accepted, or proven correct, or proven to be true, just not rejected.)

NOTE: if H0 is correct (something you will never know for certain), then Zcalc will be "close" to zero. So, comparing Zcalc to Zcrit is the "statistics" way of comparing what you observed from the sample to what you hypothesized.

THAT'S ALL THERE IS TO IT!

In the coin toss example above, there are 10 runs (observed), the expected number of runs (Rbar) is 13.923, the standard deviation (square root of variance) of the expected number of runs (√SRbar) is 2.482, so the Zcalc score is -1.580
or Zcalc = (10-13.923/2.482)
or Zcalc = -1.580

The minus sign (-) indicates only that the number of runs observed is less than the number expected. So compare the absolute value of your calculated Z (1.580) score against your "critical" Z (1.95996). In this example, 1.580 is not greater than 1.95996 (the "critical" Z value returned by NORMINV), so DO NOT reject H0. Your managerial decision can now be based what you observed as having been produced randomly since you could not, based on sample evidence, reject the idea of random production.

You are (at least) 95% confident that you are making the correct decision regarding H0, which states that the coin toss runs pattern observed was generate by a random process.

Notice that the Runs Test did NOT make the decision whether H0 should be rejected or not. It only provided information so that the decision maker (you) can make a decision.

Below is the spreadsheet used to conduct the "Runs" test. This spreadsheet is part of "ch7-ex.xls" which can be downloaded for further inspection, as well as for copying.

 

Notice that Rbar, SRbar, and Zcalc are all part of the spreadsheet that can be downloaded. You can copy these formulae, then modify them to be used with your sample. The "hard work" has already been done for you. This is true for ALL nonparametric procedures presented in this book.

ANOTHER EXAMPLE

Consider again the tossing of a coin. Let's say you toss the coin 25 times, with this result:

T T T T H H H H H H H H H H H T T T T T T T T T T

You want to be (at least) 99% certain that you are correct when making a decision about the randomness of the coin toss process.

H0: The coin toss process is random, that the pattern observed was generated by a random process
H1: Not H0, The coin toss process is NOT random, that the pattern observed was NOT generated by a random process
That is step 1.

α = 0.01    (you want to be 99% certain that you are making the correct decision, so α is .01)

You have completed step 1 of the 5 step hypothesis testing procedure.

To complete step 2 you need to find the "critical" Z value.

To use the NORMINV function, you first divide α by 2, subtract that value from 1, then enter that value. The "critical" value of Z is returned. It is that simple!
In this example, the critical Z value is determined by:
1 - α/2 = .995, so .995 is entered into the NORMINV function.

That is step 2.

Now it is time to take your sample (toss the coin and observe the number of runs - the data above), then calculate how much what you observed differs from H0 (what you expect) in sample standard deviations. The "sample" is in the spreadsheet.

That is step 3.

Now compare (the absolute value of) your calculated Z score (-5.526) with your critical Z score (2.57).

That is step 4.

You can see, from your sample evidence, that you CAN reject H0.

From this sample evidence, you can determine that the pattern (number of runs) was not generated by a random process, that the coin toss process is not random (you rejected the idea that it was). You are (at least) 99% confident that you are correct when describing the coin toss process. (Note: you did not prove H1 to be true)

That is step 5.


The second example spreadsheet is illustrated below:


YET ANOTHER EXAMPLE

Let's assume that you own a "sports bar," and that you have spent quite a large sum of money on advertising to attract women to the bar. Now you want to see if the money spent on advertising was effective. You randomly select several nights, several different hours from the nights, then observe the men and women as they enter the bar:

M M M F F M M M M M F M M F F F M M M M M M F F M F F M M M M M F F F F M M M M

H0: pattern (Runs) observed is random, sample was drawn from a random population (advertising was not effective)
H1: pattern (Runs) observed is not random, sample was not drawn from a random population (advertising was effective)

You want to be (at least) 95% certain that you are making the correct decision regarding H0 (advertising NOT effective), so Zcrit = 1.9599.

In this example, N = 40, Rbar = 19.2, S2 = 8.00, so S = 2.83, n1 = 26, n2 = 14, Zcalc = -6.77, and Zcrit = 1.9599.

You CAN, based upon your hypothesis test, reject H0. You are inclined, based on the sample observed, to conclude (but did not prove) that the advertising is effective, and make your managerial decision accordingly. Note that you did not PROVE H1 to be correct, only that you rejected H0.

The advertising effectiveness spreadsheet is illustrated below:



SIGN TEST

The Sign Test is used in a one population situation to determine whether a population median (usually opinions) is equal to or not equal to a specific hypothesized number. It makes very few assumptions about the nature of the distribution of the population being examined, so it has very general applicability. The sign test allocates a sign, either positive (+) or negative (-), to each observation according to whether it is greater or less than some hypothesized value (usually the hypothesized median value), and considers whether the pattern of "+"s and "-"s is substantially different from what we would expect by chance.

There are two separate cases of the Sign Test, depending upon sample size (n).

Any sample of size of less than 25 is considered to be small, and the table in the appendix can be used to determine the "critical" sign test value.

For a sample size of 25 or greater, the results of the Sign Test closely approximate the Normal probability distribution, so NORMINV can be used.

THAT'S ALL THERE IS TO IT!


EXAMPLE - Small Sample

Suppose that you are the manager of a fast food restaurant, and want to see whether your opinion that the median satisfaction score of customers who try (for free) a new product is correct or not. You think the median satisfaction score is 90. You want to be (at least) 95% confident that you are correct regarding your decision about H0, so α=0.05. You randomly select 12 people, have them try the new product, then offer a satisfaction score. The spreadsheet illustration below shows the scores and procedure:

The lesser of the differences totals (4) is NOT less than the "critical" value of 2 (α=.05, N=12) found in the table, so H0 CANNOT be rejected. There is not enough observed evidence to cause you to abandon your opinion that the satisfaction score median is 90.

Observed values equal to the hypothesized median, the ties, receive a 0 value for both differences, so they "go away."


ANOTHER EXAMPLE - Large Sample

For large sample (n =>25), the calculated Sign Test value, Zcalc, is approximately normally distributed, so the NORMINV function can be used.

The formula for Zcalc is: ( (X + 0.5 ) - (N - 2) ) / ( SQRT(N) / 2 )
where X and N are calculated by the Sign Test procedure: X= lesser of + or - count; N = sum of + and - counts, ties ignored.
Again, this formula appears formidable. But don't worry, it can be copied.

If the (absolute value of the) calculated Z score is greater than the "critical" Z score returned by NORMINV, reject H0. Otherwise, fail to reject H0.

Note that "n" is the sample size, while "N" is the sum of + and - counts, ties ignored.

Say that as a manager of a large company, you think that your employees have a median score of 35 on some arbitrary classification score that does not connote any ranking. You randomly select 30 employees, classify them, and record the scores. You want to be (at least) 95% confident that you are correct regarding your decision about H0, so α=0.05, and the NORNSINV function is used to get Zcrit (so α/2 is entered into the function). The spreadsheet illustrations below shows the scores and procedure:

As can be seen, Zcalc is greater than Zcrit, so H0 CAN be rejected (note: you did not prove H1 to be true) and you can be (at least) 95% confident that you are correct when you say the median classification score is NOT 35. What is the median classification score? The Sign Test cannot answer that question.


ORDINAL DATA


WILCOXON SIGNED RANK TEST for ONE POPULATION

The Wilcoxon Signed Rank Test is used to make an inference about the median of one population. The procedure requires the specification of (what is thought to be) the median of the population of interest.

EXAMPLE - Small Sample: n <= 20

Let's assume that you are the manager of a large company, and that you have ranked employees with respect to their effectiveness when dealing with potential customers, with higher rank scores indicating that they are more effective. Now you want to see if the median rank score assigned is 4. This is your null hypothesis: H0 median score = 4. Alternately, the median rank score may not be 4: H1 median score ≠ 4.

You randomly select nine (n = 9) employees, record their scores, and analyze the scores. The analysis is illustrated below:

The "rule" is that if the "W" value from the analysis is not less than the "critical W" value from the table, DO NOT reject H0. If the "W" value from the analysis is LESS than the "critical W" value from the table, reject H0.

IT'S THAT SIMPLE!

The calculated "W" value is 15. This is the value compared to the "critical W" value from the appendix table, which (for n=9, α=.05) is 5. Since the calculated "W" value of 15 is larger than (not less than) the "critical W" value of 5, do NOT reject H0. There is no compelling evidence, based on the sample, that the median score ≠ 4. Make the managerial decision accordingly.

Two notes are in order here: (1) Samples of size 20 or smaller MUST use the procedure illustrated above and the table in the appendix. Sample sizes less than 5 cannot be analyzed. Sample sizes greater than 20 can use either the table (if applicable) or the normal approximation method (below); (2) "W" is approximately normally distributed for samples of size greater than 20, so the NORMINV function can be used.


ANOTHER EXAMPLE - Large Sample: n > 20

The situation described in the above example has, in this example, been expanded to 25 people's scores (n = 25) so that a large sample is created. The function NORMINV can be used for a normal approximation, as can the table. The spreadsheet is illustrated below:

For a sample size of 25, a "W" value of 59.5 is calculated. From the table (α = .05), the "critical" W value is 89, so, the H0 CAN be rejected.

As can be seen in the spreadsheet, the (absolute value of) Zcalc value of -2.435 is greater than the Zcrit value of 1.959 returned by NORMINV. So, the H0 CAN be rejected.

You CANNOT validly play "what if" by entering different median score values until H0 cannot be rejected because analyzing the same sample numerous times inflates the probability of committing a Type I error.

NOTE: The Wilcoxon Signed Rank Test table goes as high as "n = 30, but following these rules will "keep you out of trouble."


SUMMARY

You have been introduced to one population hypothesis testing using the RUNS Test, the SIGN Test, and the WILCOXON SIGN RANK Test.

You were introduced to:


EXERCISES

  1. As the owner of a new car dealership, you have spent a lot of money on advertising to attract owners of car makes other than the ones you sell into your dealership. Is the advertising effective? To investigate the situation, you observe potential customers who drive into your dealership, and record whether or not he/she is driving a make you sell. You randomly select days and hours and record what make of car drive in (Y = make you sell, N = make you don't sell). The data are: Y N N N Y N N Y Y Y Y N N N Y N Y Y Y N N N N Y Y
    • a. State the null and alternative hypotheses.
    • b. Is the advertising effective or not? Show your analysis and how you reached your conclusion.
  2. As the manager of a large retail store, you have spent a lot of money on advertising to address potential customer awareness of your store. You now say that the median awareness score is 100 (whatever that means). Is your opinion correct? In support of your opinion, you randomly select 20 customers as they are exiting and ask for an overall awareness score.
    • a. State the null and alternative hypotheses.
    • b. Is your opinion correct or not? Show your analysis and how you reached your conclusion.
  3. As the manager of a large retail store, you have spent a lot of money on advertising to address potential customer shopping satisfaction. Higher satisfaction scores mean greater levels of satisfaction. You now say that the median satisfaction score is 10. In support of your opinion, you randomly select 20 customers as they are exiting and ask for an overall satisfaction score.
    • a. State the null and alternative hypotheses.
    • b. Is your opinion correct or not? Show your analysis and how you reached your conclusion.
  4. You are the local "outreach" director of a voter campaign. As such, before starting your "outreach" program, you want to assess whether the district that you serve has changed from its historical voting percentage of 75%. You randomly accost people, and ask them if they voted in the latest election (V = voted in latest election, S = skipped voting). Has the percentage changed?

    The data are: S S V V V S V V V V V V S V V V V V S V V V V S V

    • a. State the null and alternative hypotheses.
    • b. Has the voting percentage changed or not? Show your analysis and how you reached your conclusion.
  5. Let's use the same scenario as in No. 2, but this time you survey 50 customers.
    • a. State the null and alternative hypotheses.
    • b. Is your opinion correct or not? Show your analysis and how you reached your conclusion.
  6. Let's use the same scenario as in No. 3, but this time you survey 50 customers.
    • a. State the null and alternative hypotheses.
    • b. Is your opinion correct or not? Show your analysis and how you reached your conclusion.

Exercise data and solutions spreadsheets can be downloaded.


Return
Free Web Hosting