Analysis of Variance and Nonparametric Statistics

Discussion 6

What nonparametric statistics are and why they are important. You have seen many tests and analyses involving nonparametric statistics. In this assignment, it will be up to you to determine why these tests and analyses are necessary, and in what situations they are appropriate.

Example

The number of offspring that female blue whales have follows a skewed right distribution. Fifty years ago, researchers tagged ten female blue whales from birth and counted how many offspring each had over her lifetime. Now researchers want to get similar data to determine if the average number of offspring may be decreasing due to global warming or other environmental changes.

Instructions

  • In your own words, define nonparametric statistics.
  • For this example, discuss why an approach using nonparametric statistics would be appropriate.
  • For this example, what specific test might you use to test whether the average number of offspring has changed over the years? Why would you use this specific type of test?
  • Describe a brief scenario where you may need to use nonparametric statistics in your daily life.

6.1

Scenario

Suppose you want to test out two different trailers for an upcoming movie. So you take a sample of men and women and randomly separate them out into four groups. Some of the men will watch Trailer A and give it a rating. Some of the women will watch Trailer A and give it a rating. Some of the men will watch Trailer B and give it a rating. Some of the women will watch Trailer B and give it a rating.

Once you have the ratings for these four groups, you can run what is called a two-way analysis of variance. It is called a two-way analysis of variance because you are able to see if you are getting significantly different ratings based on not one but two variables, the type of trailer (A or B) and gender. Because you are using two variables, you can see if you get significantly different average ratings for the movie trailer.

Instructions

  • Review the movie data in the following table.

Gender

Trailer Type

Ratings

Sample

Size

Mean

1

2

3

4

5

Male

A

8

7

8

6

4

5

6.6

Male

B

4

6

4

5

4

5

4.6

Female

A

3

4

1

2

4

5

2.8

Female

B

6

7

5

4

7

5

5.8

  • Answer the following questions in a Word document:
  • State the hypotheses to test the main effects of gender.
  • State the hypotheses to test the main effects of trailer type.
  • Calculate the test statistic and p-value to test the main effects of gender.
  • Calculate the test statistic and p-value to test the main effects of trailer type.
  • State the conclusion of both these tests at the 0.05 level of significance.

6.2

Will detail a test used to compare data from matched pairs, like before and after tests, called the sign test for matched pairs. Remember that as with all nonparametric tests, this is the test you would need to use when the data don’t follow a normal distribution.

For this assignment, you will answer questions and solve problems involving paired data.

Instructions

  • Answer the following questions in a Word document:
  • Review the following data where patients with high blood pressure are given a drug that is thought to lower their blood pressure. The “before” amounts represent their blood pressure before taking the medicine. The “after” amounts represent their blood pressure after taking the medicine for three months:

Subject

Before

After

1

140

140

2

148

140

3

146

138

4

154

152

5

142

144

6

151

148

7

140

138

8

148

146

9

146

148

10

154

150

11

142

140

12

151

148

13

153

150

  • State why a sign test for matched pairs would be appropriate for this example.
  • Set up the hypotheses to determine if the claim is that the medicine helps to reduce blood pressure.
  • Calculate the test statistic and p-value for this example.
  • What would your conclusion be at the 0.05 level of significance?

6.3

  • Which national park has more bears? Random samples of plots of ten square miles were taken in different parts of Yellowstone National Park, Yosemite National Park and Glacier National Park. The bear counts per square mile were recorded as shown below:
  • What affects grade point average? Does GPA depend on gender? Does GPA depend on class (freshman, sophomore, junior, senior)? In a study, the following GPAs were collected from random samples of college students. There are four values in each cell:
  • A new medicinal drink is thought to help people stop smoking cigarettes. To test this, a random sample of 18 subjects agreed to drink one drink once a day for a month. Data is collected for these 18 subjects based on the number of cigarettes per day prior to starting the program, and the number of cigarettes per day after starting the program. Below is the data for these 18 subjects:
  • Two different methods are used to help children learn how to spell. Each child was given 60 words to spell, and it was noted how many words each child spelled correctly. Here are the counts for both methods:
  • A math class is given a difficult test and they score poorly on the test. The instructor asks each individual student approximately how many minutes they studied for the test to see if there is some correlation between how long they studied and their test score. Here is the data for seven students:

Yellowstone

Yosemite

Glacier

2

3

8

1

0

3

4

4

5

2

1

8

We want to test whether there is a difference in the mean number of bear per ten square mile plot in these three different parks using a 5% level of significance.

  • State the hypotheses.
  • Calculate the SSTOTAL, SSBETWEEN, and SSWITHIN.
  • Using these values, create the summary table for your ANOVA test.
  • From the table, state the test statistic and p-value, and state your conclusion at the 5% level of significance.

Freshmen

Sophomore

Junior

Senior

Male

3.2, 3.6, 3.8, 3.5

3.7, 3.3, 3.6, 2.6

3.3, 3.6, 2.6, 2.4

2.3, 3.5, 3.9, 2.9

Female

3.8, 3.6, 3.5, 3.1

2.9, 3.8, 3.1, 3.2

2.3, 2.5, 2.9, 3.5

3.0, 2.1, 2.8, 2.7

  • List the factors and the number of levels for each factor.
  • Suppose the test statistic and p-value for the interaction term are 0.43 and 0.736, respectively. Determine if there is any evidence of interaction between the two factors at the 5% level of significance.
  • Suppose the test statistic and p-value for the class factor are 3.32 and 0.037, respectively. Determine if there is any evidence of a difference in mean GPA based on class at the 5% level of significance.
  • Suppose the test statistic and p-value for the gender factor are 1.26 and 0.273, respectively. Determine if there is any evidence of a difference in mean GPA based on gender at the 5% level of significance.

Subject

Number before program

Number after program

1

24

24

2

25

19

3

14

2

4

17

17

5

25

29

6

30

19

7

18

7

8

15

18

9

11

1

10

20

5

11

30

12

12

40

21

13

26

28

14

38

17

15

10

0

16

32

15

17

18

4

18

24

18

Using a sign test for matched pairs at the .01 level of significance, we will test the claim that the number of cigarettes smoked per day was less after the program.

  • State the null and alternative hypotheses.
  • Compute the sample test statistic.
  • Find the p-value.
  • State the conclusion at the .01 level of significance.

Method A

28

35

19

41

37

31

38

40

25

27

36

43

Method B

42

33

26

24

44

46

34

20

48

39

45

Use a rank-sum test at the 0.05 level of significance to test the claim that there is no difference between the distributions for each method.

  • State the null and alternative hypotheses.
  • Compute the sample test statistic.
  • Find the p-value.
  • State the conclusion at the .05 level of significance.

Student

1

2

3

4

5

6

7

Minutes studied

60

85

78

90

93

45

51

Test score

78

42

68

53

62

50

76

Use a Spearman rank correlation test to determine if there is significant correlation between these two variables.

  • State the null and alternative hypotheses.
  • Compute the sample test statistic.
  • Find the p-value.
  • State the conclusion at the .05 level of significance.