Category Archives: hypothesis testing

Plotting Correlations in R

Advertisements

A correlation indicates the strength of the relationship between two or more variables.  Plotting correlations allows you to see if there is a potential relationship between two variables. In this post, we will look at how to plot correlations with multiple variables.

In R, there is a built-in dataset called ‘iris’. This dataset includes information about different types of flowers. Specifically, the ‘iris’ dataset contains the following variables

  • Sepal.Length
  • Sepal.Width
  • Petal.Length
  • Petal.Width
  • Species

You can confirm this by inputting the following script

> names(iris)
[1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"

We now want to examine the relationship that each of these variables has with each other. In other words, we want to see the relationship of

  • Sepal.Length and Sepal.Width
  • Sepal.Length and Petal.Length
  • Sepal.Length and Petal.Width
  • Sepal.Width and Petal.Length
  • Sepal.Width and Petal.Width
  • Petal.Length and Petal.Width

The ‘Species’ variable will not be a part of our analysis since it is a categorical variable and not a continuous one. The type of correlation we are analyzing is for continuous variables.

We are now going to plot all of these variables above at the same time by using the ‘plot’ function. We also need to tell R not to include the “Species” variable. This is done by adding a subset code to the script. Below is the code to complete this task.

> plot(iris[-5])

Here is what we did

  1. We use the ‘plot’ function and told R to use the “iris” dataset
  2. In brackets, we told R to remove ( – ) the 5th variable, which was species
  3. After pressing enter you should have seen the following

Rplot10

The variable names are placed diagonally from left to right. The x-axis of a plot is determined by variable name in that column. For example,

  • The variable of the x-axis of the first column is ‘Sepal.Length”
  • The variable of the x-axis of the second column is ‘Sepal.Width”
  • The variable of the x-axis of the third column is ‘Petal.Length”
  • The variable of the x-axis of the fourth column is ‘Petal.Width”

The y-axis is determined by the variable that is in the same row as the plot. For example,

  • The variable of the y-axis of the first column is ‘Sepal.Length”
  • The variable of the y-axis of the second column is ‘Sepal.Width”
  • The variable of the y-axis of the third column is ‘Petal.Length”
  • The variable of the y-axis of the fourth column is ‘Petal.Width”

AS you can see, this is the same information. We will now look at a few examples of plots

  • The plot in the first column second row plots “Sepal.Length” as the x-axis and “Sepal.Width” as the y-axis
  • The plot in the first column third row plots “Sepal.Length” as the x-axis and “Petal.Length” as the y-axis
  • The plot in the first column fourth row plots “Sepal.Length” as the x-axis and “Petal.Width” as the y-axis

Hopefully, you can see the pattern. The plots above the diagonal are mirrors of the ones below. If you are familiar with correlational matrices this should not be surprising.

After a visual inspection, you can calculate the actual statistical value of the correlations. To do so use the script below and you will see the table below after it.

> cor(iris[-5])
             Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length    1.0000000  -0.1175698    0.8717538   0.8179411
Sepal.Width    -0.1175698   1.0000000   -0.4284401  -0.3661259
Petal.Length    0.8717538  -0.4284401    1.0000000   0.9628654
Petal.Width     0.8179411  -0.3661259    0.9628654   1.0000000

As you can see, there are many strong relationships between the variables. For example “Petal.Width” and “Petal.Length” has a correlation of .96, which is almost perfect. This means that when “Petal.Width” grows by one unit “Petal.Length” grows by .96 units.

Conclusion

Plots help you to see the relationship between two variables. After visual inspection, it is beneficial to calculate the actual correlation.

Chi-Square Goodness-of-Fit-Test

Advertisements

The chi-square test is a non-parametric test that is used in statistic to determine if an observed distribution or model conforms or is similar to an expected distribution or model. In simple terms, this test will tell you if the data you collected is similar to other data or to what you expected.

There are several types of chi-square test such as the Chi-square Test of Independence, which is used for nominal data, and the Goodness-of-Fit Test, which deals with data that is not nominal. This post is about the Goodness-of-Fit Test. The Goodness-of-Fit test compares the distribution of the observed data with an expected distribution.

A unique caveat of chi-square test is that we normally desire as a researcher to make sure we do not reject our model. This is opposite of traditional hypothesis testing which desires often to reject the null hypothesis as this indicates that there is a statistical difference. With chi-square test, we want our observed model to be similar to the values found in the expected model. What this means is that our model represents what is happening in the real-world and is not only theoretical. If we reject the null it means that the model we are trying to create is not similar to expected values that might be found in the real world. In other words, we found something that does not conform to what is expected. If a model does not represent the world, it may not serve much purpose.

Here are the assumptions of Goodness-of-Fit Test

  • Random selection of subjects
  • Mutually exclusive categories

Here are the steps

  1. Determine hypothesis
    • H0: There is no difference between the observed values/model and the expected values/model
    • H1: There is a difference between the observed values/model and the expected values/model
  2. Decide level of significance
  3. Determine degree of freedom to find chi-square critical
  4. Compute for the expected frequencies
  5. Compute chi-square
  6. Make decision to accept or reject null
  7. State conclusion

Here is an example

A principal wants to know if the number of students absent each day of the week is the same. Below are the results for one week.

Day                  Absents

Monday                 17

Tuesday                 20

Wednesday            16

Thursday               14

Friday                    13

Step 1: Determine Hypothesis

  • H0: The number of students absent is the same every day
  • H1: The number of students absent is not the same every day

Step 2: Decide level of significance

  • 0.05

Step 3 Determine chi-square critical region (computer does this for you)

  • Chi-square critical region = 9.48

Step 4: Compute expected frequencies

  • Computer does this

Step 5: Compute Chi square (computer does this for you)

  • Chi-square = 1.87

Step 6: Make decision

  • Since the computed chi-square of 1,87 is less than the critical chi-square value of 9.48 we do not reject the null hypothesis

Step 7: Conclusion

  • Since we do not reject the null hypothesis we can say that there is a lack of evidence that there is a difference in the number of absences each day of the week. In other words, the number of students absent each day is the same.

NOTE: There is also a way to do this test when the expected frequencies are unequal

Simple Linear Regression Analysis

Advertisements

Simple linear regression analysis is a technique that is used to model the dependency of one dependent variable upon one independent variable. This relationship between these two variables is explained by an equation.

When regression is employed normally the data points are graphed on a scatterplot. Next, the computer draws what is called the “best-fitting” line. The line is the best fit because it reduces the amount of error between actual values and predicted values in the model. The official name of the model is the least square model in that it is the model with the least amount of error. As such, it is the best model for predicting future values

It is important to remember that one of the great enemies of statistics is explaining error or residual. In general, any particular data point that is not the mean is said to have some error in it. For example, if the average is 5 and one of the data points is three 5 -3 = 2 or an error of 2. Statistics often want to explain this error. What is causing this variation from the mean is a common question.

There are two ways that simple regression deals with error

  1. The error cannot be explained. This is known as unexplained variation.
  2. The error can be explained. This is known as explained variation.

When these two values are added together you get the total variation which is also known as the “sum of squares for error.”

Another important term to be familiar with is the standard error of estimate. The standard error of estimate is a measurement of the standard deviation of the observed dependent variables values from predicted values of the dependent variable. Remember that there is always a slight difference between observed and predicted values and the model wants to explain as much of this as possible.

In general, the smaller the standard error the better because this indicates that there is not much difference between observed data points and predicted data points. In other words, the model fits the data very well.

Another name for the explained variation is the coefficient of determination. The coefficient of determination is the amount of variation that is explained by the regression line and the independent variable. Another name for this value is the r². The coefficient of determination is standardized to have a value between 0 to 1 or 0% to 100%.

The higher your r² the better your model is at explaining the dependent variable. However, there are a lot of qualifiers to this statement that goes beyond this post.

Here are the assumptions of simple regression

  • Linearity–The mean of each error is zero
  • Independence of error terms–The errors are independent of each other
  • Normality of error terms–The error of each variable is normally distributed
  • Homoscedasticity–The variance of the error for the value of each variable is the same

There are many ways to check all of this in SPSS which is beyond this post.

Below is an example of simple regression using data from a previous post

You want to know how strong is the relationship of the exam grade on the number of words in the students’ essay. The data is below

Student         Grade        Words on Essay
1                             79                           147
2                             76                           143
3                             78                           147
4                             84                           168
5                             90                           206
6                             83                           155
7                             93                           192
8                             94                           211
9                             97                           209
10                          85                           187
11                          88                           200
12                          82                           150

Step 1: Find the Slope (The computer does this for you)
slope = 3.74

Step 2: Find the mean of X (exam grade) and Y (words on the essay) (Computer does this for you)
X (Exam grade) = 85.75        Y (Words on Essay) = 176.25

Step 3: Compute the intercept of the simple linear regression (computer does this)
-145.27

Step 4: Create linear regression equation (you do this)
Y (words on essay) = 3.74*(exam grade) – 145.27
NOTE: you can use this equation to predict the number of words on the essay if you know the exam grade or to predict the exam grade if you know how many words they wrote in the essay. It is simple algebra.

Step 5: Calculate Coefficient of Determination r² (computer does this for you)
r² = 0.85
The coefficient of determination explains 85% of the variation in the number of words on the essay. In other words, exam grades strongly predict how many words a student will write in their essay.

Spearman Rank Correlation

Advertisements

Spearman rank correlation aka ρ is used to measure the strength of the relationship between two variables. You may be already wondering what is the difference between Spearman rank correlation and Person product moment correlation. The difference is that Spearman rank correlation is a non-parametric test while Person product moment correlation is a parametric test.

A non-parametric test does not have to comply with the assumptions of parametric test such as the data being normally distributed. This allows a researcher to still make inferences from data that may not have normality. In addition, non-parametric test are used for data that is at the ordinal or nominal level. In many ways, Spearman correlation and Pearson product moment correlation compliment each other. One is used in non-parametric statistics and the other for parametric statistics and each analyzes the relationship between variables.

If you get suspicious results from your Pearson product moment correlation analysis or your data lacks normality Spearman rank correlation may be useful for you if you still want to determine if there is a relationship between the variables. Spearmen correlation works by ranking the data within each variable. Next, the Pearson product moment correlation is calculated between the two sets of rank variables. Below are the assumptions of Spearman correlation test.

  • Subjects are randomly selected
  • Observations are at the ordinal level at least

Below are the steps of Spearman correlation

  1. Setup the hypotheses
    1. H0: There is no correlation between the variables
    2. H1: There is a correlation between the variables
  2. Set the level of significance
  3. Calculate the degrees of freedom and find the t-critical value (computer does this for you)
  4. Calculate the value of Spearman correlation or ρ (computer does this for you)
  5. Calculate the t-value(computer does this for you) and make a statistical decision
  6. State conclusion

Here is an example

A clerk wants to see if there is a correlation between the overall grade students get on an exam and  the number of words they wrote for their essay. Below are the results

Student         Grade        Words on Essay
1                             79                           147
2                             76                           143
3                             78                           147
4                             84                           168
5                             90                           206
6                             83                           155
7                             93                           192
8                             94                           211
9                             97                           209
10                           85                           187
11                           88                           200
12                           82                           150

Note: The computer will rank the data of each variable with a rank of 1 being the highest value of a variable and a rank 12 being the lowest value of a variable. Remember that the computer does this for you.

Step 1: State hypotheses
H0: There is no relationship between grades and words on the essay
H1: There is a relationship between grades and words on the essay

Step 2: Determine level of significance
Level set to 0.05

Step 3: Determine critical t-value
t = + 2.228 (computer does this for you)

Step 4: Compute Spearman correlation
ρ = 0.97 (computer does this for you)
Note: This correlation is very strong. Remember the strongest relationship possible is + 1

Step 5: Calculate t-value and make a decision
t = 12.62   ( the computer does this for you)
Since the computed t-value of 12.62 is greater than the t-critical value of 2.228 we reject the null hypothesis

Step 6: Conclusion
Since the null hypotheses are rejected, we can conclude that there is evidence that there is a strong relationship between exam grade and the number of words written on an essay. This means that a teacher could tell students they should write longer essays if they want a higher grade on exams

Correlation

Advertisements

A correlation is a statistical method used to determine if a relationship exists between variables.  If there is a relationship between the variables it indicates a departure from independence. In other words, the higher the correlation the stronger the relationship and thus the more the variables have in common at least on the surface.

There are four common types of relationships between variables there are the following

  1. positive-Both variables increase or decrease in value
  2. Negative- One variable decreases in value while another increases.
  3. Non-linear-Both variables move together for a time then one decreases while the other continues to increase
  4. Zero-No relationship

The most common way to measure the correlation between variables is the Pearson product-moment correlation aka correlation coefficient aka r.  Correlations are usually measured on a standardized scale that ranges from -1 to +1. The value of the number, whether positive or negative, indicates the strength of the relationship.

The Person Product Moment Correlation test confirms if the r is statistically significant or if such a relationship would exist in the population and not just the sample. Below are the assumptions

  • Subjects are randomly selected
  • Both populations are normally distributed

Here is the process for finding the r.

  1. Determine hypotheses
    • H0: = 0 (There is no relationship between the variables in the population)
    • H0: r ≠ 0 (There is a relationship between the variables in the population)
  2. Decided what the level of significance will be
  3. Calculate degrees of freedom to determine the t critical value (computer does this)
  4. Calculate Pearson’s (computer does this)
  5. Calculate t value (computer does this)
  6. State conclusion.

Below is an example

A clerk wants to see if there is a correlation between the overall grade students get on an exam and the number of words they wrote for their essay. Below are the results

Student         Grade        Words on Essay
1                             79                           147
2                             76                           143
3                             78                           147
4                             84                           168
5                             90                           206
6                             83                           155
7                             93                           192
8                             94                           211
9                             97                           209
10                          85                           187
11                          88                           200
12                          82                           150

Step 1: State Hypotheses
H0: There is no relationship between grade and the number of words on the essay
H1: There is a relationship between grade and the number of words on the essay

Step 2: Level of significance
Set to 0.05

Step 3: Determine degrees of freedom and t critical value
t-critical = + 2.228 (This info is found in a chart in the back of most stat books)

Step 4: Compute r
r = 0.93                       (calculated by the computer)

Step 5: Decision rule. Calculate t-value for the r

t-value for r = 8.00  (Computer found this)

Since the computed t-value of 8.00 is greater than the t-critical value of 2.228 we reject the null hypothesis.

Step 6: Conclusion
Since the null hypothesis was rejected, we conclude that there is evidence that a strong relationship between the overall grade on the exam and the number of words written for the essay. To make this practical, the teacher could tell the students to write longer essays if they want a better score on the test.

IMPORTANT NOTE

When a null hypothesis is rejected there are several possible relationships between the variables.

  • Direct cause and effect
  • The relationship between X and Y may be due to the influence of a third variable not in the model
  • This could be a chance relationship. For example, foot size and vocabulary. Older people have bigger feet and also a larger vocabulary. Thus it is a nonsense relationship

Two-Way Analysis of Variance

Advertisements

Two-way analysis of variance is used when we want to know the following pieces of information.
• The means of the blocks or subpopulations
• The means of the treatment groups
• The means of the interaction of the subpopulation and treatment groups

Now you are probably confused but remember that two-way analysis of variance is an extension of randomized block designed. With randomized block design, there were two hypotheses one for the treatment groups and one for the blocks or subpopulations. What we are doing for two-analysis is assessing the interaction effect, which is the amount of the variation of
subpopulation and treatment group). The assessment of the interaction effect gives us the third hypothesis. To put it in simple words when both the subpopulation and the treatment are present combined they have some sort of influence just as they do when one or the other is present. Therefore, two-way analysis of variance is randomized block designed plus an interaction effect hypothesis.

Another important difference is the use of repeated measures. In a two-way analysis of variance, at least one of the groups received the treatment more than once. In a randomized block design, each group receives the treatment only one time. Your research questions determine if any group needs to experience the treatment more than once.

Below are the assumptions
• Sample randomly selected
• Populations have homogeneous standard deviations
• Population distributions are normal
• Population covariances are equal.

Here are the steps
1. Set up hypotheses (there will be three of them)
a.Treatment means (AKA factor A)
i. H0: There is no difference in the treatment means
ii. H1: H0 is false
b. Block means (AKA factor B)
i. H0: There is no difference in the block means
ii. H1: is false
c. Interaction between Factor A and B
i. H0: There is no interacting effect between factor A & B
ii. H1: There is an interacting effect between factor A & B
2. Determine your level of statistical significance
3. Determine F critical (there will be three now and the computer does this)
4. Calculate the F-test values (there will be three now and the computer does this)
5. Test hypotheses
6. State conclusion

Here is an example
A music teacher wants to study the effect of instrument type and service center on the repair time measured in minutes. Four instruments (sax, trumpet, clarinet, flute) were picked for the analysis. Each service center was assigned to perform the particular repair on two instruments in each category

Instrument
Service centers Sax Trumpet Clarinet Flute
1                        60      50          58         60
70      56          62         64
2                        50      53          48         54
54      57          64         46
3                        62      54          46         51
64      66          52         49

Here are your research questions
• Is there a difference in the means of the repair time between service centers?
• Is there a difference in the means of the repair time between instrument type?
• Is there an interaction due to service center and type of instrument on the mean of the repair time
Let us go through each of our steps
Step 1: State the hypotheses
• Treatment means (AKA factor A)
a. H0: There is no difference in the means of the service centers
b. H1: H0 is false
• Block means (AKA factor B)
a. H0: There is no difference in means of the instrument types
b. H1: is false
• Interaction between Factor A and B
a. H0: There is no interacting effect between service center and instrument type
b. H1: There is an interacting effect between service center and instrument type

Step 2: Significance level
• Set at 0.1

Step 3: Determine F-Critical
For the instruments, F-critical is 2.81
For the service centers, F-critical is 2.61
For the interaction effect, F-critical is 2.33

Step 4: Calculate F-values
Service centers 3.2
Instrument type 1.4
Interaction 2.1

Step 5: Make decision
Since the F-value of 3.2 is greater than the F-critical of 2.8 we reject the null hypothesis for the service centers

Since the F-value of 1.4 is less than the F-critical of 2.61 we do not reject the null hypothesis for the instrument types

Since the F-value of 2.1 is less than the F-critical of 2.3 we do reject the null hypothesis for the interaction effect of service center and instrument type.

Step 6: Conclusion
Since we reject the null hypothesis that there is no difference in the means of the repair time of the service centers, we conclude that there is evidence of a difference in the repair times between service centers. This means that one service center is faster than the others are. To find out, do a posthoc test.

Since we do not reject the null hypothesis that there is no difference in the means of the repair time of the instrument types, we conclude that there is no evidence of a difference in the repair time between instrument types. In other words, it does not matter what type of instrument is being fixed as they will all take about the same amount of time.

Since we do not reject the null hypothesis that there is no interaction effect of service center and instrument type on the mean of the repair time, we conclude that there is no evidence of an interaction effect of service center and instrument type on repair time. In other words, if service center and instrument type are considered at the same time there is no difference in how fast the instruments are repaired.

Analysis of Variance: Randomized Block Design

Advertisements

Randomized blocked design is used when a researcher wants to compare treatment means. What is unique to this research design is that the experiment is divided into two or more mini-experiments.

The reason behind this is to reduce the variation within-treatments so that it is easier to find differences between means.  Another unique characteristic of randomized block design is that since there is more than one experiment happening at the same time, there will be more than one set of hypotheses to consider. There will be a set of hypotheses for the treatment groups and also for the block groups. The block groups are the several subpopulations with the sample. Below are the assumptions

  • Samples are randomly selected
  • Populations are homogeneous
  • Populations are normally distributed
  • Populations covariances are equal
    •  Covariance is a measure of the commonality that two variables deviate from their expected values. If two variable deviates in similar ways the covariance will be high and vice versa. The standardized version of covariance is correlation.

Looking at equations and doing this by hand is tough. It is better to use SPSS or excel to calculate results. We are going to look at an example and see an application of randomized block design.

A professor wants to see if “time of day” affects his students score on a quiz. He randomly divides his stat class into five groups and has them take the quiz at one of four times during the day.  Below are the results
Time Period/Treatment
Section    8-9                10-11                11-12                1-2
1                  25                      22                        20                     25
2                  28                      24                        29                     23
3                  30                      25                        25                     27
4                  24                      27                        28                     25
5                  21                      28                        30                     24

The treatment groups here are the time periods. The are along the time and are 8-9, 10-11, 11-12, 1-2. The block groups are along the left-hand side and the are section 1, 2, 3, 4, 5. The block groups are the 5 different experimental groups of the larger population of the statistics class. What is happening here is that all members from all groups all took the quiz at one of the four times. For example, members from section one took the quiz at 8-9, 10-11, 11-12, and 1-2. The same for group 2 and so forth.  By having five different groups take the quiz at each of the time periods it should hopefully improve the accuracy of the results. It is like sampling a population five times instead of one time.

In addition, by having four different time periods, we can hopefully see much more clearly if the time period makes a difference. We have four different time periods instead of two or three. Below are the steps for solving this problem.

Step 1: State hypotheses
For Time periods
Null hypothesis: There is no difference in the means between time periods
Alternative hypothesis: There is a difference in the means between time periods
For Blocks
Null hypothesis: There is no difference in the means among the sections of students
Alternative hypothesis: There is difference in the means among the sections of students

Step 2: Significance level
are alpha is set to .05

Step 3: Critical value of F
This is done by the computer and it indicates that the F critical for the treatment (time periods) is 3.49 and the F critical for the blocks (section of students) is 3.26. There are two F criticals because there are two sets of hypotheses, one for the time periods and one for the students.

Step 4: Calculate
The computed F-value for treatment (time periods) is 0.25
The computed F-value for the blocks (section of students) is 0.89

Step 5: Decision
Since the F-value of the treatment (time periods) is 0.25 is less than F critical of 3.49 at an alpha of .05 we do not reject the null hypothesis

Since the F-value of the blocks (section of students) is 0.89 is less than F critical of 3.26 at an alpha of .05 we do not reject the null hypothesis

Step 6: Conclusion
Treatment (Time period)
Since we did not reject the null hypothesis, we can conclude that there is no evidence that time of day affects the quiz scores.

Blocks (Section of Student)
Since we did not reject the null hypothesis, we can conclude that there is no evidence that group affects the quiz scores.

From this, we know that time of day and the group a student belongs to does not matter. If the time of day mattered it might have been due to a host of factors such as early morning or late afternoon. For the groups, the difference could be identified by how they did on individual items. Maybe they struggled with finding the means of question 3.

Remember in this example there was no difference. The ideas above are for determining why there was a difference if that had happened.

One-Way Analysis of Variance (ANOVA)

Advertisements

Analysis of variance is a statistical technique that is used to determine if there is a difference in two or more sample populations.  Z-test and t-tests are used when comparing one sample population to a known value or two sample populations to each other. When two or more sample populations are involved it is necessary to use analysis of variance.  The simple rule is 3 or more use analysis of variance

Analysis of variance is too complicated to do by hand, even though it is possible. It takes a great deal of time and one error will ruin the answer. Therefore, we are not going to look at equations during this example. Instead, we will focus on the hypotheses and practical applications of analysis of variance. To calculate analysis of variance results you can use SPSS or Microsoft excel.

There are several types of analysis of variance. We are going to first look at one-way analysis of variance.

Here are the assumptions for one-way analysis of variance

  • Samples are randomly selected
  • Samples are independently assigned
  • Samples are homogeneous
  • Sample is normally distributed

One-way analysis of variance is used when 2 or more groups receive the same treatment or intervention. The treatment is the independent variable while the means of each group is the dependent variable. This is because as the researcher, you control the treatment but you do not control the resulting mean that is recorded. One-way analysis of variance is often used in performing experiments.

Let’s look at an example. You want to know if there is any difference in the average life of four different breeds of dogs. You take a random sample of five dogs from four different breeds. Below are the results

Terrier    Retriever   Hound   Bulldog
12                 11                   12            12
13                 10                   11             15
14                 13                   15             10
11                 15                   15             12
15                 14                   16             11

In this example, the independent variable is the breed of dog. This is because you control this. You can select whatever dog breed you want. The dependent variable is the average length of the dog’s lives. You have no control over how long they live. You are trying to see if  dog breed influences how long the dog will live

Here are the hypotheses

Null hypotheses: There is no difference in the average length of a dog’s life because of breed

Alternative hypotheses: There is a difference in the average length of a dog’s life because of breed

The significance level is 0.05  are F critical is 3.24

After running the results in the computer we get an F-value of 0.76. This means we do not reject are null hypotheses.  This means that there is no difference in the average life of the dog breeds in this study.

One-way analysis is used when we have one treatment and three or more groups that experience the treatment. This statistical tool is useful for research designs that call on the need for experiments.

Hypothesis Testing for Two Means: Large Independent Samples

Advertisements

Hypothesis testing for two large samples examines again if there is a difference between the two means. We infer that there is a difference between the population means by seeing if there is a difference between the sample means. The assumptions for testing for the difference between two means are below.

  • Subjects are randomly selected and independently assigned to groups
  • Population is normally distributed
  • Sample size is greater than 30

The hypotheses can be stated as follows

  • Null hypothesis: There is no difference between the population means of the two groups
    • The technical way to say this is…  H0: μ1 = μ2
  • Alternative hypothesis: There is a difference between the population means of the two groups. One is greater or smaller than the other
    • The technical way to say this is… H1: μ1≠ μ2 or μ1> μ2 or         μ1< μ2

The process for conducting a z test for independent samples is provided below

  1. Develop your hypotheses
  2. Determine the level of significance (normally .1, .05, or .01)
  3. Decide if it is a one-tail or two tail test.
  4. Determine the critical value of z. This is found in chart in the back of most stat books common values include +1.64, +1.96, or +2.32
  5. Calculate the means and standard deviations of the two samples.
  6. Calculate the test for the two independent samples. Below is the formula

z = (sample mean 1 – sample  mean 2)

√[(variance of sample 1 squared/ sample population 1) +
(variance  of sample 2 squared/ sample population 2)]

7. If the computed z is less than the critical z then you do not reject your null hypothesis. This means there is no difference between the means. If the computed z is greater than the critical z then you reject the null hypothesis and this indicates that there is evidence that there is a difference.

Below is an example

A business man is comparing the price of buildings in two different provinces to see if there is a difference. Below are the results. Determine if the buildings in Bangkok cost more than the buildings in Saraburi.

Bangkok                                   Saraburi
average price     2,140,000                                1,970,000
variance                 226,000                                     243,000
sample size           47                                                  45

Now let us go through the steps

  1. Develop your hypotheses
    • Null hypothesis: There is no difference between the average price of buildings in Bangkok and Saraburi
      • In stat language, it would be
      • H0: μ1 ≠ μ2
    • Alternative hypothesis: The  average price of buildings in Bangkok is higher than in  Saraburi
      • In stat language, it would be
      • H1: μ1 > μ2
  2. Determine the level of significance (normally .1, .05, or .01)
    • We will select .05
  3. Decide if it is a one-tail or two tail test.
    • This is a one-tail test. We want to know if one mean is greater than another. Therefore, to reject the null we need a z computed that is positive and larger than our z critical.
  4. Determine the critical value of z. This is found in chart in the back of most stat books common values include +1.64, +1.96, or +2.32 when it is a two tailed test
    • Our z critical is + 1.64  since this is a one-tail test we only have one value so we do not split the probable and place have on one side and half on the other side. If this were two-tailed we would have -1.96 and +1.96 which indicates that the difference is greater or less
  5. Calculate the means and standard deviations of the two samples.
    • Already done in the table above
  6. Calculate the test for the two independent samples. Below is the formula.

(2,140,000 – 1,970,000)
√[((226,000)²)/47) + ((243,000)²)/45)]
our final answer for are z computed is 3.47

Since 3.47 is greater than our z critical of +1.64 we reject the null hypothesis and state that there is evidence that building prices are higher in Bangkok than in Saraburi.

What is a One Sample z Test?

Advertisements

There are actually several different situations in which a researcher can use hypothesis testing. The first instance we will look at is the one sample z test. The one sample z test has the following assumptions that need to be met before employing it.

  • Sample size > 30
  • Subjects are randomly selected
  • Population is normally distributed
  • Cases within the sample are independent
  • One sample was taken

If your data collection meets the above assumptions one sample z test may be appropriate.

With the one sample z test, you are comparing your results to a known expected value. For example, if someone states that the average salaries for teachers are $63,000.00 you can assess this by collecting data from teachers to compare it to this known value. You collect some data and you find that the average salary for 35 teachers was $65,7000.00. The questions you have is who is right? Do teachers really make on average $63,000.00 like the report or do they make $65,700.00 as my data says? Before going further let us establish are hypotheses for this example.

  • Null hypothesis: the average salaries for my sample of teacher salaries will be the same as the average salary’s of the reported value of $63,000.00
    1. The mathematical shorthand for this is H0: μ = 63,000.00
  • Alternative hypothesis: the average salaries for my sample of teacher salaries will be different (greater or lesser) than the average salary’s of the reported value of $63,000.00
    1. The mathematical shorthand for this is H1: μ ≠ 63,000.00

Keep in mind that this is a two-tail level of significance. This is because our final value has the option of being either greater or lesser than $63,000.00. Two-tail means two options, greater or lesser than the expected value while one-tail means only one option either we expect greater or we expected lesser but not both. This is why we will have two z critical values to think about in the near future.

We also need two more pieces of information before we put our numbers into the equation. The two items we need to know are the standard deviation of the sample and the level of statistical significance. For the sample data, we collected we will say the standard deviation is $5,250.00 and the level of statistical significance is α = 0.01. When we convert this alpha value to the z critical value we get 2.32 and -2.32 because we are using a two-tail or two option approach. Do not get distracted by the z critical value it is the same as the alpha value but translated for the numbers set to the normal distribution. It is similar to switching from one language to another, same meaning but different language.

If our final value is greater than 2.32 or less than -2.32 we will reject the null hypothesis that average teacher salaries are $63,000.00. Now we can take a look at the equation

z critical value = sample data – expected value                                                                                           Sample standard deviation / square root of the                                                    number of those in the sample population

In simple English

z critical value  = 65,700 – 63,000                                                                                                                         5,250 / square root of 35

Z critical value = 3.04

Our answer is 3.04, which is greater than +2.32. This indicates that we can reject the null hypothesis that the average salary teachers are $63,000.00 as our data indicate that there is evidence that teachers make more on average.

We don’t want to get too excited here. We found evidence that teachers make more but further testing would be needed to validate these claims. As more data confirms our findings we can confidently state that teachers make more.

I would like to thank andydevil12 for the question and suggestion. If there are any other questions please send them to me as they help me to understand research and statistics much better as well.