### The Department of Psychology

**Lab 4 Â Two Sample Independent and Dependent t-Test**

Consider the question, "Are females better students than males?" LetÂs assume you have collected information on the scholastic performance of 35 females and 28 males currently enrolled at USC. How would you answer the question above?

A good, first step is to look at descriptive statistics for your data. This ought to give you some ideas.

- Retrieve file "**lab4_ind.sav**". Set the path to **C:\MYDOCUMENTS**
to find the file.

Given the way the data file is structured, we cannot describe males and females separately, unless we tell SPSS to temporarily split the data file into two segments, based on gender. Then we will be able to describe differences between genders on the variable GPA.

**Splitting the file**

- Make sure you are in the "Data Editor" window.

- Click on "Data" and then "Split File".

- Select "Organize Output by Groups".

- Select variable "Gender" and move it into "Groups Based On".

- Click on the PASTE button.

- Go to Syntax Window and run the pasted commands.

**Describing data within the split file environment**

- Select variable "GPA" and move it into "Variables" field.

- Click on the OPTION button.

- Select "Mean", "S. E. Mean"; un-select "Minimum", "Maximum" and "Std.

Deviation".

- Click on the CONTINUE button, then the PASTE button.

- Run the pasted commands.

**Table 1**

Gender Of Students = Male

Gender Of Students = Female

**Getting a rough idea using 95% CI**

Before we can accomplish this, letÂs un-split the data file

- Make sure you are in the "Data Editor" window.

- Click on "Data", then "Split File".

- Select "Analyze All Cases" and then click on "PASTE" button.

- Run the pasted commands.

Now letÂs build the CIÂs

- Click on "Graph", then "Error Bar".

- Select "Simple" and "Summaries for Groups of Cases", then click on DEFINE.

- Select variable "GPA" and move it into the "Variable" field.

- Select variable "Gender" and move it into the "Category Axis" field.

- Notice that by default we have a 95% CI for the mean.

- Click on the PASTE button, then run the pasted command.

**not**equivalent to conducting the t-test. However, in many cases, this strategy will produce results that agree with the conclusions obtained based on the t-test.

Now that our expectation is that there is no difference in GPA between the two groups, letÂs conduct a two sample t-test to confirm our expectation.

__1. Formulating the hypotheses (null, H _{0} and alternative,
H_{1})__

The alternative hypothesis (H_{1} ) states the hope of the experimenter.
In other words, we hope to prove that there is** **a difference between
males and females in their scholastic performance as measured by their
GPA.

The null hypothesis (H_{0} ) reflects the situation that the
experimenter hopes to disprove. In this case, that there** **is no difference
between males and females in their scholastic performance as measured by
their GPA.

Notice that the two hypotheses, combined, express competing ideas about the state of the world.

__2. Selecting a significance level a__

For this lab, we are going to use a = .05.
We could be more conservative, accept fewer Type I errors, and use a
= .01 .
__Setting the Decision Stage____A. Choosing a statistical test__

In this problem, we are comparing two sample means. Since the population
means and standard deviations are unknown, the t-test is the correct choice.
However, there are two possibilities, the **independent** two sample
t-test and the **dependent **two-sample t-test.

**Dependent** means that there is a relationship between pairs of
scores collected for the two groups. For example, if we had equal numbers
of males and females, where brothers and sisters were selected from the
same family, we would have dependent scores. This assumes that a common
family genetic heritage contributes to GPA. Another possibility is to have
the same subjects participate in both conditions of our experiment. Then
the subjectÂs scores in the two different conditions would be dependent
because of the common contributions of each subjectÂs ability to the pairs
of scores they obtained on our dependent measure. Scores for two groups
can only be dependent if they are paired, based on their common dependency,
and thus, both conditions of the independent variable have the same number
of scores.

**Independent** means there is no relationship between the scores
collected on the two groups. That is, there is no common influence on the
scores from the two groups.

Since the two groups in our case have unequal sample size, we know that
we are dealing with independent groups. Thus, we select the **two-sample
independent t-test**.

__B. Finding a critical value or values__

This part becomes unnecessary when using SPSS. This is because the information from tables such as A-2, on page 519 of your text, are built into the program. SPSS actually determines the probability of observing the t-value obtained. If that probability is less than .05, we will reject Ho. However, to keep SPSS honest, and you familiar with Table A-2, you should look up the critical value in your text, and write it on your print outs.

__C. Locating rejection region__

**Figure 1**

LB t-obs = -.85 .85 UB

In the case of the two tailed test, the area to the left of the lower bound (LB) and the area to the right of upper bound (UB) in Figure 1 is equal to one half of alpha. In our case there would be .025 of the area in each region. The P-value given by SPSS is the one associated with the observed value of t computed on the sample data. It is expressed as the combined area from the t-observed (-.85) to the left and from itÂs mirror image (+.85) to the right (just like two a /2Âs combine to a ).

__D. Formulating the decision rules__

Knowing that the **Fail to reject H _{0}**region is in between
LB and UB and that the

**Reject H**region is the remainder of the area under the curve (a ) we can formulate the decision rule in terms of a P-value and type I error (a ).

_{0}We reject H_{0} if the P-value associated with the observed
statistic is less than or equal to a .

We fail to reject H_{0} if the P-value associated with the
observed statistic is greater than a .

__4. Calculating the observed statistic__

- Select variable "GPA" and move it into the "Test Variable" field.

- Select variable "Gender" and move it into the "Grouping Variable" field.

- Click on now lit DEFINE GROUPS button.

- In the "Group 1" field, type 1 Â indicates the code for the male group.

- In the "Group 2" field, type 2 Â indicates the code for the female group.

- Click on the CONTINUE button.

- Click on the OPTION button.

- Since we chose alpha = .05, we leave value 95 in the field "Confidence interval".

- Click on the CONTINUE and PASTE button.

- Go to the syntax window and run the t-test command.

- Click on the rectangle in the upper right corner of the output window.

The middle part of the output provides the mean difference between males and females. (This difference is computed by subtracting group 2 from group 1. Since we have labeled males as group 1 and females as group 2, the difference was achieved by subtracting the female mean GPA from the male mean GPA.) As you can see, the difference is very small, only - .1737.

Ignore the LeveneÂs test for the equality of variance.

The bottom part of the output provides the observed t statistic (called
t-value), df, P-value for the two tail hypothesis (called Two Tail Sig),
standard error of the difference, and the 95% CI on the difference. For
reasons beyond what you will learn in this class, always use the line of
the output labeled **UNEQUAL**. (In general, this version of the t-test
will produce results closer to the alpha level we chose to use).

**Table 2**

__5. Making a decision__

See Figure 1. The P-value (.397) is greater than a
= .05. As you should recall the P-value is the area in the two outlying
regions of the tails of the sampling distribution. In this case, .397 of
the area falls in the extreme tails outside of the t-observed (-.85) and
its mirror image (.85). This area is greater than the area in the tail
from the LB and UB (a ). Since this can only
happen when t-observed lies in between LB and UB or in the fail to reject
H_{0} region, we conclude that there is not enough evidence to
suggest that the null hypothesis is incorrect. So we fail to reject the
H_{0}. What does that mean? Well, what we have concluded is that
the males and the females do not differ in their scholastic performance
as measured by GPA.

Regarding type I and type II error. Type I error results when the null hypothesis is true and we make the mistake of rejecting it (the chance of this happening is a , the significance level). Type II error occurs when we fail to reject the null hypothesis and in fact this null hypothesis is false. Often, the chance of this happening is unknown.

In our case, we have failed to reject the null hypothesis. So, the only
error we could be making is a type II error. There is an unknown possibility
that the null hypothesis is wrong.

__Assignment:__

2. Now, consider the following problem. You just concluded an experiment
on human recall. Your experiment had two conditions: (1) 25 subjects were
allowed to study a list of unrelated words for 1 minute. Then, they waited
for 5 minutes in silence and were asked to recall as many words from the
list as possible. (2) These same subjects were again presented with a list
of unrelated words for a duration of 1 minute (different word list). This
time, however, the subjects waited for 5 minutes listening to the deafening
roar of heavy metal music. Then again, the subjects were asked to recall
as many words as possible from the second list. Also, about half of your
subject experienced condition 1 prior to the condition 2 and vice versa.
This was to insure that the order effect did not bias your findings. The
question you are trying to answer is: "Is there a difference in the recall
between these two conditions?" **Do not forget to discuss the possibility
of committing an error (be careful here)!**

**lab6_ass.sav**".

Hint: Are these groups dependent or independent? If you conclude the scores are dependent, the commands you will need to analyze the data are presented below.

**Example for Conducting Two-Sample t-test for Dependent Scores**

T Test".

- Select two variables ("Metal" and "Quiet") and move them into the "Paired

Variables" field.

- Click on the OPTION button.

- Since alpha = .05, leave value 95 in the field "Confidence interval".

- Click on the CONTINUE and PASTE button.

- Go to the syntax window and run the t-test command.

The first part of the output gives you basic descriptive statistics for each variable, separately. This includes a measure of association called correlation. A correlation of .688 indicates a very strong relationship between the two variables.

The second part gives you idea of how dependent the two samples are using the Pearson Correlation Coefficient.

The third part of the output, the left part of it, provides descriptive statistics for the individual differences computed by subtracting the second group (QUIET) from the first group (HEAVY METAL).

The right side of the second part provides t-observed (t-value), df., and P-value (2-tailed significance).