One-Sided Test or Two-Sided Test?

Let's look at one sided tests versus
two-sided tests in hypothesis testing. A company claims that their product contains
no more than 2 grams of saturated fat on average. You intend to test whether there is strong evidence the mean saturated fat content is greater than their claim. We're going to give the company the benefit of the doubt and test the null hypothesis that the
population mean saturated fat content is 2 grams. And we want to see if there's evidence
that it is in fact greater than 2 grams. If it's two grams or less the company's claim is true, and if it's more than 2 grams the claim is false.

And so we are going to use the one-sided alternative hypothesis here that the population mean saturated fat content
is actually greater than 2 grams. A production process creates silicon wafers.
The desired thickness is 725 micrometers. You intend to test whether there is strong evidence
the mean thickness differs from this amount. We'll give the process the benefit of the doubt and test the null hypothesis that the true mean thickness
is in fact 725 micrometers. but we care about a difference in either direction. We don't want them to be too thick or too thin, so we're seeing if there's strong evidence
that it's different from that amount. And so we will use a two-sided alternative hypothesis that the true mean thickness is different from 725. The choice between a one-sided or two-sided
alternative hypothesis can be subject to debate but it should not be based on the current sample's data. You might have noticed that in both
of those previous examples we didn't have sample data but we were still able to come up with
our hypotheses by the nature of the problem.

We might strongly suspect that a newly
developed drug decreases blood pressure. And we may wish to test the null hypothesis that the population mean change in blood pressure is zero. Or in other words, this drug does
not have an effect on blood pressure. But which is the appropriate alternative? A two-sided alternative or a one-sided alternative? Because we use the word decrease here,
that we're suspecting a decrease, we might be tempted to choose this one-sided alternative. Depending on other factors that may be
the appropriate choice. What we are gaining if we use the
one-sided alternative is greater power in that direction. But if the population mean is greater than 0, say, we will not be able to detect that difference. We might not care about that side, in which case the one-sided alternative would be appropriate. But a two-sided alternative allows us to see a difference from the hypothesized value in either direction. It's not entirely obvious here which is
the most appropriate alternative hypothesis, so let's look at the pros and cons in a little more detail.

Suppose we are carrying out this test
using a Z test at an alpha level of 0.05. If we use a two-sided alternative hypothesis, then we're going to reject the null hypothesis if we get a Z value that is less than
or equal to -1.96 or greater than or equal to 1.96. If on the other hand we choose this
one-sided alternative then we are going to be putting the
entire alpha value in the left tail, and then we are going to reject the null hypothesis if the Z value is less than or equal to -1.645. So compared to the two-sided alternative, for this one-sided alternative it's going to be easier
to reject the null hypothesis on this side and so our test will have greater power if the population mean is actually less than zero. So we're gaining something, but we have to
give something up for that gain, and what we are giving up is any ability
to detect a difference on this side.

So if the true mean is greater than 0,
we're not going to be able to detect that. If we don't care about that side, then that's fine,
but we might want to know about that. It might be interesting for us if this drug
actually increases blood pressure. We might have an application for that. When we choose a one-sided alternative
instead of a two-sided alternative we gain power in the direction of the
alternative hypothesis but completely lose the ability to detect
a difference on the other side. In practice it's going to be tempting to use this alternative if we suspect the difference is going to be in that direction, and our observed sample data falls in that direction, But if our observed sample data shows a difference
in the opposite direction then it might be tempting to change our mind and choose a one-sided alternative in that other direction, or a two-sided alternatives so that we
can report those results. But this at best is very poor statistical practice. You should not use the data to determine
what the alternative hypothesis is.

It's a similar notion if we use the p-value approach. If we choose the alternative hypothesis
based on the direction observed in the sample, the reported p-value will be half of
what it should be. We will be overstating the amount of evidence
against the null hypothesis and we will sometimes be reporting
results as significant when they are not. And if we always chose the alternative hypothesis based on what we expected to get before running
the experiment or getting our sample then that's a problem as well.

Results different from what one expects
can often be the most interesting results. So here's my opinion on the matter and
it's one that's shared by many others. Err on the side of choosing a two-sided alternative hypothesis. And choose a one-sided alternative if
you care about a difference in only one direction. In a related note, in two-sided tests we
are still almost always interested in the direction of the difference. For example, you are investigating
possible differences in five-year survival rates for two types of treatment. p_1 here represents the five-year survival rate for treatment 1, and p_2 represents the five-year survival rate for treatment 2. And we're testing this against a two-sided alternative hypothesis. If you are reporting these results to your boss and you tell them that there is very strong evidence
of a difference in five-year survival rates and your boss asks you which treatment is better and you say gee I never thought to look,
that's not going to go over very well. We want to know which treatment has a
better five-year survival rate. So when we are using a two-sided
alternative hypothesis, we are still typically interested in the
direction of the difference.

There are some statistical issues with trying
to come up with a directional conclusion from a two-sided alternative hypothesis. But if we only say that there is a difference,
that didn't really get us anywhere. In a related note, in tests with two side alternatives the null hypothesis is almost always wrong. On that last page we were testing the null hypothesis that the five-year survival rates were equal.

What are the chances that the five-year survival rates
are exactly equal? Not very good to say the least. So a lot of times, going in, we know the null hypothesis is false and if in the end we simply reject it, without giving an indication of the direction of the difference,
we just wasted our time. This is another reason why it can be very informative to report an appropriate confidence interval in addition to the results of the hypothesis test, as the confidence interval can give us an indication
of the direction and size of the difference.

In the end, the choice between one or two
side alternative hypothesis does not need a matter much. Report the p-value, say whether it's a one-sided or two-sided p-value, usually two-sided, and let the reader make up their own mind. A knowledgeable person can make the
appropriate adjustments if they feel that a one-sided alternative
was more appropriate in that case..

Add Comment