## Complaint about AP stats: one tail hypothesis testing

Apologies for people who don’t care…

But DC1 needed help on hypothesis testing, which is totally understandable since it’s non-intuitive for a lot of people (including me!), particularly the one-tailed tests.

So, most of us who use stats regularly are used to the idea of two-tail t-testing– that’s when we want to know if there is actually a relationship between two variables or if it’s just random chance making it look like two variables are related.  This is kind of the essence of regression analysis– we want to know if we can be 95% ok with the idea that the coefficient we are getting is different from zero.  Two-tailed t-tests are used for a lot of other things besides regression, like seeing if a variable is different from a mean or two means are different from each other, but the basic idea is we want to know if things are the same or different.  We don’t assign any direction to the difference– we don’t say which is bigger or smaller, just that the sample was drawn from a distribution with the same/different mean, or the two samples were drawn from one distribution or from two separate distributions.

Hypotheses then look like:

H1: μ1≠μ2

H0: μ1=μ2

With a one-tailed test you are applying a direction– you’re saying that it could be equal, but it could also be going the other direction.  Your alternative (H1) hypothesis is that μ1 is bigger than μ2.  (Or that μ1 is smaller than μ2, depending on what you’re trying to show.)

So the hypothesis should look like:

H1: μ1>μ2

H0: μ1≤μ2

There are a number of different ways to write out H0.  You could write > but then cross it out (I couldn’t find that character in word).  I tell my students they can write “We cannot say that μ1>μ2,” because that’s really the point.  We don’t know if μ1=μ2 or if μ1<μ2 or… even if μ1>μ2 and we just have too small of a sample size to say for sure.  (My students generally will be doing practical t-testing for non-academic employers rather than writing up fancy scientific papers that require formal hypotheses, so it’s more important for them to understand what they’re doing and what their outputs mean than it is for them to follow a specific jargon structure.)

Imagine my astonishment when the AP review sheet DC1 had showed this instead:

H1: μ1>μ2

H0: μ1=μ2

Surely the teacher made a mistake, I thought.  But no!  This is how all the AP stuff is.  This is genuinely what they’re teaching and testing for AP stats.

The problem is that when you formulate it like this, you’re not allowing for the possibility of μ1<μ2, and you’re making kids think that you’ve actually proven the null hypothesis, that you can really say that μ1=μ2 when really all you can do is say you’re not sure if μ1>μ2.

I took a picture and texted it to my friends without comment and got replies like, “What fresh hell is this?” and “!!, No!  less than or equal.” And they were astonished to find it was AP Stats and for real, not just a mistake in the notes.

No wonder some of my students who come in to my class after having taken AP Stats never really got hypothesis testing the first time around.

### 16 Responses to “Complaint about AP stats: one tail hypothesis testing”

1. Jenny F. Scientist Says:

I’m definitely with you: the opposite of greater than is…. NOT greater than. ≯!

2. 'Snough Says:

Huh! What you say makes so much sense. I’ve never taken a stats course, but I taught a “Stats-for-bio/business-majors” course, and so had to teach myself hypothesis testing as we went along. Apologies to the world in general for especially my first year of teaching, when I was still pretty ineffective and playing catch-up.

At any rate, the course used a book called “A First Course in Statistics”, 8th edition, and I’m pretty sure the null hypothesis in that book was ALWAYS “mean = something”, never “mean is not greater than something” or “not less than something”. This is the first time I’ve thought about that: I guess it’s because the test use a single probability distribution, with mu = #, and not a bunch of different probability distributions with mu numbers less/greater than that number.

I haven’t taught stats for almost two decades; we hired faculty members who know the heck what they’re doing.

• nicoleandmaggie Says:

If it’s a first course they might not have done one sided at all. I don’t think my first stats class (at a community college) did. Sounds like they also only did one sample and not two sample. You can have one side with one sample, but one side is a difficult concept and IRL people think you’re cheating if you use it because it’s 2x as easy to reject the null, even if you only care about one side. So there’s not much point in teaching it unless you think your students will be working with small samples for which they can specify a direction (ex. Local government, medium sized firms).

3. CG Says:

Oy. I guess my oldest and I will need to have some conversations about this next year when he takes AP stats. It’s fine to teach one-sided, although I never use it in my analysis, but don’t teach it incorrectly! Was this on some standard curriculum or was it written by the teacher?

4. Matthew D Healy Says:

This is EXACTLY why in my world (pharmaceutical research), two-tailed hypothesis tests are usually preferred: there is always the possibility that your new drug might turn out to harm patients. Therefore you need to ask (1) is there a significant difference between groups and (2) which direction?

You also need to ask whether the effect size is big enough to be clinically meaningful; with large enough sample size even tiny effects can have small p values.

• nicoleandmaggie Says:

We spend quite a bit of time talking about type 1 vs type 2 error when we talk about which kind of test to use. Sometimes the other kind of error is worse! Sometimes the change is cheap. Sometimes it’s got bad side effects. There’s a lot that goes into cost-benefit analysis. (We also spend a lot of time going over magnitude vs. significance, or “oomph” vs significance as Dierdre McCloskey calls it.)

A lot of what my students will be doing when they get out is the kind where they have small samples and know a direction and rejecting the null when they shouldn’t isn’t a huge deal, so a one-sided test is better than just looking at the means. My second semester class is people who will be using large samples, so it tends not to matter if it’s one side or two side. Even if they know which side they care about.

5. This explains generally why I understood nothing in my college stats course. I don’t think anything you wrote here was covered at all. Not that my first two years of college registered much in my memory but I do remember worrying about how I’d pass because none of it was clear like this at all. I’m still going to need to reread this a couple dozen more times to fix it in my memory now, but at least it feels possible to understand.

• nicoleandmaggie Says:

It’s complicated for most people the first time through. (But also I am a brilliant teacher.)

• You have no idea how tempting it is to want to enlist your aid as a tutor just because I would like to understand math! Maybe someday when we both have an abundance of time and boredom, I could do a correspondence course with you. XD

• nicoleandmaggie Says:

ROFL, well, if you want to move to a horrific southern state for two years, you could get a masters!

• *shiver* I don’t think I value “feeling less stupid about math” quite that much! O_O Yet. Maybe. Probably.

6. First Gen American Says:

I freaking love statistics but I don’t get exposed to it until college and beyond. I was also expected to use it in my projects when I never took one class in it either in Hs or college. God, our kids will have it so much easier than I did…even with portions of it being taught incorrectly.

• nicoleandmaggie Says:

It was required for everyone at my SLAC, even humanities majors, which I think is fantastic. Though most of them satisfied that requirement with a joke class from the anthropology department. Being able to understand that just because two numbers are different that doesn’t mean they’re statistically different is an important concept for life.

This site uses Akismet to reduce spam. Learn how your comment data is processed.