Adding variances is the Pythagorean Theorem of stats, we all probably say that to the learners, but it is ONLY true in one special instance: When the two variances are independent of one another.

That pesky “independence or n < 10%” condition check plays a role here, because if we fail to check it, then we can’t add the variances.

Or can we?

Well, I won’t jump to the end yet. I want to explore a classroom activity on this first.

**Combining Random Variables: Speed Dating**

To save time and money, many single people have decided to try speed dating. At a speed dating event, women sit in a circle and men spend about 5 minutes getting to know a woman before moving on to the next one. Suppose that the height *M* of male speed daters follows a Normal distribution with a mean of 69.5 inches and a standard deviation of 4 inches and the height *F* of female speed daters follows a Normal distribution with a mean of 65 inches and a standard deviation of 3 inches. What is the probability that a randomly selected male speed dater is taller than the randomly selected female speed dater he is paired with?

Pair the class off, and have the boys do randomnormal (69.5,4) and the girls do randnormal(65,3). Make them compare and raise hands. Once they do it, shift and do it again. Do this for a sample of 20 or so. Just enough so they can see sometimes there will be taller girls, and sometimes taller boys, and just doing this won’t answer the question.

Then use lists and put 100 males into list 1, and 100 females into list 2 using the store command on the board calculator up front. Go through the list one by one. Who is taller, boy or girl? Who is taller, boy or girl. Etc. Wait for someone in the class to say “Couldn’t we just subtract the two?” bingo. Let them come up with it though.

Calculate the SD of the L1, L2, and L3. What do they notice? Hmmm, 3, 4, 5…. Wait for it. Wait for it. If you need, question, but don’t give it away. Let them reach back and think of Pythagorean Theorem.

Now they discovered the rule instead of you telling them the rule, and it should stick a little better.

———-

But back to the question above. If the two samples are not independent, then we can’t add the variances, right? Wrong. Here is where the Law of Cosines plays a role.

Think back to geometry or trigonometry and we had the following situation:

When the triangle was a right angle, we had the top situation, the nice simple Pythagorean Theorem. But if the two sides are not orthogonal, then we don’t have a right triangle, and then we use the law of cosines to find the length of the third side.

Guess what happens in stats when we do not have orthogonal (or independent) samples? You are right!

Where rho is the correlation coefficient between the two samples. When the two samples are independent, rho is zero, and the term goes away. It turns out in AP Stats, we are teaching a special case! If we have non-independent samples we can still add the variances, as long as we adjust for the correlation!

Isn’t it awesome when you learn where it goes so we can teach the curriculum better? This kind of stuff truly makes me a better teacher.

The law of cosines example not only just made my month but also helps me understand a lot of economics that my brother was trying to teach me — dealing with dependent data being thrown together (though we mainly talked about multiple regression, the concept clearly applies here). If I have any kids working with dependent data I will make them experience this first. Thanks.