Code
from datascience import *
import numpy as np
%matplotlib inline
When we observe something different from what we expect in real life (i.e., four 3’s in six rolls of a fair die), a natural question to ask is “Was this unexpected behavior due to random chance, or something else?”
Hypothesis testing allows us to answer the above question in a scientific and consistent manner, using the power of computation and statistics to conduct simulations and draw conclusions from our data.
Wayland is playing with a coin and he wants to test whether his coin is fair. His experiment is to toss the coin 100 times. He chooses the following null hypothesis.
Null Hypothesis: The coin is fair and any deviation observed is due to chance.
For each of the alternative hypotheses listed below, determine whether or not the test statistic is valid.
Alternative Hypothesis: The coin is biased towards heads.
Test Statistic: # of heads.
Alternative Hypothesis: The coin is not fair.
Test Statistic: # of heads.
Alternative Hypothesis: The coin is not fair.
Test Statistic: \(|\)# of heads - expected # of heads \(|\).
Correct.
Alternative Hypothesis: The coin is biased towards heads.
Test Statistic: \(|\)# of heads - expected # of heads \(|\).
Incorrect, this is the opposite case of part (b). We see that this test statistic will also account for a bias towards tails (because of the absolute value).
Alternative Hypothesis: The coin is not fair.
Test Statistic: 1/2 - proportion of heads.
Wayne is flipping a coin. He thinks it is unfair, but is not sure. He flips it 10 times and gets heads 9 times. He wants to determine whether the coin was actually unfair, or whether the coin was fair and his result of 9 heads in 10 flips was due to random chance.
from datascience import *
import numpy as np
%matplotlib inline
What is a possible model that he can simulate under?
A possible model that you could simulate under is that on each flip, there is a 50% chance that the coin lands heads and a 50% chance that the coin lands tails. Any difference is due to chance.
If you are more familiar with probability: The heads are like independent and identically distributed draws at random from a distribution in which 50% are Heads and 50% are Tails.What is an alternative model for Wayne’s coin? You do not necessarily have to be able to simulate under this model.
What is a good test statistic that you could compute from the outcome of his flips? Calculate that statistic for your observed data. Hint: If the coin was unfair, it could either be biased towards heads or biased towards tails.
A good test statistic is the absolute difference between the number of heads we observe and the expected number of heads (5). Our observed test statistic is $\(9 - 5\)$ = 4. Notice that this statistic is large for both a large number of heads, as well as a small number of heads.
We could also use proportions as our test statistic, i.e., \(\vert\) proportion of heads - 0.5 \(\vert\).
Complete the function flip_ten
, which takes no arguments and does the following: - Simulates flipping a fair coin 10 times - Computes the simulated statistics, based on the one chosen in the previous question
def flip_ten():
= make_array("Heads", "Tails")
faces = ____________________
flips = ____________________
num_heads return ____________________
def flip_ten():
= make_array("Heads", "Tails")
faces = np.random.choice(faces, 10)
flips = np.count_nonzero(flips == "Heads")
num_heads return abs(num_heads - 5)
flip_ten()
0
Complete the code below to simulate the experiment 10000 times and record the statistic computed in each of those trials in an array called simulated_stats
.
= ____________________
trials = ____________________
simulated_stats for ____________________:
= ____________________
one_stat = ____________________ ____________________
= 10000
trials = make_array()
simulated_stats for i in np.arange(trials):
= flip_ten()
one_stat = np.append(simulated_stats, one_stat) simulated_stats
simulated_stats
array([ 0., 2., 1., ..., 2., 2., 2.])
Suppose we performed the simulation and plotted a histogram of simulated_stats
. The histogram is shown below.
'Absolute Differences', simulated_stats).hist("Absolute Differences", bins = np.arange(11)) Table().with_columns(
Is our observed statistic from part (c) consistent with the model we simulated under?
You are playing a wheel-spinning game at a carnival, where you can earn prizes based on where the wheel stops. The booth attendant claims the distribution of prizes is as below, but you think the game is rigged and doesn’t follow the listed probabilities.
Prize | Chance |
---|---|
Nothing | 80% |
Teddy bear | 2% |
Pinwheel | 6% |
Sticker | 12% |
You would like to test your claim so you can report the carnival for fraud. Before you design your test, consider: do you have numerical data or categorical data?
What is your hypothesis?
What is the booth attendant’s hypothesis?
Which hypothesis (of the two we defined) can you simulate under?
What is a good statistic to use?
Write code that simulates playing the carnival game 1,000 times, returns an array of proportions representing how often each prize was won, and finally extracts the number of teddy bears won in the simulation.
= ____________________
prize_chances = ____________________
my_simulation = ____________________ num_teddy_bears
= make_array(0.8, 0.02, 0.06, 0.12)
prize_chances = sample_proportions(1000, prize_chances)
my_simulation = my_simulation.item(1) * 1000 num_teddy_bears
num_teddy_bears
17.0
Suppose the wheel-spinning game received a lot of complaints at the carnival, and the owners of the game are pressured to release their true distribution of prizes as below:
Prize | Chance |
---|---|
Nothing | 90% |
Teddy bear | 1% |
Pinwheel | 3% |
Sticker | 6% |
Use the distribution above to answer the following probability questions.
What is the probability of winning a Teddy bear and a Sticker in two spins?
P(Teddy bear and Sticker) = 2 * P(Teddy bear) * P(Sticker) = 2 * 0.01 * 0.06 = 0.12%
We multiply by 2 because we could have won the Teddy bear and then the Sticker OR the Sticker first and then the Teddy bear.
What is the probability of winning at least one prize in 10 spins?
Researchers are studying the effectiveness of a particular flu vaccine. A large random sample was taken from the population of people who took the vaccine in 2016. Among the sampled people, 48% did not get the flu. Another large random sample was taken in 2017, from among the people who took the vaccine that year. Among these sampled people, 40% did not get the flu.
A researcher thinks the vaccine was less effective in 2017 than in 2016. To test this, a null hypothesis is needed. Exactly one of the following choices is the correct null hypothesis.
A. The vaccine was less effective in the 2017 population than in the 2016 population, due to chance.
B. The vaccine was equally effective in the two samples but its effectiveness was different in the two populations due to chance.
C. The vaccine was equally effective in the two populations but its effectiveness was different in the two samples due to chance.
Option A - Incorrect as it describes a model that is difficult to simulate under. How can we quantify “less effective”?
Option B - Incorrect as the question tells us that the vaccine was not equally effective in the two samples (48% vs 40%).
Option C - Correct. The null hypothesis would state that the vaccine was equally effective in the two populations, and that the differences we observe in the two samples are simply due to chance.
The researcher says, “The observed value of my test statistic is 40% – 48% = − 8%.” To perform the test, the statistic is simulated under the null hypothesis. One of the figures below is the empirical histogram of the simulated values. Which is it?