Statistical Significance
Statistical significance is a crucial concept in hypothesis testing. It's used to determine whether the result of an experiment or study is likely to be due to something other than pure chance. When we perform a hypothesis test like in the given exercise, where we compare the mean time Canadians spend online to a certain value, we aim to find out if the observed difference is statistically significant. In practice, this means we use a pre-determined significance level, commonly set at 0.05, to establish a threshold for deciding when a result is significant. If the calculated p-value is less than this threshold, we consider the findings to be statistically significant, leading us to reject the null hypothesis, indicating that our observed sample provides enough evidence to support the alternative hypothesis.
In the exercise, we compare the p-values in both scenarios (part a and part b) to the 0.05 significance level to conclude whether or not there is convincing evidence to support the claim that Canadians spend more than 12.5 hours online in a typical week.
Null Hypothesis
The null hypothesis, often denoted as \(H_0\), is a statement of no effect or no difference, and it serves as the starting point for statistical significance testing. It's the hypothesis that we presume to be true until evidence suggests otherwise. In the context of the exercise, the null hypothesis posits that Canadian Internet users spend exactly 12.5 hours online each week on average. The null hypothesis is what we test against the alternative hypothesis using the data from our sample.
It is important not to confuse the null hypothesis with the claim we are trying to prove; instead, it acts as the claim we are trying to find evidence against. If we have sufficient evidence, we can reject the null hypothesis in favor of the alternative hypothesis. Otherwise, we 'fail to reject' the null hypothesis, which is just a cautious way of saying we're not taking action based on the data.
Alternative Hypothesis
Contrasting the null hypothesis is the alternative hypothesis, denoted as \(H_1\) or \(H_a\), which represents the outcome we suspect might be true and are trying to find evidence for. In the textbook exercise, the alternative hypothesis is that the mean time Canadians spend online per week exceeds 12.5 hours. It's what we think might be the case—we believe that Canadians might actually be spending more time online than the status quo of 12.5 hours.
The alternative can be directional (specifying 'more than' or 'less than') or non-directional (simply stating 'not equal to'). The direction of the hypothesis is crucial as it affects the type of statistical test we use—namely, whether we opt for a one-tailed or two-tailed test.
Standard Deviation
Standard deviation is a measure of the amount of variation or dispersion in a set of values. A low standard deviation indicates that the values are close to the mean of the data set, while a high standard deviation indicates a wider range of values. It's essential in hypothesis testing as it helps us understand the spread of our data around the mean and impacts the test statistic calculation, which in turn affects the p-value.
In the solved example, part (a) used a standard deviation of 5 hours, whereas part (b) used 2 hours, significantly impacting the z-score and p-value. The reason the null hypothesis was rejected in part (b) but not in part (a) can be traced back to the difference in standard deviations: a smaller standard deviation suggests less variability and therefore gives more weight to the difference between the sample mean and the null hypothesis mean.
Test Statistic
The test statistic, in simple terms, is a standardized value that is calculated from sample data during a hypothesis test. It's essential for making a decision regarding the null hypothesis. The idea is to see how far or how extreme the sample statistic is from the hypothesized parameter (mean, proportion, etc.) under the assumption that the null hypothesis is true. We use this test statistic to compute the p-value.
In our example, the test statistic is the z-score, representing the number of standard deviations the sample mean is from the hypothesized mean. For standard deviations of 5 and 2 hours, we calculated z-scores of 0.40 and 2.0, respectively. These scores play a pivotal role in determining whether our findings are statistically significant, as shown by the underlying p-values.
P-value
The p-value can be considered as the probability of observing the data, or something more extreme, assuming that the null hypothesis is true. A small p-value, typically less than the chosen significance level (0.05 is common), suggests that the observed data is unlikely if the null hypothesis were true, leading us to reject the null hypothesis.
In the example provided, when the standard deviation was 5 hours, we found a p-value of 0.3446, which was higher than the 0.05 threshold, indicating that there was not enough evidence to reject the null hypothesis. Conversely, when the standard deviation was 2 hours, the p-value dropped to 0.0228, falling below the significance level and leading us to reject the null hypothesis.
One-tailed Test
A one-tailed test is a statistical test in which the critical area of a distribution is one-sided so that it tests for the possibility of the relationship in one direction and completely ignores the possibility of a relationship in the other direction. This is appropriate when the research hypothesis makes a statement about a direction (e.g., 'greater than', 'less than'). In our exercise, we used a one-tailed test because our alternative hypothesis was directional; it claimed that Canadians spend more time than a specific value (12.5 hours) online. Therefore, we are only interested in the area in the distribution that represents values greater than the mean under the null hypothesis.
A one-tailed test is more powerful than a two-tailed test for detecting an effect in one direction because all the statistical significance is packed into the extreme end of one tail of the distribution.