Problem 25
Height and reading. A researcher studies children in elementary school and finds a strong positive linear association between height and reading scores. a) Does this mean that taller children are generally better readers? b) What might explain the strong correlation?
Problem 26
Cellular telephones and life expectancy. A survey of the world's nations in 2004 shows a strong positive correlation between percentage of the country using cell phones and life expectancy in years at birth. a) Does this mean that cell phones are good for your health? b) What might explain the strong correlation?
Problem 27
Correlation conclusions I. The correlation between Age and Income as measured on 100 people is \(r=0.75 .\) Explain whether or not each of these possible conclusions is justified: a) When Age increases, Income increases as well. b) The form of the relationship between Age and Income is straight. c) There are no outliers in the scatterplot of Income vs. Age. d) Whether we measure Age in years or months, the correlation will still be \(0.75\).
Problem 28
Correlation conclusions II. The correlation between Fuel Efficiency (as measured by miles per gallon) and Price of 150 cars at a large dealership is \(r=-0.34\). Explain whether or not each of these possible conclusions is justified: a) The more you pay, the lower the fuel efficiency of your car will be. b) The form of the relationship between Fuel Efficiency and Price is moderately straight. c) There are several outliers that explain the low correlation. d) If we measure Fuel Efficiency in kilometers per liter instead of miles per gallon, the correlation will increase.
Problem 29
Baldness and heart disease. Medical researchers followed 1435 middle-aged men for a period of 5 years, measuring the amount of Baldness present (none \(=1\), little \(=2\), some \(=3\), much \(=4\), extreme \(=5\) ) and presence of Heart Disense \((\mathrm{No}=0\), Yes \(=1)\). They found a correlation of \(0.089\) between the two variables, Comment on their conclusion that this shows that baldness is not a possible cause of heart disease.
Problem 30
Sample survey. A polling organization is checking its database to see if the two data sources it used sampled the same zip codes. The variable Datasource \(=1\) if the data source is MetroMedia, 2 if the data source is DataQwest, and 3 if it's RollingPoll. The organization finds that the correlation between five-digit zip code and Datasource is \(-0.0229\). It concludes that the correlation is low enough to state that there is no dependency between Zip Code and Source of Data. Comment.
Problem 31
Income and housing. The Office of Federal Housing Enterprise Oversight (www.ofheo.gov) collects data on various aspects of housing costs around the United States. Here is a scatterplot of the Housing Cost Index versus the Median Family Income for each of the 50 states. The correlation is \(0.65\). a) Describe the relationship between the Housing Cost Index and the Median Family Income by state. b) If we standardized both variables, what would the correlation coefficient between the standardized variables be? c) If we had measured Median Family Income in thousands of dollars instead of dollars, how would the correlation change? d) Washington, DC, has a Housing Cost Index of 548 and a median income of about \(\$ 45,000\). If we were to include DC in the data set, how would that affect the correlation coefficient? e) Do these data provide proof that by raising the median income in a state, the Housing Cost Index will rise as a result? Explain.
Problem 32
Interest rates and mortgages. Since 1980 , average mortgage interest rates have fluctuated from a low of under \(6 \%\) to a high of over \(14 \%\). Is there a relationship between the amount of money people borrow and the interest rate that's offered? Here is a scatterplot of Total Mortgages in the United States (in millions of 2005 dollars) versus Interest Rate at various times over the past 26 years. The correlation is \(-0.84\). a) Describe the relationship between Total Mortgages and Interest Rate. b) If we standardized both variables, what would the correlation coefficient between the standardized variables be? c) If we were to measure Total Mortgages in thousands of dollars instead of millions of dollars, how would the correlation coefficient change? d) Suppose in another year, interest rates were \(11 \%\) and mortgages totaled \(\$ 250\) million. How would including that year with these data affect the correlation coefficient? e) Do these data provide proof that if mortgage rates are lowered, people will take out more mortgages? Explain.
Problem 37
Attendance 2006. American League baseball games are played under the designated hitter rule, meaning that pitchers, often weak hitters, do not come to bat. Baseball owners believe that the designated hitter rule means more runs scored, which in turn means higher attendance. Is there evidence that more fans attend games if the teams score more runs? Data collected from American League games during the 2006 season indicate a correlation of \(0.667\) between runs scored and the number of people at the game. (http: //mlb.mlb.com) a) Does the scatterplot indicate that it's appropriate to calculate a correlation? Explain. b) Describe the association between attendance and runs scored. c) Does this association prove that the owners are right that more fans will come to games if the teams score more runs?