Problem 15
A random sample of records of sales of homes from Feb. 15 to Apr. 30,1993 , from the files maintained by the Albuquerque Board of Realtors gives the Price and Size (in square feet) of 117 homes. A regression to predict Price (in thousands of dollars) from Size has an \(R\) -squared of \(71.4 \%\). The residuals plot indicated that a linear model is appropriate. a) What are the variables and units in this regression? b) What units does the slope have? c) Do you think the slope is positive or negative? Explain.
Problem 23
A Biology student who created a regression model to use a bird's Height when perched for predicting its Wingspan made these two statements. Assuming the calculations were done correctly, explain what is wrong with each interpretation. a) \(\mathrm{My} R^{2}\) of \(93 \%\) shows that this linear model is appropriate. b) A bird 10 inches tall will have a wingspan of 17 inches.
Problem 24
A Sociology student investigated the association between a country's Literacy Rate and Life Expectancy, then drew the conclusions listed below. Explain why each statement is incorrect. (Assume that all the calculations were done properly.) a) The Literacy Rate determines \(64 \%\) of the Life Expectancy for a country. b) The slope of the line shows that an increase of \(5 \%\) in Literacy Rate will produce a 2-year improvement in Life Expectancy.
Problem 26
Players in any sport who are having great seasons, turning in performances that are much better than anyone might have anticipated, often are pictured on the cover of Sports Illustrated. Frequently, their performances then falter somewhat, leading some athletes to believe in a \({ }^{\text {"Sports Illustrated jinx." Similarly, it is common for phe- }}\) nomenal rookies to have less stellar second seasons - the so-called "sophomore slump." While fans, athletes, and analysts have proposed many theories about what leads to such declines, a statistician might offer a simpler (statistical) explanation. Explain.
Problem 33
We learned that the Office of Federal Housing Enterprise Oversight (OFHEO) collects data on various aspects of housing costs around the United States. Here's a scatterplot (by state) of the Housing Cost Index (HCI) versus the Median Family Income (MFI) for the 50 states. The correlation is \(r=0.65\). The mean HCI is \(338.2\), with a standard deviation of \(116.55\). The mean MFI is \(\$ 46,234\), with a standard deviation of \(\$ 7072.47\). a) Is a regression analysis appropriate? Explain. b) What is the equation that predicts Housing Cost Index from median family income? c) For a state with \(\mathrm{MFI}=\$ 44,993\), what would be the predicted HCI? d) Washington, DC, has an MFI of \(\$ 44,993\) and an HCI of \(548.02 .\) How far off is the prediction in b) from the actual HCI? e) If we standardized both variables, what would be the regression equation that predicts standardized HCI from standardized MFI? f) If we standardized both variables, what would be the regression equation that predicts standardized MFI from standardized HCI?
Problem 34
We saw a plot of total mortgages in the United States (in millions of 2005 dollars) versus the interest rate at various times over the past 26 years. The correlation is \(r=-0.84\). The mean mortgage amount is \(\$ 151.9\) million and the mean interest rate is \(8.88 \%\). The standard deviations are \(\$ 23.86\) million for mortgage amounts and \(2.58 \%\) for the interest rates. a) Is a regression model appropriate for predicting mortgage amount from interest rates? Explain. b) What is the equation that predicts mortgage amount from interest rates? c) What would you predict the mortgage amount would be if the interest rates climbed to \(20 \%\) ? d) Do you have any reservations about your prediction in part c? e) If we standardized both variables, what would be the regression equation that predicts standardized mortgage amount from standardized interest rates? f) If we standardized both variables, what would be the regression equation that predicts standardized interest rates from standardized mortgage amount?
Problem 37
The SAT is a test often used as part of an application to college. SAT scores are between 200 and 800 , but have no units. Tests are given in both Math and Verbal areas. Doing the SAT-Math problems also involves the ability to read and understand the questions, but can a person's verbal score be used to predict the math score? Verbal and math SAT scores of a high school graduating class are displayed in the scatterplot, with the regression line added. a) Describe the relationship. b) Are there any students whose scores do not seem to fit the overall pattern? c) For these data, \(r=0.685\). Interpret this statistic. d) These verbal scores averaged \(596.3\), with a standard deviation of \(99.5\), and the math scores averaged 612.2, with a standard deviation of \(96.1\). Write the equation of the regression line. e) Interpret the slope of this line. f) Predict the math score of a student with a verbal score of \(500 .\) g) Every year some student scores a perfect 1600 . Based on this model, what would be that student's Math score residual?
Problem 38
Colleges use SAT scores in the admissions process because they believe these scores provide some insight into how a high school student will perform at the college level. Suppose the entering freshmen at a certain college have mean combined SAT Scores of 1833 , with a standard deviation of 123 . In the first semester these students attained a mean GPA of \(2.66\), with a standard deviation of \(0.56 .\) A scatterplot showed the association to be reasonably linear, and the correlation between \(S A T\) score and \(G P A\) was \(0.47\). a) Write the equation of the regression line. b) Explain what the \(y\) -intercept of the regression line indicates. c) Interpret the slope of the regression line. d) Predict the GPA of a freshman who scored a combined \(2100 .\) e) Based upon these statistics, how effective do you think SAT scores would be in predicting academic success during the first semester of the freshman year at this college? Explain. f) As a student, would you rather have a positive or a negative residual in this context? Explain.
Problem 53
Consider the four points \((10,10)\), \((20,50),(40,20)\), and \((50,80)\). The least squares line is \(\hat{y}=7.0+1.1 x .\) Explain what "least squares" means, using these data as a specific example.
Problem 54
Consider the four points \((200,1950)\), \((400,1650),(600,1800)\), and \((800,1600)\). The least squares line is \(\hat{y}=1975-0.45 x\). Explain what "least squares" means, using these data as a specific example.