Week 7 (Mar 1-3)
On Monday this week we will review the Market Models case study from last time, and begin a new demand modeling case study. On Wednesday we’ll discuss the demand modeling case study and review the concept of sampling distributions.
In-class activities, case studies, exercises, etc for this week
We will work on these in breakout rooms. I suggest having one person share their screen and serve as note-taker, or quickly sharing a Google doc so you can work together. Some of the activities will be a component of the homework you hand in next week.
Activity 1 (Monday): Modeling the Demand for Milk
Activity 2 (Wednesday): College Scorecard
Other materials used in class
Readings, videos, and activities to do for next week
For Monday:
- Read Dummy variables and The bootstrap
- Watch: Bootstrap overview
- Complete the following walkthrough: Creatinine
Exercises and other things due next week
These will be due on Wednesday 3/10 as usual, but note that this due date will overlap with the time your midterm exam is available. So you may want to complete it by Tuesday, but I’ll leave it up to you.
(Optional) Read Interpreting coefficients in log-log regression models if you want to see why the slope is approximately the percentage change in the predicted value for y for a one percent increase in x.
Watch this video about interpreting models with a log-transformed response, to help with problem 3 below.
Complete Activity 1 from weeks 6 and 7 and turn them in. You don’t need to turn anything in for Activity 2 this week. Complete the following exercises:
Problem 1
Use the following commands in R to load the USArrests dataset from the MASS package and read the dataset’s documentation:
library(MASS)
data(USArrests)
?USArrests
In this problem you will compute estimates of regression coefficients in R.
We’re interested in the relationship between murder and assault rates.
i) Produce a scatterplot of the two variables with the Murder rate on the y-axis. Do you think a simple linear regression with the assault rate as the dependent variable (x) and murder rate as the independent variable (y) is appropriate? Why or why not?
ii) Fit a simple linear regression of the Murder rate against the assault rate. Interpret the estimated intercept and slope of the regression model. Are both statistics meaningful here? Make sure that you use the actual units of measurement as given in the documentation of the dataset.
iii) Based on this analysis, do you think high assualt rates cause high murder rates in U.S. cities?
Problem 2
Revisit the Austin food critics dataset. Fit a multiplicative model relating Price to the food quality score by log-transforming the Price variable.
i) Compare the residual plots for this new fit to the residual plots for the linear fit we computed in class. How do they differ? Which model seems to do a better job taking the “X-ness” out of Y?
ii) Plot the fitted values from your multiplicative model (using the original scale for Price). Visually, does this model seem to fit the data better than a linear model?
iii) Interpret the slope of your multiplicative model. Give your interpretation on the original scale of $y$ (Price).
iv) When we first examined this dataset, I asked you to decide whether the FeelScore or FoodScore was a better predictor of average meal price. How would you use $R^2$ to address this question? Compare these two predictors using a multiplicative models as in parts i - iii above. Which is the better predictor of average meal price?
Problem 3
Revisit the milk case study and estimate the optimal price of milk when your unit cost is $1.40, adapting the code above for optimizing functions.