# Econ 3389 Machine Learning

Econ 3389 Machine Learning: Homework 4
1 Programming Practice
Load German Credit Score data Before beginning these exercises, down-
load the credit score csv le and load the data set using the following code.
X = data.values[:,[1,4,12]]
X = X.astype(‘float’)
Y = data.values[:, 20]
Y = Y-1
Y = Y.astype(‘float’)
This data sets has a lot of predictive variables, and we will deal with that
later. For now, our X will include three variables in the following order:
the duration of the loan in years, the amount of the loan, and the age of
the person applying for a loan. The Y variable is whether the loan was

1. Using sklearn.linear model.LinearRegression.fit as in class, t a
linear model to this data set.
2. Imagine you are an 18 year old who desires to take a 30 year loan of
100,000 to buy a house. Use sklearn.linear model.LinearRegression.predict
to predict the probability that you will pay the loan back.
1
ECON3389 – Big Data – Fall 2018 2
? Use scikit learn.cross validation.train test split to divide
X and Y into a training and a test test. The training set should be
contain 70% of the observations.
? Use your sklearn.linear model.LinearRegression.fit model
to t a model on your train set.
? Use your sklearn.linear model.LinearRegression.predict model
to predict values in the train and test set.
? Use ols mse from your previous HW to nd out test and training
set MSE, and print out the values of both.
4. Using sklearn.linear model.LogisticRegression.fit as in class, t
a logistic regression model to this data set.
5. Imagine you are an 18 year old who desires to take a 30 year loan of
100,000 to buy a house. Use sklearn.linear model.LogisticRegression.predict proba
to predict the probability that you will pay the loan back.
? Use scikit learn.cross validation.train test split to divide
X and Y into a training and a test test. The training set should be
contain 70% of the observations.
? Use your sklearn.linear model.LogisticRegression.fit model
to t a model on your train set.
? Use your sklearn.linear model.LogisticRegression.predict proba
model to predict values in the train and test set.
? Use ols mse from your previous HW to nd out test and training
set MSE, and print out the values of both.
2 Theory