917 lines
120 KiB
Plaintext
917 lines
120 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h1>Programming Exercise 2: Logistic Regression</h1>\n",
|
|
"\n",
|
|
"<h3>Introduction</h3>\n",
|
|
"\n",
|
|
"In this exercise we will implement logistic regression and apply it to two different datases.\n",
|
|
"\n",
|
|
"<h3>Files Included in this exercise</h3>\n",
|
|
"\n",
|
|
"- ex2data1.txt\n",
|
|
"- ex2data2.txt\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h3>1 Logistic Regression</h3>\n",
|
|
"Here we will build a logistic regression model to predict whether a student gets admitted into a university given the results of two exams. Our training set consists of samples of applicants' scores on two exams and an admissions decision."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# used for manipulating directory paths\n",
|
|
"import os\n",
|
|
"\n",
|
|
"# Scientific and vector computation for python\n",
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"# Plotting library\n",
|
|
"from matplotlib import pyplot as plt\n",
|
|
"\n",
|
|
"# Optimization module in scipy\n",
|
|
"from scipy import optimize\n",
|
|
"\n",
|
|
"# tells matplotlib to embed plots within the notebook\n",
|
|
"%matplotlib inline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h4>Visualizing the data</h4>\n",
|
|
"\n",
|
|
"Before we begin on the algorithm we load and visualize the data."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Load data\n",
|
|
"# The first two columns contains the exam scores and the third column\n",
|
|
"# contains the label.\n",
|
|
"data = np.loadtxt(os.path.join('Data', 'ex2data1.txt'), delimiter=',')\n",
|
|
"X, y = data[:, 0:2], data[:, 2]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def plotData(X,y):\n",
|
|
" # New figure\n",
|
|
" fig = plt.figure()\n",
|
|
"\n",
|
|
" # Find indeces of positive and negative examples \n",
|
|
" # Then plot them seperately (Don't try to plot then label after)\n",
|
|
" pos = y == 1\n",
|
|
" neg = y == 0\n",
|
|
"\n",
|
|
" plt.plot(X[pos,0],X[pos,1],'k*', lw=2, ms=7)\n",
|
|
" plt.plot(X[neg,0],X[neg,1],'yo',mec='k',ms=7)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"image/png": "\n",
|
|
"text/plain": [
|
|
"<Figure size 432x288 with 1 Axes>"
|
|
]
|
|
},
|
|
"metadata": {
|
|
"needs_background": "light"
|
|
},
|
|
"output_type": "display_data"
|
|
}
|
|
],
|
|
"source": [
|
|
"plotData(X,y)\n",
|
|
"plt.xlabel(\"Exam 1 score\")\n",
|
|
"plt.ylabel(\"Exam 2 score\")\n",
|
|
"plt.legend([\"Admitted\", \"Not admitted\"])\n",
|
|
"pass"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h4>1.2 Implementation</h4>\n",
|
|
"First we construct the sigmoid function defined as:\n",
|
|
"$$h_\\theta(x) = g(\\theta^Tx)$$"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def sigmoid(z):\n",
|
|
" \"\"\"\n",
|
|
" Compute sigmoid function given the input z.\n",
|
|
" \n",
|
|
" Parameters\n",
|
|
" ----------\n",
|
|
" z : array_like\n",
|
|
" The input to the sigmoid function. This can be a 1-D vector \n",
|
|
" or a 2-D matrix. \n",
|
|
" \n",
|
|
" Returns\n",
|
|
" -------\n",
|
|
" g : array_like\n",
|
|
" The computed sigmoid function. g has the same shape as z, since\n",
|
|
" the sigmoid is computed element-wise on z.\n",
|
|
" \"\"\"\n",
|
|
" # convert input to a numpy array\n",
|
|
" z = np.array(z)\n",
|
|
"\n",
|
|
" g = 1 + np.exp(-1*z)\n",
|
|
" g = np.reciprocal(g)\n",
|
|
"\n",
|
|
" return g"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Sigmoid of 0 is 0.5\n",
|
|
"Sigmoid of 100 is 1.0\n",
|
|
"Sigmoid of -100 is 3.7200759760208356e-44\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Check a few values of sigmoid\n",
|
|
"print(\"Sigmoid of 0 is \",sigmoid(0))\n",
|
|
"print(\"Sigmoid of 100 is \",sigmoid(100))\n",
|
|
"print(\"Sigmoid of -100 is \",sigmoid(-100))\n",
|
|
"\n",
|
|
"# sigmoid of 0 should be exactly 0.5\n",
|
|
"# sigmoid of large positive numbers should be close to 1\n",
|
|
"# sigmoid of large negative numbers should be close to 0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"With a working sigmoid function, we can now implement a cost function which returns the cost and gradient for cost defined as:\n",
|
|
"\n",
|
|
"$$\\begin{align}\n",
|
|
"J(\\theta) & = \\dfrac{1}{m} \\sum_{i=1}^m \\mathrm{Cost}(h_\\theta(x^{(i)}),y^{(i)}) \\\\\n",
|
|
"& = - \\dfrac{1}{m} [\\sum_{i=1}^{m} y^{(i)} \\log(h_\\theta(x^{(i)})) + (1 - y^{(i)}) \\log(1-h_\\theta(x^{(i)}))] \\\\\n",
|
|
"\\end{align}$$\n",
|
|
"\n",
|
|
"and derivative:\n",
|
|
"\n",
|
|
"$$\\frac{\\partial}{\\partial \\theta_j} J(\\theta) = \\dfrac{1}{m} \\sum_{i=1}^{m} (h_\\theta(x^{(i)}) - y^{(i)}) x_j^{(i)}$$"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Setup the data matrix appropriately, and add ones for the intercept term\n",
|
|
"m, n = X.shape\n",
|
|
"\n",
|
|
"# Add intercept term to X\n",
|
|
"X = np.concatenate([np.ones((m, 1)), X], axis=1)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def costFunction(theta,X,y):\n",
|
|
" \"\"\"\n",
|
|
" Compute cost and gradient for logistic regression. \n",
|
|
" \n",
|
|
" Parameters\n",
|
|
" ----------\n",
|
|
" theta : array_like\n",
|
|
" The parameters for logistic regression. This a vector\n",
|
|
" of shape (n+1, ).\n",
|
|
" \n",
|
|
" X : array_like\n",
|
|
" The input dataset of shape (m x n+1) where m is the total number\n",
|
|
" of data points and n is the number of features. We assume the \n",
|
|
" intercept has already been added to the input.\n",
|
|
" \n",
|
|
" y : arra_like\n",
|
|
" Labels for the input. This is a vector of shape (m, ).\n",
|
|
" \n",
|
|
" Returns\n",
|
|
" -------\n",
|
|
" J : float\n",
|
|
" The computed value for the cost function. \n",
|
|
" \n",
|
|
" grad : array_like\n",
|
|
" A vector of shape (n+1, ) which is the gradient of the cost\n",
|
|
" function with respect to theta, at the current values of theta.\n",
|
|
" \"\"\"\n",
|
|
" ## Initialize some useful values\n",
|
|
" m = y.size # number of training examples\n",
|
|
" J = 0\n",
|
|
" grad = np.zeros(theta.shape)\n",
|
|
" h = sigmoid(X.dot(theta))\n",
|
|
" logh = np.log(h)\n",
|
|
" tempLog = np.log(1-h)\n",
|
|
" yTrans = y.transpose()\n",
|
|
" Xtrans = X.transpose()\n",
|
|
" tempTrans = (1-y).transpose()\n",
|
|
" \n",
|
|
" \n",
|
|
" J = ((-yTrans).dot(logh))\n",
|
|
" J = J - tempTrans.dot(tempLog)\n",
|
|
" J = J * (1/m)\n",
|
|
" \n",
|
|
" diff = np.subtract(sigmoid(X.dot(theta)),y)\n",
|
|
" grad = Xtrans.dot(diff)\n",
|
|
" grad = grad * (1/m)\n",
|
|
" \n",
|
|
" # =============================================================\n",
|
|
" return J, grad\n",
|
|
" "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We now test our cost function with varying initial thetas"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Cost at initial theta (zeros): 0.693\n",
|
|
"Expected cost (approx): 0.693\n",
|
|
"\n",
|
|
"Gradient at initial theta (zeros):\n",
|
|
"\t[-0.1000, -12.0092, -11.2628]\n",
|
|
"Expected gradients (approx):\n",
|
|
"\t[-0.1000, -12.0092, -11.2628]\n",
|
|
"\n",
|
|
"Cost at test theta: 0.218\n",
|
|
"Expected cost (approx): 0.218\n",
|
|
"\n",
|
|
"Gradient at test theta:\n",
|
|
"\t[0.043, 2.566, 2.647]\n",
|
|
"Expected gradients (approx):\n",
|
|
"\t[0.043, 2.566, 2.647]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Initialize fitting parameters\n",
|
|
"initial_theta = np.zeros(n+1)\n",
|
|
"\n",
|
|
"cost, grad = costFunction(initial_theta, X, y)\n",
|
|
"\n",
|
|
"print('Cost at initial theta (zeros): {:.3f}'.format(cost))\n",
|
|
"print('Expected cost (approx): 0.693\\n')\n",
|
|
"\n",
|
|
"print('Gradient at initial theta (zeros):')\n",
|
|
"print('\\t[{:.4f}, {:.4f}, {:.4f}]'.format(*grad))\n",
|
|
"print('Expected gradients (approx):\\n\\t[-0.1000, -12.0092, -11.2628]\\n')\n",
|
|
"\n",
|
|
"# Compute and display cost and gradient with non-zero theta\n",
|
|
"test_theta = np.array([-24, 0.2, 0.2])\n",
|
|
"cost, grad = costFunction(test_theta, X, y)\n",
|
|
"\n",
|
|
"print('Cost at test theta: {:.3f}'.format(cost))\n",
|
|
"print('Expected cost (approx): 0.218\\n')\n",
|
|
"\n",
|
|
"print('Gradient at test theta:')\n",
|
|
"print('\\t[{:.3f}, {:.3f}, {:.3f}]'.format(*grad))\n",
|
|
"print('Expected gradients (approx):\\n\\t[0.043, 2.566, 2.647]')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Now that we have a working cost function, we can implement gradient descent using a built in optimization function scipy.optimize.\n",
|
|
"\n",
|
|
"To use this function we need to pass in:\n",
|
|
"\n",
|
|
"- The initial values of the parameters we are trying to optimize\n",
|
|
"- A function that, when given training set and theta, computes the logistic regression cost and gradient with respect to theta for (X,y)\n",
|
|
"- jac: which is an indication if we would like the function to return the jacobian (gradient) as well\n",
|
|
"- method: which is the method/algorithm we would like to implement\n",
|
|
"- options: options specific to our chosen algorithm (chosen iterations in our case)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Cost at theta found by optimize.minimize: 0.203\n",
|
|
"Expected cost (approx): 0.203\n",
|
|
"\n",
|
|
"theta:\n",
|
|
"\t[-25.161, 0.206, 0.201]\n",
|
|
"Expected theta (approx):\n",
|
|
"\t[-25.161, 0.206, 0.201]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# set options for optimize.minimize\n",
|
|
"options= {'maxiter': 400}\n",
|
|
"\n",
|
|
"# see documention for scipy's optimize.minimize for description about\n",
|
|
"# the different parameters\n",
|
|
"# The function returns an object `OptimizeResult`\n",
|
|
"# We use truncated Newton algorithm for optimization which is \n",
|
|
"# equivalent to MATLAB's fminunc\n",
|
|
"# See https://stackoverflow.com/questions/18801002/fminunc-alternate-in-numpy\n",
|
|
"res = optimize.minimize(costFunction,\n",
|
|
" initial_theta,\n",
|
|
" (X, y),\n",
|
|
" jac=True,\n",
|
|
" method='TNC',\n",
|
|
" options=options)\n",
|
|
"\n",
|
|
"# the fun property of `OptimizeResult` object returns\n",
|
|
"# the value of costFunction at optimized theta\n",
|
|
"cost = res.fun\n",
|
|
"\n",
|
|
"# the optimized theta is in the x property\n",
|
|
"theta = res.x\n",
|
|
"\n",
|
|
"# Print theta to screen\n",
|
|
"print('Cost at theta found by optimize.minimize: {:.3f}'.format(cost))\n",
|
|
"print('Expected cost (approx): 0.203\\n');\n",
|
|
"\n",
|
|
"print('theta:')\n",
|
|
"print('\\t[{:.3f}, {:.3f}, {:.3f}]'.format(*theta))\n",
|
|
"print('Expected theta (approx):\\n\\t[-25.161, 0.206, 0.201]')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Now that we have an optimal theta, we can use it to get a decision boundary."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def mapFeature(X1, X2, degree=6):\n",
|
|
" \"\"\"\n",
|
|
" Maps the two input features to quadratic features used in the regularization exercise.\n",
|
|
"\n",
|
|
" Returns a new feature array with more features, comprising of\n",
|
|
" X1, X2, X1.^2, X2.^2, X1*X2, X1*X2.^2, etc..\n",
|
|
"\n",
|
|
" Parameters\n",
|
|
" ----------\n",
|
|
" X1 : array_like\n",
|
|
" A vector of shape (m, 1), containing one feature for all examples.\n",
|
|
"\n",
|
|
" X2 : array_like\n",
|
|
" A vector of shape (m, 1), containing a second feature for all examples.\n",
|
|
" Inputs X1, X2 must be the same size.\n",
|
|
"\n",
|
|
" degree: int, optional\n",
|
|
" The polynomial degree.\n",
|
|
"\n",
|
|
" Returns\n",
|
|
" -------\n",
|
|
" : array_like\n",
|
|
" A matrix of of m rows, and columns depend on the degree of polynomial.\n",
|
|
" \"\"\"\n",
|
|
" if X1.ndim > 0:\n",
|
|
" out = [np.ones(X1.shape[0])]\n",
|
|
" else:\n",
|
|
" out = [np.ones(1)]\n",
|
|
"\n",
|
|
" for i in range(1, degree + 1):\n",
|
|
" for j in range(i + 1):\n",
|
|
" out.append((X1 ** (i - j)) * (X2 ** j))\n",
|
|
"\n",
|
|
" if X1.ndim > 0:\n",
|
|
" return np.stack(out, axis=1)\n",
|
|
" else:\n",
|
|
" return np.array(out)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def plotDecisionBoundary(plotData, theta, X, y):\n",
|
|
" \"\"\"\n",
|
|
" Plots the data points X and y into a new figure with the decision boundary defined by theta.\n",
|
|
" Plots the data points with * for the positive examples and o for the negative examples.\n",
|
|
"\n",
|
|
" Parameters\n",
|
|
" ----------\n",
|
|
" plotData : func\n",
|
|
" A function reference for plotting the X, y data.\n",
|
|
"\n",
|
|
" theta : array_like\n",
|
|
" Parameters for logistic regression. A vector of shape (n+1, ).\n",
|
|
"\n",
|
|
" X : array_like\n",
|
|
" The input dataset. X is assumed to be a either:\n",
|
|
" 1) Mx3 matrix, where the first column is an all ones column for the intercept.\n",
|
|
" 2) MxN, N>3 matrix, where the first column is all ones.\n",
|
|
"\n",
|
|
" y : array_like\n",
|
|
" Vector of data labels of shape (m, ).\n",
|
|
" \"\"\"\n",
|
|
" # make sure theta is a numpy array\n",
|
|
" theta = np.array(theta)\n",
|
|
" \n",
|
|
" # Plot the data (note: first collumn is x-intercepts so we can ignore it)\n",
|
|
" plotData(X[:,1:3],y)\n",
|
|
" \n",
|
|
" if X.shape[1] <= 3:\n",
|
|
" # Only need 2 points to define line, so we choose the two endpoints\n",
|
|
" plot_x = np.array([np.min(X[:, 1]) - 2, np.max(X[:, 1]) + 2])\n",
|
|
" \n",
|
|
" # Calculate the decision boundary line ( given form y = theta0*x0 + \n",
|
|
" # theta1*x1 + theta2*x2, we just solve for y)\n",
|
|
" plot_y = (-1. / theta[2]) * (theta[1] * plot_x + theta[0])\n",
|
|
" \n",
|
|
" # Plot and adjust axes\n",
|
|
" plt.plot(plot_x, plot_y)\n",
|
|
" \n",
|
|
" # Setup legend\n",
|
|
" plt.legend(['Admitted', 'Not admitted', 'Decision Boundary'])\n",
|
|
" plt.xlim([30, 100])\n",
|
|
" plt.ylim([30, 100])\n",
|
|
" \n",
|
|
" else:\n",
|
|
" # Setup grid range\n",
|
|
" u = np.linspace(-1, 1.5,50)\n",
|
|
" v = np.linspace(-1,1.5,50)\n",
|
|
" \n",
|
|
" z = np.zeros((u.size, v.size))\n",
|
|
" # Evaluate z = theta*x over the grid\n",
|
|
" for i, ui in enumerate(u):\n",
|
|
" for j, vj in enumerate(v):\n",
|
|
" z[i, j] = np.dot(mapFeature(ui, vj), theta)\n",
|
|
" \n",
|
|
" z = z.T # important to transpose z before calling contour\n",
|
|
" # Plot z = 0\n",
|
|
" plt.contour(u, v, z, levels=[0], linewidths=2, colors='g')\n",
|
|
" plt.contourf(u, v, z, levels=[np.min(z), 0, np.max(z)], cmap='Greens', alpha=0.4)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"image/png": "\n",
|
|
"text/plain": [
|
|
"<Figure size 432x288 with 1 Axes>"
|
|
]
|
|
},
|
|
"metadata": {
|
|
"needs_background": "light"
|
|
},
|
|
"output_type": "display_data"
|
|
}
|
|
],
|
|
"source": [
|
|
"plotDecisionBoundary(plotData, theta, X, y)\n",
|
|
"plt.xlabel(\"Exam 1 score\")\n",
|
|
"plt.ylabel(\"Exam 2 score\")\n",
|
|
"pass"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Now that we have a decision boundary we can create a function to predict whether a given student (a single sample of two exam scores) will be admitted."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def predict(theta, X):\n",
|
|
" \"\"\"\n",
|
|
" Predict whether the label is 0 or 1 using learned logistic regression.\n",
|
|
" Computes the predictions for X using a threshold at 0.5 \n",
|
|
" (i.e., if sigmoid(theta.T*x) >= 0.5, predict 1)\n",
|
|
" \n",
|
|
" Parameters\n",
|
|
" ----------\n",
|
|
" theta : array_like\n",
|
|
" Parameters for logistic regression. A vecotor of shape (n+1, ).\n",
|
|
" \n",
|
|
" X : array_like\n",
|
|
" The data to use for computing predictions. The rows is the number \n",
|
|
" of points to compute predictions, and columns is the number of\n",
|
|
" features.\n",
|
|
"\n",
|
|
" Returns\n",
|
|
" -------\n",
|
|
" p : array_like\n",
|
|
" Predictions and 0 or 1 for each row in X.\n",
|
|
" \"\"\"\n",
|
|
" # Number of training samples\n",
|
|
" m = X.shape[0]\n",
|
|
" \n",
|
|
" # initialize p\n",
|
|
" p = np.zeros(m)\n",
|
|
" \n",
|
|
" temp = sigmoid(X.dot(theta))\n",
|
|
" for i in range(m):\n",
|
|
" if temp[i] >= 0.5:\n",
|
|
" p[i] = 1\n",
|
|
" \n",
|
|
" return p\n",
|
|
" \n",
|
|
" "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"For a student with scores 45 and 85,we predict an admission probability of 0.776\n",
|
|
"Train Accuracy: 89.00 %\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Predict probability for a student with score 45 on exam 1 \n",
|
|
"# and score 85 on exam 2 \n",
|
|
"prob = sigmoid(np.dot([1, 45, 85], theta))\n",
|
|
"print('For a student with scores 45 and 85,'\n",
|
|
" 'we predict an admission probability of {:.3f}'.format(prob))\n",
|
|
"\n",
|
|
"# Compute accuracy on our training set\n",
|
|
"p = predict(theta, X)\n",
|
|
"print('Train Accuracy: {:.2f} %'.format(np.mean(p == y) * 100))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h3>2 Regularized Logistic Regression</h3>\n",
|
|
"\n",
|
|
"In this part of the exercise, we will implement regularized logistic regression to predict whether microchips from a fabrication plant pass quality assurance. \n",
|
|
"\n",
|
|
"First we visualize the data"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Load data\n",
|
|
"# The first two columns contains the exam scores and the third column\n",
|
|
"# contains the label.\n",
|
|
"data = np.loadtxt(os.path.join('Data', 'ex2data2.txt'), delimiter=',')\n",
|
|
"X, y = data[:, 0:2], data[:, 2]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 18,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"image/png": "\n",
|
|
"text/plain": [
|
|
"<Figure size 432x288 with 1 Axes>"
|
|
]
|
|
},
|
|
"metadata": {
|
|
"needs_background": "light"
|
|
},
|
|
"output_type": "display_data"
|
|
}
|
|
],
|
|
"source": [
|
|
"plotData(X,y)\n",
|
|
"plt.xlabel(\"Microchip Test 1\")\n",
|
|
"plt.ylabel(\"Microchip Test 2\")\n",
|
|
"plt.legend([\"y=1\", \"y=0\"])\n",
|
|
"pass"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"In order to create a more complex boundary, we will now map the features into all polynomial terms of x1 and x2 up to the sixth power. This results in a conversion of our vector of two features becoming a vector of 28 features."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 19,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Note that mapFeature also adds a column of ones for us, so the intercept\n",
|
|
"# term is handled\n",
|
|
"X = mapFeature(X[:, 0], X[:, 1])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We can now compute the cost function and gradient for our newly mapped features"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 20,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def costFunctionReg(theta, X, y, lambda_):\n",
|
|
" \"\"\"\n",
|
|
" Compute cost and gradient for logistic regression with regularization.\n",
|
|
" \n",
|
|
" Parameters\n",
|
|
" ----------\n",
|
|
" theta : array_like\n",
|
|
" Logistic regression parameters. A vector with shape (n, ). n is \n",
|
|
" the number of features including any intercept. If we have mapped\n",
|
|
" our initial features into polynomial features, then n is the total \n",
|
|
" number of polynomial features. \n",
|
|
" \n",
|
|
" X : array_like\n",
|
|
" The data set with shape (m x n). m is the number of examples, and\n",
|
|
" n is the number of features (after feature mapping).\n",
|
|
" \n",
|
|
" y : array_like\n",
|
|
" The data labels. A vector with shape (m, ).\n",
|
|
" \n",
|
|
" lambda_ : float\n",
|
|
" The regularization parameter. \n",
|
|
" \n",
|
|
" Returns\n",
|
|
" -------\n",
|
|
" J : float\n",
|
|
" The computed value for the regularized cost function. \n",
|
|
" \n",
|
|
" grad : array_like\n",
|
|
" A vector of shape (n, ) which is the gradient of the cost\n",
|
|
" function with respect to theta, at the current values of theta.\n",
|
|
" \"\"\"\n",
|
|
" # Initialize some useful values\n",
|
|
" m = y.size # number of training examples\n",
|
|
" temp, n = X.shape\n",
|
|
" J = 0\n",
|
|
" grad = np.zeros(theta.shape)\n",
|
|
"\n",
|
|
" h = sigmoid(X.dot(theta))\n",
|
|
" logh = np.log(h)\n",
|
|
" tempLog = np.log(1-h)\n",
|
|
" yTrans = y.transpose()\n",
|
|
" Xtrans = X.transpose()\n",
|
|
" tempTrans = (1-y).transpose()\n",
|
|
" \n",
|
|
" tempTheta = theta[0]\n",
|
|
" theta[0] = 0\n",
|
|
" J = ((-yTrans).dot(logh))\n",
|
|
" J = J - tempTrans.dot(tempLog)\n",
|
|
" J = J * (1/m)\n",
|
|
" J = J + (lambda_ / (2*m)) * np.sum(np.square(theta))\n",
|
|
" theta[0] = tempTheta\n",
|
|
" \n",
|
|
" diff = np.subtract(sigmoid(X.dot(theta)),y)\n",
|
|
" grad = Xtrans.dot(diff)\n",
|
|
" grad = grad * (1/m)\n",
|
|
" for i in range(1,n):\n",
|
|
" grad[i] = grad[i] + (lambda_ / m)*theta[i]\n",
|
|
" \n",
|
|
" \n",
|
|
" # =============================================================\n",
|
|
" return J, grad"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 21,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Cost at initial theta (zeros): 0.693\n",
|
|
"Gradient at initial theta (zeros) - first two values only:\n",
|
|
"\t[0.0085, 0.0188]\n",
|
|
"------------------------------\n",
|
|
"\n",
|
|
"Cost at test theta : 3.16\n",
|
|
"Gradient at test theta - first two values only:\n",
|
|
"\t[0.3460, 0.1614]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Initialize fitting parameters\n",
|
|
"initial_theta = np.zeros(X.shape[1])\n",
|
|
"\n",
|
|
"# Set regularization parameter lambda to 1\n",
|
|
"# DO NOT use `lambda` as a variable name in python\n",
|
|
"# because it is a python keyword\n",
|
|
"lambda_ = 1\n",
|
|
"\n",
|
|
"# Compute and display initial cost and gradient for regularized logistic\n",
|
|
"# regression\n",
|
|
"cost, grad = costFunctionReg(initial_theta, X, y, lambda_)\n",
|
|
"\n",
|
|
"print('Cost at initial theta (zeros): {:.3f}'.format(cost))\n",
|
|
"print('Gradient at initial theta (zeros) - first two values only:')\n",
|
|
"print('\\t[{:.4f}, {:.4f}]'.format(*grad[:5]))\n",
|
|
"\n",
|
|
"\n",
|
|
"# Compute and display cost and gradient\n",
|
|
"# with all-ones theta and lambda = 10\n",
|
|
"test_theta = np.ones(X.shape[1])\n",
|
|
"cost, grad = costFunctionReg(test_theta, X, y, 10)\n",
|
|
"\n",
|
|
"print('------------------------------\\n')\n",
|
|
"print('Cost at test theta : {:.2f}'.format(cost))\n",
|
|
"print('Gradient at test theta - first two values only:')\n",
|
|
"print('\\t[{:.4f}, {:.4f}]'.format(*grad[:4]))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"With a working cost function, we can now apply linear regression to fit our parameters."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 25,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Train Accuracy: 83.1 %\n",
|
|
"Expected accuracy (with lambda = 1): 83.1 % (approx)\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"image/png": "\n",
|
|
"text/plain": [
|
|
"<Figure size 432x288 with 1 Axes>"
|
|
]
|
|
},
|
|
"metadata": {
|
|
"needs_background": "light"
|
|
},
|
|
"output_type": "display_data"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Initialize fitting parameters\n",
|
|
"initial_theta = np.zeros(X.shape[1])\n",
|
|
"\n",
|
|
"# Set regularization parameter lambda to 1 (We can vary this to\n",
|
|
"# change how well fit the data is)\n",
|
|
"lambda_ = 1\n",
|
|
"\n",
|
|
"# set options for optimize.minimize\n",
|
|
"options= {'maxiter': 100}\n",
|
|
"\n",
|
|
"res = optimize.minimize(costFunctionReg,\n",
|
|
" initial_theta,\n",
|
|
" (X, y, lambda_),\n",
|
|
" jac=True,\n",
|
|
" method='TNC',\n",
|
|
" options=options)\n",
|
|
"\n",
|
|
"# the fun property of OptimizeResult object returns\n",
|
|
"# the value of costFunction at optimized theta\n",
|
|
"cost = res.fun\n",
|
|
"\n",
|
|
"# the optimized theta is in the x property of the result\n",
|
|
"theta = res.x\n",
|
|
"\n",
|
|
"plotDecisionBoundary(plotData, theta, X, y)\n",
|
|
"plt.xlabel('Microchip Test 1')\n",
|
|
"plt.ylabel('Microchip Test 2')\n",
|
|
"plt.legend(['y = 1', 'y = 0'])\n",
|
|
"plt.grid(False)\n",
|
|
"plt.title('lambda = %0.2f' % lambda_)\n",
|
|
"\n",
|
|
"# Compute accuracy on our training set\n",
|
|
"p = predict(theta, X)\n",
|
|
"\n",
|
|
"print('Train Accuracy: %.1f %%' % (np.mean(p == y) * 100))\n",
|
|
"print('Expected accuracy (with lambda = 1): 83.1 % (approx)\\n')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.7.3"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
}
|