\n",
"\n",
"In this exercise we will implement linear regression and see it work on data\n",
"\n",
"Files included with this exercise:\n",
"\n",
" - ex1data1.txt - Dataset for linear regression with one variable\n",
" - ex1data2.txt - Dataset for linear regression with multiple variables\n",
" \n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# used for manipulating directory paths\n",
"import os\n",
"\n",
"# Scientific and vector computation for python\n",
"import numpy as np\n",
"\n",
"# Plotting library\n",
"from matplotlib import pyplot\n",
"from mpl_toolkits.mplot3d import Axes3D # needed to plot 3-D surfaces\n",
"import matplotlib.patches as mpatches \n",
"import matplotlib.lines as mlines # for creating a legend\n",
"\n",
"# tells matplotlib to embed plots within the notebook\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
1 Simple Python function
\n",
"\n",
"We will warmup by creating a function which returns an n x n identity matrix"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def warmupexercise(x):\n",
" A = np.identity(x)\n",
" return A"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can run the function with an input of 5 to create a 5 x 5 identity matrix"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1., 0., 0., 0., 0.],\n",
" [0., 1., 0., 0., 0.],\n",
" [0., 0., 1., 0., 0.],\n",
" [0., 0., 0., 1., 0.],\n",
" [0., 0., 0., 0., 1.]])"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"warmupexercise(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
2 Linear Regression with One Variable
\n",
"\n",
"In this part of this exercise, we will implement linear regression with one variable to predict profits for a food truck. Suppose you are the CEO of a restaurant franchise and are considering different cities for opening a new outlet. The chain already has trucks in various cities and we have data for profits and populations from the cities.\n",
"\n",
"We would like to use this data to help you select which city to expand to next. The file ex1data1.txt contains the dataset for our linear regression problem. The first column is the population of a city and the second column is the profit of a food truck in that city. A negative value for profit indicates a loss. \n",
"\n",
"We now load the data"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"# Read comma separated data\n",
"data = np.loadtxt(os.path.join('Data', 'ex1data1.txt'), delimiter=',')\n",
"X, y = data[:, 0], data[:, 1]\n",
"\n",
"m = y.size # number of training examples"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
2.1 Plotting the Data
\n",
"\n",
"Before starting on the task it is useful to visualize the data. Since we are dealing with only two variables, we can do this with a scatterplot."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"def plotData(X,y):\n",
" red_x = pyplot.plot(X, y, 'ro', ms=10, mec='k')\n",
" pyplot.title('Figure 1: Scatter Plot of Training Data')\n",
" pyplot.xlabel('Population of City in 10,000s')\n",
" pyplot.ylabel('Profit in $10,000s')"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEWCAYAAABv+EDhAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO2de5xVVdn4v8/MHGWm4SAwQF5CasouKnmZFKOLXX7llKIVZaHiBcS4+AZiAl3t7c3qLbQLailTXgjTyIqMqYyyLKQaLEcN0zm95Q0VL+EghAd4fn+sdeDM4ex99pk59/N8P5/1mXP2XmuvZ++zZz1rPetZzxJVxTAMw6g/GsotgGEYhlEeTAEYhmHUKaYADMMw6hRTAIZhGHWKKQDDMIw6xRSAYRhGnWIKoAIRkfEislVEGssti5EdETlHRH5forpeLSJ/EZF+EfmvEtT3dxF5c6HzGpWHKYAyIiL/FJHtvrFPpYNU9WFVbVXVXRUg434issrLqiJyYp7lDxeRX4rIcyLybxHZICLvGaJMJ4rIoxnHLhWRFUO5bpZ6LhWRpP9d/i0i60TkhEFc5w4RmTkEUS4B7lDV4ar6jYxr35/27uwSkf+kff/EYCpT1Ver6p2FzpsPIjLT30/qXv5PRL4jIq/K4xorROTSQstWS5gCKD+n+MY+lR4vZmUi0jSIYr8HzgSeGETZnwK3A+OAscB/Ac8P4jpFJeS53KyqrcAY3HO4VUSkdJIBcChwf7YTqnp46t0B7gTmpb1Ll2XmH+TvXy7u9Pc1AngnkAR6ROS15RWrhlBVS2VKwD+Bd2Y5PgFQoMl/fznwO6Af+BVwJbDCnzsReDTousClwCpgBa7hnYlT/IuBBPAMcAswKoK8jwIn5nF/bf4+DgjJcyrwVy9bAjjJHz8X2Ojv+R/ABf74S4DtwG5gq0/TgBdxDcRW4B6fdwTQBWwCHgP+B2j0584B/gBcATwL/E8W2S5NPWf//XB/P22+/O/Tzr0R+DOwxf99oz/+BWAX8B8v27KA5zAF18j/G7gDeK0//uuM8oeFPMs7gJkZx2b6d+cb/j4vBV4F/Mb/9k8DNwIjsv3O/pnd5N+ffuA+4JhB5u3wv3U/8H3gB8ClAfcyEzfqyTz+c+D7/nMD7t1+Istzm+Pfhxf9c/uRP/4p3PvU75/3lHK3A+VMNgKoDlYCfwJG4/6Bz8qz/Km4f5QDgO/heuGnAW8FDgKewymVvBGRaSLSG3D6GaAPWCEip4nIuIyyxwE3AB/3sr0Fp7wAngJOBuI4ZXCFiByjqi8AncDjurenuxK4DN9bV9XX+2tcD+wEXgkcDbwL17CkOB7XGIzFNdRh97k/rtF/VFWfzjg3CvgZrpEdDVwO/ExERqvqJxnYM5+X5dqH4RrO+biRxhrgpyKyn6q+PaP8g2FyBvBGnDIdA3wZEFxjfSDwOuAVwKdDyp+GUxIHAN3+PvPK65/fj4HlwCjghz5vvtwKpM853IZTaC/FKZwbAVT1KuBm4DL/3N7n8z8ITMZ1Dr4ArMx8L+sJUwDl58fevvxvEflx5kkRGQ+8AfiMqr6oqr8HVudZx12q+mNV3a2q24ELgE+q6qOqugOnVKYOxjygqitVdWLAOQXehmvUlwKbROR3aXbcGcB3VPV2L9tjqvqAL/szVU2o47fALxn4jx+K/6fuBOar6guq+hSut//htGyPq+o3VXWnfy7Z+JCI/Bt4BDiW7I3We4GHVPVGf62bgAeAUyKKezrwM/8cksBXgWZcw10IHlbVq1V1l6puV9UHVXWtf59Sz+WtIeV/q6q/UDcndSNw1CDyTgZ2q+oyVU2q6g+ADYO4l8dxCgT/zlynqv2q+h/ce3ysiLwkqLCq3qKqm3zZlbh3s2MQctQE1WQPrFVOU9VfhZw/CHhWVbelHXsEeFkedTyS8f1Q4Ecisjvt2C6cnf6xPK6bE1V9FJgHICIvA67B9fpPwN3DmmzlRKQT+CxwGK6j0gLcm0fVhwIxnNJJHWtg4LPIfC7ZuEVVz8yR5yDgXxnH/gUcHOH6+5RX1d0i8kge5XMx4D5F5KW4nvlkYDjuuWwOKZ8+97MNZ4bLN+9BOHNRoFwRORhnysJ7yX0RmIozy6Xe5zbghWyFReQcYAHu/QBo9fnrEhsBVD6bgFEi0pJ2LL3xfwHXOAJ7/inGZFwjM+TrI0Cnqh6QloapakEb/0xU9RGcqemINDnaM/N5c8EPcT3hcap6AE5RpFrybCFss93jDqAt7R7jqnp4SJnB8jh7G5QU49mrTHPVM6C8n2R+GYVTxpn1fxn3bI5U1TjOtFXsie1NwCEZx/LpxKQ4DWcSA5gOvAd4O86k80p/POt7IiKvAK4GZgOj/Xv1AMW/94rFFECFo6r/AnqAS71L5gkMNC08CAwTkfeKSAw3ybV/jst+C/iCiBwKICJjROTUoMwisr+IDPNf9xORYVE8YURkpIh8TkReKSINItIGnAes91m6gHNF5B3+/MEi8hpgP38Pm4GdfjTwrrRLPwmMFpERGccmiEgDgKpuwpmNlopI3F+/XUTCTB2DZQ1wmJ8PaRKR03G29dvSZHtFSPlbgPf65xADFuIa6HVFkBVcr/8FYIsflV1cpHrS+T3QKCKz/TP6AM6klhMRaRSRV4jIVcCbgM/7U8Nxz+kZXCcocx4n87m34pTCZndZmQm8ZrA3VAuYAqgOzsCZTJ7BTd7djHvxUdUtOI+H5bge4wvsO9TO5Ou4eYRfikg/rkE+PiT/33GeNwcDv/CfU8rjDBHJ6qKI88CYgPNceh43SbcD1+NEVf+En+DFec/8FjhUVftxE9W34Caop5E27+HnCW4C/uHnTg7CeZQAPCMid/vP03HK5G/+OqtwE58FRVWfwU1YL8T9RpcAJ6dNFn8dN8fynIjsM4Gqqn/Hudl+E+eVcwrOPfjFQsvq+SxwHO6Zr8aNtoqKn2t6H/BR3G/xIZzi3BFS7M0ishX37vwa18h3qGrqffsubvT0OM6jJ1NhLgde75/7KlXtxZm+/oQbkbwG+GMBbq9qETdPZ1QTInIz8ICqfrbcshjGYBGRDcDXVPXGcstSr9gIoAoQkTd480WDiJyEc+vcx2PIMCoZcSu4x3kT0AxcD/yX5ZarnjEvoOrgpTj/59E4885sVf1LeUUyjLx5Lc58+RLcor8PqOqT5RWpvjETkGEYRp1iJiDDMIw6pSpMQG1tbTphwoRyi2EYhlFVbNiw4WlVzVwXtIeiKQDvX3wDzn69G7hGVb/uw7Oez96Vh59Q1ayrQVNMmDCBnp6eYolqGIZRk4hI5gr1ARRzBLATWKiqd4vIcGCDiNzuz12hql8tYt2GYRhGDoqmAPxKzE3+c7+IbKRwsU0MwzCMIVKSSWARmYALx5tadTdPRHr9Dj8jA8rMEpEeEenZvDksTpVhGIYxGIquAESkFbfUfL6qPo8LxtSOCxO7CRcmeB9U9RpV7VDVjjFjAucwDMMwao5EIsGCOXMYF4/T2NDAuHicBXPmkEgkClpPURWAD2z1Q+B7qnorgKo+6eOS7wauxcUkMQzDMIDu7m4mTZxI8/LlrOvvZ4cq6/r7aV6+nEkTJ9Ld3V2wuoqmAHy0yC5go6pennY8PRjX+3ABwgzDMOqeRCLB9KlTWb1tG5clk7TjJmrbgcuSSVZv28b0qVMLNhIo5ghgMm7rwreLyF99eg/wvyJyr99G8G24zRkMwzDqnmVLl3J+MskJAedPAGYmk1x5xRUFqa8qQkF0dHSorQMwDKPWGRePs66/f99dktJIAJPjcZ7YsiXn9URkg6oGbnlpoSAMwzAqhKe3bt1na7lMxvt8hcAUgGEYRoXQ1tq6z+bSmTzs8xUCUwCGYRgVwrQzz6QrFgvNszwWY9pZZxWkPlMAhmEYFcK8hQu5NhbjroDzd+EUwNwFhfGdMQVgGIZRIbS3t3PDqlVMaWlhSSxGAkjiJn6XxGJMaWnhhlWraG8PmyaOjikAwzCMCqKzs5P1vb3smDWLyfE4zQ0NTI7H2TFrFut7e+ns7CxYXeYGahhG1ZBIJFi2dCkrV6zg6a1baWttZdqZZzJv4cKC9YprCXMDNQyjJihliIR6wUYAhmFUPIlEgkkTJ7J627asq2TvAqa0tLC+t9dGAmnYCMAwjKqn1CES6gVTAIZhVDwrV6xgRjIZmmdmMsnKG28skUS1gSkAwzAqnlKHSKgXTAEYhlHxlDpEQr1gCsAwjIqnFCESSrULVyVhCsAwjIqn2CES6tXF1BSAYRgVTzFDJJR6F65KwhSAYRhVQbFCJNSzi6ktBDMMo64p9C5clYQtBDMMwwgh08U0gduofBzQ6P9+A9jc318G6YqLKQDDMOqadBfTbmAS0AysA3b4vy3AMNWamww2BWAYRl2TcjFNANOB1cBlMGAy+IvAr6DmJoNNARiGUdekXEw/BZwPdTUZbArAMIy6JuVi+hNgRo68tRZvyBSAYRh1T2dnJztE6i7ekCkAwzAM6jPekCkAwzAMShNvqNIwBWAYhkHx4w1VIqYADMMwKG68oUqlaApARF4mIr8RkY0icr+IfMwfHyUit4vIQ/7vyGLJYBiGkQ/FijdUqRQtFpCIHAgcqKp3i8hwYANwGnAO8KyqfklEFgMjVXVR2LUsFpBhGEb+lC0WkKpuUtW7/ed+YCNwMHAqcL3Pdj1OKRiGYRglpiRzACIyATga+CMwTlU3gVMSwNiAMrNEpEdEejZv3lwKMQ3DMOqKoisAEWkFfgjMV9Xno5ZT1WtUtUNVO8aMGVM8AQ3DMOqUoioAEYnhGv/vqeqt/vCTfn4gNU/wVDFlMAzDMLJTTC8gAbqAjap6edqp1cDZ/vPZwE+KJYNhGIYRTFMRrz0ZOAu4V0T+6o99AvgScIuIzMCtrP5gEWUwDMMwAiiaAlDV3wMScPodxarXMAzDiIatBDYMw6hTTAEYhmHUKaYADMMw6hRTAIZhGHWKKQDDMIwCkkgkWDBnDuPicRobGhgXj7NgzpyK3EzeFIBhGEaB6O7uZtLEiTQvX866/n52qLKuv5/m5cuZNHEi3d3d5RZxAEWLBlpILBqoYRiVTiKRYNLEiazeto0Tspy/C5jS0sL63t6S7SlQtmigtUI1DecMwygfy5Yu5fxkMmvjD3ACMDOZ5MorriilWKGYAgih2oZzhmGUj5UrVjAjmQzNMzOZZOWNN5ZIotyYCSiAShzOGYZRuTQ2NLBDNTS8QhJobmhg565dJZHJTECDpBqHc4ZhlI+21lb+lSPPwz5fpWAKIIBqHM4ZRq1TyXNy0848k65YLDTP8liMaWedVSKJcmMKIICnt27l0Bx5xvt8hmEUn0qfk5u3cCHXxmLcFXD+LpwCmLtgQSnFCsUUQADVOJwzjFolkUgwfepUVm/bxmXJJO24UMbtwGXJJKu3bWP61KllHQm0t7dzw6pVTGlpYUksRgJn808AS2IxprS0cMOqVRU1Z2gKIIBqHM4ZRq1SLXNynZ2drO/tZcesWUyOx2luaGByPM6OWbNY39tLZ2dnWeXLxLyAAjAvIMOoHMbF46zr7yfsPy0BTI7HeWLLllKJVfGYF9AgqcbhnGHUKjYnVxxMAYRQbcM5w6hVbE6uOJgCyEF7ezuXL1vGE1u2sHPXLp7YsoXLly2znr9hlBCbkysOpgAMw6h4qtHFshqIpABE5IMiMtx//pSI3CoixxRXNMMwDIfNyRWHqCOAT6tqv4i8CXg3cD1wdfHEMgzDGIjNyRWeqAogFbnovcDVqvoTYL/iiGQYtUElhy2oVmxOrrBEVQCPici3gQ8Ba0Rk/zzKGkbdUelhCwwDIi4EE5EW4CTgXlV9SEQOBI5U1V8WW0CwHcGM6sIWERqVwpAXgomIAEf6r0eIyPHAE6Vq/A2j2qiWsAWGEToCEJF3AVcBDwGP+cOHAK8E5tgIwDD2xcIWGJVCrhFA2OY1AF8H3qmq/8y46MuBNcBrQyr+DnAy8JSqHuGPXQqcD2z22T6hqmtyyGAYVYWFLTCqhVwmoCbg0SzHHwPCl+XBdbh5g0yuUNWjfLLG36hIhuLBY2ELjGohlwL4DvBnEVkkItN8WgT8EegKK6iqvwOeLZCchlEyhurBY2ELjGohpxeQiLwOmAIcDAhuRLBaVf+W8+IiE4DbMkxA5wDPAz3AQlV9Ltd1bA7AKBWF8OAxLyCjUhiyF5Cq/k1VvwR8Frci+EtRGv8ArsZt4nMUsAlYGpRRRGaJSI+I9GzevDkom2EUlEJ48FjYAqNaCFUAIjJeRL4vIk/hzD5/EpGn/LEJ+Vamqk+q6i5V3Q1cCxwXkvcaVe1Q1Y4xY8bkW5VhDIqVK1YwI5kMzTMzmWTljTeG5rGwBUY1kMsN9C7ga8AqVd3ljzUCHwTmq+qk0IvvawI6UFU3+c8LgONV9cO5hDQTkFEqGhsa2KEa6h6XBJobGti5a1dILsMoP0M1AbWp6s2pxh/A9+C/D4zOUfFNOHPnq0XkURGZAfyviNwrIr3A2wCL3WpUFObBY9QTuRTABhG5SkSOF5GDfDpeRK4C/hJWUFU/oqoHqmpMVQ9R1S5VPUtVj1TViao6JTUaMAqDBR8bOubBY9QTuRTAdOBe4HPAL4Bf+s/3AfYfUEFY8LHCYBuPGPVEpGBw5cbmAMIxt8PBkUgkWLZ0KStXrODprVtpa21l2plncmRHB4suvJCZySQzk0nG48w+y2Mxlsdi3LBqlU3iGlXBkOYARKRJRC4QkW4R6RWRe/znj4pIrpXARomw4GP5EzZiWnThhXz5m980Dx6j5snlBXQT8G/cDmCpkBCHAGcDo1T19KJLiI0AcmHBx/LDRkxGvTBUL6BjVHW2qq5X1Ud9Wq+qs4GjCyuqMVgs+Fh+2IjJMBy5FMBzfkP4PflEpEFETgdyhnAwSoO5LuZHoRZ7GUa1k0sBfBiYCjwpIg+KyIPAE8D7/TmjAjDXxfywEZNhOEL3A/D7AJwOICKjcXMGT5dALiMP5i1cyKTrr+eUALNGynVxvbkuAn7ElGPOxEZMRj0QeWN3VX0m1fiLSIeIHFw8sYx8sOBj+WEjJsNwRFYAGVwI3CYiNxdSGGPwWPCx6NhiL8NwDGkhmIgMV9X+AsqTFXMDNQpNd3c306dOtcVeRk0z5P0ARGSEiJwuIheJyAL/+QCAUjT+RnGp1/hBlTpiqtffwygTqhqYcLGAEriNXD7l07f8selhZQuZjj32WDWi0dfXp/Nnz9axw4drg4iOHT5c58+erX19ffvkXbNmjba1tOiSWEz7QJOgfaBLYjFta2nRNWvWlOEO6hf7PYxCA/RoWBsfehL+DhyQ5fhI4MGwsoVM1awA8mmQh1rHiOZmbQG92DccYQ1IX1+ftrW06Dpwr0FGWgfa1tJSUDmNYOz3MIpBLgWQywQkQLZJgt3+nBFCKSJ0pur4z7XX0rR9O78CvoLbd7PJ/70smWT1tm1Mnzp1jynBVsNWFvZ7GOUgVyygs4HP4MJAP+IPjwf+H/B5Vb2u2AJCdU4ClyLeTHodtwDNwGUh+ZfEYuyYNYvLly2z+EEVhv0eRjEY0iSwql4PdAC/BXYALwJ3AB2lavyrlVL06NLrWAnMyJE/PbyBrYatLOz3MMqB7QdQJErRo0uvoxGnoaPuZWs9zsrCfg+jGAzZDTTkwvcOtmw9UIoeXXodbZBXQDhbDVtZ2O9hlINcG8K8PyB9AHhpiWSsSkoRoTO9jmlAV4786Q2IrYatLOz3MMpBrhHAzcAU4JSMdDIwrLiiVTel6NGl1zEPuBYiNyAWP6iysN/DKAthPqLABuCIgHOPhJUtZKrGdQCl8OvOrGMNaBvoYu///6L/u6ipKXAhUV9fny6YO1fHxePa2NCg4+JxXTB3rvmblwn7PYxCwhAXgr0ZGB9wriOsbCFTNSoA1b0rOxf7lZ2pBnlxAVd2ZtaxEfQ80BGgDaCjW1qsATGMOiWXAsjlBnqnqj4ccK663HLKQCnizWTWcURDAz+Lxzlv7lwe7Ovj6Rde4PJly8x0YBjGPuR0AxWRscALqvqCiDQDFwHDga+r6qYSyFiVbqCGYRjlphBuoN8HRvvPnwNeidsPeOXQxTMMwzDKRS430LNx4WRO9J9PB3pw+wIfKiLTRWRi8cU0jOJg4ZeNeibXCOAOYDuwEXgMeBL4qT/+jP+by93dMCqSUgTrM4xKJtck8L+ArwO3AbcA/+0nhRV4WlUfVtWs69JF5Dsi8pSI3Jd2bJSI3C4iD/m/Iwt3K4YRnUQiwfSpU1m9bRuXJZM5o6caRi2Scw5AVa/G/V8coqq3+cPPAB/JUfQ64KSMY4uBtar6KmCt/24YJcfCLxtGkYPBicgE4DZVPcJ//ztwoqpuEpEDgTtU9dW5rmNeQEahseBrRj1QtGBwg2RcynXU/x0blFFEZolIj4j0bN68uWQCGvWBhV82jNIrgMio6jWq2qGqHWPGjCm3OEaNUYpgfYZR6ZRaATzpTT/4v0+VuP49mPtffZNPsD57V4xaJZIC8CGgHxKRLSLyvIj0i8jzg6hvNXC2/3w28JNBXGPImPufETX88hHHHGPvilG7hAUKSiWgD3htlLxpZW4CNuGi2j6K27FwNM775yH/d1SUaxUyGFwponQa1UGuYH1dXV32rhhVDUMJBpfGk6q6MU/F8hFVPVBVY6p6iKp2qeozqvoOVX2V//tsPtcsBOb+Z6TIFazv3p4ee1eMmiaSG6iIfB23A9iPcVvPAqCqtxZPtL0U0g3U3P+MqNi7YlQ7udxAw/YQTycObAPelXZMgZIogEJi7n9GVOxdMWqdSApAVc8ttiCloq21lX/l6NWZ+58B9q4YtU+uaKCX+L/fFJFvZKbSiFhYSrFXr1Eb2Lti1Dq5JoFTE789uP2BM1PVEdX9L7V5ulE5lNof394Vo+YJcxGqlFToPYFLsVevUVhSv9kS/5sl/W+2pMi/mb0rRjVDgdxAa4pS7NVbTxS7Z17O0M32rhi1TFGjgRYKiwZauXR3dzN96lTOTyaZkUxyKG6HoK5YjGtjMW5YtWrIjeSCOXNoXr6cy5LJwDxLYjF2zJrF5cuWDakuw6glChINVEQmRzlm1CZBPfxf//rXJemZr1yxghkhjT+4BVkrb7xxSPUYRr0R1QT0zYjHjBojLG7S+979bt6+Y0fRV8qaP75hFIdcbqAniMhCYIyIXJSWLgUaSyJhianGyI/FkjmX7f3nO3eydtcuwmopRM/cQjcbRnHINQLYD2jF/d8PT0vPA1OLK1rpqcYoocWUOVLcJODKkGsE9czzUVrmj28YRSLMRSiVgEOj5CtWKrQbaDaqMUposWUeO3y49gVcO5X6QMflOh+PD7huvi6d1fjbGEYlQA430FwN/9f835/iYvkPSGFlC5lKoQDmz56tS2Kx0MZucSymC+bOLUr9fX19On/2bB07fLg2iOjY4cN1/uzZoY3aUGXOVWeDiCZzKIAXQRt9Az4fdCxog/87H/SCpqYB9Q+2MTd/fMPIn6EqgGP837dmS2FlC5lKoQAi93YzerOFIKhHvLipSeNNTTqiuTlrAz0UmaP0wqNevxW0BXSh/5661iJ/vKura0+9Q1FafX19umDuXB0Xj2tjQ4OOi8d1wdy51vM3jACGqgDW+r9fDstX7FQKBRC5tyuyp8xgeu2ZROkRjwZ9IKOB7urq0v2z9LgzG+wXQRsbGvKus62lRc+ZNi1nY30R6Et8mSg9+nIqWsOoN3IpgFyTwAeKyFuBKSJytIgck54KNQ9RCbQ2NUXyNGn1k5GFmnyNMtF6PvBtBvrXXzhjBmcB63AbNKwDmoFJQHrN2bxjom6K0yCSMxbOtcBZvkzYtVKuoObSaRgVRJh2wHn6dAP9wG8y0q/DyhYylWIEMCIW08U5eqaLQEfEYgWdlBzsROsloAuC6k4bCWQzp+TTCw+yvV8soiNAR2QZdYT16G0EYBilg6GYgPZkgk9HyVesVAoFIL7hDG3UQRtECjphnM9Eay6lsKdurxyCFFHkOr3pKJvtfUQspmtx5qd8rlXuyXbDqCcKogDcdZgCfNWnk6OWK0Qq1SRwl2/kF/sGdo+niT/e5Xummb3YbB4w54KObm2NVO9gRgDZlEJ6/hG+8c/mHVOIXnhKiYwlvxGAuXQaRunIpQCixgL6IvAx4G8+fcwfqxmmnXkmfbEY63E29ck4m/pk/3098JBfbJRux+7G2d2bGWiPHwds37o151xApEVOwLSMYw8DbQH5x+NsdkHRKguxsCq1Onca0BV6pYHXam9v54ZVq5jS0sKSWIwEkMTtrbskFmNKSws3rFpFe3vYPlyGYRSEMO2QSkAv0JD2vRHojVK2ECnfEcBgvHPy6ZmmetB9RDAb5ejNRqo3Sy97sR9lZPO9X5uj916IXnjKlDPYZ2AunYZRfCjQHEAvMCrt+6hKVQBD2Tgk6mKjVOM3H3RJDvNHFHt2UL2X+MZ1TZZGNY5zD13CQN/7Jd78c9rJJw+qzqgLq9KVyBqym84Wgo5ubrZFWoZRJgqlAD6CC/N+HXA98H/Ah6OULUSKqgAK0bON0jNdu3atxhsbtZlwH/yU/XtMa2vOEUlmvaNbWjTe2KgXNDUNaFQXNTXpMN/Ih64daG7O2ZvOdq/nnXGGnjNtWqTRU7oSWQv6MdAx/pm0gL7/5JML3qMvxNoLw6gXhqwAAAFeBhyImwg+FXhprnKFTFEVQCk8TNasWaOjm5v1YhHtwy3QOtc3yAI6KkMZvOgbxMGMSIKU0btPPFEvLsDII9u95Tt6KqUpp1zbQhpGtVKoEcCGKPmKlaIqgGL7mPf19ekB++23p+edMn1kmmFSXkNr2DtPUEiPl3zuM2qPudK9cypdPsOoRAqlAK4E3hAlbzFSVAUwmHAO+fDuE0/UhWkNbJR1AzPJvmArW089amOdz31G7TFXun9+pctnGJVIoRTA34BdOG+9XuDeSpwEjtozboG8e4p9fX3azF7TTpQJ4EtAhzNwbiBzzUAbbnVxV1dX5MY6n/uslRg9lS6fYVQihVIAh2ZLUcoGXO+fXon8NZeAmocCmD97tn5cJLSRWAw6SSTvnuL82bMHrHqNugBqdN9vwhMAABeFSURBVNr3IJPRQt9YL43YWEfpDV8sopNyPYu0HnO+q4NLTaXLZxiVyJAUADAMmA8sAy4AmsLyR01eAbRFzZ+PF1DOXi8D/eSjml3GDh8+wB8/cgiENGUQxWQUpFQyTUW57OEt/j6j9pgrvYdd6fIZRiWSSwHkWgl8PdDhe+udwNIc+ctKe3s723GuSktg4CpTf/wG4M24aJP5RPR8euvWAate2yBS9NDh/vMyXFTPwW6vODOZ5NtXXkljQwNvPPpoJp94Iqc0Nweupt0OvCWHfOlRNyt928VKl88wqpIw7QDcm/a5Cbg7LH/UhFtHcDewAZgVkGcW0AP0jB8/PrLGGzt8uK7FTbyO8z3wcf57X1pPcXRra15eJanrpnrxUeYAFjU1abyxUdeRR8ycHKOJ9LmBkcOG6ftPPjmrC2a+PeZK97KpdPkMoxJhiCagu8O+DzYBB/m/Y4F7gLeE5c9nJXAU+/iipiY9eOTIvHzp58+erYubmvbY8S/A2fdzNUipyd0oJqONELjJSzblkNnoDTBn4dYmBC1Qy7w/1crfdrHS5TOMSmOoCmAX8LxP/cDOtM/Ph5WNmoBLgYvD8uSjAKLax+NRe+RZesh9uBHFAQzcCjGoQerr63P7CITUtcYrlIvZN7RDG+gH2TuKSfciGgH6hiOOCPQiWkRwOIlqjNFT6fIZRiUxJAVQjAS8BBie9nkdcFJYmXyDwQX1FBc1Ne3xtsk3jn36dRelhWdYC3ocaDPO7z7VIK1du3bA5PKIWGyAh1J6Qy7kdtlsAf0i2b2IZkYoPxo3wghSUBZewTBqj0pUAK/wZp97gPuBT+YqM5j9ANauXasdhx+uLWkN7MEjR+oFjY2q5B/HPkWUHmi2kAVr2Ru/J9Md9L98Tz1Mlo/jRi3ZGvn5EcovxJmXMuW18AqGUbtUnAIYTBrsCCCzUUvfvnCokTyDes1r164NNEGt8TJkNuSjIiqjAwLODUWZ2cSqYdQudacAwhq1dLPPUGL5h/Wa442NusiPMrKlc2HA5HMfboQymG0hs91XVHOWqoVXMIxap+4UQFCjtgYGhHJQ3BaPcfadxL2I4O0Uc/WaR+fojWf21ueTx8bqEa8ZdQRgi6sMo7bJpQAibQlZTaxcsYIZyeSAYwlgOi6OdWohVzewCLel4XPs3QLyWOBbwFFveAOHHXbYPtdftnQp5yeTgQu6noM920VmYzPwDdyWkY3ANcD7yb2t4lXAewPO5bstY4r0rS2DSF8sZhhGjRGmHSol5TMCyBYzJmXvT5l9biG3+acFtLWhQbu6ulR1r82/hfBNYMJ642v8dVM7ZyX9tR6IKM/MkF76YMxZNgIwjNqGehsBpDYrT2clMANox4WCmAmcS3hYhrnA4bt3c+GMGSxYsGBPyIhe9m783ozbED592/dpuE3cM0mNQn4FfNHL0oQLKdHk5QoKYXEK0DRsGD9uaeGuLNdu9/neCSxuaoq80bqFVzCMOidMO1RKGswcQKaffXqPPZedPtXzHeNHC1ECzA0IM5Elf5DXUfrx1AKzzBAWFzQ16YK5c3OuhO3q6sprkZR5ARlGbUO9TQL39fVpfP/9dRTZN0xvI7rXTYNvoD+eI+9iBm768sHGRo03NQ1oqIMifeZrvin0SlgLr2AYtUtdKoCR++9fkFDJLeQfxC3VYK9du3ZAQx2mdFILwxYRHlKiWFh4BcOoTepOAUTxbb8IdFKWRjwzxk5zxmghM0/KrLQRZ7IJa7BzTbj2gZ7nlY41woZhFIJcCqDmJoGzuYFmMge3r2VqQrUbN5nbjJvc3YGLUz3HH7shIE9qIvhNuIncHbNmsb63l87Ozn3qzDXh2g6MjcW4YO5cdu7axRNbtnD5smX7TNwahmEUCnFKorLp6OjQnp6eSHkbGxrYoUpTSJ4kbquzUcAHgFXAT8nuFXQX8C5g/xx5Tmpq4u4HHghssBOJBJMmTmT1tm2B15jS0sL63l5r9A3DKAgiskFVO4LO19wIIJsbaCYP4xTAVbhdaXK5hL4WOC9HntnAlVdcEVhne3s7N6xaxZSWlsBdvLK5ahqGYRSLmlMAQaaWBLAAtwL3MEAaGpjZ2MhG4KM5rvl/uA2Rwzh/505W3nhjaJ7Ozk7W9/ayY9YsJsfjNDc0MDkeDzUdGYZhFIuaMwElEgmOP/JIfrp9+54eezduEdb5uAVhh+L28/2WCFep8jXgb8AK4Fnc6GAXbrOC6bjQDTsgp1mpuaGBnbt25Xt7hmEYRaHuTEDt7e286W1voxO3OvbXuEZ8NXAZe1fgtgNfUeVXwHxgG7Ae19D34kYLAjyOs/9HMSu1tbYW/H4MwzCKRc0pAIC77ryTW3GN+fuBswm338/D9fbTlcMXcZO+vwbeDlydo04LmWAYRrVRkwrg6a1beQtwOa73PjtH/lm4eEGZnICLGzQOF98nWxwe/PFrm5qYu2DB4AQ2DMMoAzWpANI9gZ4mPDwz+JDHAedmAt8H+oF3AMfhRgUpD57FQCeQ3L2bBx98cIiSG4ZhlI6aVADpnkBtRLTfB5wbjzMl7QDuBU7ERecchttD4EXcorGf79jB9KlTSSQSWa+TSCRYMGcO4+JxGhsaGBePs2DOnMD8hmEYxaYmFcC8hQu5NhbjLiJuluLzZeNhYD/gYGAZzh30V7hFZH/AmZna8eaiZDLrWoDu7u494aTX9fezQ5V1/f00L1/OpIkT6e7u3qeMYRhGsalJBZC+6GorbtetMPv9clz8/2xci5sjSI///2+caejKjLwzk8l91gIkEgmmT53K6m3buCyZHDDRfFkyyept20JHDoZhGMWiJhUA7F10tfuMM9iK2yzlYgZutrLYH1+Ca5AzuQs3erjQn78M5056FvAQ8G3cto7jcG6jSfbdPjHXFpJhIwfDMIxiUrMKIEU8HmdYczO7gTuBo4DhwETgKzgvoc/hVgNn7sQ1BRcILl05nACcAzwG++wO9iZg+P77D6g/SnC6bCMHwzCMYlOzCiDd7r5h+3b+CuwEduN666nG+8+4qJ834ZTC/rhGfgduYVi24AyzceEhBphzcOsGdieTA8w5tvG6YRiVSk0qgGx298eAB9h3T97Uoq+fA6kIQo+zd3I3G0FuoyfgYgJ9+NRT9yiBqMHpbBWxYRilpiYVQLrdPRUE7lRczz3MFn8+0MrQ3EZnA33337/Hu6eYG6+ba6lhGEOhJhVAyu6evonLMHKvCP4ozkz0iRz5wtxGx+MWjaW8e06ZOnWPS2o27sIpgHxXEZtrqWEYQyZsu7BKSflsCamq2iCiD2Rstt4Qsidv+kbwjX5bxlvCNmkP2Sc4fX/gxbGYLpg7t+Abr/f19WlbS0vkjeQNw6hPqMQtIUXkJBH5u4j0icjiQl+/rbWVL+NMOimTTz4rgufh/PyXMNAz6OMidLKvZ1A66aODlHdPofcBMNdSwzAKQph2KEbCuc4ngFfgFtneA7wurEy+I4D5s2friIxe+nzQJTlGAItBF/hyY/zncWmjgvPOOENHDhsW3vNOq/dFv8F7ocm1wfyekUg8XvC6DcOoHqjAEcBxQJ+q/kNVX8TFWju1kBXMW7iQ5xkYBG4eblVvlBXB43Ebw1wOPAF83G/W3rViBd+79VamtLTss6gs27qBYnn3mGupYRiFoBwK4GDgkbTvj/pjAxCRWSLSIyI9mzdvzquC9vZ2RjY3DzD5tOMa53fiVgCHNd7pXj6Zk7Qpc85vDz+cDtwE82Syrxso1h4B5lpqGEYhKIcCkCzH9tmXUlWvUdUOVe0YM2ZM3pVMP+ccljcN3MSxE/gQ8Ftcox3UeF8LvJfgzdrb29v5/k9+QlNLC3fiRgmZ6wYG690ThWK6lhqGUUeE2YeKkXBzlL9I+74EWBJWJt85AFXnKTNy//33sdf3ZXgHZbPjt4CObm3VBXPnhnrSFNq7J597My8gwzByQQXOAfwZeJWIvFxE9gM+jIuxVnB2AScz0JsH4G04U9BCMkxBvse/as0anu7v5/Jlywb0/DMptHdPVNKjnS6JxbLeQ+aoxTAMI5OSKwBV3Ymbk/0FsBG4RVXvL3Q9y5YuZe7u3fwJZ+JJN/kcgovb8wcRjt1vvyE13O3t7Vy+bBlPbNnCzl27eGLLlpyKoxCUS/kYhlE7iBslVDYdHR3a09OTV5lx8Tjr+vsD/fXB9ZiPb2nh6RdeGJJ8hmEYlYiIbFDVjqDzNRkKAqK7Sj63bVvesXMsBo9hGLVAzSqAqK6SwyGvFbMWg8cwjFqhZhXAtDPP5Fs58iwHPgCRN2Ox7R0Nw6glalYBzFu4kKvIvfL340RfMWsxeAzDqCVqVgG0t7cTa27mFPYN6pa+8jdG9BWztr2jYRi1RM0qAIBzzzmHDzY17eMGmr7yN58VsxaDxzCMWqKmFcC8hQtZtd9+fBAXrmEnA8M25BuuwWLwGIZRS9S0Aij0ilmLwWMYRi1R0woACrtidt7ChUXZ3tEwDKMc1KQCyFyo9cajj0Z37+YPd989pHANFoPHMIxaouYUQLEXalkMHsMwaoWaigWUSCSYNHEiq7dty+qrfxcwpaWF9b291ks3DKPmqatYQLZQyzAMIzo1pQBsoZZhGEZ0akoB2EItwzCM6NSUArCFWoZhGNGpKQVgC7UMwzCiU1MKwBZqGYZhRKemFIAt1DIMw4hOTSkAsIVahmEYUamphWCGYRjGXupqIZhhGIYRHVMAhmEYdYopAMMwjDqlKuYARGQz5FzjFUQb8HQBxSk2Jm/xqTaZTd7iUm3yQnSZD1XVMUEnq0IBDAUR6QmbBKk0TN7iU20ym7zFpdrkhcLJbCYgwzCMOsUUgGEYRp1SDwrgmnILkCcmb/GpNplN3uJSbfJCgWSu+TkAwzAMIzv1MAIwDMMwsmAKwDAMo06pGQUgIv8UkXtF5K8isk/gIHF8Q0T6RKRXRI4ph5xelld7OVPpeRGZn5HnRBHZkpbnMyWW8Tsi8pSI3Jd2bJSI3C4iD/m/IwPKnu3zPCQiZ5dZ5q+IyAP+N/+RiBwQUDb0/SmhvJeKyGNpv/t7AsqeJCJ/9+/z4jLKe3OarP8Ukb8GlC3H832ZiPxGRDaKyP0i8jF/vCLf4xB5i/cOq2pNJOCfQFvI+fcA3YAAk4A/lltmL1cj8ARuwUb68ROB28oo11uAY4D70o79L7DYf14MfDlLuVHAP/zfkf7zyDLK/C6gyX/+cjaZo7w/JZT3UuDiCO9MAngFsB9wD/C6csibcX4p8JkKer4HAsf4z8OBB4HXVep7HCJv0d7hmhkBROBU4AZ1rAcOEJEDyy0U8A4goaqDXelcFFT1d8CzGYdPBa73n68HTstS9N3A7ar6rKo+B9wOnFQ0QdPIJrOq/lJVd/qv64FDSiFLFAKecRSOA/pU9R+q+iLwfdxvU1TC5BURAT4E3FRsOaKiqptU9W7/uR/YCBxMhb7HQfIW8x2uJQWgwC9FZIOIzMpy/mDgkbTvj/pj5ebDBP/TnCAi94hIt4gcXkqhAhinqpvAvazA2Cx5KvU5A5yHGwVmI9f7U0rm+eH+dwLME5X4jN8MPKmqDwWcL+vzFZEJwNHAH6mC9zhD3nQK+g43DVbACmSyqj4uImOB20XkAd9jSSFZypTVB1ZE9gOmAEuynL4bZxba6u3APwZeVUr5BknFPWcAEfkksBP4XkCWXO9Pqbga+DzumX0eZ1Y5LyNPJT7jjxDe+y/b8xWRVuCHwHxVfd4NVnIXy3KsJM84U9604wV/h2tmBKCqj/u/TwE/wg2T03kUeFna90OAx0sjXSCdwN2q+mTmCVV9XlW3+s9rgJiItJVawAyeTJnN/N+nsuSpuOfsJ/BOBs5QbyzNJML7UxJU9UlV3aWqu4FrA+SoqGcsIk3A+4Gbg/KU6/mKSAzXmH5PVW/1hyv2PQ6Qt2jvcE0oABF5iYgMT33GTZrcl5FtNTBdHJOALalhYBkJ7DWJyEu9XRUROQ73Wz1TQtmysRpIeUOcDfwkS55fAO8SkZHefPEuf6wsiMhJwCJgiqpuC8gT5f0pCRnzUu8LkOPPwKtE5OV+FPlh3G9TLt4JPKCqj2Y7Wa7n6/9/uoCNqnp52qmKfI+D5C3qO1zMWe1SJZw3xD0+3Q980h//KPBR/1mAK3HeE/cCHWWWuQXXoI9IO5Yu7zx/L/fgJn7eWGL5bgI2AUlcb2gGMBpYCzzk/47yeTuA5WllzwP6fDq3zDL34Wy5f/XpWz7vQcCasPenTPLe6N/PXlxDdWCmvP77e3BeIolyyuuPX5d6b9PyVsLzfRPObNOb9vu/p1Lf4xB5i/YOWygIwzCMOqUmTECGYRhG/pgCMAzDqFNMARiGYdQppgAMwzDqFFMAhmEYdYopACMSIrLLRxm8T0R+ICItBb7+OSKyLEeeE0XkjWnfPyoi0wspR5Y6v+IjM34ly7lOEenx0RsfEJGvZsrl7+ugPOtcLiKvyyP/a0TkLhHZISIXZ5zLGTVUAqJj+jUzWSPoSpkivhoFphT+uJaqPwFb0z5/D7iowNc/B1iWI8+l5IiUWYT7fh7YP8vxI3A++K/x35uAOVny3UGR15zgYtm8AfhC+vMhYtRQAqJjEhBBlzJGfLVU2GQjAGMw3Am8EkBELvKjgvvE72kgIhN8j/h633NclRoxiItZ3uY/d4jIHZkXF5FTROSPIvIXEfmViIwTFxzro8ACPxJ5s7jY+Rf7MkeJyHrZGzM91Yu9Q0S+LCJ/EpEHReTNWeoT39O/T1w89dP98dXAS4A/po6lcQnwBVV9AEBVd6rqVb7cpSJysYhMxS0u+p6X+b0i8qO0ev+fiNyacd2UzB3+81YR+YK4oIDrRWRcZn5VfUpV/4xboJVO1KihQdExgyLoZo2UKSKNInJd2nNckKUuo4IwBWDkhbi4L53AvSJyLHAucDyuh3i+iBzts74auEZVJ+J60XPyqOb3wCRVPRrXaF2iqv8EvgVcoapHqeqdGWVuABb5+u4FPpt2rklVjwPmZxxP8X7gKOD1uLAGXxGRA1V1CrDd15cZ5+YIYEPYTajqKqAHF7/lKGAN8FoRGeOznAt8N+waOAW0XlVfD/wOOD9H/nSiRrQMio4ZVD7o+FG48MVHqOqR5L43o8yYAjCi0ixut6ce4GFczJI3AT9S1RfUBa67FRcWGOARVf2D/7zC543KIcAvRORe4ONAaChsERkBHKCqv/WHrsdtXpIi1cveAEzIcok3ATepC8L2JPBbnEmloKiq4kI9nCluV6cTCA7tm+JF4Db/OUj+IIYa0TKofNDxfwCvEJFviotf83yWfEYFYQrAiEqqJ3yUql7oTQphcXUzG5rU953sfe+GBZT9Jm4+4EjggpB8Udnh/+4iewj0SPGBM7gfOHYQ5b4LnIkLBPgD3bvRRxBJrzggWP4goka0DIqOGVQ+63FvDno9bt5jLrA8D1mNMmAKwBgKvwNOE5EWcREI34ebHwAYLyIn+M8fwZl1wG1bl2o4PxBw3RHAY/5zuodJP26rvAGo6hbguTT7/lm4Xnw+93G6t2GPwY0e/pSjzFeAT4jIYQAi0iAiF2XJN0BmdSF7Hwc+hQuiVkwCo4aKyBdF5H0+X1B0zKAIulkjZfq5nQZV/SHwadz2kUYFU0sbwhglRlXvFpHr2NtYLlfVv/gJ243A2SLybVzUxat9ns8BXSLyCfbd7SjFpcAPROQxXCTUl/vjPwVWicipwIUZZc4GvuUnm/+Bs69H5Uc4c8w9uJHKJar6RFgBVe31k943+ToV+FmWrNd5ubYDJ6jqdpwX1RhV/VseMgYiIi/FmebiwG4v1+vUbX4yD9dgNwLfUdX7fbEj2RtC+kvALSIyA2fe+6A/voa90Si34Z+pqj4rIp/HKRiA//bHXg98V0RSHctsGx0ZFYRFAzUKjlcAt6nqEWUWpSIRt97hL6raVUYZfqGq7y5X/UZlYCMAwyghIrIBeAFYWE45rPE3wEYAhmEYdYtNAhuGYdQppgAMwzDqFFMAhmEYdYopAMMwjDrFFIBhGEad8v8BMmVtTnuIW5MAAAAASUVORK5CYII=\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plotData(X,y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
2.2 Gradient Descent
\n",
"\n",
"Here we will fit theta to our data using Gradient Descent."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"m = y.size # number of samples\n",
"X = np.stack([np.ones(m), X], axis=1) # add collumn of ones to data for theta0 intercept term\n",
" \n",
"# NOTE: If ValueError: all input arrays must have the same shape appears then you may have run this cel multiple times\n",
"# which will have added multiple collumns of ones to the matrix X\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"# Here we set the number of iterations as well as learning rate alpha\n",
"iterations = 1500\n",
"alpha = 0.01"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we are properly setup we can begin implementing Gradient Descent. We do this by subtracting from theta our scaled derivative of the cost function. We will also keep track of the cost function to check accuracy. Relevant formulas are as follows:\n",
"\n",
"$$J(\\Theta ) = 1/(2m)\\sum_{i = 1}^{m} (h_\\theta (x^i) - y^i)$$\n",
"\n",
"$$h_\\theta(x) = \\theta^Tx = \\theta_0 + \\theta_1x_1$$\n",
"\n",
"$$\\theta_j = \\theta_j - (\\alpha/m)\\sum_{i = 1}^{m}(h_\\theta(x^i) - y^i)x_j^i$$\n",
"\n",
"We begin by creating a function which can return the cost J given training set X and y and an intitial theta\n"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"def computeCost(X,y,theta):\n",
" # initialize some useful values\n",
" m = y.size # number of training examples\n",
" J = 0 # initialize zero cost\n",
" \n",
" # Vectorized implementation of cost function J(theta)\n",
" H = X.dot(theta)\n",
" J = np.subtract(H,y)\n",
" J = np.square(J)\n",
" J = np.sum(J)\n",
" J = J*(1/(2*m))\n",
" # ===========================================================\n",
" return J"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can run the function with a few different theta initializations "
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"ename": "ValueError",
"evalue": "shapes (47,3) and (2,) not aligned: 3 (dim 1) != 2 (dim 0)",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mJ\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mcomputeCost\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mX\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0my\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtheta\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0marray\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0.0\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m0.0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'With theta = [0, 0] \\nCost computed = %.2f'\u001b[0m \u001b[1;33m%\u001b[0m \u001b[0mJ\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[1;31m# further testing of the cost function\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[0mJ\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mcomputeCost\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mX\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0my\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtheta\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0marray\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;33m-\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m2\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m\u001b[0m in \u001b[0;36mcomputeCost\u001b[1;34m(X, y, theta)\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[1;31m# Vectorized implementation of cost function J(theta)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 7\u001b[1;33m \u001b[0mH\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mX\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtheta\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 8\u001b[0m \u001b[0mJ\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msubtract\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mH\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0my\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 9\u001b[0m \u001b[0mJ\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msquare\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mJ\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mValueError\u001b[0m: shapes (47,3) and (2,) not aligned: 3 (dim 1) != 2 (dim 0)"
]
}
],
"source": [
"J = computeCost(X, y, theta=np.array([0.0, 0.0]))\n",
"print('With theta = [0, 0] \\nCost computed = %.2f' % J)\n",
"\n",
"# further testing of the cost function\n",
"J = computeCost(X, y, theta=np.array([-1, 2]))\n",
"print('With theta = [-1, 2]\\nCost computed = %.2f' % J)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have a working cost function, we can implement Gradient Descent"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"def gradientDescent(X, y, theta, alpha, num_iters):\n",
" # Initialize useful values\n",
" m = y.size\n",
" n = theta.size\n",
" J_history = []\n",
" \n",
" # make a copy of theta, to avoid changing the original array, since numpy arrays\n",
" # are passed by reference to functions\n",
" theta = theta.copy()\n",
" \n",
" for i in range(num_iters):\n",
" hypothesis = X.dot(theta)\n",
" errors = np.subtract(hypothesis,y)\n",
" Xtrans = X.transpose()\n",
" gradient = alpha*(1/m)*Xtrans.dot(errors)\n",
" theta = theta - gradient\n",
" J_history.append(computeCost(X,y,theta))\n",
" return(theta, J_history)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"ename": "ValueError",
"evalue": "shapes (47,3) and (2,) not aligned: 3 (dim 1) != 2 (dim 0)",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 8\u001b[0m \u001b[0malpha\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;36m0.01\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 9\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 10\u001b[1;33m \u001b[0mtheta\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mJ_history\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mgradientDescent\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mX\u001b[0m \u001b[1;33m,\u001b[0m\u001b[0my\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtheta\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0malpha\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0miterations\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 11\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'Theta found by gradient descent: {:.4f}, {:.4f}'\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m*\u001b[0m\u001b[0mtheta\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m\u001b[0m in \u001b[0;36mgradientDescent\u001b[1;34m(X, y, theta, alpha, num_iters)\u001b[0m\n\u001b[0;32m 10\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 11\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mnum_iters\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 12\u001b[1;33m \u001b[0mhypothesis\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mX\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtheta\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 13\u001b[0m \u001b[0merrors\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msubtract\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mhypothesis\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0my\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 14\u001b[0m \u001b[0mXtrans\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mX\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtranspose\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mValueError\u001b[0m: shapes (47,3) and (2,) not aligned: 3 (dim 1) != 2 (dim 0)"
]
}
],
"source": [
"# Test Case\n",
"\n",
"# initialize fitting parameters\n",
"theta = np.zeros(2)\n",
"\n",
"# some gradient descent settings\n",
"iterations = 1500\n",
"alpha = 0.01\n",
"\n",
"theta, J_history = gradientDescent(X ,y, theta, alpha, iterations)\n",
"print('Theta found by gradient descent: {:.4f}, {:.4f}'.format(*theta))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we have a theta we can fit our data to a line"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"ename": "ValueError",
"evalue": "shapes (47,3) and (2,) not aligned: 3 (dim 1) != 2 (dim 0)",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[0mplotData\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mX\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0my\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[0mpyplot\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mplot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mX\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mX\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mtheta\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'-'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 3\u001b[0m \u001b[0mpyplot\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mlegend\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'Training Data'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'Linear Regression'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[1;32mpass\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mValueError\u001b[0m: shapes (47,3) and (2,) not aligned: 3 (dim 1) != 2 (dim 0)"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plotData(X[:,1], y)\n",
"pyplot.plot(X[:,1],np.dot(X,theta),'-')\n",
"pyplot.legend(['Training Data', 'Linear Regression'])\n",
"pass"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"For population = 35,000, we predict a profit of 0.00\n",
"\n",
"For population = 70,000, we predict a profit of 0.00\n",
"\n"
]
}
],
"source": [
"# Predict values for population sizes of 35,000 and 70,000\n",
"predict1 = np.dot([1, 3.5], theta)\n",
"print('For population = 35,000, we predict a profit of {:.2f}\\n'.format(predict1*10000))\n",
"\n",
"predict2 = np.dot([1, 7], theta)\n",
"print('For population = 70,000, we predict a profit of {:.2f}\\n'.format(predict2*10000))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
2.4 Visualizing J(theta)
\n",
"\n",
"To better understand our cost function, we will now plot the cost over a 2-d grid of theta0 and theta1 values."
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"ename": "ValueError",
"evalue": "shapes (47,3) and (2,) not aligned: 3 (dim 1) != 2 (dim 0)",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 9\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mi\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtheta0\u001b[0m \u001b[1;32min\u001b[0m \u001b[0menumerate\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtheta0_vals\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 10\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mj\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtheta1\u001b[0m \u001b[1;32min\u001b[0m \u001b[0menumerate\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtheta1_vals\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 11\u001b[1;33m \u001b[0mJ_vals\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mi\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mj\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mcomputeCost\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mX\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0my\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m[\u001b[0m\u001b[0mtheta0\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtheta1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 12\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 13\u001b[0m \u001b[1;31m# Because of the way meshgrids work in the surf command, we need to\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m\u001b[0m in \u001b[0;36mcomputeCost\u001b[1;34m(X, y, theta)\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[1;31m# Vectorized implementation of cost function J(theta)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 7\u001b[1;33m \u001b[0mH\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mX\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtheta\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 8\u001b[0m \u001b[0mJ\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msubtract\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mH\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0my\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 9\u001b[0m \u001b[0mJ\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msquare\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mJ\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mValueError\u001b[0m: shapes (47,3) and (2,) not aligned: 3 (dim 1) != 2 (dim 0)"
]
}
],
"source": [
"# grid over which we will calculate J\n",
"theta0_vals = np.linspace(-10, 10, 100)\n",
"theta1_vals = np.linspace(-1, 4, 100)\n",
"\n",
"# initialize J_vals to a matrix of 0's\n",
"J_vals = np.zeros((theta0_vals.shape[0], theta1_vals.shape[0]))\n",
"\n",
"# Fill out J_vals\n",
"for i, theta0 in enumerate(theta0_vals):\n",
" for j, theta1 in enumerate(theta1_vals):\n",
" J_vals[i, j] = computeCost(X, y, [theta0, theta1])\n",
" \n",
"# Because of the way meshgrids work in the surf command, we need to\n",
"# transpose J_vals before calling surf, or else the axes will be flipped\n",
"J_vals = J_vals.T\n",
"\n",
"# surface plot\n",
"fig = pyplot.figure(figsize=(12, 5))\n",
"ax = fig.add_subplot(121, projection='3d')\n",
"ax.plot_surface(theta0_vals, theta1_vals, J_vals, cmap='viridis')\n",
"pyplot.xlabel('theta0')\n",
"pyplot.ylabel('theta1')\n",
"pyplot.title('Surface')\n",
"\n",
"# contour plot\n",
"# Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100\n",
"ax = pyplot.subplot(122)\n",
"pyplot.contour(theta0_vals, theta1_vals, J_vals, linewidths=2, cmap='viridis', levels=np.logspace(-2, 3, 20))\n",
"pyplot.xlabel('theta0')\n",
"pyplot.ylabel('theta1')\n",
"pyplot.plot(theta[0], theta[1], 'ro', ms=10, lw=2)\n",
"pyplot.title('Contour, showing minimum')\n",
"pass"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
3 Linear Regression with Multiple Variables
\n",
"\n",
"Here we implement linear regression with multiple variable to predict the price of houses\n",
"\n",
"
3.1 Feature Normalization
\n",
"\n",
"We begin by creating a function to normalize our features by setting the mean to zero and standard deviation to 1"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" X[:,0] X[:, 1] y\n",
"--------------------------\n",
" 2104 3 399900\n",
" 1600 3 329900\n",
" 2400 3 369000\n",
" 1416 2 232000\n",
" 3000 4 539900\n",
" 1985 4 299900\n",
" 1534 3 314900\n",
" 1427 3 198999\n",
" 1380 3 212000\n",
" 1494 3 242500\n"
]
}
],
"source": [
"# Load data\n",
"data = np.loadtxt(os.path.join('Data', 'ex1data2.txt'), delimiter=',')\n",
"X = data[:, :2]\n",
"y = data[:, 2]\n",
"m = y.size\n",
"\n",
"# print out some data points\n",
"print('{:>8s}{:>8s}{:>10s}'.format('X[:,0]', 'X[:, 1]', 'y'))\n",
"print('-'*26)\n",
"for i in range(10):\n",
" print('{:8.0f}{:8.0f}{:10.0f}'.format(X[i, 0], X[i, 1], y[i]))"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"def featureNormalize(X):\n",
" # Normalize features in x returning normalized version of X where \n",
" # mean value of each feature is zero and the standard deviation is 1\n",
" \n",
" # You need to set these values correctly\n",
" X_norm = X.copy()\n",
" mu = np.zeros(X.shape[1])\n",
" sigma = np.zeros(X.shape[1])\n",
" m = X.shape[0]\n",
" n = X.shape[1]\n",
"\n",
" # =========================== YOUR CODE HERE =====================\n",
" mu = np.mean(X, axis = 0)\n",
" sigma = np.std(X, axis = 0)\n",
" tempMu = np.zeros(X.shape)\n",
" for i in range(m):\n",
" tempMu[i,:] = mu\n",
" X_norm = np.subtract(X_norm,tempMu)\n",
" for i in range(n):\n",
" X_norm[:,i] = np.divide(X_norm[:,i],sigma[i])\n",
" \n",
" \n",
" # ================================================================\n",
" return X_norm, mu, sigma"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Computed mean is [2000.68085106 3.17021277]\n",
"Computed sigma is [7.86202619e+02 7.52842809e-01]\n"
]
}
],
"source": [
"X_norm, mu, sigma =featureNormalize(X);\n",
"print(\"Computed mean is \", mu)\n",
"print(\"Computed sigma is \", sigma)"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [],
"source": [
"# Add intercept term to X\n",
"X = np.concatenate([np.ones((m, 1)), X_norm], axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
3.2 Gradient Descent
\n",
"We can now apply gradient descent to our normalized, multivariate data set and plot the cost relative to the number of iterations"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"theta computed from gradient descent: [334302.06399328 99411.44947359 3267.01285407]\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# Initialize (Adjusting these values can change how the effectiveness\n",
"# of our minimization as seen on our graph)\n",
"alpha = 0.01\n",
"num_iters = 400\n",
"\n",
"# initialize theta and run Gradient Descent\n",
"theta = np.zeros(3)\n",
"theta, J_history = gradientDescent(X,y,theta,alpha,num_iters)\n",
"\n",
"# Graph it\n",
"pyplot.plot(np.arange(len(J_history)), J_history)\n",
"pyplot.xlabel(\"Number of Iterations\")\n",
"pyplot.ylabel(\"Cost J\")\n",
"\n",
"# Resulting theta\n",
"print('theta computed from gradient descent: {:s}'.format(str(theta)))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"