Add Network Modeling Assignment 1 notes

2026-03-22 15:58:12 +01:00
parent 6551f7a011
commit 5de46cc713
26 changed files with 1469 additions and 0 deletions
--- a//task1b_ql4jfi6af0/template_solution.ipynb
+++ b//task1b_ql4jfi6af0/template_solution.ipynb
@@ -0,0 +1,253 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### General guidance\n",
+    "\n",
+    "This serves as a template which will guide you through the implementation of this task. It is advised\n",
+    "to first read the whole template and get a sense of the overall structure of the code before trying to fill in any of the TODO gaps.\n",
+    "This is the jupyter notebook version of the template. For the python file version, please refer to the file `template_solution.py`.\n",
+    "\n",
+    "First, we import necessary libraries:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2026-03-15T18:22:13.169924Z",
+     "start_time": "2026-03-15T18:22:13.165934Z"
+    }
+   },
+   "source": [
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "\n",
+    "# Add any additional imports here (however, the task is solvable without using \n",
+    "# any additional imports)\n",
+    "# import ..."
+   ],
+   "outputs": [],
+   "execution_count": 30
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    " #### Loading data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2026-03-15T18:22:13.190312Z",
+     "start_time": "2026-03-15T18:22:13.181357Z"
+    }
+   },
+   "source": [
+    "data = pd.read_csv(\"train.csv\")\n",
+    "y = data[\"y\"].to_numpy()\n",
+    "data = data.drop(columns=[\"Id\", \"y\"])\n",
+    "# print a few data samples\n",
+    "print(data.head())\n",
+    "X = data.to_numpy()"
+   ],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "     x1    x2    x3    x4    x5\n",
+      "0  0.02  0.05 -0.09 -0.43 -0.08\n",
+      "1 -0.13  0.11 -0.08 -0.29 -0.03\n",
+      "2  0.08  0.06 -0.07 -0.41 -0.03\n",
+      "3  0.02 -0.12  0.01 -0.43 -0.02\n",
+      "4 -0.14 -0.12 -0.08 -0.02 -0.08\n"
+     ]
+    }
+   ],
+   "execution_count": 31
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Transform features"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2026-03-15T18:22:13.201976Z",
+     "start_time": "2026-03-15T18:22:13.198370Z"
+    }
+   },
+   "source": [
+    "\"\"\"\n",
+    "Transform the 5 input features of matrix X (x_i denoting the i-th component of a given row in X) \n",
+    "into 21 new features phi(X) in the following manner:\n",
+    "5 linear features: phi_1(X) = x_1, phi_2(X) = x_2, phi_3(X) = x_3, phi_4(X) = x_4, phi_5(X) = x_5\n",
+    "5 quadratic features: phi_6(X) = x_1^2, phi_7(X) = x_2^2, phi_8(X) = x_3^2, phi_9(X) = x_4^2, phi_10(X) = x_5^2\n",
+    "5 exponential features: phi_11(X) = exp(x_1), phi_12(X) = exp(x_2), phi_13(X) = exp(x_3), phi_14(X) = exp(x_4), phi_15(X) = exp(x_5)\n",
+    "5 cosine features: phi_16(X) = cos(x_1), phi_17(X) = cos(x_2), phi_18(X) = cos(x_3), phi_19(X) = cos(x_4), phi_20(X) = cos(x_5)\n",
+    "1 constant feature: phi_21(X)=1\n",
+    "\n",
+    "Parameters\n",
+    "----------\n",
+    "X: matrix of floats, dim = (700,5), inputs with 5 features\n",
+    "\n",
+    "Compute\n",
+    "----------\n",
+    "X_transformed: matrix of floats: dim = (700,21), transformed input with 21 features\n",
+    "\"\"\"\n",
+    "X_transformed = np.zeros((700, 21))\n",
+    "quadratic_features = np.power(X,2)\n",
+    "exponential_features = np.exp(X)\n",
+    "cosine_features = np.cos(X)\n",
+    "constant_feature = np.ones((X.shape[0],1))\n",
+    "X_transformed = np.concatenate((X, quadratic_features, exponential_features, cosine_features, constant_feature), axis=1)\n",
+    "assert X_transformed.shape == (700, 21)"
+   ],
+   "outputs": [],
+   "execution_count": 32
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Fit data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2026-03-15T18:22:13.625674Z",
+     "start_time": "2026-03-15T18:22:13.217438Z"
+    }
+   },
+   "source": [
+    "\"\"\"\n",
+    "Use the transformed data points X_transformed and fit the logistic regression on this \n",
+    "transformed data. Finally, compute the weights of the fitted logistic regression. \n",
+    "\n",
+    "Parameters\n",
+    "----------\n",
+    "X_transformed: array of floats: dim = (700,21), transformed input with 21 features\n",
+    "y: array of integers \\in {0,1}, dim = (700,), input labels\n",
+    "\n",
+    "Compute\n",
+    "----------\n",
+    "w: array of floats: dim = (21,), optimal parameters of logistic regression\n",
+    "\"\"\"\n",
+    "weights = np.zeros((21,))\n",
+    "learning_rate = 2 * X.shape[0] / np.linalg.svd(X_transformed, compute_uv=False)[0]**2\n",
+    "tolerance = 0.001\n",
+    "sigma = lambda x : 1/(1+np.exp(-x))\n",
+    "gradient = lambda w, X, y: X.T @ (sigma(X @ w) - y) / X.shape[0]\n",
+    "update = 1000000\n",
+    "while np.linalg.norm(update) > tolerance:\n",
+    "    # Select a random batch (SGD)\n",
+    "    selection = np.random.choice(X_transformed.shape[0], 100, replace=False)\n",
+    "    X_random = X_transformed[selection,:]\n",
+    "    update = learning_rate * gradient(weights, X_random, y[selection])\n",
+    "    weights -= update\n",
+    "\n",
+    "assert weights.shape == (21,)"
+   ],
+   "outputs": [],
+   "execution_count": 33
+  },
+  {
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2026-03-15T18:22:13.630314Z",
+     "start_time": "2026-03-15T18:22:13.629075Z"
+    }
+   },
+   "cell_type": "code",
+   "source": "",
+   "outputs": [],
+   "execution_count": null
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2026-03-15T18:22:13.636358Z",
+     "start_time": "2026-03-15T18:22:13.633885Z"
+    }
+   },
+   "source": [
+    "# Save results in the required format\n",
+    "np.savetxt(\"./results.csv\", weights, fmt=\"%.12f\")"
+   ],
+   "outputs": [],
+   "execution_count": 34
+  },
+  {
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2026-03-15T18:22:13.643898Z",
+     "start_time": "2026-03-15T18:22:13.641041Z"
+    }
+   },
+   "cell_type": "code",
+   "source": [
+    "matrix = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])\n",
+    "np.linalg.svd(X_transformed,compute_uv=False)\n",
+    "matrix[np.random.choice(matrix.shape[0], 1, replace=False), :]"
+   ],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([[1, 2]])"
+      ]
+     },
+     "execution_count": 35,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "execution_count": 35
+  },
+  {
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2026-03-15T18:22:13.707327Z",
+     "start_time": "2026-03-15T18:22:13.706165Z"
+    }
+   },
+   "cell_type": "code",
+   "source": "",
+   "outputs": [],
+   "execution_count": null
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}