feat: add simple linear regression

2 years ago · de8b685aaf
parent 36da0d1685
commit de8b685aaf
2 changed files with 32 additions and 2 deletions
--- a/main.tex
+++ b/main.tex
@ -48,14 +48,14 @@
 \section{Introduction}
 This first section is dedicated to defining the core concepts that will be developed in the course. Most importantly, the definitions of what is an intelligent system/machine.

-\begin{definition}{Intelligent System}
+\begin{definition}{Intelligent System}{intellignet-system}
    An algorithm enabled by constraints, exposed by representations that support models, and targeted at reflection, perception, and action.\\
    Without loss of generality, an intelligent system is one that generate hypotheses ans test them.
 \end{definition}

 We say that the algorithm is \textit{enabled by constraints} because the constraints gives it a direction to follow to solve the problem. Without constraints, the fields of possible solutions is too broad and the algorithm cannot posible start choosing a direction. The constraints are necessary to enable the intelligence. An intelligent system has one main feature that is the generation of outputs based on the inputs and the nature of the system. Common capabilities of inteligent systems include sensory perception, pattern recognition, learning and knowledge acquisition, inference from incomplete information etc.

-\begin{definition}{Intelligent Machine}
+\begin{definition}{Intelligent Machine}{intelligent-machine}
    An intelligent machine is one that can exibit one or more intelligent characteristics of a human. An intelligent machine embodies machine intelligence. An intelligent machine, howevern may take a broader meaning than an intelligent computer.
 \end{definition}

@ -64,7 +64,36 @@ We say that the algorithm is \textit{enabled by constraints} because the constra
    \includegraphics[width=0.8\textwidth]{images/intelligent_machine_example.png}
 \end{figure}

+ML algorithm can be sorted in two categories: Predictive and Descriptive.

+\begin{definition}{Predictive vs Descriptive}{desc-pred}
+Predictive Models returns a prediction or outcome that is not known. Classification or regressions are examples of predictive analysis.\\
+Descriptive Models provide new information to describe the data in a new way. Clustering or summarization are examples or descriptive analysis.\\
+\end{definition}
+
+The difference between classification and regression is that classification provide a categorical output (one element of a pre-deternime finite set, for example a label) when regression provide a continuouse output as a real number.
+
+\section{Simple Linear Regression}
+We are interested in finding a function that represent best the non-functional relationship\footnote{Meaning that the output cannot be expressed as a mathematical function of the input.} between the input and the output. We suppose that we have a dataset of $n$ input/output points. We can first try to determine if there is a statistical relationship between input and output. For that we look at the covariance and the corelation.
+
+With $\bar{Y}$ the average of the outputs and $\bar{X}$ the average of the inputs and $S$ the standard deviation:
+\begin{align}
+    Covariance &= \dfrac{1}{n-1}\sum_{i=1}^{n}(x_i-\bar{X})(y_i-\bar{Y})\\
+    Correlation &= \dfrac{Cov(X,Y)}{S_xS_y}
+\end{align}
+
+The correlation is normalized between $-1$ and 1. A correlation of $-1$ or 1 indicate a perfect statistical relationship, great for applying SLR. On the other hand, the closer the correlation is to 0 the less information we have. A low correlation does not indicate that there is no statistical relationship, it simply mean that the relationship is not linear.
+
+Now that we know that there is a linear relationship, we want to find the parameters of the affine function that best approximate this relationship. The function is of the form $y_i = f(x_i) = \beta_1 x_i +\beta_0$. The best approwimation of $\beta$ is the one that minimizes the square error. We can find $\beta$ by deriving and solving for 0 the srqare error defined by:
+\begin{equation}
+    Q = \sum_{i=1}^{n}(y_i - \hat{y_i})^2 = \sum_{i=1}^{n}(y_i - \beta_1x_i+\beta_0)^2
+\end{equation}
+
+This is equivalent to using the Maximum Likelihood Estimator (MLE) with the (reasonable) assumption that the output is a reesult of the Gaussian noice around the function of the input i.e. a Normal distribution.
+
+This works well for monovariate problems. In the case of multivariate problems, you have to take each input individually and evaluate the linear regression from this input to the output and the problem becomes way more complex (not detailed here).
+
+\section{Logistic Regression}


 \end{document}
--- a/requirements.txt
+++ b/requirements.txt
@ -2,3 +2,4 @@ fancyhdr
 import
 tcolorbox
 environ
+pgf