[INCOMPLETE] Thoughts on AI
[Someday], 2026
One of the main issues with modern AI discourse, specifically with those unfamiliar with the mechanisms of AI, is the mystical thinking surrounding its implementation. In this article, I would like to give a generalized description of said mechanisms, specifically in the language of interpolation. Using this description, I hope that we can more precisely discuss the powers, potential applications and limitations of AI systems.
The fundemental problem of AI
I consider the fundemental problem of AI to be formulated as follows: For any underlying task that an AI can be designed to work on, it can be modelled functionally. Specifically, we can analyze any task as some input output map. For example
- Image Classification: input image, output class
- Text Based Generative AI: input prompt text, output repsonse text, or perhaps input previous text in window, output next text in window (depending on the architecture)
The fundemental problem, henceforth, is to find the underlying function describing the process
Solving the Fundemental Problem
Generally, to design an AI system/architecture, the designer makes a guess at the parameterized function governing the process. For example, I can look at the below data,
and perhaps, I will guess that the underlying function to predict this data may be where are unknown parameters. In AI terminology, we must "train" the AI to find the "parameters", or "weights", of the model.
An introduction to interpolation
Scientific computing is a discipline concerned with the development and study of numerical algorithms for solving mathematical problems that arise in various disciplines in science and engineering.
- A First Course in Numerical Methods, Chen Greif and U. M. Ascher
The problem of interpolation is state as: a series of data is given, and we are looking for the underlying function which generates the data. For those who have taken high school science, we may recall the process of analysis after experimentation, where the underlying physical laws are derivated from empirical data. This process is a precursor of the more general task of interpolation. Whereas in class, you may have had to guess simple functions to fit data, such as lines, parabolas, etc., interpolation tasks may require you to fit arbitrary function bases against arbitrarily complex data, many times with hard to guess forms.
In this section, we will explore the process of interpolation, and identify some common techniques deployed in interpolation tasks.
Polynomial Interpolation
Pictured above is the method of Legrangian Interpolation. The Lagrange Interpolating Polynomial Provides an algorithm like process for creating these polynomials. Suppose we are given data . Suppose , then the interpolant is constructed We can see that if we plug in to the first and second term, that the coefficient of the first term is 1, and the coefficient of the second term is 0. Similarly, plugging in gives a 0 first term and 1 second term coefficient. In general Specifically, we can see that as a function where plugging in anything except returns 0, and plugging in returns 1. We can recover the polynomial of form by expanding and combining all the terms of to find our parameters.
Linear Regression
In the last example, we were obsessed with finding an exact fit of a polynomial through our data. Suppose we had the following data:
It is clear that the underlying function we are looking for is not a polynomial, namely it is a line. However, there is no exact line which perfectly goes through all of the points. This is likely because the data is noisy, and error was introduced when measuring the data. In this case, we deploy regression techniques, specifically linear regression.
To derive regression, we consider an optimization problem. Formally, suppose we are given points , and suppose we define the following cost function
As one can expect, the goal of regression is to minimize the cost, namely
We can consider applying optimization techniques on J, namely J may be optimized when
Above are the so called "normal equations" for linear regression. We see that since , are known quantities, we can solve for . We also know that this solution is the global minimum since is a convex function (specifically a quadratic form).
To summarize, if there exists , such that , then we can find the optimal which minimizes the squared error between and by solving the normal equations.
High Dimensional Linear Regression
Suppose instead of having , was a linear function (perhaps ). This framework is still capable of handling this, in particular by making We can verify that still, and the normal equations still find optimal .
Polynomial Regression
Suppose instead of having linear , was a polynomial function (perhaps ). This framework is still capable of handling this, in particular by making We can verify that still, and the normal equations still find optimal .
Notice that this is still a linear problem as any polynomial is a linear combination of the function basis composed of powers of .
Gereralized Regression and Iterative Descent/Optimization Methods
To summarize, in generalized regression we must
- Perform regression (rather than exact interpolation)
- Handle high dimensional data
- Guess arbitrary function (which cannot be eyeballed due to complexity and afformentioned high dimensionality)
There are a lot of nice prorperties of linear regression which break down very quickly when generalizing. Firstly, we no longer are guarenteed a convex cost function, hence we cannot simply set the gradient to 0 to find the global minimum. We are also no longer necessarily given nicely weighted linear combinations of functions, hence we cannot simply construct the X matrix as before.