Recursive functions
I explore recurive function functionality with some ML applications
Often I deal with methods which are based on a recursive structure. For example gradient descent, time series or in general recursive sequences like the fibonaci sequence. With recursive sequence I have in mind something of the form $x_n(x_{n-1})$ meaning the $x_n$ depends on its "past" value in some form. In gradient descent this looks for example as follows:
$\beta_n = \beta_{n-1} - \nu \nabla f(\beta_{n-1})$
Usually I implemented this with a simple for loop but today I want to implement this with the recursion capabilitys of python. ON the way I want to develop a gernal framework on how to write recursive functions to solve arbitrary problems $x_n(x_{n-1}, x_{n-2}, x_{n-3}, ...)$
Lets start by writing down a recursive function for the factorial to get a feel of it. Remember, the factorial is defined as
$n! = n * (n-1) * (n-2) ... * 1$
which can be written as
$n! = n * (n-1)!$ $( x_n=n * x_{n-1})$
def fact(x):
return x * fact( x - 1)
fact(5)
Hmm, what happend ? I clearly wrote a function which never terminates and hence runs forever. To avoid this python has set a limiting number of exceution times. This can be accesed with the folooing code
import sys
sys.getrecursionlimit()
Of course our secquence has a natural end, namely when one sequence member is 1. So we note that we have to "break" out of the recursive function.
def fact(x):
if x == 1:
return 1
return x * fact( x - 1)
fact(5)
Much better! Now we are increasing the complexity a bit by considering a sequence which depents on two of its past values. A natural candidate is the fibonacci sequence which is defined as follows:
$x_n = x_{n-1} + x_{n-2}$ with $x_1 = 1$ and $x_0 = 0$
def fib(n):
# x_1 = 1
if n == 1:
return 1
# x_0 = 0
if n == 0:
return 0
return fib(n - 1) + fib(n - 2)
fib(10)
Next I will implement gradient descent in an recursive way
def gradient_descent(estimate, old_estimate, gradient, X , y, step_size = .01, tol = .000001):
if np.abs(old_estimate - estimate).sum() < tol:
return estimate
old_estimate = estimate.copy()
estimate = estimate - step_size * gradient( estimate, X, y)
return gradient_descent(estimate, old_estimate, gradient, X, y)
def gradient(estimate , X, y):
n, p = X.shape
return - 2 / n * X.T @ (y - X @ estimate)
X = np.random.randn(10000, 2)
beta = np.array([1, 2])
y = X @ beta + np.random.randn(10000)
gradient_descent(np.array([0,0]), np.array([1, 1]), gradient, X ,y)
In my opinion this is a very clean of doing gradient descent as it avoids the for loop. And with some small tweaks tihs could be generalized so that you simply pass a diferent gradient function say from a logist c regression and returns the solution.