Say you formulated a recursive solution to some algorithmic problem – you can either implement the recursion directly, or do a top-down Dynamic Programmic (DP) approach. The direct approach is usually easier, but can lead to a Stack Overflow exception (SO) relatively soon in a language with low stack depth, such as Java.
Take the following Java code that computes the n-th Fibonacci number directly from the recurrence:
Try running fib(1000)
, it’s too slow because we are recomputing answers we already know. How about we memoize it:
The idea is to use an array FIBS
in which we store the computed values. A \(-1\) signifies we still haven’t computed an answer. (Note that we take the answers modulo another number so that we don’t overflow).
The above code easily solves fib(1000)
. Say now I ask you to compute the 10,000th Fibonacci number (modulo mod
). Try running the above code, and you will also get a stackoverflow error, since the default (at least in Eclipse) Java has a rather low stack size. We have two ways to fix this – one of them is to use a bottom-up approach to fill the DP table using only values we already know. Before looking at the second way, let’s look at some more theory of DP.
1 Dynamic Programming Theory
Suppose you formulated a recursive solution to an algorithmic problem of yours. Informally, if your formulation:
- Has a base case
- The recursive formulation is made up of formulations that are ‘closer’ to the base case.
you can use DP to solve it more efficiently.
More formally, let \(G=(V,E)\) be the directed acyclic graph (DAG) of the recurrence relation defining our formulation, where \(V\) is all the subproblems and \((u,v) \in E\) iff the solution to problem \(v\) depends on the solution to problem \(u\). Since \(G\) is a DAG (if it’s not your problem cannot be solved using this method), you can construct a topological sort of \(G\), \(G_{sort}\). A topological sort of a DAG \(G\) is a permutation of the vertices of \(G\), \(p(V)\) such that \(\forall (u,v) \in E\), \(u\) appears before \(v\) in \(p(V)\) (\(p(V)_i = u \wedge p(V)_j = v \Leftrightarrow i < j\)). Notice that there can be many valid choices of \(p\).
If you want to come up with a bottom-up solution, you can traverse your problem space in the order of \(p(V)\) until you reach the subproblem that you are interested in. Coding-wise, this is many times difficult or bug prone when compared to the top-down way since you need to figure out a good way to visit the problem space.
Alternatively, you can do it in a top-down way using recursion and your initial problem specification. This is akin to doing a DFS on the inverted graph of \(G\), \(G_{inv} = (V,E_{inv})\) where \(E_{inv} = \{(a,b) | (b,a) \in E \}\) until you explore all the nodes reachable from the state you desire an answer to. The problem with this, is that if it’s directly implemented, you might run out of stack space when running your DFS. A solution, which takes up minimal extra coding time, is to re-run your recursive algorithm on intermediate subproblems so that solving them won’t produce a stack overflow. In practice, it might be difficult to figure out the choice of subproblems to guarantee success, but you should be able to come up with a conservative estimate.
2 Fibonacci Examples
2.1 Bottom-up DP
The bottom-up approach usually involves an array to store the computed results and a couple of loops to go through the whole problem space.
Running fibDP(10000)
is effortless.
2.2 Top-Down Recursive
The previous top-down approach was faulty due to stack-overflow errors for large values of \(n\). We can fix this by computing some intermediaries:
Running fibTopDown(100000)
is effortless because at each step we have at most \(1000\) recursion steps which means we won’t run into a stack-overflow error.
3 Conclusion
In this article we’ve seen a neat trick to ensure that a stack overflow error does not happen when solving a DP problem in a top-down recursive fashion. I’m not entirely sure I would recommend this as the ‘the-end’ solution (try rewriting it bottom-up), but if you are pressed for time in some programming contest and need to make sure your top-down recursive solution will work for large input cases, use this trick! Indeed, I got the idea from this topcoder post. If you are worried, I of course know the simple example of Fibonacci can be memoized with only a two-element DP table, but for any real-life problem, it most likely won’t be so simple and you can spare a bit more memory.