No longer „holding on to the head“
In a recent reddit discussion joinr mentions that holding on to the head of lazy sequences (especially large ones) is detrimental to performance. But what does „holding on to the head“ even mean?
Turns out I made this mistake in my 8 queens solver as well (complete code):
;; holds onto the head (def all-combinations (apply c/cartesian-product all-positions)) ... ;; causes realization of the huge list (println (count result))
Holding on to the head means there’s a var in the code, „all-combinations“ in my case, that references the list, a.k.a. the head of the list. This means the garbage collector cannot free that memory, as it is still being referenced by that var. I knew that but I did not realize what a huge difference it would make.
Calculation times using the same algorithm:
- Holding on to the head: „Elapsed time: 109568.375293 msecs“ (almost 2 minutes)
- Letting GC free memory while calculating: „Elapsed time: 38448.036158 msecs“ (38 seconds)
A 2-3 times speedup just because of allowing the garbage collector to free unused memory? That is huge!
What to keep in mind with holding on to the head:
- The style of the program depends strongly on where and how it is used: Holding on to the head is really useful when developing interactively at the REPL. You can save the results of expensive calculations that way. The problem is that production software has to be written a little differently, i.e. with performance in mind.
- Execution time does not help detecting this: „(def all-combinations…“ returns really fast because the huge list hasn’t been realized yet (it’s a lazy seq after all). Only until „(count result)“ is executed the list is realized. This is where the execution happens, but not the holding on to the head.
My previous algorithm is extremely dumb (i.e. brute force). It calculates 16.7M board positions and filters the ones where queens can’t beat each other. This means it’s exponentially complex, more accurately O(n^n) where n is the dimension of the board.
A more efficient way of doing this is to reduce the number of combinations while calculating. e.g. if two queens in the top rows get in each other’s way, stop calculating. This avoids trying to place queens in the rows below even though this can never be a valid solution.
My improved algorithm starts with all board positions (a list of rows, themselves a list of positions, to place queens on) and an empty list of chosen positions (no queens placed yet). It works from the top downwards and places a queen in every row. Thus, while the list of row positions becomes smaller, the list of chosen positions increases in size.
- Given a list of board rows (where no queen has been placed yet) and a list of previously chosen positions (in rows above), see if there are more rows to place queens on:
- if not, we’ve reached the bottom of the board. Return the previously chosen positions
- if there are more rows, map over every available position in the first row and for each position find the solutions recursively like so:
- add that position to the list of chosen positions
- from the rest of the rows remove all positions that get in the way of the newly chosen position
- call the search for solutions recursively, with the filtered positions and the new list of chosen positions
Here’s the respective function (complete code):
(defn solutions [rows chosen] (if (empty? rows) [chosen] (mapcat (fn [new-pos] (solutions (map #(remove (partial beats? new-pos) %) (rest rows)) (conj chosen new-pos))) (first rows))))
„Removing all positions from the rest of the rows“ is the magic step. It removes all positions that get in the way of already placed queens and thus avoids doing unnecessary work.
The speedup is astonishing: „Elapsed time: 30.79253 msecs“. Which is more than a 1000 times faster for an 8×8 board. It’s pretty fast for bigger board sizes as well: „Elapsed time: 9065.330115 msecs“ 14200 results ( 12 x 12 ).
Such a speedup can only come from a reduction in algorithmic complexity.
My guess is, that the new algorithm is at O(n^4) or maybe even O(n^3) due to elimination of possibilities. After thinking about this a little more my guess is that my algorithm is now O(n!) instead of O(n^n). Faster strategies probably require randomized algorithms. Anyone care to make an analysis? 😉
Edit: My variant is, in its depth-first search strategy, similar to the backtracking algorithm used in a lot of the n-queens examples. It is different, however, because it does not do backtracking. It prevents even trying invalid queen placements by eliminating those positions beforehand. As I’ve learned this is called search space pruning. What I found interesting is that my algorithm only checks 2057 positions for a 8×8 board, i.e. my „solutions“ function is called 2057 times. Roughly a third of them terminate early (less than 8 queens and already unsolvable).