The Nitty-Gritty
To make this fast, we have to know why it's slow. This usually involves measuring the program execution time and seeing what parts of it are "hot" and causing the slowdown. (It cannot be overstated how important measurement is in optimization. The only way to know what's fast and what's slow is by testing it: everything else is guesswork at best.)
I'll save you the trouble of testing this yourself: basically all of matrix_fib
's runtime is computing matrix exponents. Specifically, computing , takes a long time, and depending on how many of those we need to multiply together at the end that can contribute a significant fraction of the runtime as well. Computing matrix exponents is dominated by performing multiplications.
As you probably picked up in grade school, multiplications are trickier than additions. Computing the standard way requires adding together each column of numbers, being sure to carry the 1s when necessary. Computing the standard way requires multiplying every digit with every other digit, which is a lot more work. The grade-school method of multiplication is just one way of organizing it, but this core fact remains: note how each digit in the individual products we add to get the final product comes from the product of a pair of the original digits.1
Another important fact to realize is that squaring can be significantly faster than multiplication. If we're squaring a 3-digit number, instead of 9 individual pairs of digits to multiply, we only need to multiply 6. (For , we need to compute and )
So our goal is to eliminate expensive multiplications as much as possible, and if we have to multiply ideally we square integers. How can we do that for our specific problem?
- Multiplying -digit numbers in less than multiplications seems impossible, but in fact it's possible to do less than that. As it turns out, that's not especially relevant here, so I'm not going to talk about it that much, but if I just piqued your interest try Googling Karatsuba multiplication and go from there.↩
The Nitty-Gritty
To make this fast, we have to know why it's slow. This usually involves measuring the program execution time and seeing what parts of it are "hot" and causing the slowdown. (It cannot be overstated how important measurement is in optimization. The only way to know what's fast and what's slow is by testing it: everything else is guesswork at best.)
I'll save you the trouble of testing this yourself: basically all of matrix_fib
's runtime is computing matrix exponents. Specifically, computing , takes a long time, and depending on how many of those we need to multiply together at the end that can contribute a significant fraction of the runtime as well. Computing matrix exponents is dominated by performing multiplications.
As you probably picked up in grade school, multiplications are trickier than additions. Computing the standard way requires adding together each column of numbers, being sure to carry the 1s when necessary. Computing the standard way requires multiplying every digit with every other digit, which is a lot more work. The grade-school method of multiplication is just one way of organizing it, but this core fact remains: note how each digit in the individual products we add to get the final product comes from the product of a pair of the original digits.1
Another important fact to realize is that squaring can be significantly faster than multiplication. If we're squaring a 3-digit number, instead of 9 individual pairs of digits to multiply, we only need to multiply 6. (For , we need to compute and )
So our goal is to eliminate expensive multiplications as much as possible, and if we have to multiply ideally we square integers. How can we do that for our specific problem?
- Multiplying -digit numbers in less than multiplications seems impossible, but in fact it's possible to do less than that. As it turns out, that's not especially relevant here, so I'm not going to talk about it that much, but if I just piqued your interest try Googling Karatsuba multiplication and go from there.↩