The Nitty-Gritty

To make this fast, we have to know why it's slow. This usually involves measuring the program execution time and seeing what parts of it are "hot" and causing the slowdown. (It cannot be overstated how important measurement is in optimization. The only way to know what's fast and what's slow is by testing it: everything else is guesswork at best.)

I'll save you the trouble of testing this yourself: basically all of matrix_fib's runtime is computing matrix exponents. Specifically, computing M2,M4,M8,M^2, M^4, M^8, \dots, takes a long time, and depending on how many of those we need to multiply together at the end that can contribute a significant fraction of the runtime as well. Computing matrix exponents is dominated by performing multiplications.

As you probably picked up in grade school, multiplications are trickier than additions. Computing 269+135269 + 135 the standard way requires adding together each column of numbers, being sure to carry the 1s when necessary. Computing 269×135269 \times 135 the standard way requires multiplying every digit with every other digit, which is a lot more work. The grade-school method of multiplication is just one way of organizing it, but this core fact remains: note how each digit in the individual products we add to get the final product comes from the product of a pair of the original digits.1

Another important fact to realize is that squaring can be significantly faster than multiplication. If we're squaring a 3-digit number, instead of 9 individual pairs of digits to multiply, we only need to multiply 6. (For abc×abcabc \times abc, we need to compute a×b,a×c,b×c,a2,b2,a \times b, a \times c, b \times c, a^2, b^2, and c2c^2)

So our goal is to eliminate expensive multiplications as much as possible, and if we have to multiply ideally we square integers. How can we do that for our specific problem?


  1. Multiplying nn-digit numbers in less than n2n^2 multiplications seems impossible, but in fact it's possible to do less than that. As it turns out, that's not especially relevant here, so I'm not going to talk about it that much, but if I just piqued your interest try Googling Karatsuba multiplication and go from there.

The Nitty-Gritty

To make this fast, we have to know why it's slow. This usually involves measuring the program execution time and seeing what parts of it are "hot" and causing the slowdown. (It cannot be overstated how important measurement is in optimization. The only way to know what's fast and what's slow is by testing it: everything else is guesswork at best.)

I'll save you the trouble of testing this yourself: basically all of matrix_fib's runtime is computing matrix exponents. Specifically, computing M2,M4,M8,M^2, M^4, M^8, \dots, takes a long time, and depending on how many of those we need to multiply together at the end that can contribute a significant fraction of the runtime as well. Computing matrix exponents is dominated by performing multiplications.

As you probably picked up in grade school, multiplications are trickier than additions. Computing 269+135269 + 135 the standard way requires adding together each column of numbers, being sure to carry the 1s when necessary. Computing 269×135269 \times 135 the standard way requires multiplying every digit with every other digit, which is a lot more work. The grade-school method of multiplication is just one way of organizing it, but this core fact remains: note how each digit in the individual products we add to get the final product comes from the product of a pair of the original digits.1

Another important fact to realize is that squaring can be significantly faster than multiplication. If we're squaring a 3-digit number, instead of 9 individual pairs of digits to multiply, we only need to multiply 6. (For abc×abcabc \times abc, we need to compute a×b,a×c,b×c,a2,b2,a \times b, a \times c, b \times c, a^2, b^2, and c2c^2)

So our goal is to eliminate expensive multiplications as much as possible, and if we have to multiply ideally we square integers. How can we do that for our specific problem?


  1. Multiplying nn-digit numbers in less than n2n^2 multiplications seems impossible, but in fact it's possible to do less than that. As it turns out, that's not especially relevant here, so I'm not going to talk about it that much, but if I just piqued your interest try Googling Karatsuba multiplication and go from there.