Wed, 08 Jan 2014
When QuickCheck Fails Me
This is an old trick I picked up from a colleague over a decade ago and have re-invented or re-remembered a number of times since.
When implementing complicated performance critical algorithms and things don't work immediately, the best idea is to drop back to the old formula of:
- Make it compile.
- Make it correct.
- Make it fast.
Often than means implementing slow naive versions of parts of the algorithm first and then one-by-one replacing the slow versions with fast versions. For a given function of two inputs, this might give me two functions with the identical type signatures:
functionSlow :: A -> B -> C functionFast :: A -> B -> C
that can be used interchangeably.
When it comes to implementing the fast versions, the slow versions can be used to check the correct-ness of the fast version using a simple QuickCheck property like:
\ a b -> functionFast a b == functionSlow a b
This property basically just asks QuickCheck to generate a, b pairs, pass these to both functions and compare their outputs.
With something like this, QuickCheck usually all finds the corner cases really quickly. Except for when it doesn't. QuickCheck uses a random number generator to generate inputs to the function under test. If for instance you have a function that takes a floating point number and only behaves incorrectly when that input is say exactly 10.3, the chances of QuickCheck generating exactly 10.3 and hitting the bug are very small.
For exactly this reason, using the technique above sometimes doesn't work. Sometimes the fast version has a bug that Quickcheck wasn't able to find.
When this happens the trick is to write a third function:
functionChecked :: A -> B -> C functionChecked a b = let fast = functionFast a b slow = functionSlow a b in if fast == slow then fast else error $ "functionFast " ++ show a ++ " " ++ show b ++ "\nreturns " ++ show fast ++ "\n should be " ++ show slow
which calculates the function output using both the slow and the fast versions, compares the outputs and fails with an error if the two function outputs are not identical.
Using this in my algorithm I can then collect failing test cases that QuickCheck couldn't find. With a failing test case, its usually pretty easy to fix the broken fast version of the function.