Eventually, I ended with a layout which I'm 99.9% sure is equivalent to the real one. We cannot know the exact contents of the real s-boxes without getting them from the actual hardware, but the current ones should be matematically equivalent.
The result is here: http://xoomer.alice.it/nicola.salmoria/cps2crptv2.zip.
The most notable news is that the key is now reduced to 64 bits, and the one we are currently using should be identical to the one used by the hardware, apart from a fixed permutation of the bits.
Finding the real permutation would be nice, but obviously that's not something we can determine from the algorithm, since the order of the bits of the key is completely irrelevant.
What is interesting to note is that the keys used by some games don't seem to be random. If they were random one would expect there to be around 32 0s and 32 1s, but sometimes this isn't the case. E.g.
Of these, the last three literally scream "I'm not a random number!". Guessing the right bit order to make something appear, of course, is another matter.
Some of the watchdog values contain birth dates, e.g. cmpi.l #$19660419,D1, so I expect the same thing might be happening here.
Also, it makes sense for the pzloop2 one to be more regular than the others because it's third party game.
On the key extraction front, things are going reasonably well. The brute force attack described in the previous article is working decently on most games, however for some of them the available data isn't enough. I'll have a more precise list once I've finished going through all the games. After that, we'll need to devise a better attack if we want to get the missing keys.
The discovery that the key is only 64 bits might help to construct a better attack, though at the moment I don't have many ideas. The fact that the algorithm is divided in two parts, with the output of the first one affecting the key on the second part, complicates things.