True, at least the 4GB tables could now be replaced with a single 128kB table, but that's still not how the hardware works, and it wouldn't be possible to generate proper keys for the games that are missing the full 8GB tables, which is the main concern.
From the full tables, we can extract a 96-bit subkey for every address.
Edit: I had a link to a file here. I've removed it since I've found an error which I'll correct soon.
These wouldn't be the actual keys used by the hardware, however I think me and Andy have agreed that they are, apart from a fixed XOR and bit permutation that wouldn't change from game to game. So we can forget about this step of the decryption for the time being, and move on.
The problem now is to understand how the hardware generates the 96-bit subkey starting from the address and from the global key.
I have determined that only 16 bits are needed to generate the 96-bit subkey; this was sort of expected, but it isn't without problems.
We can generate only 94 bits of the subkey starting from those 16 bits and using only XOR operations. The remaining 2 bits need AND/OR operations, something which I have no explaination for at this point.
That was the least of the problems anyway. The major hurdle at this point is that the mapping from address to 16-bit key seed is very far from trivial.
The scheme should be as follows:
- take 16-bit address, N-bit global key, and generate a 16-bit key seed using an unknown algorithm.
- take 16-bit key seed, M-bit global key, and generate 96-bit subkey using an unknown algorithm.
- take 16-bit ciphertext, 96-bit subkey, and generate 16-bit plaintext using the algorithm we have discovered.
I think it would make sense for the algorithm in step 1. to be another 4-round Feistel network.
If this is the case, things are quite harder than before. To break the other Feistel network, we could rely on complete knowledge of ciphertext-plaintext relationship. Now we can't: we only have a vague idea of what the 16-bit key seed could be.
If we could rely on a 16-bit value except for a constant XOR and permutation, it wouldn't be a problem, since that wouldn't change the nature of the Feistel network. Unfortunately, we don't have that luxury.
Let's see that with an example. Let's say that the first algorithm generates a 4-bit key seed, abcd, which is expanded into the 6-bit subkey ABCDEF this way:
A = a
B = a XOR b
C = a XOR c
D = b XOR c
E = b XOR d
F = b XOR c XOR d
We don't know anything about abcd, all we see is ABCDEF, but we need to guess what abcd looks like. So we notice that
D = B XOR C
F = A XOR C XOR E
and we decide that
a' = A
b' = B
c' = C
d' = E
so we would have
A = a'
B = b'
C = c'
D = b' XOR c'
E = d'
F = a' XOR c' XOR d'
This all works as far as generating the subkey from the seed goes, the problem is that abcd and a'b'c'd' are two completely different numbers! We have that
a'b'c'd' = abcd XOR aab
so this isn't simply a XOR with a constant value, it's a variable modification of the number. And this is something that the Feistel network cannot handle.