Author Topic: Cheat Code Checksum (Read 2425 times)

Hooman · « **on:** January 05, 2008, 09:05:39 PM »

Some people were curious about some code I used to brute force the 4th cheat using the value in the checksum table, so here it is.

It basically just loads a dictionary of English words, forms all possible strings of the form "Dan's X could Y more Z.", where X, Y, and Z are from the dictionary, computes the checksum of each string, and stops if it finds a string with a desired result. It currently only checks for the hash of the 4th cheat code, since that code is the only one that's still unknown. (But we do have a grammatiaclly incorrect collision that works). You can check for collisions of the other codes at a slight loss in performance. (see below first)

It's basically the hashing algorithm from OP2, ripped out and placed in a .cpp file (still in an assembly form, but probably changed a little). The values from the hash table are also in the file, and it does a quick test with known cheat codes at the start.

Btw, the expected run time to test all combinations of words in the dictionary on my computer (800 MHz) is approximately 323 days.

The dictionary was downloaded somewhere off the net. Found with google, and source since forgotten.

Note: Some of the speed optimizations may require the dictionary to be in increasing word order.

alice · « **Reply #1 on:** January 05, 2008, 10:20:32 PM »

Thanks for this Hooman ^^

(Figured I might as well post so that you know people appreciate you do around here rather than just download it and secretly build my hooman sourcecode cache d:

BlackBox · « **Reply #2 on:** January 06, 2008, 09:38:29 AM »

Ahh yes, I remember you telling me about this.

Did we ever find out if this is some sort of standard hashing algorithm? Or completely custom made?

I don't know a lot about cryptography, do you (or anyone else) think there is some sort of cryptanalysis of this hash function? (so that we could use it to reduce the search time) I don't know enough about this subject to do anything like that myself.

Also, as a test, I compiled it with the heaviest optimizations my compiler (Visual Studio 2005) would allow. (O2, favor fast code, omit FP, use SSE2, etc). Unfortunately it doesn't seem to improve the speed that much.

Interestingly enough, one iteration of the inner loop seems to run faster with larger words in the dictionary.

Some day when I am bored, perhaps I will try to optimize the code further (and make it brute force different parts of the dictionary in parallel, since I have multiple machines capable of SMP -- this laptop has a dual core CPU, I have two more Sparc machines that have 2 and 4 CPUs). I'm too lazy to do it now however. :rolleyes:

Hooman · « **Reply #3 on:** January 06, 2008, 05:02:30 PM »

I imagine it's custom made. It looks like it even contains a bug. Not that it really matters for a hashing algorithm since in a sense you're just trying to output random garbage anyways.

It doesn't seem particularly secure cryptographically, but due to some irregularity, which strikes me as a bug, it's somewhat more difficult to analyse.

I've had a number of ideas to make it go faster. Some are implemented, some I think I started on, and then forgot about the project. I optimized the inner loop so it only replaces the last word (provided the length doesn't change), so it doesn't need to rewrite the whole string with sprintf. That was a huge speedup. (It also means you want the loaded words to be sorted according to length). Also, since the game only allows strings up to a certain size, it won't check through the whole dictionary if the size of the 3 substituted words exceeds the maximum string length. Hence, if you have long words at the start, it will run faster since it needs to try less words for the final position.

As for parallelizing it, just choose different values for k. Gpgarretboast said his computer could check a single value of k in about 68 seconds. (Dictionary size of a little less than 27000 words). It shouldn't be too hard to setup a different range for k for each computer it's run on. Now if only we had 27000 computers, we could check the whole dictionary in about 68 seconds.

(Actually, substantially less due to the string length speed optimization, since all the longer words are at the end of the dictionary).

Oh, and the planned but unimplemented part, was to hash the first part of the string, and save the results, and then continue the hash after the last word was changed. Then it wouldn't need to rehash the whole string each time. I think I added some code for that into the project, but it's not complete and not called anywhere. That should probably yield a speedup of about 2x or more.

Sirbomber · « **Reply #4 on:** January 06, 2008, 05:34:10 PM »

Those seem like important things to add to this.
Unless you want it to take 200+ days to complete.

News:

Author Topic: Cheat Code Checksum (Read 2425 times)