Password Cracking with Rainbowcrack and Rainbow Tables including source codes

Posted by Shashank Krishna Tuesday, February 3, 2009

sharethis:

What is RainbowCrack & Rainbow Tables?

RainbowCrack is a general propose implementation of Philippe Oechslin’s faster time-memory trade-off technique.

In 1980 Martin Hellman described a cryptanalytic time-memory trade-off which reduces the time of cryptanalysis by using precalculated data stored in memory. This technique was improved by Rivest before 1982 with the introduction of distinguished points which drastically reduces the number of memory lookups during cryptanalysis. This improved technique has been studied extensively but no new optimisations have been published ever since.

You can find the official Rainbowcrack project here, where you can download the latest version of Rainbowcrack.

In short, the RainbowCrack tool is a hash cracker. A traditional brute force cracker try all possible plaintexts one by one in cracking time. It is time consuming to break complex password in this way. The idea of time-memory trade-off is to do all cracking time computation in advance and store the result in files so called “rainbow table”.

Basically these types of password crackers are working with pre-calculated hashes of ALL passwords available within a certain character space, be that a-z or a-zA-z or a-zA-Z0-9 etc.

These files are called Rainbow Tables.

You are trading speed for memory and disk space, the Rainbow Tables can be VERY large.

WORKING OF RAINBOW TABLES
COURTESY-(http://kestas.kuliukas.com/RainbowTables/)

I found the creator of Rainbow Table's paper, aimed at cryptanalysts, was pretty inaccessible considering the simplicity and elegance of Rainbow Tables, so this is an overview of it for a layman.

Hash functions map plaintext to hashes so that you can't tell a plaintext from its hash.

If you want to find a given plaintext for a certain hash there are two simple methods:
- Hash each plaintext one by one, until you find the hash.
- Hash each plaintext one by one, but store each generated hash in a sorted table so that you can easily look the hash up later without generating the hashes again

Going one by one takes a very long time, and storing each hash takes an amount of memory which simply doesn't exist (for all but the smallest of plaintext sets). Rainbow tables are a compromise between pre-computation and low memory usage.

The key to understanding rainbow tables is understanding the (unhelpfully named) reduction function.
A hash function maps plaintexts to hashes, the reduction function maps hashes to plaintexts.

It's important to note that it does the reverse of a hash function (mapping hashes to plaintexts), but it is /not/ an inverse hash function. The whole purpose of hash functions is that inverse hash functions can't be made. If you take the hash of a plaintext, and take the reduction of the hash, it will not give you the original plaintext; but some other plaintext.

If the set of plaintexts is [0123456789]{6} (we want a rainbow table of all numeric passwords of length 6), and the hashing function is MD5(), a hash of a plaintext might be MD5("493823") -> "222f00dc4b7f9131c89cff641d1a8c50".
In this case the reduction function R() might be as simple as taking the first six numbers from the hash; R("222f00dc4b7f9131c89cff641d1a8c50") -> "222004".
We now have generated another plaintext from the hash of the previous plaintext, this is the purpose of the reduction function.

Hashes are one-way functions, and so are reduction functions. The chains which make up rainbow tables are chains of one way hash and reduction functions starting at a certain plaintext, and ending at a certain hash. A chain in a rainbow table starts with an arbitrary plaintext, hashes it, reduces the hash to another plaintext, hashes the new plaintext, and so on. The table only stores the starting plaintext, and the final hash you choose to end with, and so a chain "containing" millions of hashes can be represented with only a single starting plaintext, and a single finishing hash.

After generating many chains the table might look something like:
iaisudhiu -> 4259cc34599c530b1e4a8f225d665802
oxcvioix -> c744b1716cbf8d4dd0ff4ce31a177151
9da8dasf -> 3cd696a8571a843cda453a229d741843
[...]
sodifo8sf -> 7ad7d6fa6bb4fd28ab98b3dd33261e8f

The chains are now ready to be used. We have a certain hash with an unknown plaintext, and we want to check to see whether it is inside any of the generated chains.

The algorithm is:

Look for the hash in the list of final hashes, if it is there break out of the loop.
If it isn't there reduce the hash into another plaintext, and hash the new plaintext.
Goto the start.
If the hash matches one of the final hashes, the chain for which the hash matches the final hash contains the original hash.

You can now get that chain's starting plaintext, and start hashing and reducing it, until you come to the known hash along with its secret plaintext.

In this way you check through the hashes in the chains, which aren't actually stored anywhere on disk, by iterating column by column through the table of chains, backwards from the last column in the chain, to the starting plaintext.
If you wanted to check whether the hash exists in the fourth from last column in all the chains you reduce and hash the given hash four times, and check the generated hash against the final hashes.

Collisions are the only problem with Rainbow Tables. Ironically collisions are seen as a bad thing for hashing algorithms, but in the case of Rainbow Tables a hashing algorithm which generates collisions fairly regularly will be more secure.

A given hash may be generated by multiple plaintexts (this is called a collision), which is a big problem for chains because it causes chains which start different to converge into one. Also you get loops, which are caused when a hash is reduced to a plaintext that was hashed at a previous point in the chain.

Because of these collision problems there is no guarantee that there will be a hash of a plaintext that will reduce to some other given plaintext.
If you have a simple list of hashes and corresponding plaintexts for every plaintext in a set you will know that if you have not found the hash in the generated hashes the plaintext that generated the hash is not in the set.
If you have a table of chains where the reduction function reduces hashes into the set of plaintexts you could have trillions of chains generated but you still may not have generated every plaintext in the set you want to check. You can only say how probable it is that a table of chains contains a certain plaintext, and this can approach 1 but will probably never reach 1.
If you have a rainbow table with 10 chains of length 100 you have hashed 1000 plaintexts, but even if there are only 100 plaintexts in the set of desired plaintexts the 1000 hashes you have in the chains may not contain all the desired hashes.

The way collisions are handled is what sets Rainbow Tables apart from its predecessor which was developed in 1980.

The predecessor solved the problem of certain plaintexts never being reduced to by using many small tables. Each small table uses a different reduction function. This doesn't solve the problem completely, but it does help.
To solve chain merges and loops each chain ended at a "distinct point"; a hash which was unique in some way, eg hashes where the first 4 characters are 0. The chains keep on going until it reaches a distinct point. If two chains end up at the same distinct point then there has been a collision somewhere in the chain, and one of the chains is discarded. If a chain is generated for an unusually long time without reaching a distinct point a loop is suspected (where a chain of hashes ends up reducing and hashing to a previous hash in the chain). The problem with this is that if there is a collision there is potentially a whole branch which has to be cut off and won't make it into the chains, and a loop will cause all the hashes which came before the loop in the chain to be discarded.

Also all the time spend generating that chain will be wasted, and by ending only at distinct points you have chains of variable length. This means that you may have to keep checking for a hash within especially long chains long after the other chains have ended.

Rainbow tables differ in that they don't use multiple tables with different reduction functions, they only use one table. However in Rainbow Tables a different reduction function is used for each column. This way different tables with different reduction functions aren't needed, because different reduction functions are used within the same table. It is still unlikely that all plaintexts in the desired set will be hashed, but the chances are higher for a given number of chains. Chain merges are much, much rarer, because collisions have to occur on the same column. For a chain of length l the chance of a collision causing a merge is reduced to 1/l. Loops are also solved, because if a hash in a chain is the same as a previous hash it won't reduce to the same plaintext.

The reason they're called Rainbow Tables is because each column uses a different reduction function. If each reduction function was a different color, and you have starting plaintexts at the top and final hashes at the bottom, it would look like a rainbow (a very vertically long and thin one).
By using Rainbow Tables the only problem that remains is that you can never be certain that the chains contain all the desired hashes, to get higher success rates from a given Rainbow Table you have to generate more and more chains, and get diminishing returns.

I hope by explaining the Rainbow Table I haven't made them any less wonderful ...

Project RainbowCrack

Download

The latest version of RainbowCrack is 1.2

download	platform	supported charset	supported algorithm
rainbowcrack-1.2-win.zip(547K) rainbowcrack-1.2-src.zip(44K)	windows binary source for windows and linux	customizable	lm, md5, sha1, customizable
rainbowcrack-1.1-win.zip(403K) rainbowcrack-1.1-win-src.zip(59K)	windows binary windows source	customizable	lm
rainbowcrack-1.01-win.zip(400K) rainbowcrack-1.01-win-src.zip(56K)	windows binary windows source	alpha and alpha-numeric	lm
rainbowcrack-1.0-win.zip(400K) rainbowcrack-1.0-win-src.zip(56K)	not recommended

lm: The LanManager hash algorithm. "lm" table can be used to break windows password.
customizable charset: Charset of rainbow table can be customized as described in documentation.
customizable algorithm: Support of new algorithm can be done with ease, as described in FAQ. A ready to work algorithm patch supporting NTLM, MD2, MD4 and RIPEMD160 is here Algorithm patch for RainbowCrack 1.2(3K).

Before you leave, please promote this article with your favorite bookmarking site using the Share/save button! AND DO please give your valuable comment