by Zhu Shuanglei <shuanglei@hotmail.com>
http://www.antsight.com/zsl/rainbowcrack/
RainbowCrack is a general propose implementation of Philippe
Oechslin's faster
time-memory trade-off technique. In the tutorial and some other document of the
tool, four ready to work configurations are introduced. However, you may be puzzled
by the parameters(m, t and l) if you want to build tables for your custom
charset. If these parameters are casually selected, the performance
will be really poor.
In this article, we present some useful information which is helpful if you want
to determine your parameters of a rainbow table set, which can be used to attack
the ciphertexts of a selected key space of a hash algorithm .
The RainbowCrack tool is mainly based on the paper Making a Faster Cryptanalytic Time-Memory Trade-Off.
We will not talk about the basic concepts here, so please read the paper and
make clear the following concepts before you continue:
1. What is a rainbow chain
2. Why the rainbow chains merge in a rainbow table
| Catalog | Parameter | Explanation |
| Const values: they will not change | speed of hash computation | time-memory trade-off cryptanalysis need a lot of computing power and a lot of disk access |
| speed of the harddisk | ||
| N | the key space | |
| Input parameters: parameters that you can adjust | t | rainbow chain length |
| m | rainbow chain count of a rainbow table | |
| l | rainbow table count | |
| Output parameters: time and space complexity depending on the "input parameters" | disk usage | disk space required to store the generated tables |
| success probability | the probability you can find the plaintext of the ciphertext | |
| mean cryptanalysis time | mean time used to compute the hashes when cracking a ciphertext | |
| max cryptanalysis time | max time used to compute the hashes when cracking a ciphertext | |
| mean disk access time | mean time used to load the files when cracking a ciphertext | |
| max disk access time | max time used to load the files when cracking a ciphertext | |
| table precomputation time | time needed to compute the rainbow table set |
To determine the parameter "speed of hash computation":
Run the program rtgen like this:
C:\rainbowcrack>rtgen lm alpha 1 7 0 -bench
lm hash speed: aaaaaa / s
lm step speed: bbbbbb / s
C:\rainbowcrack>rtgen md5 alpha 1 7 0 -bench
md5 hash speed: cccccc / s
md5 step speed: dddddd / s
Here "bbbbb" is the speed of lm hash, "dddddd" is the speed
of md5 hash
To determine the parameter "speed of the harddisk":
If the program rcrack can read 610 MB of data in 19 seconds, the speed is
considered as "19 / 610"
To determine the parameter "N":
If the charset is "alpha"(26 characters) and the plaintext length
range is 1 to 7, N = 26^1 + 26^2 + 26^3 + 26^4 + 26^5 + 26^6 + 26^7 = 8353082582
If the charset is "byte"(256 characters) and the plaintext length
range is 1 to 4, N = 256 + 256^2 + 256^3 + 256^4 = 4311810304
Parameter t, m and l are the only parameters you can adjust. If you change
any of them, the output parameters will change. Of course we would like to select some
good input parameters which will produce better output parameters.
Some parameters are considered more important than the others. These
"important" parameters are in bold font in the above table.
In this section, we will talk about how the output parameters are determined by the input parameters. The discussion is not complete, the prototype of the matlab scripts used to compute the output parameters are given in appendix A. However, you can also take the source of these scripts as another reference.
The format of a rainbow table generated by rainbowcrack is given in appendix
B. Each rainbow chain will take 16 bytes, so the formula is:
disk_usage = m * 16 * l
Fig: More rainbow chains will merge when t increase
Matlab: demo_chain_merge(8353082582, 2100, 8000000)

Fig: More rainbow chains will merge when m increase(compared to the
figure above)
Matlab: demo_chain_merge(8353082582, 2100, 16000000)
The matlab scripts which will produce the figures are given too. So you can generate the figures yourself, change the parameters and see how the rainbow chains will merge.
If we store all rainbow chains in a single rainbow table(with same reduce
function set), the success probability will increase very slowly. So we use
different rainbow tables to store the chains:

Fig: We can archive better success probability with same disk usage if we
use different rainbow tables
Matlab: demo_advantage_of_multiple_table(8353082582, 2100, 256, 4096)

Fig: We can archive better success probability with same disk usage if we
use different rainbow tables(zoomed in)
Matlab: demo_advantage_of_multiple_table(8353082582, 2100, 256, 4096)
We use different curves to represent how the success probability will
increase if we use more disk space(larger m*l).
There are five curves in the figure(l=1 to 5). When l(rainbow table count)
is equal to 5, The curve reach the success probability 0.999 first.
When l increase, we will reach same success probability with less disk space. But the cryptanalysis time will increase too. This is the dilemma.
The marked point is the configuration we selected in configuration #0 in rainbowcrack tutorial. However, any points in these curves represent a configuration.
We need "t*t/2" times rainbow chain walk(hash computation and a
reduce function) to finish the search of a rainbow table. Thus the time to
search a rainbow table is "t*t/2 / step_speed".
Each rainbow table has its success probability. If we can't find the plaintext in
a rainbow table, we will continue with the next. The formula is:
mean_cryptanalysis_time = rate_of_each_table * 1 * t*t/2 / step_speed
+ (1 - rate_of_each_table) ^ 1 * rate_of_each_table * 2 * t*t/2 / step_speed
+ (1 - rate_of_each_table) ^ 2 * rate_of_each_table * 3 * t*t/2 / step_speed
...
+ (1 - rate_of_each_table) ^ (table_count - 1) * rate_of_each_table * table_count * t*t/2 / step_speed
+ (1 - rate_of_each_table) ^ table_count * table_count * t*t/2 / step_speed
We use the matlab script performance_chart.m to select our configurations. We pass the script with our desired success probability, disk usage range and the rainbow chain count range.

Fig: Performance chart of key space 8353082582
Matlab: performance_chart(8353082582, 0.999, 250, 2000, 3, 8, 354000, 19 / 610)
We use different colors to represent the performance curves if we use different rainbow table count.
There are three charts in the figure:
chart 1: The minimal t we can use if we want to reach the desired success
probability
chart 2: mean cryptanalysis time if we use the t selected in char 1
chart 3: mean cryptanalysis time plus mean disk access time if we use the t
selected in char 1
The cryptanslysis time will decrease when disk space increase(m*l increase).
However, we will need longer disk access time if use larger tables. Look at the
chart 3 to find out how "mean_cryptanalysis_time + mean_disk_access_time"
change when disk space change.
You can select any point in these curves as your configuration of the rainbow table set. Of course some configurations are better than the others.
Fig: Performance chart of key space 8353082582(zoomed to where we select the
configuration #0)
Matlab: performance_chart(8353082582, 0.999, 250, 2000, 3, 8, 354000, 19 / 610)
Now we zoomed the figure to where we select the configuration #0.
When a configuration is selected, you can determine the parameter t and l
directly from the figure. m*l can be determined by:
m * l =
disk_usage / 16
If you want to crack one ciphertext at a time, the total time is:
mean_cryptanalysis_time + mean_disk_access_time
If you want to crack n ciphertexts at a time, the total time is: mean_cryptanalysis_time * n + max_disk_access_time (when n >> 1)
the mean time is: (mean_cryptanalysis_time * n + max_disk_access_time) / n
=
mean_cryptanalysis_time (when n >> 1)
You should also take the curves in chart 2 into consideration if you always have
dozens of ciphertexts to crack. In this situation, the disk access time will
take very little percent in cracking progress.
| basic scripts | description |
| function ret = calc_disk_usage(m, table_count) | calculate disk usage |
| function ret = calc_success_probability(N, t, m) | calculate success probability |
| function ret = calc_mean_cryptanalysis_time(N, t, m, table_count, step_speed) | calculate mean cryptanalysis time |
| function ret = calc_max_cryptanalysis_time(t, table_count, step_speed) | calculate max cryptanalysis time |
| function ret = calc_mean_disk_access_time(N, t, m, table_count, disk_speed) | calculate mean disk access time |
| function ret = calc_max_disk_access_time(m, table_count, disk_speed) | calculate max disk access time |
| function ret = calc_table_precomputation_time(t, m, table_count, step_speed) | calculate table precomputation time |
| function calc_configuration_parameters(N, t, m, table_count, step_speed, disk_speed) | wrapper of the above scripts |
| parameter optimization scripts | description |
| function performance_chart(N, target_success_probability, ... disk_usage_min, disk_usage_max, ... table_count_min, table_count_max, ... step_speed, disk_speed) |
you can find your custom configuration of a selected charset with this script |
| demonstration scripts | description |
| function demo_chain_merge(N, t, m) | demonstrate the merge of rainbow chains |
| function demo_advantage_of_multiple_table(N, t, disk_usage_min, disk_usage_max) | demonstrate how the success probability change if we create multiple rainbow tables |
A rainbow table is a raw array of rainbow chains. Only the starting point
and the end point are stored in a rainbow chain. The data struct is defined as:
struct RainbowChain
{
uint64 nIndexS;
uint64 nIndexE;
};
Each rainbow chain takes 16 bytes.
For example:
| [user@linux90
rainbowcrack]$ ./rtgen lm alpha 1 7 0 100 4 test hash routine: lm hash length: 8 plain charset: ABCDEFGHIJKLMNOPQRSTUVWXYZ plain charset in hex: 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 57 58 59 5a plain length range: 1 - 7 plain charset name: alpha plain space total: 8353082582 rainbow table index: 0 reduce offset: 0 generating... 4 of 4 rainbow chains generated (0 m 0 s) [user@linux90 rainbowcrack]$ hexdump lm_alpha#1-7_0_100x4_test.rt 0000000 ba40 a519 0001 0000 57bd 9f74 0001 0000 0000010 7af6 42b8 0000 0000 3dad 910f 0000 0000 0000020 728b 499c 0001 0000 c840 cb54 0000 0000 0000030 feab b2b4 0000 0000 77ac c144 0000 0000 0000040 [user@linux90 rainbowcrack]$ ./rtsort lm_alpha#1-7_0_100x4_test.rt available physical memory: 15585280 bytes loading rainbow table... sorting rainbow table... writing sorted rainbow table... [user@linux90 rainbowcrack]$ hexdump lm_alpha#1-7_0_100x4_test.rt 0000000 7af6 42b8 0000 0000 3dad 910f 0000 0000 0000010 feab b2b4 0000 0000 77ac c144 0000 0000 0000020 728b 499c 0001 0000 c840 cb54 0000 0000 0000030 ba40 a519 0001 0000 57bd 9f74 0001 0000 0000040 |
We generated a simple rainbow table with 4 rainbow chains and the size of the generated file is 64 bytes.
The rtsort program is very simple too. It sort the end points and write the rainbow chains back to the file.
Interestingly, you can dump the content of a rainbow chain like this:
| [user@linux90
rainbowcrack]$ ./rtdump lm_alpha#1-7_0_100x4_test.rt 0 #0 0000000042b87af6 CPENHYM|4350454e48594d dc755c2de44e8620 #1 0000000138480a7a PXXXZOM|505858585a4f4d acdd711d4d251cae #2 0000000017f34221 AGUGSNB|41475547534e42 62c9104f81dd89ac #3 000000002703a3d0 BCBICDU|42434249434455 7103a8b83dd78800 #4 00000001a14058f0 VQDTIFQ|56514454494651 0a965e6115101515 ...... ...... #95 00000001abbf9d13 WEZDJYB|57455a444a5942 500da7c384c8a4b6 #96 0000000161daa38d SEQFGUP|53455146475550 3d25c6e9d112ed31 #97 00000000bb89fb89 JDUEUNR|4a445545554e52 12586e9d6b663bff #98 0000000012216749 YOPLXZ |594f504c585a e3d94db4cd0f94e5 #99 00000000910f3dad |
The starting point and end point are same in the output of hexdump and rtdump.
Create date: 2004/1/1