Cracking passwords is a kind of e-sport, really. There's competition among amateurs and professionals "players", tools, gear. There are secrets, home-made recipes, software helpers, etc.
One of this software is PCFG password guess generator, for "Probabilistic Context-Free Grammar". I won't explain the concept of PCFG, some scientific literature exists you can read to discover all the math inside.
PCFG password guess generator comes as two main python programs: pcfg_trainer.py and pcfg_manager.py. Basic mechanism is the following:
- you feed pcfg_trainer.py with enough known passwords to generate comprehensive rules describing the grammar of known passwords, and supposedly unknown passwords too.
- you run pcfg_manager.py, using previously created grammar, to create millions of password candidates to feed into your favorite password cracker (John the Ripper, Hashcat…).
In order to measure PCFG password guess generator's efficiency I've made few tests. Here is my setup:
- Huge password dump, 117205873 accounts with 61829207 unique Raw-SHA1 hashes;
- John the Ripper, Bleeding Jumbo, downloaded 20160728, compiled on FreeBSD 10.x;
- PCFG password guess generator, downloaded 20160801, launched with Python 3.x;
Here's my methodology:
Of these 61829207 hashes, about 35 millions are already cracked. I've extracted a random sample of 2 millions known passwords to feed the trainer. Then I've used pcfg_manager.py to create a 10 millions lines word list. I've also trimmed the famous Rockyou list to it's 10 millions first lines, to provide a known reference.
Finally, I've launched this shell script:
for i in none wordlist jumbo; do
./john --wordlist=pcfg_crckr --rules=$i --session=pcfg_cracker-$i --pot=pcfg_cracker-$i.pot HugeDump
./john --wordlist=ry10m --rules=$i --session=ry10m-$i --pot=ry10m-$i.pot HugeDump
No forking, I'm running on one CPU core here. Each word list is tested three times, with no word mangling rules, with defaults JtR rules, and finally with Jumbo mangling rules.
Some results (number of cracked passwords):
That I can translate into efficiency, where efficiency is Cracked/WordlistLength as percentage:
It's quite interesting to see that the PCFG generated word list has a very good efficiency, compared to Rockyou list, when no rules are involved. That's to be expected, as PCFG password guess generator has been trained with a quite large sample of known passwords from the same dump I am attacking.
Also, the PCFG password guess generator creates candidates that are not very well suited for mangling, and only the jumbo set of rules achieves good results with this source. Rockyou on the other hand starts quite low with only 27.7% but jumps to 50.1% with common rules, and finally defeats PCFG when used with jumbo rules.
On the word list side, Rockyou is known and limited: it will never grow. But PCFG password guess generator looks like it can create an infinite list of candidates. Let see what happens when I create a list of +110 M candidates and feed them to JtR.
Efficiency plummets: only 9.7 M hashes cracked with a list of 110398024 candidates, and only 1.1 M more when the set of rules "wordlist" is applied. It's even less beneficial than with a list of 10 M candidates (+1.3 M with "wordlist" rules, compared to "none").
On the result side, both word list with jumbo rules yields to +21 M cracked passwords. But are those passwords identical, or different?
||Total unique cracked
Yield = UniqueCracked / (PcfgCracked + RockyouCracked)
A high yield basically says that you should run both word lists into John. A yield of 50% means that all pwd cracked thanks to PCFG are identical to those cracked with the Rockyou list.
As a conclusion, I would say that the PCFG password guess generator is a very interesting tool, as it provides a way to generate valid candidates pretty easily. You probably still need a proper known passwords corpus to train it.
It's also very efficient with no rules at all, compared to the Rockyou list. That might make it a good tool for very slow hashes when you can't afford to try thousands of mangling rules on each candidate.
Some graphs to illustrate this post:
every john session on the same graph
every session, zoomed on the first 2 minutes
Rules "wordlist" on both lists of candidates
Rules "none", both lists of candidates