Standard wordlists like RockYou.txt target linguistic probability but do not
account for spatial patterns generated by physical input devices. Users
frequently utilize keyboard patterns (e.g., 1qaz@WSX) to meet complexity
requirements while maintaining memorability.
To target these patterns, I developed `kbwalk.py` (part of the keyb-walking repository), a Python-based generator optimized for creating spatial wordlists and rainbow tables.
Development History
This project is an iteration of the `kb-walk` concept originally authored by Ronald Broberg. My implementation builds on the codebase maintained by Austin Scott, refactoring the core logic to support direct cryptographic hashing and optimizing the traversal algorithms for modern hardware.
Implementation Details
The primary constraint for generating spatial wordlists is the combinatorics involved. A full vertical walk generation results in over 33 million unique strings.
Memory Efficiency
To prevent excessive memory consumption during generation, the script utilizes Python’s `itertools` library and buffered I/O streams. Rather than storing the generated list in memory, the tool acts as a generator, yielding results directly to the output buffer. This allows for the creation of multi-gigabyte wordlists with constant memory usage.
Native Hashing
Standard cracking workflows often involve piping generator output to a separate hashing utility:
python3 kbwalk.py | openssl md5
This introduces context-switching overhead between processes. `kbwalk.py` implements native support for NTLM, MD5, SHA1, and SHA256 hashing. By computing hashes within the generation loop, the tool can write rainbow tables directly to disk, bypassing standard output stream limitations.
# Direct NTLM generation
python3 kbwalk.py -m v --hash ntlm -f ntlm_rainbow.txt
Performance Analysis (Occam)
To verify the algorithmic efficiency of the generator, I analyzed the code using Occam, a tool I developed for static and dynamic code analysis.
Occam performs static analysis to determine cyclomatic complexity and dynamic analysis to measure runtime performance (time and memory) relative to input size (\(N\)).
Benchmark Results
Executing `occam` against `kbwalk.py` produced the following runtime metrics:
-----------------------------------------------------------------
N | Time (avg) | Memory (avg) | Stability (CV)
-----------------------------------------------------------------
10 | 0.002631 | 6890 | 0.2041
...
100 | 22.543129 | 12131 | 0.0175
-----------------------------------------------------------------
Time Slope: 3.95 (R^2: 0.9999) -> O(n^k) Polynomial
Space Slope: 0.25 -> O(log n) Logarithmic
Interpretation
- Time Complexity O(N^4): The runtime grows polynomially. This is expected behavior, as the number of possible keyboard walks increases exponentially with the allowed path length.
- Space Complexity O(log(N)): Memory usage remains logarithmic relative to the input size.
The analysis confirms that the use of generator expressions and buffered I/O successfully decouples memory usage from the volume of data generated.
Conclusion
`kbwalk.py` provides a resource-efficient method for generating spatial pattern wordlists. By modeling keyboard adjacency graphs and optimizing for low-memory environments, the tool facilitates the creation of targeted attack vectors for password auditing and penetration testing.
Source code: https://github.com/antonhibl/keyb-walking