Standard wordlists like RockYou.txt target linguistic probability but do not account for spatial patterns generated by physical input devices. Users frequently utilize keyboard patterns (e.g., 1qaz@WSX) to meet complexity requirements while maintaining memorability.

To target these patterns, I developed `kbwalk.py` (part of the keyb-walking repository), a Python-based generator optimized for creating spatial wordlists and rainbow tables.

Development History

This project is an iteration of the `kb-walk` concept originally authored by Ronald Broberg. My implementation builds on the codebase maintained by Austin Scott, refactoring the core logic to support direct cryptographic hashing and optimizing the traversal algorithms for modern hardware.

Implementation Details

The primary constraint for generating spatial wordlists is the combinatorics involved. A full vertical walk generation results in over 33 million unique strings.

Memory Efficiency

To prevent excessive memory consumption during generation, the script utilizes Python’s `itertools` library and buffered I/O streams. Rather than storing the generated list in memory, the tool acts as a generator, yielding results directly to the output buffer. This allows for the creation of multi-gigabyte wordlists with constant memory usage.

Native Hashing

Standard cracking workflows often involve piping generator output to a separate hashing utility:

python3 kbwalk.py | openssl md5

This introduces context-switching overhead between processes. `kbwalk.py` implements native support for NTLM, MD5, SHA1, and SHA256 hashing. By computing hashes within the generation loop, the tool can write rainbow tables directly to disk, bypassing standard output stream limitations.

# Direct NTLM generation
python3 kbwalk.py -m v --hash ntlm -f ntlm_rainbow.txt

Performance Analysis (Occam)

To verify the algorithmic efficiency of the generator, I analyzed the code using Occam, a tool I developed for static and dynamic code analysis.

Occam performs static analysis to determine cyclomatic complexity and dynamic analysis to measure runtime performance (time and memory) relative to input size (\(N\)).

Benchmark Results

Executing `occam` against `kbwalk.py` produced the following runtime metrics:

-----------------------------------------------------------------
N        | Time (avg)    | Memory (avg) | Stability (CV)
-----------------------------------------------------------------
10       | 0.002631      | 6890         | 0.2041
...
100      | 22.543129     | 12131        | 0.0175
-----------------------------------------------------------------
Time Slope:  3.95 (R^2: 0.9999) -> O(n^k) Polynomial
Space Slope: 0.25 -> O(log n) Logarithmic

Interpretation

  • Time Complexity O(N^4): The runtime grows polynomially. This is expected behavior, as the number of possible keyboard walks increases exponentially with the allowed path length.
  • Space Complexity O(log(N)): Memory usage remains logarithmic relative to the input size.

The analysis confirms that the use of generator expressions and buffered I/O successfully decouples memory usage from the volume of data generated.

Conclusion

`kbwalk.py` provides a resource-efficient method for generating spatial pattern wordlists. By modeling keyboard adjacency graphs and optimizing for low-memory environments, the tool facilitates the creation of targeted attack vectors for password auditing and penetration testing.

Source code: https://github.com/antonhibl/keyb-walking