Writing a Pythonic Complexity Scanner

Competitive programming often reduces algorithmic efficiency to Big O notation. While theoretically sound, Big O ignores the constants, interpreter overhead, and memory allocation patterns inherent in actual execution. A solution might be O(N) on paper but fail runtime constraints due to Python’s specific implementation details.

I developed Occam to bridge the gap between theoretical complexity and hardware reality. The goal was to build a local profiling tool capable of analyzing solution behavior under stress before submission to platforms like LeetCode.

Dynamic Analysis: Empirical Benchmarking

The primary component of Occam is a dynamic analyzer that measures runtime behavior relative to input size (N). Rather than running a single test case, the tool executes the target function against procedurally generated datasets of increasing scale.

Implementation

The analyzer utilizes Python’s `timeit` module for precision clocking and `tracemalloc` to snapshot peak memory usage. By collecting data points across a range (e.g., N=10 to N=10,000), we establish a dataset representing the algorithm’s growth curve.

To determine the complexity class programmatically, Occam performs a linear regression on the log-log plot of Time vs N. The resulting slope approximates the polynomial degree:

Slope approx. 1: Linear O(N)
Slope approx. 2: Quadratic O(N^2)
Slope approx. 0: Constant O(1)

This allows for the validation of algorithmic assumptions against actual hardware performance.

Static Analysis: CPython Internals

Runtime metrics can sometimes mask inefficient implementation details. To address this, Occam includes a static analysis engine that inspects the raw CPython bytecode using the `dis` module.

Bytecode & Instruction Counting

The tool disassembles the function object to count specific bytecode instructions. This reveals hidden costs, such as implicit loop overheads or expensive global lookups, that high-level Python syntax obscures.

Cyclomatic Complexity

Occam calculates the Cyclomatic Complexity number by analyzing the Control Flow Graph (CFG). It iterates through the bytecode to count branching instructions (POP_JUMP_IF_FALSE, FOR_ITER, etc.). This metric provides a quantitative measure of code maintainability and branch density, independent of execution time.

Battle Mode: Comparative Profiling

Optimization is often about trade-offs. To facilitate A/B testing, I implemented a “Battle Mode” that accepts multiple script files.

The tool executes all provided solutions against the exact same generated input vectors. It then overlays the performance graphs and outputs a statistical comparison. This allows for direct analysis of different approaches—for example, comparing the memory overhead of a Recursive DFS against the time complexity of an Iterative BFS—to determine which implementation is optimal for a specific constraint set.

Visualization

Raw numerical data can be difficult to interpret at a glance. Occam integrates `scipy` for statistical smoothing and `matplotlib` to generate complexity graphs. These visualizations display not just the average runtime, but also the standard deviation (stability) and peak memory consumption at each interval.

Conclusion

Understanding code requires more than looking at the syntax; it requires understanding how that syntax executes on the machine. Occam provides the tooling necessary to dissect Python solutions, offering both the theoretical confirmation of static analysis and the practical reality of dynamic stress testing.

Source code: https://github.com/antonhibl/occam

Dynamic Analysis: Empirical Benchmarking#

Implementation#

Static Analysis: CPython Internals#

Bytecode & Instruction Counting#

Cyclomatic Complexity#

Battle Mode: Comparative Profiling#

Visualization#

Conclusion#