Usage
Contents
Usage#
Python API#
from_code
#
The main entrypoint to our API is the CodeData
object. You can create it from any Python CodeType
:
# Load rich first for prettier output
from rich import pretty
pretty.install()
from code_data import CodeData
def fn(a, b):
return a + b
cd = CodeData.from_code(fn.__code__)
cd
CodeData( ( ( Instruction('LOAD_FAST', Varname('a'), line_number=7), Instruction('LOAD_FAST', Varname('b'), line_number=7), Instruction('BINARY_ADD', line_number=7), Instruction('RETURN_VALUE', line_number=7) ), ), filename='/tmp/ipykernel_701/3063237183.py', first_line_number=6, name='fn', stacksize=2, type=Function(Args(positional_or_keyword=('a', 'b'))), _additional_args=(Constant(None, _index_override=0),) )
Instead of using Python’s built in code object, or the dis
module, it reduces the amoutn of information to only that which is needed to recreate the code object. So all information about how it happens to be stored on disk, the bytecode offsets for example of each instruction, is ommited, making it simpler to use.
normalize
#
We are also able to “normalize” the code object, removing pieces of it that are unused. For example, if you have dead code, Python will still include the constants that are present in it, even though there is no way they can be accessed:
def fn():
if False:
x = 20
x = 1
cd = CodeData.from_code(fn.__code__)
cd
CodeData( ( ( Instruction('NOP', line_number=2), Instruction( 'LOAD_CONST', Constant(1, _index_override=3), line_number=4 ), Instruction( 'STORE_FAST', Varname('x', _index_override=0), line_number=4 ), Instruction( 'LOAD_CONST', Constant(None, _index_override=0), line_number=4 ), Instruction('RETURN_VALUE', line_number=4) ), ), filename='/tmp/ipykernel_701/2121495508.py', first_line_number=1, name='fn', stacksize=1, type=Function(), _additional_args=( Constant(False, _index_override=1), Constant(20, _index_override=2) ) )
cd.normalize()
CodeData( ( ( Instruction('NOP', line_number=2), Instruction('LOAD_CONST', Constant(1), line_number=4), Instruction('STORE_FAST', Varname('x'), line_number=4), Instruction('LOAD_CONST', Constant(None), line_number=4), Instruction('RETURN_VALUE', line_number=4) ), ), filename='/tmp/ipykernel_701/2121495508.py', first_line_number=1, name='fn', stacksize=1, type=Function() )
JSON Support#
Since the code object is now a simple data structure, we can serialize it to and from JSON. This provides a nice option if you want to analyze Python bytecode in a different language or save it on disk:
code_json = cd.to_json_data()
assert CodeData.from_json_data(code_json) == cd
code_json
{ 'blocks': [ [ {'name': 'NOP', 'line_number': 2}, { 'name': 'LOAD_CONST', 'arg': {'constant': 1, '_index_override': 3}, 'line_number': 4 }, { 'name': 'STORE_FAST', 'arg': {'varname': 'x', '_index_override': 0}, 'line_number': 4 }, { 'name': 'LOAD_CONST', 'arg': {'constant': None, '_index_override': 0}, 'line_number': 4 }, {'name': 'RETURN_VALUE', 'line_number': 4} ] ], 'filename': '/tmp/ipykernel_701/2121495508.py', 'first_line_number': 1, 'name': 'fn', 'stacksize': 1, 'type': {}, '_additional_args': [ {'constant': False, '_index_override': 1}, {'constant': 20, '_index_override': 2} ] }
Command Line#
We provide a CLI command python-code-data
which is useful for debugging or introspecting code objects from the command line.
It contains many of the same
flags to load Python code as the default Python CLI, including from a string (-c
),
from a module (-m
), or from a path (<file name>
). It also includes a way to
load a string from Python code to eval it first, which is useful for generating
test cases on the CLI of program strings.
! python-code-data -h
usage: python-code-data [-h] [-c cmd] [-e eval] [-m mod] [--dis] [--dis-after]
[--source] [--no-normalize] [--json]
[file]
Inspect Python code objects.
positional arguments:
file path to Python program
options:
-h, --help show this help message and exit
-c cmd program passed in as string
-e eval string evalled to make program
-m mod python library
--dis print Python's dis analysis
--dis-after print Python's dis analysis after round tripping to code-
data, for testing
--source print the source code
--no-normalize don't normalize code data before printing
--json Print the JSON represenation of the code data as well
! python-code-data -c 'x if y else z'
CodeData(
(
(
Instruction('LOAD_NAME', Name('y'), line_number=1),
Instruction('POP_JUMP_IF_FALSE', Jump(1), line_number=1),
Instruction('LOAD_NAME', Name('x'), line_number=1),
Instruction('POP_TOP', line_number=1),
Instruction('LOAD_CONST', Constant(None), line_number=1),
Instruction('RETURN_VALUE', line_number=1)
),
(
Instruction('LOAD_NAME', Name('z'), line_number=1),
Instruction('POP_TOP', line_number=1),
Instruction('LOAD_CONST', Constant(None), line_number=1),
Instruction('RETURN_VALUE', line_number=1)
)
),
filename='<string>',
first_line_number=1,
name='<module>',
stacksize=1
)