Example walk through
Let us take a look at how a simple program is executed in Python world.
Save this program in a file named `
test.py` and execute using Python and you should see below output.
python test.py 3
Now, in order to understand what happened in the background (the Purple box) let us take a first step to disassemble this code. Since the source code file you wrote (`
test.py`) has been compiled and created a bytecode – let us take a look at the same. Before we actually take a look at the bytecode, in your console start the python interpreter, execute following commands.
What happened here is…
1. We compiled `
test.py` and stored this object in a variable `
2. When we print `
c`, it gives the output which indicates that it is a code object and the first line starts at a particular memory location.
3. This code object `
c` has many properties. If you want to take a look at all the properties available you can do so by executing `
dir(c)` in python interpreter. One such property is `
co_code`. We printed the same in this step and it gave us a string of Hex-looking characters separated by `
\`. This is technically the real bytecode which is then fed to the actual interpreter.
4. In the next step we check the length of this bytecode.
5. Here we check the type of `
co_code` and it clearly said ‘
6. This command is used to get the actual ASCII character codes from bytecodes. As we can see we now have bunch of ASCII codes in the form of an array. I know what you are thinking here, for now just hold this thought here and just assume that this is what is executed by the compiler.
So we have travelled from source file `
test.py` which contained the python code to the bytecode. We then represented it in ASCII format. Let’s disassemble this code now. Python source code provides us with a dissassembler and can be found in its library.
Lib > dis.py` – feel free to take some time to go through it and maybe connect it back to the below screenshot.
Alright, very first impression – this looks like assembly code. It is kind of an assembly code for Python interpreter but it is more human readable. Our program had 4 lines of code – these four lines are converted by the compiler into these many lines of bytecode. The very first column starting from left represents the corresponding number of line of code in your source code file. As we can see, first line `
x = 1` is converted into 2 lines of bytecode. And similarly for rest of the 3 lines. First column, helps correlate your source code to bytecode.
The second column represents the byte offset for every instruction. It represents the bytes occupied by that corresponding instruction. It starts with 0 for first instruction which is 2 bytes long, thus the next instruction starts at 2 and so on. If you look at this value for the last instruction (`
RETURN_VALUE`) it starts at 26 which means that the end would be at #28. This corresponds to the step# 4 in previous section where we took a look at `
co_code` after compilation. In this step when we checked the output of the compiled code – to correctly said 28.
The fourth column represents the pointer to `value stack`. Python interpreter maintains something called as value stack and we shall know more about it in upcoming sections where we discuss frames and scopes. Same applies to the values which you can see in the brackets in fifth column. Just to give you an overview, the values in the brackets represent what the Interpreter is dealing with in that particular instruction. Python `
dis`assembler has this way of putting together nicely for us humans.
Hey, if you like what I write do consider subscribing and sharing it with your friends. I am currently working on things like newsletter and courses on Let’sDoTech. I would be sharing some exclusive content in my newsletter starting June 2021. Come, be an early bird!