There are different ways of debugging code in Python, one of which is to introduce breakpoints into the code at points where one would like to invoke a Python debugger.
1.8k
By Nick Cotes
There are different ways of debugging code in Python, one of which is to introduce breakpoints into the code at points where one would like to invoke a Python debugger. The statements that one would use to enter a debugging session at different call sites, depend on the version of the Python interpreter that one is working with, as we shall be seeing in this tutorial.
In this tutorial, you will discover various ways of setting breakpoints in different versions of Python.
After completing this tutorial, you will know:
How to invoke the pdb debugger in earlier versions of Python.
How to make use of the new, built-in
breakpoint() function introduced in Python 3.7.
How to write your own
breakpoint() function to simplify the debugging process in earlier versions of Python.
How to use a post-mortem debugger
Let’s get started.
Setting Breakpoints in Different Versions of Python
Photo by Josh Withers, some rights reserved.
Tutorial overview
This tutorial is divided into three parts; they are:
Setting Breakpoints in Python Code
Invoking the pdb Debugger in Earlier Versions of Python
Using the breakpoint() Function in Python 3.7
Writing One’s Own breakpoint() Function for Earlier Versions of Python
Limitations of the breakpoint() function
Setting breakpoints in Python code
We have previously seen that one way of debugging a Python script is to run it in the command line with the Python debugger.
In order to do so, we would need to use of the
-mpdb command that loads the pdb module before executing the Python script. In the same command line interface, we would then follow this by a specific debugger command of choice, such as
n to move to the next line, or
s if our intention is to step into a function.
This method could become quickly cumbersome as the length of the code increases. One way to address this problem and gain better control over where to break your code, is to insert a breakpoint directly into the code.
Invoking the pdb debugger in earlier versions of Python
Doing so prior to Python 3.7 would require you to
importpdb, and to call
pdb.set_trace() at the point in your code where you would like to enter an interactive debugging session.
If we reconsider, as an example, the code for implementing the general attention mechanism, we can break into the code as follows:
fromnumpy importarrayfromnumpy importrandomfromnumpy importdotfromscipy.special importsoftmax# importing the Python debugger moduleimportpdb# encoder representations of four different wordsword_1=array([1,0,0])word_2=array([0,1,0])word_3=array([1,1,0])word_4=array([0,0,1])# stacking the word embeddings into a single arraywords=array([word_1,word_2,word_3,word_4])# generating the weight matricesrandom.seed(42)W_Q=random.randint(3,size=(3,3))W_K=random.randint(3,size=(3,3))W_V=random.randint(3,size=(3,3))# generating the queries, keys and valuesQ=dot(words,W_Q)K=dot(words,W_K)V=dot(words,W_V)# inserting a breakpointpdb.set_trace()# scoring the query vectors against all key vectorsscores=dot(Q,K.transpose())# computing the weights by a softmax operationweights=softmax(scores/K.shape[1]**0.5,axis=1)# computing the attention by a weighted sum of the value vectorsattention=dot(weights,V)print(attention)
Executing the script now opens up the pdb debugger right before we compute the variable scores, and we can proceed to issue any debugger commands of choice, such as
n to move to the next line, or
c to continue execution:
Although functional, this is not the most elegant and intuitive approach of inserting a breakpoint into your code. Python 3.7 implements a more straightforward way of doing so, as we shall see next.
Using the breakpoint() function in Python 3.7
Python 3.7 comes with a built-in
breakpoint() function that enters the Python debugger at the call site (or the point in the code at which the
breakpoint() statement is placed).
When called, the default implementation of the
breakpoint() function will call
sys.breakpointhook(), which in turn calls the
pdb.set_trace() function. This is convenient because we will not need to
importpdb and call
pdb.set_trace() explicitly ourselves.
Let’s reconsider the code for implementing the general attention mechanism, and now introduce a breakpoint via the
breakpoint() statement:
fromnumpy importarrayfromnumpy importrandomfromscipy.special importsoftmax# encoder representations of four different wordsword_1=array([1,0,0])word_2=array([0,1,0])word_3=array([1,1,0])word_4=array([0,0,1])# stacking the word embeddings into a single arraywords=array([word_1,word_2,word_3,word_4])# generating the weight matricesrandom.seed(42)W_Q=random.randint(3,size=(3,3))W_K=random.randint(3,size=(3,3))W_V=random.randint(3,size=(3,3))# generating the queries, keys and valuesQ=words@W_Q
K=words@W_K
V=words@W_V
# inserting a breakpointbreakpoint()# scoring the query vectors against all key vectorsscores=Q@K.transpose()# computing the weights by a softmax operationweights=softmax(scores/K.shape[1]**0.5,axis=1)# computing the attention by a weighted sum of the value vectorsattention=weights@Vprint(attention)
One advantage of using the
breakpoint() function is that, in calling the default implementation of
sys.breakpointhook() the value of a new environment variable,
PYTHONBREAKPOINT, is consulted. This environment variable can take various values, based on which different operations can be performed.
For example, setting the value of
PYTHONBREAKPOINT to 0 disables all breakpoints. Hence, your code could contain as many breakpoints as necessary, but these can be easily stopped fromhalting the execution of the code without having to remove them physically. If (for example) the name of the script containing the code is main.py, we would disable all breakpoints by calling it in the command line interface as follows:
PYTHONBREAKPOINT=0 python main.py
Otherwise, we can achieve the same outcome by setting the environment variable in the code itself:
fromnumpy importarrayfromnumpy importrandomfromscipy.special importsoftmax# setting the value of the PYTHONBREAKPOINT environment variableimportosos.environ['PYTHONBREAKPOINT']='0'# encoder representations of four different wordsword_1=array([1,0,0])word_2=array([0,1,0])word_3=array([1,1,0])word_4=array([0,0,1])# stacking the word embeddings into a single arraywords=array([word_1,word_2,word_3,word_4])# generating the weight matricesrandom.seed(42)W_Q=random.randint(3,size=(3,3))W_K=random.randint(3,size=(3,3))W_V=random.randint(3,size=(3,3))# generating the queries, keys and valuesQ=words@W_Q
K=words@W_K
V=words@W_V
# inserting a breakpointbreakpoint()# scoring the query vectors against all key vectorsscores=Q@K.transpose()# computing the weights by a softmax operationweights=softmax(scores/K.shape[1]**0.5,axis=1)# computing the attention by a weighted sum of the value vectorsattention=weights@Vprint(attention)
The value of
PYTHONBREAKPOINT is consulted every time that
sys.breakpointhook() is called. This means that the value of this environment variable can be changed during the code execution and the
breakpoint() function would respond accordingly.
The
PYTHONBREAKPOINT environment variable can also be set to other values, such as to the name of a callable. Say, for instance, that we’d like to use a different Python debugger other than pdb, such as ipdb (run
pip install ipdb first, if the debugger has not yet been installed). In this case, we would call the main.py script in the command line interface, and hook the debugger without making any changes to the code itself:
PYTHONBREAKPOINT=ipdb.set_trace python main.py
In doing so, the
breakpoint() function enters the ipdb debugger at the next call site:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
> /Users/Stefania/Documents/PycharmProjects/BreakpointPy37/main.py(33)<module>()
32 # scoring the query vectors against all key vectors
---> 33 scores = Q @ K.transpose()
34
ipdb> n
> /Users/Stefania/Documents/PycharmProjects/BreakpointPy37/main.py(36)<module>()
35 # computing the weights by a softmax operation
---> 36 weights = softmax(scores / K.shape[1] ** 0.5, axis=1)
37
ipdb> c
[[0.98522025 1.74174051 0.75652026]
[0.90965265 1.40965265 0.5 ]
[0.99851226 1.75849334 0.75998108]
[0.99560386 1.90407309 0.90846923]]
The function can also take input arguments as,
breakpoint(*args,**kws), which are then passed on to
sys.breakpointhook(). This is because any callable (such as a third party debugger module) might accept optional arguments, which can be passed through the
breakpoint() function.
Writing your own breakpoint() function in earlier versions of Python
Let’s return to the fact that versions of Python earlier than v3.7 do not come with the
breakpoint() function readily built in. We can write our own.
Similarly to how the
breakpoint() function is implemented from Python 3.7 onwards, we can implement a function that checks the value of an environment variable and:
Skips all breakpoints in the code if the value of the environment variable is set to 0.
Enters into the default Python pdb debugger if the environment variable is an empty string.
Enters into another debugger as specified by the value of the environment variable.
...# defining our breakpoint() functiondefbreakpoint(*args,**kwargs):importimportlib# reading the value of the environment variableval=os.environ.get('PYTHONBREAKPOINT')# if the value has been set to 0, skip all breakpointsifval=='0':returnNone# else if the value is an empty string, invoke the default pdb debuggereliflen(val)==0:hook_name='pdb.set_trace'# else, assign the value of the environment variableelse:hook_name=val# split the string into the module name and the function namemod,dot,func=hook_name.rpartition('.')# get the function from the modulemodule=importlib.import_module(mod)hook=getattr(module,func)returnhook(*args,**kwargs)...
We can include this function into the code and run it (using a Python 2.7 interpreter, in this case). If we set the value of the environment variable to an empty string, we find that the pdb debugger stops at the point in the code at which we have placed our
breakpoint() function. We can then issue debugger commands into the command line from there onwards:
fromnumpy importarrayfromnumpy importrandomfromnumpy importdotfromscipy.special importsoftmax# setting the value of the environment variableimportosos.environ['PYTHONBREAKPOINT']=''# defining our breakpoint() functiondefbreakpoint(*args,**kwargs):importimportlib# reading the value of the environment variableval=os.environ.get('PYTHONBREAKPOINT')# if the value has been set to 0, skip all breakpointsifval=='0':returnNone# else if the value is an empty string, invoke the default pdb debuggereliflen(val)==0:hook_name='pdb.set_trace'# else, assign the value of the environment variableelse:hook_name=val# split the string into the module name and the function namemod,dot,func=hook_name.rpartition('.')# get the function from the modulemodule=importlib.import_module(mod)hook=getattr(module,func)returnhook(*args,**kwargs)# encoder representations of four different wordsword_1=array([1,0,0])word_2=array([0,1,0])word_3=array([1,1,0])word_4=array([0,0,1])# stacking the word embeddings into a single arraywords=array([word_1,word_2,word_3,word_4])# generating the weight matricesrandom.seed(42)W_Q=random.randint(3,size=(3,3))W_K=random.randint(3,size=(3,3))W_V=random.randint(3,size=(3,3))# generating the queries, keys and valuesQ=dot(words,W_Q)K=dot(words,W_K)V=dot(words,W_V)# inserting a breakpointbreakpoint()# scoring the query vectors against all key vectorsscores=dot(Q,K.transpose())# computing the weights by a softmax operationweights=softmax(scores/K.shape[1]**0.5,axis=1)# computing the attention by a weighted sum of the value vectorsattention=dot(weights,V)print(attention)
This facilitates the process of breaking into the code for Python versions earlier than v3.7, because it now becomes a matter of setting the value of an environment variable, rather than having to manually introduce (or remove) the
importpdb;pdb.set_trace() statement at different call sites in the code.
Limitations of the breakpoint() function
The breakpoint() function allows you to bring in the debugger at some point of the program. You need to find the exact position that you need the debugger to put the breakpoint into it. If you consider the following code:
try:func()except:breakpoint()print("exception!")
this will bring you the debugger when the function func() raised exceptions. It can triggered by the function itself, or deep inside some other functions that it calls. But the debugger will start at the line print("exception!") above. Which may not be very useful.
The way that we can bring up the debugger at the point of exception is called the post-mortem debugger. It works by asking Python to register the debugger pdb.pm() as the exception handler when uncaught exception is raised. When it is called, it will look for the last exception raised and start the debugger at that point. To use the post-mortem debugger, we just need to add the following code before the program is run:
This is handy because nothing else need to be changed in the program. As an example, assume we want to evaluate the average of $1/x$ using the following program. It is quite easy to overlook some corner cases but we can catch the issue when an exception is raised:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import sysimport pdbimport randomdef debughook(etype,value,tb):pdb.pm()# post-mortem debuggersys.excepthook=debughook# Experimentally find the average of 1/x where x is a random integer in 0 to 9999N=1000randomsum=0foriinrange(N):x=random.randint(0,10000)randomsum+=1/xprint("Average is",randomsum/N)
when we run the above program, the program may terminate or it may raise a division by zero exception, depends on whether the random number generator ever produces zero in the loop. In that case, we may see the following:
> /Users/mlm/py_pmhook.py(17)<module>()
-> randomsum += 1/x
(Pdb) p i
16
(Pdb) p x
0
which we found the exception is raised at which line and we can check the value of the variables as we can usually do in pdb.
In fact, it is more convenient to print the traceback and the exception when the post-mortem debugger is launched:
import sysimport pdbimport tracebackdef debughook(etype,value,tb):traceback.print_exception(etype,value,tb)print()# make a new line before launching post-mortempdb.pm()# post-mortem debuggersys.excepthook=debughook
and the debugger session will be started as follows:
Traceback(most recent call last):File"/Users/mlm/py_pmhook.py",line17,in<module>randomsum+=1/xZeroDivisionError:division by zero>/Users/mlm/py_pmhook.py(17)<module>()->randomsum+=1/x(Pdb)
Further Reading
This section provides more resources on the topic if you are looking to go deeper.
Websites
Summary
In this tutorial, you discovered various ways of setting breakpoints in different versions of Python.
Specifically, you learned:
How to invoke the pdb debugger in earlier versions of Python.
How to make use of the new, built-in
breakpoint() function introduced in Python 3.7.
How to write your own
breakpoint() function to simplify the debugging process in earlier versions of Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.