image

I have previously unravelled for loops, and so the concept of looping has already come up in this blog post series of removing the syntactic sugar from Python. But one aspect of looping that I didn't touch upon is that of break and continue. Both are statements used to control the flow within a loop, whether it's to leave or jump back to the top of the loop, respectively.

How the bytecode does it

CPython's interpreter has the ability to jump around to various opcodes. That ability is what allows for break and continue to work. Take the following example (whose print call is there just to have a marker for the end of the loop body):

for x in y:
    if a:
        break
    if b:
        continue
    print('hi')
Example for loop

If you disassemble that for loop you end up with:

  2           0 LOAD_GLOBAL              0 (y)
              2 GET_ITER
        >>    4 FOR_ITER                26 (to 32)
              6 STORE_FAST               0 (x)

  3           8 LOAD_GLOBAL              1 (a)
             10 POP_JUMP_IF_FALSE       16

  4          12 POP_TOP
             14 JUMP_ABSOLUTE           32

  5     >>   16 LOAD_GLOBAL              2 (b)
             18 POP_JUMP_IF_FALSE       22

  6          20 JUMP_ABSOLUTE            4

  7     >>   22 LOAD_GLOBAL              3 (print)
             24 LOAD_CONST               1 ('hi')
             26 CALL_FUNCTION            1
             28 POP_TOP
             30 JUMP_ABSOLUTE            4
        >>   32 LOAD_CONST               0 (None)
             34 RETURN_VALUE
Bytecode for the example for loop

The bytecode at offset 14 is for break and offset 20 is for continue. As you can see they are JUMP_ABSOLUTE statements, which means that when the interpreter runs them it immediately go to the bytecode at those offsets. In this instance break jumps to the end of the function and continue jumps to the top of the for loop. So the bytecode has a way to skip over chunks of code.

How we are going to do it

So how do we do something similar without using those two statements? Exceptions to the rescue! In both instances we need some form of control flow that lets us jump to either the beginning or right after a loop. We can do that based on whether we put the loop inside or outside of a try block.

For break, since we want to jump just passed the end of the loop, we want to put the loop inside of a try block and raise an exception where the break statement was. We can then catch that exception and let execution carry us outside of the loop.

class _BreakStatement(Exception):
    pass

try:
    for x in y:
        if a:
            raise _BreakStatement
        if b:
            continue
        print('hi')
except _BreakStatement:
    pass
Using exceptions to desugar break

Handling continue is similar, although the try block is inside the loop this time.

class _BreakStatement(Exception):
    pass
    
class _ContinueStatement(Exception):
    pass

try:
    for x in y:
        try:
            if a:
                raise _BreakStatement
            if b:
                raise _ContinueStatement
            print('hi')
        except _ContinueStatement:
            pass
except _BreakStatement:
    pass
Using exceptions to desugar continue

Thanks to the end of the try block for continue extending to the bottom of the loop, control flow will just naturally flow back to the top of the loop as expected.

And a nice thing about this solution is it nests appropriately. Since Python has no way to break out of multiple loops via a single break statement (some languages allow this by letting you label the loop and having the break specify which loop you're breaking out of), you will always hit the tightest try block that you're in. And since you only need one try block per loop for an arbitrary number of break and continue statements, there's no concern of getting it wrong. And this trick is also the idiomatic way to break out of nested loops in Python, so there's already precedent in using it for this sort of control flow.

Bringing else clauses into the mix

This also works nicely for else clauses on for and while loops as they simply become else clauses on the try block! So this:

while x:
    break
else:
    print("no `break`")
Example while loop with an else clause

becomes:

try:
    while x:
        raise _BreakStatement
except _BreakStatement:
    pass
else:
    print("no `break`")
Unravelling an else clause on a loop

It's literally just a move of the entire clause from one statement to another!