A bit of Python black magic that lets you efficiently inspect and manipulate execution contexts after crashes, aka, post-mortem debugging.
Author

Nima Sarang

Published

January 30, 2025

1 Introduction

I bet this experience sounds familiar:

Your code has been running smoothly for a while, and you’re finally breathing easy. Then, an hour into execution an error occurs. The worst part? You have no idea what caused it based on the logs alone, and source is buried 8 calls deep in the stack trace. You’re dreading the thought of validating the data from start to finish to find the root cause, but there’s no other way.

Well, I might’ve found the solution to this. Python’s built-in traceback lets you inspect variables at each step of the stack trace, and even better, save the current context for reuse later! If you’re familiar with the pdb debugger, this article is similar in spirit but with major differences.

This is the most useful code I’ve written in ages, and it took some mighty effort to get it right due to the lack of proper documentation. If you want the TL;DR, just skip to Section 4. Otherwise, let’s dive in.

2 Post-mortem

The term “post-mortem” in the context of debugging is when you’re trying to figure out what went wrong after an error has occurred, meaning it doesn’t require anything to set up in advance, like breakpoints or logging.

When a Python program runs, it maintains a call stack that tracks the sequence of function calls. Each entry on this stack is called a frame and represents a function call in progress. A frame contains information about the function in progress, such as the function’s code object, its local variables, the global variables in its namespace, references to the previous frame, and so on. When an exception occurs in Python, the interpreter captures the entire call stack at the moment of the error and preserves all the frames. This stack trace shows the execution path that led to the error.

To leverage this, the most common approach is using the Python debugger, pdb, which lets you inspect each frame in the traceback, and even execute code within the context of the frame. The IPython version of pdb, ipdb, is available in the IPython/Jupyter based environments and makes navigating the stack even easier.

I won’t be explaining pdb here, since what I’m about to propose is an alternative to it. Still, if you’re interested in learning more about it, here are some resources to get you started:

3 A New Perspective

For years I’ve wanted a way to easily inspect the context that led to an error. This might be manageable in a notebook where all your code lives together, but for any decent-sized project where code sprawls across multiple files? Not so much.

I understand that pdb offers a solution to some extent, but my main gripe with it is how clunky it feels in terms of navigation and code execution. Why do we need another interactive interface within an interactive notebook?

What sparked my interest was this article by Andy Jones, where he created a helper function that grabs a frame’s context and copies it to IPython’s namespace. This is a great idea, but I wanted to take it further and make it even more user-friendly.

What if we could pick and choose the frames we want to inspect based on the error message, make a copy of the context, and skip pdb altogether as a bonus?

Toy Example

Let’s say we have a project structured like this:

notebook.ipynb
src/
    model.py
    run.py

The code in run.py looks like:

import numpy as np
from .model import ModelClass

def train_model(length: int):
    model = ModelClass()
    data = np.arange(length)
    model.train(data)

And in model.py we have:

class ModelClass:
    def train(self, data):
        try:
            data_norm = data / sum(data)
            self._validate_data(data_norm)
        except:
            raise RuntimeError("Second exception: This is a dummy exception.")

    def _validate_data(self, data):
        if len(data) < 10:
            raise ValueError("First exception: Data is too short.")

Consider a simple scenario where we pass in the wrong data type:

from src.run import train_model

train_model(length="10")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 3
      1 from src.run import train_model
----> 3 train_model(length="10")

File ~/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py:8, in train_model(length)
      6 def train_model(length: int):
      7     model = ModelClass()
----> 8     data = np.arange(length)
      9     model.train(data)

TypeError: arange() not supported for inputs with DType <class 'numpy.dtypes.StrDType'>.

Oops, wrong input! Before doing anything else, let’s grab a reference to the exception:

import sys
exception_1 = sys.last_value
exception_1
TypeError("arange() not supported for inputs with DType <class 'numpy.dtypes.StrDType'>.")

sys.last_value contains the most recent exception. There’s also sys.last_type and sys.last_traceback, but we can get those from sys.last_value directly.

import traceback

exec_type = type(exception_1)
exc_tb = exception_1.__traceback__

# Helper function to print the traceback
traceback.print_tb(exc_tb)
  File "/Users/nsarang/micromamba/envs/arclight/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/var/folders/29/fh16rbz95b99yz5df3c6yt2h0000gn/T/ipykernel_8225/205929029.py", line 3, in <module>
    train_model(length="10")
  File "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py", line 8, in train_model
    data = np.arange(length)
           ^^^^^^^^^^^^^^^^^

The traceback reveals the chain of events:

  1. The first frame is from the IPython shell running our notebook.
  2. The second frame is our cell call, with that funky filename like T/ipykernel_28470/3835444418.py. That’s because IPython temporarily writes your code to a file and runs it.
  3. The third frame is where the error actually happened.

The traceback exc_tb is an entry point to the exception traceback, and all the frames are linked together as a linked list. Each frame’s tb_next points to the next frame, and tb_frame gives you the actual frame object.

traceback.print_tb(exc_tb.tb_next)
  File "/var/folders/29/fh16rbz95b99yz5df3c6yt2h0000gn/T/ipykernel_8225/205929029.py", line 3, in <module>
    train_model(length="10")
  File "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py", line 8, in train_model
    data = np.arange(length)
           ^^^^^^^^^^^^^^^^^
traceback.print_tb(exc_tb.tb_next.tb_next)
  File "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py", line 8, in train_model
    data = np.arange(length)
           ^^^^^^^^^^^^^^^^^
tb_last = exc_tb.tb_next.tb_next
frame = tb_last.tb_frame
frame
<frame at 0x107b065c0, file '/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py', line 8, code train_model>
import inspect
info = inspect.getframeinfo(frame)
info._asdict() # for pretty printing
{'filename': '/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py',
 'lineno': 8,
 'function': 'train_model',
 'code_context': ['    data = np.arange(length)\n'],
 'index': 0}

Now we’re cooking! 🤘 This is exactly the info your IDE uses to create those colorful error messages. Let’s peek at the local variables in this frame:

frame.f_locals
{'length': '10', 'model': <src.model.ModelClass at 0x107a8af50>}

The globals are too massive to print in full, so just the keys:

frame.f_globals.keys()
dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__', '__cached__', '__builtins__', 'np', 'ModelClass', 'train_model'])

With all this data we can:

  • Save local and global variables from each frame
  • Control how much context code gets printed around the error
  • Assign an index to each frame for easy selection later on
  • Control code indentation for readability

Here’s the implementation:

import sys
import inspect
import textwrap

def process_exception(exception, context_lines=1, max_indent=float('inf'), frame_index=0):
    """
    Process an exception and return a formatted traceback message and frame information.

    Parameters
    ----------
    exception : Exception
        The exception to process.
    context_lines : int, optional
        The number of context lines to include around the current line, by default 1.
    max_indent : int, optional
        The maximum indentation to use for the code block in the error message. Defaults to no limit.
    frame_index : int, optional
        The index of the first frame, by default 0.
    """
    tb = exception.__traceback__
    frame_info = []
    
    while tb is not None:
        # Get high-level frame information
        filename, lineno, function_name, lines, index = inspect.getframeinfo(
            tb, context=context_lines
        )
        
        # Dedent the lines if the entire block is indented more than max_indent
        lines_dedented = textwrap.dedent("".join(lines)).splitlines()
        if lines_dedented and lines[0] and len(lines[0]) - len(lines_dedented[0]) > max_indent:
            lines = textwrap.indent(
                "\n".join(lines_dedented), " " * max_indent
            ).splitlines()
        
        # Construct the frame message
        start_no = lineno - index
        end_no = lineno + len(lines) - index
        number_width = len(str(end_no))
        
        frame_message = [
            f"┌─── Frame {frame_index} " + "─" * 40,
            f'Function {function_name}, in file "{filename}"',
        ]
        
        for i, file_lineno in enumerate(range(start_no, end_no)):
            line = lines[i].rstrip() if i < len(lines) else ""
            prefix = "➤➤➤ " if i == index else "    "
            frame_message.append(f" {prefix}{file_lineno:{number_width}}:  {line}")
        
        frame_message.append("")
        
        frame_info.append({
            "message": "\n".join(frame_message),
            "frame": tb.tb_frame,
            "locals": tb.tb_frame.f_locals.copy(), # shallow copy just in case
            "globals": tb.tb_frame.f_globals.copy(),
            "metadata": {
                "filename": filename,
                "lineno": lineno,
                "function_name": function_name,
                "lines": lines if lines else [],
                "index": index,
            },
        })
            
        tb = tb.tb_next
        frame_index += 1
    
    exception_header = f"{type(exception).__name__}: {str(exception)}"
    traceback_message =" \n".join([frame["message"] for frame in frame_info] + [exception_header])
    return traceback_message, frame_info

Let’s give it a try:

traceback_message, frame_info = process_exception(exception_1, context_lines=5, max_indent=6)
print(traceback_message)
┌─── Frame 0 ────────────────────────────────────────
Function run_code, in file "/Users/nsarang/micromamba/envs/arclight/lib/python3.11/site-packages/IPython/core/interactiveshell.py"
     3575:                await eval(code_obj, self.user_global_ns, self.user_ns)
     3576:            else:
 ➤➤➤ 3577:                exec(code_obj, self.user_global_ns, self.user_ns)
     3578:        finally:
     3579:            # Reset our crash handler in place
 
┌─── Frame 1 ────────────────────────────────────────
Function <module>, in file "/var/folders/29/fh16rbz95b99yz5df3c6yt2h0000gn/T/ipykernel_8225/205929029.py"
     1:  from src.run import train_model
     2:  
 ➤➤➤ 3:  train_model(length="10")
 
┌─── Frame 2 ────────────────────────────────────────
Function train_model, in file "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py"
      5:  
      6:  def train_model(length: int):
      7:      model = ModelClass()
 ➤➤➤  8:      data = np.arange(length)
      9:      model.train(data)
 
TypeError: arange() not supported for inputs with DType <class 'numpy.dtypes.StrDType'>.

frame_info gives us a list of dictionaries, each packed with a frame’s context:

frame_info[2]["locals"]
{'length': '10', 'model': <src.model.ModelClass at 0x107a8af50>}

Processing Chained Exceptions

Chained exceptions happen when an exception is raised while another is being handled. There are two types explained in PEP 3134.

To handle these, I’ll borrow a function from the pdb module that walks the linked exceptions and returns them in chronological order:

def get_chained_exceptions(exc):
    """
    Given a an exception, return a tuple of chained exceptions.

    Borrowed and modified from the `pdb` module.
    """
    _exceptions = []
    current = exc
    reason = None

    while current is not None:
        if (current, reason) in _exceptions:
            break
        _exceptions.append((current, reason))

        if current.__cause__ is not None:
            current = current.__cause__
            reason = "__cause__"
        elif (
            current.__context__ is not None and not current.__suppress_context__
        ):
            current = current.__context__
            reason = "__context__"

    return reversed(_exceptions)

Now let’s wrap it all together:

def extract_from_exception(exception=None, context_lines=5, max_indent=8):
    """
    Print traceback with surrounding code context, supporting nested exceptions.

    exception : Exception, optional
        The exception to process. If not provided, the last exception will be used.
    
    For other parameters, see `process_exception`.
    """
    if exception is None:
        exception = sys.last_value

    frames_info = []
    traceback_message = []

    for exc_value, exc_reason in get_chained_exceptions(exception):
        traceback_message_single, frame_info_single = process_exception(
            exc_value, context_lines, max_indent, frame_index=len(frames_info)
        )
        traceback_message.append(traceback_message_single)
        frames_info.extend(frame_info_single)

        if exc_reason == "__context__":
            traceback_message.append(
                "\n\nDuring handling of the above exception, "
                "another exception occurred:\n\n"
            )
        elif exc_reason == "__cause__":
            traceback_message.append(
                "\n\nThe above exception was the direct cause "
                "of the following exception:\n\n"
            )

    traceback_message = "\n".join(traceback_message)
    return traceback_message, frames_info

4 Working Examples

4.1 Frame Inspection

Let’s try a more complex example with multiple exceptions. Using our earlier example from Section 3, let’s run train_model with a length of 5:

from src.run import train_model

train_model(length=5)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/model.py:5, in ModelClass.train(self, data)
      4     data_norm = data / sum(data)
----> 5     self._validate_data(data_norm)
      6 except:

File ~/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/model.py:11, in ModelClass._validate_data(self, data)
     10 if len(data) < 10:
---> 11     raise ValueError("First exception: Data is too short.")

ValueError: First exception: Data is too short.

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
Cell In[15], line 3
      1 from src.run import train_model
----> 3 train_model(length=5)

File ~/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py:9, in train_model(length)
      7 model = ModelClass()
      8 data = np.arange(length)
----> 9 model.train(data)

File ~/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/model.py:7, in ModelClass.train(self, data)
      5     self._validate_data(data_norm)
      6 except:
----> 7     raise RuntimeError("Second exception: This is a dummy exception.")

RuntimeError: Second exception: This is a dummy exception.

The above is Jupyter’s error message. Let’s see how ours compares:

traceback_message, frame_info = extract_from_exception(context_lines=5, max_indent=4)
print(traceback_message)
┌─── Frame 0 ────────────────────────────────────────
Function train, in file "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/model.py"
     3:      try:
     4:          data_norm = data / sum(data)
 ➤➤➤ 5:          self._validate_data(data_norm)
     6:      except:
     7:          raise RuntimeError("Second exception: This is a dummy exception.")
 
┌─── Frame 1 ────────────────────────────────────────
Function _validate_data, in file "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/model.py"
      7:              raise RuntimeError("Second exception: This is a dummy exception.")
      8:  
      9:      def _validate_data(self, data):
     10:          if len(data) < 10:
 ➤➤➤ 11:              raise ValueError("First exception: Data is too short.")
 
ValueError: First exception: Data is too short.


During handling of the above exception, another exception occurred:


┌─── Frame 2 ────────────────────────────────────────
Function run_code, in file "/Users/nsarang/micromamba/envs/arclight/lib/python3.11/site-packages/IPython/core/interactiveshell.py"
     3575:              await eval(code_obj, self.user_global_ns, self.user_ns)
     3576:          else:
 ➤➤➤ 3577:              exec(code_obj, self.user_global_ns, self.user_ns)
     3578:      finally:
     3579:          # Reset our crash handler in place
 
┌─── Frame 3 ────────────────────────────────────────
Function <module>, in file "/var/folders/29/fh16rbz95b99yz5df3c6yt2h0000gn/T/ipykernel_8225/4277452375.py"
     1:  from src.run import train_model
     2:  
 ➤➤➤ 3:  train_model(length=5)
 
┌─── Frame 4 ────────────────────────────────────────
Function train_model, in file "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py"
      5:  
      6:  def train_model(length: int):
      7:      model = ModelClass()
      8:      data = np.arange(length)
 ➤➤➤  9:      model.train(data)
 
┌─── Frame 5 ────────────────────────────────────────
Function train, in file "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/model.py"
      5:              self._validate_data(data_norm)
      6:          except:
 ➤➤➤  7:              raise RuntimeError("Second exception: This is a dummy exception.")
      8:  
      9:      def _validate_data(self, data):
 
RuntimeError: Second exception: This is a dummy exception.

Look at that! The full chain of exceptions is displayed with the relevant code context. We can now inspect the variables at each step and even rerun the code to see what happens.

frame_info[0]["locals"]
{'self': <src.model.ModelClass at 0x107a5f6d0>,
 'data': array([0, 1, 2, 3, 4]),
 'data_norm': array([0. , 0.1, 0.2, 0.3, 0.4])}
print(frame_info[1]["message"])

# Check if the length of 'data' is actually less than 10
frame_info[1]["locals"]
┌─── Frame 1 ────────────────────────────────────────
Function _validate_data, in file "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/model.py"
      7:              raise RuntimeError("Second exception: This is a dummy exception.")
      8:  
      9:      def _validate_data(self, data):
     10:          if len(data) < 10:
 ➤➤➤ 11:              raise ValueError("First exception: Data is too short.")
{'self': <src.model.ModelClass at 0x107a5f6d0>,
 'data': array([0. , 0.1, 0.2, 0.3, 0.4])}
print(frame_info[4]["message"])

# Going back to the 'train_model' call
frame_info[4]["locals"]
┌─── Frame 4 ────────────────────────────────────────
Function train_model, in file "/Users/nsarang/Nimas/nsarang.github.io/blog/2025-01-30-post-mortem/src/run.py"
      5:  
      6:  def train_model(length: int):
      7:      model = ModelClass()
      8:      data = np.arange(length)
 ➤➤➤  9:      model.train(data)
{'length': 5,
 'model': <src.model.ModelClass at 0x107a5f6d0>,
 'data': array([0, 1, 2, 3, 4])}

4.2 Code Execution

What if we want to execute some code within a frame’s context, just like in pdb?

def execute(source: str, context: dict):
    """
    Execute the given source code in the given context.
    """
    source = textwrap.dedent(source)
    # compile for better performance
    code = compile(source, "<string>", "exec")
    exec(code, context["globals"], context["locals"])
execute(r"""
        data = data + 10
        print("data:", data)
        print("locals:", locals().keys())
        """,
        context=frame_info[4])
data: [10 11 12 13 14]
locals: dict_keys(['length', 'model', 'data'])

See that? The data variable got updated in place, and the new value shows up in the locals dictionary.

We could also bring the frame’s locals into IPython’s namespace like in Andy’s approach, but there’s a risk of overwriting existing variables. Best to be selective about what we bring in.

Reuse

Citation

BibTeX citation:
@online{sarang2025,
  author = {Sarang, Nima},
  title = {Saving {Time:} {Post-mortem} {Debugging} in {Python}},
  date = {2025-01-30},
  url = {https://www.nimasarang.com/blog/2025-01-30-post-mortem/},
  langid = {en}
}
For attribution, please cite this work as:
N. Sarang, “Saving Time: Post-mortem Debugging in Python.” [Online]. Available: https://www.nimasarang.com/blog/2025-01-30-post-mortem/