Distributed tracing, context managers, Django’s StreamingHttpResponse

In my work at Bevy, I recently implemented a streaming endpoint in our Django app using StreamingHttpResponse. For anyone unfamiliar with this class, using it is straightforward. You just need to pass it some iterable, and it will stream the content.

from django.http import StreamingHttpResponse

def my_generator():
  for i in range(3):
    yield i

def my_view(request):
  return StreamingHttpResponse(
    my_generator(),
    content_type="text/event-stream",
  )

And you'll get a view that streams 012, one digit at a time.

But the gotcha came from how we normally do instrumentation. We use Datadog for distributed tracing and structlog for structured logging, and using a streaming response broke our normal usage patterns for both.

First, the tracing. We typically trace an entire callable, like so:

from ddtrace import tracer

class MyStreamingThing:
  @tracer.wrap(service="my_service")
  def stream_stuff(self):
    for i in range(3):
      yield i

This works for callables that never yield up to the caller. But it doesn't work for streaming. The trace doesn't stay open for the entire duration of the work. Instead, I had to manually manage the traces, something like this:

from ddtrace import tracer

class MyStreamingThing:
  def stream_stuff(self):
    span = tracer.trace(
      name="my_streaming_thing.stream_stuff",
      service="my_service",
    )
    try:
      for i in range(3):
        yield i
    finally: # spans MUST be finished to avoid a memory leak!
      span.finish() # FYI, this method is idempotent; very useful!

This worked as expected. Hooray! Logging, on the other hand, was a little more complicated. We use an admittedly outdated version of django-structlog, which includes a piece of middleware that uses structlog's thread-local context to bind logger variables. This middleware uses structlog's threadlocal.tmp_bind context manager around every request to bind request variables like the user ID, IP address, correlation ID, etc. to a new bound logger. These variables are unbound when the context manager exits.

And I'm sure you can see where this is going: we were losing all our request-level logger bindings! Now, the correct solution to this is to upgrade our version of django-structlog, which uses structlog's contextvars module to bind request-level variables that persist across threads. And that's what I'll be doing next week! But for now, I needed a quicker solution.

Enter the immutable logger. structlog lets you separate a logger instance from the thread-local context, which is just what we needed to preserve our request-level variables even after our middleware context manager had exited. We used it like so:

import structlog
from ddtrace import tracer
from django.http import StreamingHttpResponse

# this logger has the context provided by the middleware
logger = structlog.get_logger(__name__)

def my_view(request):
  # this logger will keep its context after the middelware exits
  immutable_logger = structlog.threadlocal.as_immutable(logger)
  return StreamingHttpResponse(
    MyStreamingThing().stream_stuff(immutable_logger),
    content_type="text/event-stream",
  )

class MyStreamingThing:
  def stream_stuff(self, logger):
    # now our logger won't lose its context, and we can continue 
    # to bind extra context as we normally would
    logger = logger.bind(foo="bar")
    span = tracer.trace(
      name="my_streaming_thing.stream_stuff",
      service="my_service",
    )
    try:
      for i in range(3):
        logger.info("Yielding...")
        # log event: {"event": "Yielding...", "foo": "bar", "request-id": ... }
        yield i
    finally:
      span.finish()

And now everything was working as expected. The logger was a bit tedious, as we had to pass it around to any other callables that needed to log anything, but it worked as we expected. As mentioned, using contextvars makes this less verbose (at the cost of more "magic"), but the bigger lesson for me was this: streaming responses and context managers can combine into a bit of a footgun, so it pays to be extra careful!

Instrumentation and Django's StreamingHttpResponse

Subscribe to my newsletter

Justin Thurman

Justin Thurman