Optimize BlockingRunner: Drive By Deadlines, Not Polling

by Alex Johnson 57 views

The Problem with Constant Polling in BlockingRunner

Hey there! Let's dive into a bit of a technical puzzle we've been wrestling with in our runtime architecture. At the heart of our system is the SchedulerCore, which acts as the central brain, keeping track of all our commands, inquiries, retries, and critically, when timeouts should kick in. We have two main ways of interacting with this core: an asynchronous path and a blocking path. The async side is pretty slick; it already knows how to ask the scheduler, "When's the next important thing happening?" using `next_deadline()`. This lets the rest of our event loop play nicely with the scheduler's timing. However, the blocking path, handled by `BlockingRunner`, has been doing things a bit differently. Instead of asking the scheduler, it's been stuck in a cycle of checking for incoming data every 10 milliseconds, like a relentless poll. This isn't just a minor inefficiency; it's causing a few headaches:

  • Wasting CPU cycles: Imagine asking "Anything new?" 100 times a second, even when you know nothing's coming and the next scheduled event is ages away. That's what `BlockingRunner` has been doing. It leads to a lot of unnecessary wakeups and system calls, burning CPU cycles that could be used elsewhere.
  • Adding jitter and overhead: Every time we try to read data with a timeout, especially in blocking transports like TCP, we end up doing extra work. We have to mess with existing timeout settings, set new ones, read, and then put the old settings back. Doing this every 10 milliseconds adds up, causing unnecessary work and making our timing less predictable. The whole process of handling timeouts also becomes clunky, tied to this fixed polling pace rather than the actual deadlines.
  • Architectural inconsistency: It feels a bit redundant, right? The scheduler already calculates the *exact* "next time something matters" using `SchedulerCore::next_deadline()`. Yet, the `BlockingRunner`’s polling loop completely ignores this valuable piece of information.

Essentially, we have a smart scheduler telling us precisely when to pay attention, but our blocking interface is stubbornly sticking to its own, less informed, 10ms timer. This is where the proposed change comes in – we want to make `BlockingRunner` smarter and more aligned with the rest of the system by making it deadline-driven.

Evidence of the Current Polling Problem

Let's dig a bit deeper into why this 10ms polling in BlockingRunner is problematic and how the evidence points towards a better solution. We've observed a few key behaviors that highlight the inefficiencies:

1. The Unconditional 10ms Poll Loop: The core of the issue lies within the BlockingRunner::run_until_complete method. Currently, it consistently calls recv_into_with_timeout with a hardcoded Duration::from_millis(10). Crucially, when this call times out (returning Err(Error::Timeout)), the `BlockingRunner` simply does nothing and continues its loop. This means that even if the scheduler's next deadline is hours away, the `BlockingRunner` will keep polling every 10ms, regardless of any actual need to do so. This unconditional polling is the primary source of wasted CPU cycles and unnecessary system activity. It's like having a security guard who checks the front door every 10 milliseconds, 24/7, even when the building is clearly empty and locked.

match transport.recv_into_with_timeout(&mut read_buf, Duration::from_millis(10)) {
    // ... other cases ...
    Err(Error::Timeout) => {} // Simply ignore the timeout and continue looping!
    Err(e) => { /* handle other errors */ }
}

2. Amplified Cost of Per-Call Timeout Mutation: When we look at the implementation of recv_into_with_timeout in blocking transports, we see another layer of inefficiency. For instance, TCP's implementation involves reading the current read timeout, setting a new timeout for the specific read operation, performing the read, and then restoring the original timeout. This sequence of operations – get, set, read, restore – is performed on every single receive attempt. When combined with the 10ms polling interval, this repeated mutation of timeout settings becomes a significant performance drain. It's a lot of overhead just to check for data that might not even be there. Similarly, UDP transports often configure a default read timeout at the connection level (e.g., 5 seconds), but the BlockingRunner's aggressive 10ms polling bypasses the intended use of this configured timeout, forcing it into a much more frequent, and less efficient, polling mode.

// Example of timeout mutation in TCP transport
let original_timeout = self.reader.get_ref().read_timeout()?;
self.reader.get_ref().set_read_timeout(Some(duration))?;
let result = self.reader.read(dst);
self.reader.get_ref().set_read_timeout(original_timeout)?;

3. The Scheduler Already Knows Best: This is a critical point. Our SchedulerCore is designed to be the ultimate source of truth for timing. Its next_deadline() method already calculates the earliest moment something important needs to happen. This includes things like acknowledging received data (ACK timeouts), waiting for responses to our requests (socket timeouts), handling outstanding inquiries, managing retries, and even spacing out successive inquiries. The fact that this information is readily available but ignored by BlockingRunner represents a clear architectural disconnect.

4. Async Precedent: The asynchronous part of our system, specifically AsyncAdapter, already follows the desired pattern. Its next_deadline() method directly delegates to core.next_deadline(now). This provides a perfect blueprint and a strong precedent for how the blocking path should operate. If the async path can intelligently defer to the scheduler's timing, the blocking path should be able to do the same.

5. Masked Behavior in Test Transports: To make matters more challenging, our current test transport, ScriptedBlockingTransport, often masks the timeout-driven behavior. Its recv_into_with_timeout implementation tends to ignore the timeout duration passed to it and simply calls the underlying recv_into. This makes it difficult to accurately test and assert the correct timeout selection within the BlockingRunner without updating these test doubles.

These pieces of evidence collectively paint a clear picture: the fixed 10ms polling in BlockingRunner is a significant source of inefficiency and architectural inconsistency. The solution lies in leveraging the existing scheduling intelligence to drive the blocking loop.

The Proposed Solution: A Deadline-Driven BlockingRunner

Our proposed change is straightforward yet impactful: we're going to refactor BlockingRunner::run_until_complete to be deadline-driven. Instead of mindlessly polling every 10 milliseconds, it will intelligently calculate how long it *actually* needs to wait before the next important event. This timeout will be derived directly from the scheduler's calculated deadlines and other relevant timing information. By doing this, we eliminate the fixed polling interval and align the blocking path with the sophisticated timing logic already present in the `SchedulerCore`.

Design: How We'll Make it Deadline-Driven

To achieve this, we'll introduce a small, private helper function within BlockingRunner (or potentially as a local function right inside run_until_complete). This helper's job will be to compute an effective receive timeout. Here’s how it breaks down:

  • Core Scheduler Deadline: First, we'll ask the `SchedulerCore` for its next crucial deadline: core_deadline = self.core.next_deadline(now). From this, we'll calculate how long we need to wait until that deadline: core_wait = core_deadline.saturating_duration_since(now). The `saturating_duration_since` is important because it ensures we never get a negative duration, which would cause issues.
  • Current Call Deadline: We'll also consider any deadline associated with the current command or inquiry. If there's an active deadline, we'll find out how much time is remaining: call_wait = deadline.map(|d| d.remaining_at(now)). This uses our existing `Deadline` API.
  • Transport Read Timeout: Finally, we'll factor in the transport's own configured read timeout: transport_wait = transport.transport_config().read_timeout.

The magic happens when we combine these. Our `recv_timeout` will be the smallest of these non-zero waits: min_nonzero(core_wait, call_wait, transport_wait). We need to be careful here to handle cases where some of these values might be `None` (meaning no specific deadline) or `Duration::ZERO` (meaning we should act immediately or not block at all). If all calculated waits are zero or none, we'll effectively skip blocking and proceed to the next iteration.

Once we have this calculated recv_timeout, we'll replace the hard-coded Duration::from_millis(10) with it. The overall structure of run_until_complete will remain largely the same, ensuring a smooth transition:

  1. First, we'll handle any pending sends using next_item_to_send and get_ready_retries.
  2. Next, we'll check for and process any timeouts that have occurred using check_timeouts(now) and execute the resulting `SchedulerAction`s.
  3. Finally, instead of a fixed poll, we'll now block in recv_into_with_timeout using our calculated recv_timeout. This block will only last until either data arrives or the next meaningful scheduler deadline is reached.

We also need to consider an edge case: what if `core.next_deadline(now)` returns `None` even though there's still a pending `target_cmd_id`? This situation shouldn't ideally happen if everything is set up correctly, but to be safe, we'll fall back to using the transport's configured read timeout (bounded, of course). We might also emit a `trace!` log under `RUNTIME_TRACE` in this scenario to help developers debug any unexpected behavior.

Why This Aligns and Is "Zero-Cost" in Spirit

This approach is particularly appealing because it's not about introducing new, complex machinery. Instead, it leverages the existing components and guarantees of our system:

  • Uses Existing Primitives: We're making use of types and invariants that are already fundamental to our codebase, such as SchedulerCore::next_deadline and the Deadline::remaining_at API. No new scheduling systems need to be invented or integrated.
  • Reduces Runtime Work: The primary goal is to cut down on unnecessary operations. By eliminating the constant wakeups and the repeated, per-call timeout mutations in transport implementations, we significantly reduce the runtime overhead without altering the fundamental behavior or correctness of the protocol. It's a pure gain in efficiency.

This makes the change feel like a natural evolution rather than a disruptive overhaul. It's about making the blocking path work in harmony with the rest of the system's intelligence.

Expected Benefits: What We'll Gain

By adopting this deadline-driven approach for `BlockingRunner`, we anticipate several significant improvements:

  • Enhanced Correctness and Reduced Latency: Timeout handling will become precisely aligned with the actual computed deadlines. This means that actions triggered by timeouts – whether they are acknowledgments, retries, or other scheduled events – will fire exactly when they are supposed to, rather than being constrained by a coarse 10ms polling interval. This can lead to a noticeable reduction in latency and jitter for operations that depend on timely timeout responses.
  • Improved Maintainability Through Architectural Consistency: One of the key benefits is ironing out an architectural inconsistency. Both the asynchronous and blocking runtimes will now operate under the same fundamental scheduling contract: "Wake up and act when the scheduler says it's time." This shared paradigm simplifies understanding, debugging, and future development, as there's one less disparate behavior to account for.
  • Tangible Performance Gains: The most immediate and measurable benefit will be in performance. Eliminating the ~100 wakeups per second that occurred during idle periods means less CPU consumption. Furthermore, the drastic reduction in the frequency of per-call timeout mutations within transport implementations (like TCP's `set_read_timeout`) will free up valuable processing time and reduce overhead.

In essence, this change promises a more responsive, efficient, and easier-to-manage system by making the blocking interface as intelligent about timing as its asynchronous counterpart.

Acceptance Criteria: How We'll Know We've Succeeded

To ensure this refactoring is successful and meets our goals, we'll be looking for the following conditions to be met:

  • No More Fixed 10ms Polling: The most obvious indicator will be that the BlockingRunner no longer contains a hard-coded 10ms interval for receive polling. The polling duration should be dynamically determined based on scheduler deadlines.
  • No Regression in Behavior: Critically, we must ensure that this change doesn't break existing functionality. This means that all command completion times, timeout firings, and retry mechanisms must continue to operate correctly and no later than the deadlines calculated by the scheduler. Specifically, call-level deadlines that are intended to cancel operations must still enforce that cancellation promptly, even if the calculated blocking timeout is longer than the remaining call duration.
  • Scalable Receive Attempts: In scenarios where no data is inbound, the frequency of receive attempts should scale dynamically with the scheduler's deadlines. Instead of a constant ~100 attempts per second, the number of calls to recv_into_with_timeout should be significantly reduced, occurring perhaps only once per nearest scheduled timeout or retry event. This scaling is the hallmark of a truly deadline-driven approach.

Meeting these criteria will confirm that we've successfully implemented a more efficient and architecturally consistent deadline-driven `BlockingRunner`.

Considering Alternatives: Other Paths Explored

When tackling a problem like this, it's always wise to consider alternative solutions and understand why the chosen path is the most appropriate. We've thought about a few other ways we could have approached making `BlockingRunner` more efficient, but each comes with its own set of trade-offs:

  1. Make the Polling Interval Configurable

    One idea might be to simply make the 10ms polling interval configurable, perhaps via a new setting in TransportConfig like poll_interval.

    Trade-off: While this offers some flexibility, it doesn't fundamentally solve the core issue. We'd still be performing periodic wakeups that are entirely decoupled from the actual state of the scheduler. If the scheduler's next deadline is two minutes away, we'd still be waking up every configured interval (e.g., 10ms or 50ms), unnecessarily consuming resources. This approach fails to leverage the rich, existing scheduling contract that tells us precisely *when* we need to wake up. It keeps the unnecessary runtime churn.

  2. Rely Solely on Transport-Configured Read Timeout

    Another possibility is to remove the 10ms poll altogether and rely solely on the read_timeout already configured within the transport itself (like the 5-second default in UDP).

    Trade-off: This is problematic because transport-level read timeouts are often independent of the scheduler's internal deadlines. The scheduler might determine that an acknowledgment or a retry needs to be sent much sooner than the transport's default read timeout allows. For example, if an ACK is due in 50ms but the transport's read timeout is 5 seconds, relying only on the transport timeout would mean we miss the critical scheduler deadline. Our `next_deadline()` function already encodes the *earliest* time something needs attention, and this approach would ignore that critical information, potentially leading to missed deadlines and incorrect protocol behavior.

  3. Introduce OS-Level Multiplexing (select/poll/epoll)

    A more significant undertaking would be to introduce operating system-level I/O multiplexing primitives like `select`, `poll`, or `epoll` directly into the blocking runtime. These mechanisms allow a program to monitor multiple file descriptors (like network sockets) and wait until one or more of them become ready for I/O.

    Trade-off: While powerful, this approach introduces a considerable amount of complexity and platform-specific code. We'd need to manage the registration of sockets, handle various event types, and integrate this new multiplexing layer. Given that our system *already* has a well-defined concept of "when to wake" via the `SchedulerCore` and a usable `recv_into_with_timeout` abstraction, adding OS-level multiplexing feels like overkill. It doesn't leverage the existing scheduler-driven timing that we've already built and relies upon. The complexity increase seems disproportionate to the problem we're solving, especially when a simpler, more integrated solution exists.

Recommendation: The Deadline-Driven Path

Based on this evaluation, we strongly recommend adopting the deadline-driven approach. It represents the path of least resistance while yielding the greatest benefits:

  • Minimal Change, Maximum Alignment: It's the smallest modification required to make the blocking runtime fully conform to the repository's existing scheduler-centric architecture.
  • Removes Measurable Overhead: It directly addresses and eliminates significant runtime inefficiencies without introducing new dependencies or expanding the public API surface.

This approach ensures that our blocking interface is not just functional but also as efficient and intelligent as its asynchronous counterpart, seamlessly integrating with the system's core timing mechanisms.

Testing Plan: Ensuring a Smooth Transition

To make sure this refactoring goes smoothly and doesn't introduce any regressions, we've outlined a comprehensive testing plan. This plan covers unit tests for the core logic, integration-style tests for the `BlockingRunner` itself, and ensures our existing tests remain effective.

1. Unit Tests for Timeout Computation

We'll start by creating focused unit tests for the new helper function (or internal logic) responsible for calculating the effective receive timeout. These tests will isolate the computation logic and verify its correctness under various conditions:

  • Inputs: We'll feed the computation function different scenarios, including:
    • Results from SchedulerCore::next_deadline(now) (we'll simulate pending ACK timeouts and retry queue entries).
    • Calculated remaining time from Deadline::remaining_at(now).
    • Values from TransportConfig.read_timeout.
  • Assertions: We'll verify that the computed timeout:
    • Correctly selects the minimum of the available non-zero wait times.
    • Handles cases where deadlines are `None` appropriately.
    • Never returns a negative duration (ensuring `saturating_duration_since` works as expected).
    • Treats `Duration::ZERO` as an instruction to not block and proceed immediately.

2. Integration-Style Blocking Runner Test (Non-Flaky)

To test the `BlockingRunner` in a more integrated fashion, we'll implement a specialized test double for the transport layer. This new `BlockingTransport` test harness will:

  • Record Timeout Arguments: It will meticulously record the `timeout` duration passed to every call of recv_into_with_timeout.
  • Simulate Blocking: It will simulate blocking for approximately the duration specified by the timeout argument. If a deterministic simulation is feasible, we'll opt for that to ensure test stability.
  • Return Controlled Frames: It will be able to return controlled data frames when needed, simulating actual network activity.
  • Validation: We will then validate that:
    • The recorded timeout is not a constant 10ms. Instead, it should dynamically track the scheduler's next deadline (e.g., being close to an ACK timeout or the next scheduled retry time).
    • Over a fixed period, the number of calls to recv_into_with_timeout is significantly less than the previous ~100Hz polling rate, confirming deadline-driven behavior.

3. Update Existing Scripted Transport for Better Coverage

To avoid needing a separate transport harness for all blocking runner tests, we plan to update the existing ScriptedBlockingTransport. Currently, its recv_into_with_timeout often ignores the provided timeout. We'll modify it to respect the timeout duration (instead of just delegating to recv_into). This will allow our existing test suite to automatically gain better coverage for correct timeout selection without requiring major test rewrites.

4. Regression Suite

We will run our existing suite of blocking runner tests. These tests typically cover scenarios like send failures and timeout paths. Ensuring that these tests pass without modification will give us confidence that the external behavior of the `BlockingRunner` remains unchanged. We'll also add a specific regression test to confirm that call-level `Deadline` cancellations continue to function correctly and promptly return `Error::Timeout`, even when the computed `recv_timeout` is longer than the remaining call duration.

5. Observability Check (Manual/Dev)

Finally, for manual verification and developer confidence, we'll enable the `RUNTIME_TRACE` logging. By running a scenario with no inbound data and observing the logs (or simple counters if implemented), we'll confirm that receive attempts occur at the cadence dictated by scheduler deadlines, rather than the old fixed 10ms interval. We'll also verify that scheduled retry and timeout actions still trigger as expected, ensuring the system's core logic remains intact.

Conclusion: A Smarter, More Efficient Runtime

By shifting `BlockingRunner` from a fixed-interval polling model to a deadline-driven one, we're not just fixing an inefficiency; we're fundamentally aligning its behavior with the intelligent, scheduler-centric design of our entire system. This change promises to reduce unnecessary CPU wakeups, eliminate redundant work in transport operations, and importantly, ensure that timeouts and retries happen precisely when they need to, not just because a timer ticked over. The result is a more responsive, performant, and maintainable runtime environment.

For further insights into asynchronous programming and efficient I/O handling in Rust, you might find these resources helpful:

  • Read more about asynchronous programming in Rust at the official Rust documentation: Rust Async Book.
  • Explore best practices for network programming and I/O in Rust: Tokio Tutorial.