Skip to main content
Whenever messages are stored in Honcho, background processes kick off to reason about the conversation and generate insights. Reasoning is an asynchronous process and will not immediately generate insights for the latest message you’ve sent. This is by design: we want to reason efficiently over batches of messages rather than assessing each message in a vacuum. Honcho provides several utilities to check the status of the queue.
from honcho import Honcho
honcho = Honcho()

status = honcho.queue_status()
Output types
class QueueStatus(BaseModel):
    completed_work_units: int
    """Completed work units"""

    in_progress_work_units: int
    """Work units currently being processed"""

    pending_work_units: int
    """Work units waiting to be processed"""

    pending_stalled_work_units: int
    """Pending representation work units waiting to accumulate enough tokens to
    hit DERIVER_REPRESENTATION_BATCH_MAX_TOKENS. Always 0 when
    DERIVER_FLUSH_ENABLED is true."""

    pending_ready_work_units: int
    """Pending work units eligible to be claimed: non-representation task types,
    plus representation work units whose pending tokens are at/above the batch
    threshold (or when flush is enabled).
    pending_stalled_work_units + pending_ready_work_units == pending_work_units."""

    total_work_units: int
    """Total work units"""

    sessions: Optional[Dict[str, Sessions]] = None
    """Per-session status when not filtered by session"""
The pending_stalled_work_units / pending_ready_work_units split tells you why pending work isn’t moving: stalled items are sitting below the deriver’s batch token threshold waiting for more messages, while ready items are eligible to be picked up by a worker. The two always sum to pending_work_units. Whenever a message is sent it will generate several tasks. These could be tasks such as generating insights, cleaning up a representation, summarizing a conversation etc. These tasks are defined based on who is sending the message, what session the message is in, and potentially who is observing the message. We call the combination of these parameters a work_unit This has a few different implications.
  • tasks within the same work_unit are processed sequentially, but multiple work_units will be processed in parallel
  • If local representations are turned in a Session then a message will generate an additional work unit for every peer that has observe_others=True

Tracked task types

The queue status endpoint reports on the following task types:
Task TypeDescription
representationMemory formation — the deriver processes messages and extracts observations about peers
summarySession summarization — creates short and long summaries at configurable message intervals
dreamMemory consolidation — explores and consolidates observations to improve memory quality
Internal infrastructure tasks (such as webhook delivery, resource deletion, and vector reconciliation) are not included in queue status counts.
Completed counts are not lifetime totals. Honcho periodically cleans up processed queue items to keep the queue table lean. As a result, completed_work_units reflects items completed since the last cleanup cycle, not the total number of items ever processed.
The queue_status method can take additional parameters to scope the status to a specific work unit:
def queue_status(
        self,
        observer_id: str | None = None,
        sender_id: str | None = None,
        session_id: str | None = None,
    ) -> QueueStatus:
Additionally, there are queue status methods available on the session objects in each of the SDKs.
Do not wait for the queue to be empty. The queue is a continuous processing system—new messages may arrive at any time, and “completion” is not a meaningful state. Design your application to work without assuming the queue will ever be fully drained. Use queueStatus() for observability and debugging, not for synchronization.
Below are the function signatures for the session level queue status method:
@validate_call
    def queue_status(
        self,
        observer_id: str | None = None,
        sender_id: str | None = None,
    ) -> QueueStatus:

Inspecting individual work units

When the aggregate counts tell you “something is stalled” but not which work units are stalled, use queue_work_units / queueWorkUnits. It returns one row per unprocessed work unit with the token totals, in-progress flag, and threshold classification needed to debug “why isn’t this advancing?”.
The two endpoints count at different granularities. queue/status counts individual queue items (one per message awaiting processing), while queue/work-units returns one row per work unit — items sharing a work_unit_key collapse into a single row. So a status response reporting pending_work_units: 3 can correspond to a single row (total: 1) here when those three items belong to the same work unit.
from honcho import Honcho
honcho = Honcho()

page = honcho.queue_work_units()

# Inspect just the current page
for wu in page.items:
    print(wu.work_unit_key, wu.pending_tokens, wu.hit_threshold)

# Threshold context from the envelope
print(page.representation_batch_max_tokens, page.flush_enabled)

# Walk the full queue (auto-fetches subsequent pages)
for wu in page:
    ...

Cursor pagination

The endpoint is cursor-paginated, not offset-paginated. The queue mutates rapidly (workers claim and complete items continuously), and offset pagination would skip rows that were processed between fetches. Cursor pagination uses opaque tokens (next_page / previous_page) that are stable across these mutations. Pass cursor and optionally size to fetch a specific page, or use the helpers on the returned page object to navigate.
# Explicit cursor navigation
page = honcho.queue_work_units(size=50)
while page.has_next_page():
    page = page.get_next_page()
    process(page.items)

# Or pass a cursor token directly
page = honcho.queue_work_units(cursor="<token-from-previous-page>")

Per-work-unit fields

FieldTypeDescription
work_unit_keystrFull key, e.g. representation:ws_abc:sess_xyz:peer_observed
task_typestr"representation", "summary", or "dream"
session_idstr | nullFK to the session row; null for task types without a session
session_namestr | nullHuman-readable session name
observerstr | nullObserver peer (from queue payload)
observedstr | nullObserved peer (from queue payload)
pending_itemsintUnprocessed queue items in this work unit
pending_tokensintSum of messages.token_count across the pending items
tokens_until_thresholdintTokens still needed to fire the batch (0 for non-representation task types or when flush is enabled)
hit_thresholdboolTrue if eligible to be claimed; false means stalled
in_progressboolTrue if a deriver worker has claimed this work unit
oldest_item_atdatetimeOldest pending queue-item creation timestamp
newest_item_atdatetimeNewest pending queue-item creation timestamp
Each page also carries the deriver’s threshold configuration so you can interpret per-row classification without re-querying server settings:
FieldTypeDescription
representation_batch_max_tokensintDERIVER_REPRESENTATION_BATCH_MAX_TOKENS at request time
flush_enabledboolDERIVER_FLUSH_ENABLED at request time
Cursor pagination is stable, but the underlying data is not. Items can be processed between pages, and new items can be enqueued. Each page is a snapshot of what the server saw at request time, but pages may be inconsistent with each other under concurrent processing. Use page.items for a stable per-page view; iterate the page object only when an approximate walk is acceptable.