How to Read This Changelog
How to Read This Changelog
Each release is documented with:
- Added: New features and capabilities
- Changed: Modifications to existing functionality
- Deprecated: Features that will be removed in future versions
- Removed: Features that have been removed
- Fixed: Bug fixes and corrections
- Security: Security-related improvements
Version Format
Honcho follows Semantic Versioning:- MAJOR version for incompatible API changes
- MINOR version for backwards-compatible functionality additions
- PATCH version for backwards-compatible bug fixes
Honcho API and SDK Changelogs
- Honcho API
- Python SDK
- TypeScript SDK
v3.0.10 (Current)
Added
- Messages are now embedded via a background task rather than blocking API request
- Read-only DB session mode (
get_read_db/tracked_db(..., read_only=True)) so reads don’t hold a transaction open across the work CORS_ORIGINSenv var to configure CORS allowed origins without editing source; defaults match the prior hardcoded list, so self-hosted deployments behind custom domains can whitelist their frontend (#697)scripts/generate_jwt.py— utility for minting scoped or admin Honcho JWTs (--admin,--workspace/--peer/--session,--expireswith human-friendly durations,--print-only) without calling the keys API (#757)STALE_WORK_UNIT_CLEANUP_INTERVAL_SECONDS(default 60s) — minimum jittered spacing between deriver stale-work-unit cleanup runs, so cleanup no longer runs on every seconds-scale poll (0.0keeps the legacy every-poll behavior) (#773)
Changed
- Optimized the deriver and dreamer prompt cache prefixes to improve prompt-cache hit rates (#806)
Fixed
times_derivedis now properly reinforced when a duplicate conclusion is detected. It had been pinned at 1 for nearly every conclusion (the reject-new branch dropped the increment and the new-wins branch reset the count to 1), soORDER BY times_derived DESCfell back to arbitrary heap order and froze stale conclusions to the front of injected context. Reinforcement is now an atomic increment and both most-derived queries gained acreated_at DESCrecency tiebreaker (#768)- Webhook creation now correctly rejects private/internal IP addresses (#793)
v3.0.9
Changed
- Connection acquisition is now a single attempt with no server-side retry, on a vanilla
AsyncSession. A newDB_CONNECT_TIMEOUT_SECONDS(default 2s) bounds the attempt so a saturated or unreachable pooler fails fast instead of holding a client connection open to re-knock. A saturated DB now surfaces to the caller — the API returns an error and the deriver backs off and retries on a later poll — which lets the pooler drain rather than amplifying saturation.
Added
- Deriver poll jitter so instances that start together don’t poll in lockstep:
DERIVER_POLLING_STARTUP_JITTER_SECONDS(random delay before the first poll, default 30s) andDERIVER_POLLING_JITTER_RATIO(±fraction applied to every poll sleep, default 0.5). Both disable at0.0; the underlying backoff schedule is unchanged.
Removed
- Reverted the connection-checkout retry and
HonchoAsyncSessioncustom session introduced in 3.0.8. Removed theDB_CONNECTION_RETRY_ENABLED/DB_CONNECTION_RETRY_MAX_DELAY_SECONDS/DB_CONNECTION_RETRY_BACKOFF_INITIAL_SECONDS/DB_CONNECTION_RETRY_BACKOFF_MAX_SECONDSsettings, thedb_connection_acquisitions{outcome=...}Prometheus counter, and thedb.pool.acquireSentry span. Alerting built ondb_connection_acquisitionsshould migrate todb_pool_connections/db_queries_in_flight.
v3.0.8
Added
- Connection-checkout retry with bounded exponential backoff (tenacity) on
get_db/tracked_db: transient transaction-pooler (Supavisor) rejections — SQLAlchemyTimeoutErrorandOperationalError— now retry with backoff instead of surfacing as 500s under client-connection saturation. Gated byDB_CONNECTION_RETRY_ENABLEDwith configurable delay/backoff knobs; ~10s default budget (#758) HonchoAsyncSession— a lazyAsyncSessionthat checks out its pooled connection (with retry) on the first DB-touching call rather than at construction. Request handlers doing non-DB work (embedding, file, LLM) before their first query no longer pin a pooler connection across it. Only the checkout is retried; the statement still runs exactly once, so writes are never duplicated (#758)- Adaptive deriver queue polling: the poll interval backs off when the queue is idle or erroring (base → max, doubling each cycle) and snaps back to base the moment work is claimed, cutting steady-state query load against the DB. Gated by
DERIVER_POLLING_BACKOFF_ENABLEDwith configurable max/multiplier (#758) - New Prometheus
db_pool_connectionsgauge (checked_out / checked_in / size / overflow), labeledapi|deriver, registered in both the API lifespan and the deriver metrics server (#758) - New Prometheus
db_connection_acquisitions{outcome=ok|retried|exhausted}counter — the alertable early-warning signal that connection checkouts are retrying through pooler rejection, before requests start failing (#758) - New Prometheus
db_queries_in_flightgauge — statements actually executing on the wire (via SQLAlchemy cursor-execute events). Paired withchecked_out, the gap reveals connections held but parked (the “idle in transaction during an external call” antipattern). Gated onMETRICS.ENABLEDfor zero overhead when off (#758) - Explicit
SqlalchemyIntegrationin both the API and deriver Sentry inits; connection acquisition wrapped in adb.pool.acquirespan with live pool stats captured on retry exhaustion (#758)
Changed
- Default
POOL_TIMEOUTlowered to 5s, with validation that it stays under the connection-retry budget when a pooled (non-null)POOL_CLASSis configured;config.toml.exampleand the v2/v3 configuration docs updated to match (#758) HonchoAsyncSessionwraps every DB-touching session method (execute / scalar / scalars / flush / merge / refresh / commit / get / get_one / stream / stream_scalars / delete) so the lazy-checkout-with-retry guarantee has no holes; the acquired flag resets onclose()/reset()so a reused session re-acquires on next use (#758)
Fixed
- Roll the session back on a retryable checkout failure before retrying — a failed autobegin could otherwise leave it pending-rollback, making the next connection attempt raise instead of cleanly re-checking-out (#758)
- Guard
DBPoolCollector.collect()so a pool-read/import hiccup can’t raise and abort the entire/metricsscrape (Prometheus drops all metrics if any collector raises) (#758) - Clamp the pool overflow gauge to ≥ 0 (it could report negative before the pool fills) (#758)
- Removed a double-sleep in the deriver idle poll so the backoff cap is a true cap rather than 2× (#758)
v3.0.7
Added
- New
src/llm/package as the single owner of provider runtime: clients, backends, history adapters, tool loop, request builder, credentials, and caching policy (#459) - New cloudevent
LLMCallCompletedEvent(llm.call.completed) fires once per provider hit with full cost-attribution context: transport/provider_label, model, token counts with cache breakdown, finish_reason, outcome, retry/fallback state, duration, tool-call shape, streaming flag, and agent correlation (run_id+ iteration) (#637) RepresentationCompletedEventnow carriestotal_input_tokensfor full-trace cost attribution; per-emitterhoncho_versioninjection; deterministic per-run_idhigh-volume sampler viaTelemetrySettings.HIGH_VOLUME_SAMPLE_RATE(#637)- Deriver custom instructions: per-workspace/peer guidance threaded into the deriver prompt with a
MAX_CUSTOM_INSTRUCTIONS_TOKENSbudget (default 2000); deriverMAX_INPUT_TOKENSraised 23000 → 25000 (#609) - Configurable embedding dimensions:
EMBEDDING_MODEL_CONFIG__DIMENSIONS_MODE(auto/always/never) controls whether OpenAIdimensions=is forwarded (#678) - New
honcho-clipackage — Python CLI for inspecting and managing peers, sessions, and configuration against a Honcho deployment (#424) HONCHO_API_URLenv var support in the MCP Worker for self-hosted deployments (#575)- API ID
max_lengthincreased from 100 to 512 acrossWorkspaceCreate,PeerCreate, andSessionCreateto align with the DB schema (#684) AttemptPlandataclass pins per-retry provider selection across stream-final retries so streaming doesn’t bounce back to primary after the tool loop has settled on fallback (#459)- Gemini JSON-schema sanitizer for
function_declarations— strips keywords Gemini’s validator rejects while preserving semantics for other backends (#459)
Changed
- All LLM orchestration moved out of
src/utils/clients.pyintosrc/llm/with modules split by responsibility (#459) - Default
ModelConfigfactories (deriver, summary, dreamer specialists, dialectic levels) normalized with no extra parameters set by default; operators add transport/thinking overrides explicitly (#459) - OpenAI reasoning-model routing widened to cover
gpt-5.xando1/o3/o4— these models receivemax_completion_tokensinstead ofmax_tokens(#459) - Peer card prompts reframed as stable identity markers; induction specialist now opts out of peer card writes so only deduction touches the card (#686)
- Vector store queries no longer fetch embedding vectors — only document metadata is returned, reducing payload size and DB load (pgvector, lancedb, turbopuffer) (#682)
- Langfuse trace metadata now includes
namespace,model, andproviderso traces can be filtered by deployment slice (#565) - Deriver: model-aware tokenizer (replaces the previously hardcoded encoding) and explicit guard on empty message content (#647)
- Dialectic level defaults now merge correctly with per-level overrides (#656)
- Default dialectic tool choice switched to
auto(#630) - Vector sync given a substantial retry budget to tolerate transient embedding provider outages (#604)
AgentToolConclusionsDeletedEventpayload now carrieslevels(#612)- Turbopuffer:
InternalServerErrorcaught and surfaced as a warning rather than a hard failure; vector store sync errors downgraded to warnings (#561)
Fixed
reversequery parameter is now honored on the v3 workspace list, peer list, workspace-scoped session list, and peer-scoped session list. Honcho SDKs at 2.1.0+ were already sendingreverse=truefor these routes but the server silently ignored it. Ties oncreated_atnow fall back to the internal nanoididfor stable ordering across pages (#685)- LLM client factories now receive
base_urlfromLLMSettingsfor default providers — operators pointing at OpenAI-compatible proxies viaLLM__OPENAI_BASE_URLwere previously ignored on the default path (#643, fixes #641) - Internal N+1 query in dialectic agent tool execution — collapsed per-iteration DB lookups into a single fetch (#652)
- Dreamer threshold and time-guard semantics: count filter now includes only
documents.level == 'explicit'(was inflating threshold via dreamer-created levels and creating a feedback loop);last_dream_atwrite relocated from enqueue to process so duplicate enqueues or failed runs no longer reset the 8-hour time guard (#573) - Deriver: blank observations are filtered out before embedding (previously triggered noisy embedding calls and persisted empty rows) (#615)
- Surprisal module: filter format corrected from
{"level": levels}to{"level": {"in": levels}}— the prior call silently returned 0 results and made the entire Surprisal phase of the Dream cycle a no-op (#581, fixes #559) - Removed hardcoded
stop_sequencesoverride from DeriverModelConfig(was clobbering operator-configured stop sequences) (#587) - Embedding client:
embed()now wraps single-string input in an array, restoring compatibility with OpenAI-compatible third-party providers that reject scalar input (#586) - Docker Compose: deriver service startup gated on the API service healthcheck — prevents races where the deriver starts before the API has run migrations (#689)
- Docker image:
HEALTHCHECKdirective removed from the shared base image; service-level health checks now belong in each service’s own configuration (#530) - Removed strict parameter validation for thinking params on Anthropic and OpenAI transports — was rejecting valid per-transport configs (#686)
- Stream-final retries pin to the
AttemptPlanthat succeeded rather than re-running provider selection through the outercurrent_attemptContextVar (#459) - Gemini
cached_contentreuse keys now includesystem_instructionandtool_configso cache hits don’t cross configurations (#459) - CrewAI example updated for the latest CrewAI protocol (#631)
Removed
src/utils/clients.pydeleted; its responsibilities are split acrosssrc/llm/registry.py,src/llm/credentials.py, and the backend-specific modules (#459)HEALTHCHECKdirective from the shared Docker image (#530)
v3.0.6
Changed
- Tightened transaction scopes across search, agent tools, queue manager, and webhook delivery to minimize DB connection hold time during external operations (#525)
- Search operations refactored to two-phase pattern — external work (embeddings, LLM calls) completes before opening a transaction (#525)
- Agent tool executor performs external operations before acquiring DB sessions (#525)
- Queue manager transaction scope reduced to only the critical section (#525)
- Webhook delivery no longer holds a DB session parameter (#525)
Fixed
- Session leakage in non-session-scoped dialectic chat calls (#526)
Added
- Health check endpoint (
/health) for container orchestration and load balancer probes (#510)
v3.0.4
Added
- JSONB metadata validation enforces 100 key limit and max depth of 5 (#419)
Changed
- Schemas refactored from single
schemas.pyintoschemas/api.py,schemas/configuration.py, andschemas/internal.pywith backwards-compatible re-exports (#419)
Fixed
- Missing
deleted_atfilter onRepresentationManager._query_documents_recent()and._query_documents_most_derived()allowed soft-deleted documents to leak into the deriver’s working representation (#456) CleanupStaleItemsCompletedEventemitted spuriously when no queue item was actually deleted (#454)- Empty JSON file uploads caused unhandled errors; now returns normalized error responses (#434)
- Memory leak:
_observation_locksswitched toWeakValueDictionaryto prevent unbounded growth (#419) - SQL injection in
dependencies.py: parameterizedset_configcalls to prevent injection via request context (#419) - NUL byte crashes: string inputs (message content, queries, peer cards) now stripped at schema level (#419)
- Filter recursion depth capped at 5 to prevent stack overflow (#419)
- Dedup-skipped observations now correctly reflected in created counts (#477)
- External vector store support for message search — routes queries through configured external vector store with oversampling and deduplication to handle chunked embeddings (#479)
- Dialectic agent no longer holds a DB connection during LLM calls — embeddings are pre-computed before tool execution, DB sessions isolated in
extract_preferences,query_documentsno longer accepts a DB session parameter (#477)
v3.0.3
Added
- Consolidated session context into a single DB session with 40/60 token budget allocation between summary and messages
- Observation validation via
ObservationInputPydantic schema with partial-success support and batch embedding with per-observation fallback - Peer card hard cap of 40 facts with case-insensitive deduplication and whitespace normalization
- Safe integer coercion (
_safe_int) for all LLM tool inputs to handle non-integer values like"Infinity" - Embedding pre-computation and reuse across multiple search calls in dialectic and representation flows
- Peer existence validation in dialectic chat endpoints — raises ResourceNotFoundException instead of silently failing
- Logging filter to suppress noisy
GET /metricsaccess logs - Oolong long-context aggregation benchmark (synth and real variants, 1K–4M token context windows)
- MolecularBench fact quality evaluation (ambiguity, decontextuality, minimality scoring)
- CoverageBench information recall evaluation (gold fact extraction, coverage matching, QA verification)
- LoCoMo summary-as-context baseline evaluation
- Webhook delivery tests, dependency lifecycle tests, queue cleanup tests, summarizer fallback tests
- Parallel test execution via pytest-xdist with worker-specific databases
test_reasoning_levels.pyscript for LOCOM dataset testing across reasoning levels
Changed
- Workspace deletion is now async — returns 202 Accepted, validates no active sessions (409 Conflict), cascade-deletes in background
- Redis caching layer now stores plain-dict instead of ORM objects, with v2-prefixed keys, storage, resilient
safe_cache_set/safe_cache_deletehelpers, and deferred post-commit cache invalidation - All
get_or_create_*CRUD operations now use savepoints (db.begin_nested()) instead of commit/rollback for race condition prevention - Reconciler vector sync uses direct ORM mutation instead of batch parameterized UPDATE statements
- Summarizer enforces hard word limit in prompt and creates fallback text for empty summaries with
summary_tokens = 0 - Blocked Gemini responses (SAFETY, RECITATION, PROHIBITED_CONTENT, BLOCKLIST) now raise
LLMErrorto trigger retry/backup-provider logic - Gemini client explicitly sets
max_output_tokensfrommax_tokensparameter - All deriver and metrics collector logging replaced with structured
logging.getLogger(__name__)calls - Dreamer specialist prompts updated to enforce durable-facts-only peer cards with max 40 entries and deduplication
GetOrCreateResultchanged fromNamedTupletodataclasswithasync post_commit()method- FastAPI upgraded from 0.111.0 to 0.131.0; added pyarrow dependency
- Queue status filtering to only show user-facing tasks (representation, summary, dream); excludes internal infrastructure tasks
Fixed
- JWT timestamp bug —
JWTParams.twas evaluated once at class definition time instead of per-instance - Session cache invalidation on deletion was missing
get_peer_card()now properly propagatesResourceNotFoundExceptioninstead of swallowing itset_peer_card()ensures peer exists viaget_or_create_peers()before updating- Backup provider failover with proper tool input type safety
- Removed
setup_admin_jwt()from server startup - Sentry coroutine detection switched from
asyncio.iscoroutinefunctiontoinspect.iscoroutinefunction
Removed
explicit.pyandobex.pybenchmarks replaced by coverage.py and molecular.py- Claude Code review automation workflow (
.github/workflows/claude.yml) - Coverage reporting from default pytest configuration
v3.0.2
Added
- Documentation for reasoning_level and Claude Code plugin
Changed
- Gave dreaming sub-agents better prompting around peer card creation, tweaked overall prompts
Fixed
- Added message-search fallback for memory search tool, necessary in fresh sessions
- Made FLUSH_ENABLED a config value
- Removed N+1 query in search_messages
v3.0.0
Added
- Agentic Dreamer for intelligent memory consolidation using LLM agents
- Agentic Dialectic for query answering using LLM agents with tool use
- Reasoning levels configuration for dialectic (
minimal,low,medium,high,max) - Prometheus token tracking for deriver and dialectic operations
- n8n integration
- Cloud Events for auditable telemetry
- External Vector Store support for turbopuffer and lancedb with reconciliation flow
Changed
- API route renaming for consistency
- Dreamer and dialectic now respect peer card configuration settings
- Observations renamed to Conclusions across API and SDKs
- Deriver to buffer representation tasks to normalize workloads
- Local Representation tasks to create singular QueueItems
- getContext endpoint to use
search_queryrather than forcelast_user_message
Fixed
- Dream scheduling bugs
- Summary creation when start_message_id > end_message_id
- Cashews upgrade to prevent NoScriptError
- Memory leak in
accumulate_metriccall
Removed
- Peer card configuration from message configuration; peer cards no longer created/updated in deriver process
v2.5.1
Fixed
- Backwards compatibility for
message_idsfield in documents to handle legacy tuple format
v2.5.0
Added
- Message level configurations
- CRUD operations for observations
- Comprehensive test cases for harness
- Peer level get_context
- Set Peer Card Method
- Manual dreaming trigger endpoint
Changed
- Configurations to support more flags for fine-grained control of the deriver, peer cards, summaries, etc.
- Working Representations to support more fine-grained parameters
Fixed
- File uploads to match
MessageCreatestructure - Cache invalidation strategy
v2.4.3
Added
- Redis caching to improve DB IO
- Backup LLM provider to avoid failures when a provider is down
Changed
- QueueItems to use standardized columns
- Improved Deduplication logic for Representation Tasks
- More finegrained metrics for representation, summary, and peer card tasks
- DB constraint to follow standard naming conventions
v2.4.2
v2.4.1
v2.4.0
Added
- Unified
Representationclass - vllm client support
- Periodic queue cleanup logic
- WIP Dreaming Feature
- LongMemEval to Test Bench
- Prometheus Client for better Metrics
- Performance metrics instrumentation
- Error reporting to deriver
- Workspace Delete Method
- Multi-db option in test harness
Changed
- Working Representations are Queried on the fly rather than cached in metadata
- EmbeddingStore to RepresentationFactory
- Summary Response Model to use public_id of message for cutoff
- Semantic across codebase to reference resources based on
observerandobserved - Prompts for Deriver & Dialectic to reference peer_id and add examples
Get Contextroute returns peer card and representation in addition to messages and summaries- Refactoring logger.info calls to logger.debug where applicable
Fixed
- Gemini client to use async methods
v2.3.3
v2.3.2
Added
- Get peer cards endpoint (
GET /v2/peers/{peer_id}/card) for retrieving targeted peer context information
Changed
- Replaced Mirascope dependency with small client implementation for better control
- Optimized deriver performance by using joins on messages table instead of storing token count in queue payload
- Database scope optimization for various operations
- Batch representation task processing for ~10x speed improvement in practice
Fixed
- Separated clean and claim work units in queue manager to prevent race conditions
- Skip locked ActiveQueueSession rows on delete operations
- Langfuse SDK integration updates for compatibility
- Added configurable maximum message size to prevent token overflow in deriver
- Various minor bugfixes
v2.3.0
Added
getSummariesendpoint to get all available summaries for a session directly- Peer Card feature to improve context for deriver and dialectic
Changed
- Session Peer limit to be based on observers instead, renamed config value to
SESSION_OBSERVERS_LIMIT Messagescan take a custom timestamp for thecreated_atfield, defaulting to the current timeget_contextendpoint returns detailedSummaryobject rather than just summary content- Working representations use a FIFO queue structure to maintain facts rather than a full rewrite
- Optimized deriver enqueue by prefetching message sequence numbers (eliminates N+1 queries)
Fixed
- Deriver uses
get_contextinternally to prevent context window limit errors - Embedding store will truncate context when querying documents to prevent embedding token limit errors
- Queue manager to schedule work based on available works rather than total number of workers
- Queue manager to use atomic db transactions rather than long lived transaction for the worker lifecycle
- Timestamp formats unified to ISO 8601 across the codebase
- Internal get_context method’s cutoff value is exclusive now
v2.2.0
Added
- Arbitrary filters now available on all search endpoints
- Search combines full-text and semantic using reciprocal rank fusion
- Webhook support (currently only supports queue_empty and test events, more to come)
- Small test harness and custom test format for evaluating Honcho output quality
- Added MCP server and documentation for it
Changed
- Search has 10 results by default, max 100 results
- Queue structure generalized to handle more event types
- Summarizer now exhaustive by default and tuned for performance
Fixed
- Resolve race condition for peers that leave a session while sending messages
- Added explicit rollback to solve integrity error in queue
- Re-introduced Sentry tracing to deriver
- Better integrity logic in get_or_create API methods
v2.1.2
Fixed
- Summarizer module to ignore empty summaries and pass appropriate one to get_context
- Structured Outputs calls with OpenAI provider to pass strict=True to Pydantic Schema
v2.1.1
Added
- Test harness for custom Honcho evaluations
- Better support for session and peer aware dialectic queries
- Langfuse settings
- Added recent history to dialectic prompt, dynamic based on new context window size setting
Fixed
- Summary queue logic
- Formatting of logs
- Filtering by session
- Peer targeting in queries
Changed
- Made query expansion in dialectic off by default
- Overhauled logging
- Refactor summarization for performance and code clarity
- Refactor queue payloads for clarity
v2.1.0
Added
- File uploads
- Brand new “ROTE” deriver system
- Updated dialectic system
- Local working representations
- Better logging for deriver/dialectic
- Deriver Queue Status no longer has redundant data
Fixed
- Document insertion
- Session-scoped and peer-targeted dialectic queries work now
- Minor bugs
Removed
- Peer-level messages
Changed
- Dialectic chat endpoint takes a single query
- Rearranged configuration values (LLM, Deriver, Dialectic, History->Summary)
v2.0.4
Fixed
- Migration/provision scripts did not have correct database connection arguments, causing timeouts
v2.0.2
Fixed
- Database initialization was misconfigured and led to provision_db script failing: switch to consistent working configuration with transaction pooler
v2.0.1
Added
- Ergonomic SDKs for Python and TypeScript (uses Stainless underneath)
- Deriver Queue Status endpoint
- Complex arbitrary filters on workspace/session/peer/message
- Message embedding table for full semantic search
Changed
- Overhauled documentation
- BasedPyright typing for entire project
- Resource filtering expanded to include logical operators
Fixed
- Various bugs
- Use new config arrangement everywhere
- Remove hardcoded responses
v2.0.0
Added
- Ability to get a peer’s working representation
- Metadata to all data primitives (Workspaces, Peers, Sessions, Messages)
- Internal metadata to store Honcho’s state no longer exposed in API
- Batch message operations and enhanced message querying with token and message count limits
- Search and summary functionalities scoped by workspace, peer, and session
- Session context retrieval with summaries and token allocatio
- HNSW Index for Documents Table
- Centralized Configuration via Environment Variables or config.toml file
Changed
- New architecture centered around the concept of a “peer” replaces the former “app”/“user”/“session” paradigm
- Workspaces replace “apps” as top-level namespace
- Peers replace “users”
- Sessions no longer nested beneath peers and no longer limited to a single user-assistant model. A session exists independently of any one peer and peers can be added to and removed from sessions.
- Dialectic API is now part of the Peer, not the Session
- Dialectic API now allows queries to be scoped to a session or “targeted” to a fellow peer
- Database schema migrated to adopt workspace/peer/session naming and structure
- Authentication and JWT scopes updated to workspace/peer/session hierarchy
- Queue processing now works on ‘work units’ instead of sessions
- Message token counting updated with tiktoken integration and fallback heuristic
- Queue and message processing updated to handle sender/target and task types for multi-peer scenarios
Fixed
- Improved error handling and validation for batch message operations and metadata
- Database Sessions to be more atomic to reduce idle in transaction time
Removed
- Metamessages removed in favor of metadata
- Collections and Documents no longer exposed in the API, solely internal
- Obsolete tests for apps, users, collections, documents, and metamessages
v1.1.0
Added
- Normalize resources to remove joins and increase query performance
- Query tracing for debugging
Changed
/listendpoints to not require a request bodymetamessage_typetolabelwith backwards compatibility- Database Provisioning to rely on alembic
- Database Session Manager to explicitly rollback transactions before closing the connection
Fixed
- Alembic Migrations to include initial database migrations
- Sentry Middleware to not report Honcho Exceptions
v1.0.0
Added
- JWT based API authentication
- Configurable logging
- Consolidated LLM Inference via
ModelClientclass - Dynamic logging configurable via environment variables
Changed
- Deriver & Dialectic API to use Hybrid Memory Architecture
- Metamessages are not strictly tied to a message
- Database provisioning is a separate script instead of happening on startup
- Consolidated
session/chatandsession/chat/streamendpoints
Previous Releases
For a complete history of all releases, see our GitHub Releases page.Getting Help
If you encounter issues using the Honcho API or its SDKs:- Open an issue on GitHub
- Join our Discord community for support