Messages API v3
Messages API v3 is the rc2-backed OpenAI-compatible text generation endpoint.
Base URL:
https://msg.hidoba.com
Use this API when you want synchronous chat or Responses API generations with Hidoba quota tracking, character prompts, RAG, streaming, fallback models, and usage attribution.
The older Messages API v2 docs cover the legacy /v2/completions text/audio flow. Use Messages API v3 for new OpenAI-compatible text generation integrations.
Typical Flow
- Send an OpenAI-compatible request with a quota API key.
- rc2 validates the API key, quota, character access, and request lifecycle.
- Messages API v3 transforms the request, injects character and RAG context when configured, and routes the request to Bifrost.
- The model response is returned directly or streamed.
- Usage is recorded automatically; clients do not call billing or usage endpoints.
Features
- Chat Completions:
POST /v3/chat/completions - Responses API:
POST /v3/responses - Compatibility aliases:
/v1/chat/completionsand/v1/responses - Authentication:
Authorization: Bearer <quota_api_key>orX-API-Key: <quota_api_key> - Characters: Optional GitHub or inline characters under
metadata.hidoba.character - RAG: Derived from character and server config. Do not send request-level RAG config.
- Routing: Server-owned provider routing with optional request
fallback_model - Reasoning controls: OpenRouter-style
reasoningoptions can be passed through - Streaming: Standard streaming responses are proxied for supported models
Important Considerations
important
metadata.hidobamay contain onlycharacterandcharacter_params.metadata.hidoba.rag,metadata.hidoba.routing,metadata.hidoba.request_id, and unknownmetadata.hidobafields are rejected.metadata.hidobais stripped before provider-visible model payloads.- RAG uses internal retrieval, including dense, SPLADE, and BM25 signals when configured. Clients do not call SPLADE directly.
- Character
max_new_tokens, when present in old character schemas, is not used as the output-token limit. Use request-level token fields such asmax_completion_tokens,max_tokens, ormax_output_tokens.