Bedrockify: OpenAI Compatible Proxy for Bedrock: Completions + Embeddings

Happy to release bedrockify: An OpenAI Completions compatible API with embeddings proxy API for bedrock usage. Many of the current agents don't support bedorck by default, but do support custom OpenAI api. This gives a proxy you can run easily in one line on the same machine as your agent , and point your agent to it. Written in go, it's fast, easy to run as a daemon and just works. Works with OpenClaw, Hermes and other agents.

Supported APIs

APIEndpointDescription
Chat CompletionsPOST /v1/chat/completionsNon-streaming and SSE streaming
EmbeddingsPOST /v1/embeddingsTitan, Cohere, Nova embedding models
ModelsGET /v1/modelsList available Bedrock foundation models
HealthGET /Health check with config info

Features

Core

  • Unified proxy — chat completions + embeddings on a single port
  • OpenAI-compatible API — drop-in for any OpenAI SDK client
  • Streaming — SSE streaming for chat completions with stream_options.include_usage support
  • Tool use — function calling via Bedrock Converse (autorequired, specific tool)
  • Vision — image inputs via base64 data URLs AND remote HTTP/HTTPS URLs
  • Model aliases — short names, OpenRouter IDs, bare Bedrock IDs all work
  • Cross-region inference — auto-prefixes (us.eu.ap.global.) based on region

Intelligence

  • Reasoning / Extended Thinking — reasoning_effort (low/medium/high) for Claude 3.7/4/4.5 with reasoning_content in responses and streaming
  • DeepSeek R1 Reasoning — automatic format detection (string format for DeepSeek, object format for Claude)
  • Interleaved Thinking — extra_body.thinking for Claude 4/4.5 thinking between tool calls
  • Prompt Caching — extra_body.prompt_caching for up to 90% cost reduction, with ENABLE_PROMPT_CACHING env var for global default

Compatibility

  • Application Inference Profiles — pass ARN as model ID for cost tracking (no alias mangling)
  • Developer role — developer messages treated as system (OpenAI compatibility)
  • Message coalescing — consecutive same-role messages automatically merged (Bedrock requirement)
  • No-prefill handling — models that reject assistant-ending conversations (e.g. claude-opus-4-6) get automatic user continuation
  • temperature/topP conflict — auto-stripped for models that reject both simultaneously (Claude 4.5, Haiku 4.5, Opus 4.5)
  • Extra body passthrough — unknown extra_body keys forwarded to Bedrock additionalModelRequestFields

Embedding Models

  • Amazon Titan Embed v2 — 1024 dimensions, English + multilingual
  • Cohere Embed v3/v4 — English, multilingual, latest v4 (1536 dims)
  • Amazon Nova Multimodal Embeddings v2 — configurable dimensions (256/384/1024/3072)
  • base64 encoding — encoding_format: "base64" for compact responses

Infrastructure

  • Auth — IAM/SigV4 (default) or Bedrock API key (bearer token)
  • Adaptive retries — max 8 attempts with adaptive backoff
  • Config — TOML file, env vars, CLI flags (layered priority)
  • Systemd daemon — install-daemon subcommand
  • Self-update — update subcommand