AWS

Bedrockify: OpenAI Compatible Proxy for Bedrock: Completions + Embeddings

Roy Osherove

03 Apr 2026 — 2 min read

Happy to release bedrockify: An OpenAI Completions compatible API with embeddings proxy API for bedrock usage. Many of the current agents don't support bedorck by default, but do support custom OpenAI api. This gives a proxy you can run easily in one line on the same machine as your agent , and point your agent to it. Written in go, it's fast, easy to run as a daemon and just works. Works with OpenClaw, Hermes and other agents.

Supported APIs

API	Endpoint	Description
Chat Completions	`POST /v1/chat/completions`	Non-streaming and SSE streaming
Embeddings	`POST /v1/embeddings`	Titan, Cohere, Nova embedding models
Models	`GET /v1/models`	List available Bedrock foundation models
Health	`GET /`	Health check with config info

Features

Core

Unified proxy — chat completions + embeddings on a single port
OpenAI-compatible API — drop-in for any OpenAI SDK client
Streaming — SSE streaming for chat completions with stream_options.include_usage support
Tool use — function calling via Bedrock Converse (auto, required, specific tool)
Vision — image inputs via base64 data URLs AND remote HTTP/HTTPS URLs
Model aliases — short names, OpenRouter IDs, bare Bedrock IDs all work
Cross-region inference — auto-prefixes (us., eu., ap., global.) based on region

Intelligence

Reasoning / Extended Thinking — reasoning_effort (low/medium/high) for Claude 3.7/4/4.5 with reasoning_content in responses and streaming
DeepSeek R1 Reasoning — automatic format detection (string format for DeepSeek, object format for Claude)
Interleaved Thinking — extra_body.thinking for Claude 4/4.5 thinking between tool calls
Prompt Caching — extra_body.prompt_caching for up to 90% cost reduction, with ENABLE_PROMPT_CACHING env var for global default

Compatibility

Application Inference Profiles — pass ARN as model ID for cost tracking (no alias mangling)
Developer role — developer messages treated as system (OpenAI compatibility)
Message coalescing — consecutive same-role messages automatically merged (Bedrock requirement)
No-prefill handling — models that reject assistant-ending conversations (e.g. claude-opus-4-6) get automatic user continuation
temperature/topP conflict — auto-stripped for models that reject both simultaneously (Claude 4.5, Haiku 4.5, Opus 4.5)
Extra body passthrough — unknown extra_body keys forwarded to Bedrock additionalModelRequestFields

Embedding Models

Amazon Titan Embed v2 — 1024 dimensions, English + multilingual
Cohere Embed v3/v4 — English, multilingual, latest v4 (1536 dims)
Amazon Nova Multimodal Embeddings v2 — configurable dimensions (256/384/1024/3072)
base64 encoding — encoding_format: "base64" for compact responses

Infrastructure

Auth — IAM/SigV4 (default) or Bedrock API key (bearer token)
Adaptive retries — max 8 attempts with adaptive backoff
Config — TOML file, env vars, CLI flags (layered priority)
Systemd daemon — install-daemon subcommand
Self-update — update subcommand

Bedrockify: OpenAI Compatible Proxy for Bedrock: Completions + Embeddings

Roy Osherove

Supported APIs

Features

Core

Intelligence

Compatibility

Embedding Models

Infrastructure

Read more

The Rise of the Stateful Agent

Stateful Agent — Pattern Definition (with Conversation Log)

I Gave My Agent its Own AWS Account

embedrock : OpenAI-compatible Bedrock embedding proxy for OpenClaw and Friends