Open Source LLM Gateway for AI Governance

Large language models are part of everyday software development, DevOps, and automation workflows. Coding assistants, CI/CD pipelines, and agentic tools all require access to AI providers. Managing individual AI subscriptions, credentials, and usage costs for every developer is impractical and difficult to govern at scale.

At the same time, uncontrolled access to AI services introduces security, compliance, and cost risks. Organizations need a central place to control which models may be used and who may access them.

Membrane comes with an integrated LLM Gateway for AI governance and secure access to LLM APIs. It provides centralized access control, token usage tracking, rate limiting, model governance, and security policies for LLM APIs. Combined with API Gateway features such as OAuth2, validation, and transformations, it forms a flexible toolkit for building tailored AI governance solutions.

This page describes the capabilities of the Membrane LLM Gateway and how it can be used to govern and secure AI usage across an organization.

What is an AI or LLM Gateway?

An AI or LLM Gateway acts as a centralized entry point between applications and AI providers such as OpenAI, Anthropic, or Google AI.

Central AI Gateway for OpenAI, Anthropic, and Google AI

Instead of applications connecting directly to external AI services, requests are routed through the gateway. The gateway controls authentication, model access, token usage, security policies, and monitoring.

This allows organizations to govern AI usage centrally. Developers and tools can use AI services without managing provider API keys or individual subscriptions. At the same time, organizations gain visibility into costs, usage patterns, and security relevant activity.

An LLM Gateway extends the capabilities of a traditional API Gateway with AI specific functionality such as token-based quotas, model governance, prompt filtering, streaming support, provider abstraction, and AI usage metrics.

API Key Sharing

An organization can centrally manage access to LLM providers using a shared provider account. Developers do not use the real API key of the AI provider directly. Instead, they authenticate against the internal LLM Gateway using organization managed API keys or security tokens.

The Membrane LLM Gateway validates the internal credentials and replaces them with the actual provider API key before forwarding the request to the LLM provider.

Sharing AI provider API keys through the Membrane LLM Gateway

This approach centralizes AI access and allows organizations to govern usage consistently across teams and tools:

The company only needs one account with the AI provider instead of many individual subscriptions.
Token usage can still be tracked per developer, team, or application through the internal API keys.
The gateway collects metrics about token consumption that can be used for internal chargeback models.
Provider API keys remain hidden from client applications and developer tools, reducing the risk of credential leaks.
Agents, IDEs, and AI tools are easy to configure. Developers only need the URL of the LLM Gateway and their internal API key instead of a real provider API key.

Token Usage Tracking and Quotas

Uncontrolled spending of tokens could lead to unpleasant surprises. The LLM Gateway can track token usage for every request and enforce quotas to control spending.

An upcoming Membrane release will include a JDBC based token usage store. Token usage data can then be persisted in a database and used as a source for reporting, monitoring, analytics, or billing systems.

Model Governance

The Membrane LLM Gateway can enforce policies that define which models are available to users.

How to setup the LLM Gateway?

The initial setup takes less than 10 minutes.

Download Membrane API Gateway.
Explore the examples in the tutorials/ai/llm-gateway folder.
Adapt a tutorial configuration to your use case.

Sample Configuration

The following configuration exposes an OpenAI compatible LLM Gateway. Client applications send their requests to the gateway instead of directly to OpenAI.

The gateway enforces quotas and forwards the request to the OpenAI API using the centrally managed provider key.

api:
  port: 2000
  flow:
    - llmGateway:
        apiKey: <<Replace with your API_KEY>>
        openai: {}
        maxInputTokens: 10000
        maxOutputTokens: 20000
  target:
    url: https://api.openai.com