Announcing APISIX Integration with AI/ML API


We're thrilled to announce that AI/ML API has become a supported provider to the
ai-proxy
,ai-proxy-multi
, andai-request-rewrite
plugins in Apache APISIX. All the AI/ML APIs will be supported in the next APISIX version.
Introduction
AI/ML API is a single endpoint that gives you access to more than 300 ready-to-use AI models—large language models, embeddings, image and audio tools—through one standard REST interface. It is used by over 150,000 developers and organizations as a centralized LLM API gateway.
We're thrilled to announce that AI/ML API has become a supported provider to the ai-proxy
, ai-proxy-multi
, and ai-request-rewrite
plugins in Apache APISIX.
AI/ML API provides a unified OpenAI-compatible API with access to 300+ LLMs such as GPT-4, Claude, Gemini, DeepSeek, and others. This integration bridges the gap between your API infrastructure and leading AI services, enabling you to deploy intelligent features—like chatbots, real-time translations, and data analysis—faster than ever.
Proxy to OpenAI via AI/ML API
Prerequisites
Generate your API key on the AI/ML API dashboard.
Configure the Route
Create a route and configure the ai-proxy
plugin as such:
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"provider": "aimlapi",
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'" # Generated openai key from AI/ML API dashboard
}
},
"options":{
"model": "gpt-4"
}
}
}
}'
Test the Integration
Send a POST request to the route with a system prompt and a sample user question in the request body:
curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-H "Host: api.openai.com" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "What is 1+1?" }
]
}'
Verify Response
You should receive a response similar to the following:
{
...,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "1 + 1 equals 2.",
"refusal": null,
"annotations": []
}
}
],
"created": 1753845968,
"model": "gpt-4-0613",
"usage": {
"prompt_tokens": 1449,
"completion_tokens": 1008,
"total_tokens": 2457
...
}
Core Use Cases
Unified AI Service Management
Multi-Model Proxy and Load Balancing: Replace hardcoded vendor endpoints with a single APISIX interface, dynamically routing requests to models from OpenAI, Claude, DeepSeek, Gemini, Mistral, etc., based on cost, latency, or performance needs.
Vendor-Agnostic Workflows: Seamlessly switch between models (e.g., GPT-4 for creative tasks, Claude for document analysis) without code changes.
Cost-Optimized Token Governance
Token-Based Budget Enforcement: Set per-team/monthly spending limits; auto-throttle requests when thresholds are exceeded.
Caching & Fallbacks: Cache frequent LLM responses (e.g., FAQ answers) or reroute to cheaper models during provider outages.
Real-Time AI Application Scaling
Chatbots & Virtual Agents: Power low-latency conversational interfaces with streaming support for token-by-token responses.
Data Enrichment Pipelines: Augment APIs with AI—e.g., auto-summarize user reviews or translate product descriptions on-the-fly.
Hybrid/Multi-Cloud AI Deployment
Unified Control Plane: Manage on-prem LLMs (e.g., Llama 3) alongside cloud APIs (OpenAI, Azure) with consistent policy enforcement.
High Availability & Fault Tolerance: Built-in health-checks, automatic retries and failover; if one LLM fails, traffic is rerouted within seconds to keep services alive.
Enterprise AI Security & Compliance
Data Security and Compliance: Prompt Guard, content moderation, PII redaction and full audit logs in a single place.
One Auth Layer for 300+ LLMs: Unified authentication (JWT/OAuth2/OIDC) and authorization for 300+ LLM keys and policies.
Conclusion
With AI/ML API now natively supported in Apache APISIX, you no longer have to choose between speed, security, or scale—you get all three.
One line of YAML turns your gateway into a 300-model AI powerhouse.
Zero code changes let you hot-swap GPT-4 for Claude, or route 10 % of traffic to a cheaper model for instant cost savings.
Built-in guardrails (PII redaction, token budgets, content moderation) keep compliance teams happy while your product team ships faster.
More Resources
Related APISIX AI Plugins
Subscribe to my newsletter
Read articles from Yilia Lin directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Yilia Lin
Yilia Lin
Technical Writer at API7.ai