Getting Started with Anura: Learn about Lilypad's Inference API


As artificial intelligence rapidly becomes foundational to modern applications, developers face growing challenges around scalability, cost, and centralized infrastructure constraints. For instance having to purchase a capable rig to run local inference can cost you multiple of thousands of dollars if you want a machine equipped with Nvidia’s latest graphics cards or even an Apple Mac Studio.
Lilypad’s Anura Inference API offers a solution—providing access to the latest models powered by their decentralized AI compute infrastructure for tasks like text generation, image creation, and web search. Whether you're building AI agents, creative tools, or intelligent apps, Anura gives you open access to the power of AI, without gatekeepers or vendor lock-in.
What Is the Anura Inference API?
Anura, the official inference API from Lilypad, enables developers to run AI inference tasks on a decentralized compute network. Instead of relying on traditional cloud platforms, Anura taps into a permissionless network of compute nodes, reducing costs while increasing scalability and transparency.
With Anura, you can:
Generate content using large language models (LLMs)
Create high-quality images with diffusion models
Preform vision queries against images using their multimodal modals
Perform real-time web searches which you can use to boost your AI’s contextual knowledge
Top Features of Anura API
1. Text-to-Text Generation with LLMs
Run powerful models like Llama3, DeepSeek, and Qwen2.5 using OpenAI-compatible chat completion endpoints.
Endpoint:
POST /api/v1/chat/completions
Supports: Function calling, vision, streaming via Server-Sent Events (SSE), customizable temperature, top-p, and stop sequences
2. AI-Powered Image Generation
Use models like sdxl-turbo
to generate images from natural language prompts.
List models:
GET /api/v1/image/models
Generate image:
POST /api/v1/image/generate
3. Web Search API for Real-Time Context
Retrieve live internet search results to enhance AI agent accuracy and context awareness.
Endpoint:
POST /api/v1/websearch
Returns: Titles, URLs, snippets, and related queries
Getting Started Just Takes 3 Simple Steps
Register for an API key at anura.lilypad.tech
Browse available models via
/models
endpointStart running inference jobs using Anura's clean, RESTful interface. Check out their docs for all the details
Why I dig the API:
Open & Permissionless: All I need is an API Key and i’m good to go
Decentralized Compute: Run workloads across a global compute network including access to A5000’s, A6000’s, Nvidia 5090’s. 4090’s and more
Cost-Effective: Anura is FREE right now and you can’t beat that in today’s expensive inference API landscape
OpenAI-Compatible: Easy integration with existing SDKs and tools, I can plug Anura into the OpenAI SDK without having to onboard another API client
Access to Web Search out of the box: Having the tooling available for web search is an amazing way to boost the context of my AI workflow
Here are some example applications i’ve built using the Anura API for inspiration:
A RAG using web search: https://github.com/narbs91/lilypad-websearch-agent-demo
A Crypto Onboarding Agent: https://github.com/narbs91/crypto-onboarding-agent
Final Thoughts
As the demand for decentralized AI infrastructure grows, Lilypad’s Anura API is emerging as a critical enabler of scalable, open AI development. Whether you're a solo hacker, a startup founder, or an enterprise innovator, Anura gives you the flexibility to build and scale AI-powered applications—without the limits of centralization. If you enjoyed this article consider giving it a thumbs up or subscribing to never miss any content I push out. Cheers until the next one!
Subscribe to my newsletter
Read articles from Narbeh Shahnazarian directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
