ASG Inference
Access state-of-the-art AI models with per-token pricing and instant access.
Overview
ASG Inference provides:
- 100+ models — GPT-5.2, Claude Sonnet 4, Gemini 2.5 Pro, DeepSeek R1 and more
- OpenAI-compatible — Drop-in replacement
- Per-token billing — Pay exactly for usage
- Automatic fallback — Reliability across providers
Quick Example
curl -X POST https://agent.asgcompute.com/mcp \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "inference_chat",
"arguments": {
"model": "openai/gpt-4o-mini",
"messages": [
{"role": "user", "content": "Explain quantum computing in one sentence."}
]
}
}
}'
Available Models
Pass the full model identifier in the model parameter:
| Model | Best For | Cost |
|---|
openai/gpt-4o-mini | Quick responses, chat | $ |
openai/gpt-4.1 | General purpose | $$ |
openai/gpt-5.2 | Complex reasoning | $$$ |
anthropic/claude-sonnet-4 | Coding, analysis | $$$ |
google/gemini-2.5-pro | Multimodal, long context | $$ |
deepseek/deepseek-r1 | Math, reasoning | $$ |
Use the Quote response to see exact per-token pricing for any model before execution.
Parameters
| Parameter | Type | Required | Description |
|---|
model | string | Yes | Model identifier (see table above) |
messages | array | Yes | Conversation messages |
max_tokens | number | No | Max output tokens (default: 1024) |
temperature | number | No | Randomness (0-2, default: 1) |
stream | boolean | No | Enable streaming (default: false) |
Response
{
"result": {
"content": "Quantum computing uses quantum bits...",
"usage": {
"prompt_tokens": 12,
"completion_tokens": 45,
"total_tokens": 57
},
"_meta": {
"receipt_id": "rcpt_abc123",
"debited_usdc_microusd": 2400
}
}
}
Streaming
For real-time responses, set stream: true in arguments. Streaming responses use Server-Sent Events.
Pricing
See Pricing for current rates.
Cost Optimization: Use lightweight models like openai/gpt-4o-mini for simple tasks and reserve frontier models for complex reasoning.