OpenAI Agents SDK Skill
Build AI agents using OpenAI Agents SDK with support for Gemini and other LLMs via direct integration or OpenRouter.
Architecture
┌─────────────────────────────────────────────────────────────────────────┐
│ OpenAI Agents SDK │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Agent │ │
│ │ model: OpenAIChatCompletionsModel │ │
│ │ (via AsyncOpenAI with Gemini base_url) │ │
│ │ tools: [function_tool, ...] │ │
│ │ mcp_servers: [MCPServerStreamableHttp(...)] │ │
│ │ handoffs: [specialized_agent, ...] │ │
│ └──────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
│
│ MCP Protocol
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ MCP Server (FastMCP) │
│ @mcp.tool() for task operations │
└─────────────────────────────────────────────────────────────────────────┘
Quick Start
Installation
# Base installation
pip install openai-agents
# Or with uv
uv add openai-agents
Environment Variables
# For Direct Gemini Integration (Recommended)
GOOGLE_API_KEY=your-gemini-api-key
# OR for OpenRouter (Alternative)
OPENROUTER_API_KEY=your-openrouter-api-key
# For OpenAI (optional, for tracing)
OPENAI_API_KEY=your-openai-api-key
Using Gemini via Direct Integration (Recommended)
The recommended approach is to use AsyncOpenAI with Gemini's OpenAI-compatible endpoint. This avoids quota issues with LiteLLM.
import os
from agents import AsyncOpenAI, OpenAIChatCompletionsModel, Agent, Runner
from agents.run import RunConfig
from dotenv import load_dotenv
load_dotenv()
# Create custom OpenAI client pointing to Gemini
gemini_api_key = os.getenv("GOOGLE_API_KEY")
external_provider = AsyncOpenAI(
api_key=gemini_api_key,
base_url="https://generativelanguage.googleapis.com/v1beta/openai",
)
# Create model using the custom client
model = OpenAIChatCompletionsModel(
openai_client=external_provider,
model="gemini-2.0-flash-exp",
)
# Configure agent with model
config = RunConfig(
model=model,
model_provider=external_provider,
tracing_disabled=True
)
# Create agent
agent = Agent(
name="Todo Assistant",
instructions="You are a helpful task management assistant.",
)
# Run the agent with config
result = await Runner.run(agent, "Help me organize my tasks", config=config)
print(result.final_output)
Alternative: Using OpenRouter
OpenRouter provides access to multiple models through a single API, including free options.
import os
from agents import AsyncOpenAI, OpenAIChatCompletionsModel, Agent, Runner
from agents.run import RunConfig
from dotenv import load_dotenv
load_dotenv()
# Create OpenRouter client
openrouter_key = os.getenv("OPENROUTER_API_KEY")
external_provider = AsyncOpenAI(
api_key=openrouter_key,
base_url="https://openrouter.ai/api/v1",
)
# Use a free model (powered by Gemini or other providers)
model = OpenAIChatCompletionsModel(
openai_client=external_provider,
model="openai/gpt-oss-20b:free",
)
config = RunConfig(
model=model,
model_provider=external_provider,
tracing_disabled=True
)
# Create and run agent
agent = Agent(
name="Todo Assistant",
instructions="You are a helpful task management assistant.",
)
result = await Runner.run(agent, "Help me organize my tasks", config=config)
print(result.final_output)
Reference
| Pattern | Guide | |---------|-------| | Agent Creation | reference/agents.md | | Function Tools | reference/function-tools.md | | MCP Integration | reference/mcp-integration.md | | Handoffs | reference/handoffs.md |
Examples
| Example | Description | |---------|-------------| | examples/todo-agent.md | Complete todo agent with MCP tools |
Templates
| Template | Purpose | |----------|---------| | templates/agent_gemini.py | Basic Gemini agent template with direct integration | | templates/agent_mcp.py | Agent with MCP server integration |
Basic Agent with Function Tools
import asyncio
from agents import AsyncOpenAI, OpenAIChatCompletionsModel, Agent, Runner, function_tool
from agents.run import RunConfig
import os
@function_tool
def get_weather(city: str) -> str:
"""Get the weather for a city."""
return f"The weather in {city} is sunny."
async def main():
# Setup Gemini client
external_provider = AsyncOpenAI(
api_key=os.getenv("GOOGLE_API_KEY"),
base_url="https://generativelanguage.googleapis.com/v1beta/openai",
)
model = OpenAIChatCompletionsModel(
openai_client=external_provider,
model="gemini-2.0-flash-exp",
)
config = RunConfig(
model=model,
model_provider=external_provider,
tracing_disabled=True
)
agent = Agent(
name="Assistant",
instructions="You are a helpful assistant.",
tools=[get_weather],
)
result = await Runner.run(agent, "What's the weather in Tokyo?", config=config)
print(result.final_output)
asyncio.run(main())
Agent with MCP Server
Connect your agent to an MCP server to access tools, resources, and prompts.
import asyncio
import os
from agents import AsyncOpenAI, OpenAIChatCompletionsModel, Agent, Runner
from agents.mcp import MCPServerStreamableHttp
from agents.run import RunConfig
async def main():
# Setup Gemini client
external_provider = AsyncOpenAI(
api_key=os.getenv("GOOGLE_API_KEY"),
base_url="https://generativelanguage.googleapis.com/v1beta/openai",
)
model = OpenAIChatCompletionsModel(
openai_client=external_provider,
model="gemini-2.0-flash-exp",
)
config = RunConfig(
model=model,
model_provider=external_provider,
tracing_disabled=True
)
async with MCPServerStreamableHttp(
name="Todo MCP Server",
params={
"url": "http://localhost:8000/api/mcp",
"timeout": 30,
},
cache_tools_list=True,
) as mcp_server:
agent = Agent(
name="Todo Assistant",
instructions="""You are a task management assistant.
Use the MCP tools to help users manage their tasks:
- add_task: Create new tasks
- list_tasks: View existing tasks
- complete_task: Mark tasks as done
- delete_task: Remove tasks
- update_task: Modify tasks""",
mcp_servers=[mcp_server],
)
result = await Runner.run(
agent,
"Show me my pending tasks",
config=config
)
print(result.final_output)
asyncio.run(main())
Agent Handoffs
Create specialized agents that hand off conversations. Note: When using handoffs, all agents share the same RunConfig.
from agents import Agent, handoff, AsyncOpenAI, OpenAIChatCompletionsModel, Runner
from agents.run import RunConfig
from agents.extensions.handoff_prompt import prompt_with_handoff_instructions
import os
# Setup shared configuration
external_provider = AsyncOpenAI(
api_key=os.getenv("GOOGLE_API_KEY"),
base_url="https://generativelanguage.googleapis.com/v1beta/openai",
)
model = OpenAIChatCompletionsModel(
openai_client=external_provider,
model="gemini-2.0-flash-exp",
)
config = RunConfig(
model=model,
model_provider=external_provider,
tracing_disabled=True
)
# Specialized agents (no model specified - uses config)
task_agent = Agent(
name="Task Agent",
instructions=prompt_with_handoff_instructions(
"You specialize in task management. Help users create, update, and complete tasks."
),
)
help_agent = Agent(
name="Help Agent",
instructions=prompt_with_handoff_instructions(
"You provide help and instructions about using the todo app."
),
)
# Triage agent
triage_agent = Agent(
name="Triage Agent",
instructions=prompt_with_handoff_instructions(
"""Route users to the appropriate agent:
- Task Agent: for creating, viewing, or managing tasks
- Help Agent: for questions about how to use the app"""
),
handoffs=[task_agent, help_agent],
)
# Run with config
result = await Runner.run(triage_agent, "How do I add a task?", config=config)
Streaming Responses
from agents import Runner
result = Runner.run_streamed(agent, "List my tasks", config=config)
async for event in result.stream_events():
if event.type == "run_item_stream_event":
print(event.item, end="", flush=True)
print(result.final_output)
Model Settings
from agents import Agent, ModelSettings, Runner
agent = Agent(
name="Assistant",
model_settings=ModelSettings(
include_usage=True, # Track token usage
tool_choice="auto", # or "required", "none"
),
)
result = await Runner.run(agent, "Hello!", config=config)
print(f"Tokens used: {result.context_wrapper.usage.total_tokens}")
Tracing Control
Disable tracing when not using OpenAI:
from agents.run import RunConfig
# Tracing disabled in config
config = RunConfig(
model=model,
model_provider=external_provider,
tracing_disabled=True # Disable tracing
)
Supported Model Providers
| Provider | Base URL | Model Examples |
|----------|----------|----------------|
| Gemini | https://generativelanguage.googleapis.com/v1beta/openai | gemini-2.0-flash-exp, gemini-1.5-pro |
| OpenRouter | https://openrouter.ai/api/v1 | openai/gpt-oss-20b:free, google/gemini-2.0-flash-exp:free |
MCP Connection Types
| Type | Use Case | Class |
|------|----------|-------|
| Streamable HTTP | Production, low-latency | MCPServerStreamableHttp |
| SSE | Web clients, real-time | MCPServerSse |
| Stdio | Local processes | MCPServerStdio |
| Hosted | OpenAI-hosted MCP | HostedMCPTool |
Error Handling
from agents import Runner, AgentError
try:
result = await Runner.run(agent, "Hello")
print(result.final_output)
except AgentError as e:
print(f"Agent error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
Best Practices
- Use Direct Integration - Use AsyncOpenAI with custom base_url instead of LiteLLM to avoid quota issues
- Pass RunConfig - Always pass RunConfig to Runner.run() for non-OpenAI providers
- Cache MCP tools - Use
cache_tools_list=Truefor performance - Use handoffs - Create specialized agents for different functionality
- Enable usage tracking - Set
include_usage=Truein ModelSettings to monitor costs - Disable tracing - Set
tracing_disabled=Truein RunConfig when not using OpenAI - Handle errors gracefully - Use try/except for agent execution
- Use streaming - Implement streaming for better user experience
- Share configuration - When using handoffs, all agents share the same RunConfig
Troubleshooting
Quota Exceeded Error on First Request
Problem: Getting quota exceeded error even on first request when using LiteLLM.
Solution: Switch to direct integration using AsyncOpenAI:
from agents import AsyncOpenAI, OpenAIChatCompletionsModel
from agents.run import RunConfig
external_provider = AsyncOpenAI(
api_key=os.getenv("GOOGLE_API_KEY"),
base_url="https://generativelanguage.googleapis.com/v1beta/openai",
)
model = OpenAIChatCompletionsModel(
openai_client=external_provider,
model="gemini-2.0-flash-exp",
)
config = RunConfig(
model=model,
model_provider=external_provider,
tracing_disabled=True
)
result = await Runner.run(agent, "Hello", config=config)
MCP connection fails
- Check MCP server is running
- Verify URL is correct
- Check timeout settings
- Ensure cache_tools_list=True for performance
Gemini API errors
- Verify GOOGLE_API_KEY is set correctly
- Check model name:
gemini-2.0-flash-exporgemini-1.5-pro - Verify base_url is correct:
https://generativelanguage.googleapis.com/v1beta/openai - Ensure API quota is not exceeded
Agent not using provided model
Problem: Agent ignores model configuration.
Solution: Always pass RunConfig to Runner.run():
result = await Runner.run(agent, message, config=config)