Structured Outputs Skill

Structured Outputs

Overview

Structured Outputs ensure that an LLM's response always conforms to a predefined JSON schema. This moves beyond simple prompt engineering to guaranteed syntactic and structural correctness, essential for programmatic integration.

When to Use

API Integration: When LLM output must be parsed by a machine (e.g., generating database records).
UI Components: When the LLM generates data to populate a specific frontend interface.
Data Extraction: Converting unstructured text into standardized JSON objects.

Decision Tree

Do you need 100% guarantee of schema adherence?
- YES: Use Structured Output mode (e.g., OpenAI's strict: true or Gemini's response_mime_type).
- NO: JSON Mode may suffice.
Is the structure complex or nested?
- YES: Use Pydantic models and model_rebuild if recursive.
Do you need to handle safety refusals programmatically?
- YES: Check for the refusal field in the response.

Workflows

1. Defining a Pydantic-based Schema

Define a Python class inheriting from BaseModel.
Use specific types (e.g., int, List[str]) and Field descriptions to guide the model.
Pass the model directly to the SDK's parse method (e.g., client.beta.chat.completions.parse).
Access the parsed result via response.choices[0].message.parsed.

2. Handling Optional and Nullable Fields

In the JSON schema, define types as an array: ["string", "null"].
In Pydantic, use Optional[str] = None.
Instruct the model in the field description when to use null vs. an empty value.

3. Streaming Structured JSON

Initiate a generate_content_stream call with the JSON schema configuration.
Iterate over chunks as they arrive from the model.
Concatenate the partial JSON strings; note that the full object is only valid JSON once the stream finishes.

Non-Obvious Insights

Order Matters: Models typically produce outputs in the same order as the keys defined in the schema.
Refusals as First-Class Citizens: Safety-based model refusals are now programmatically detectable as a separate field, preventing the parser from failing on malformed JSON when the model won't answer.
Beyond JSON Mode: Structured Outputs guarantee schema adherence, whereas JSON Mode only guarantees valid JSON syntax without specific structure enforcement.

Evidence

"Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema." - OpenAI
"The model will produce outputs in the same order as the keys in the schema." - Google AI
"Safety-based model refusals are now programmatically detectable." - OpenAI

Scripts

scripts/structured-outputs_tool.py: Pydantic model definition and parsing logic.
scripts/structured-outputs_tool.js: Equivalent JSON schema definition for Node.js.

Dependencies

pydantic
openai >= 1.40.0 or google-generativeai

References

references/README.md

Agent Skills: Structured Outputs

Install this agent skill to your local

Skill Files