Hume AI API
Analyze emotions in text, audio, and video, generate expressive text-to-speech, and build speech-to-speech conversational agents via Hume's REST API.
Official docs: https://dev.hume.ai/reference
When to Use
Use this skill when you need to:
- Analyze emotions and expressions in text, audio, images, or video (batch processing)
- Generate expressive text-to-speech audio with emotional control
- Manage EVI (Empathic Voice Interface) configurations, prompts, and tools
- Retrieve chat transcripts and events from EVI conversations
Prerequisites
- Sign up at https://app.hume.ai
- Navigate to the API Keys page
- Copy your API key
Set environment variable:
export HUME_TOKEN="your-api-key"
Placeholders: Values in
{curly-braces}like{job-id}are placeholders. Replace them with actual values when executing.
Expression Measurement (Batch)
Start Inference Job from URLs
Write to /tmp/hume_request.json:
{
"urls": ["https://example.com/media-file.mp4"],
"models": {
"face": {},
"prosody": {},
"language": {}
},
"notify": true
}
Then run:
curl -s -X POST "https://api.hume.ai/v0/batch/jobs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .
Start Inference Job with Text
Write to /tmp/hume_request.json:
{
"text": ["I am so excited about this!", "This is really disappointing."],
"models": {
"language": {}
}
}
Then run:
curl -s -X POST "https://api.hume.ai/v0/batch/jobs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .
List Jobs
curl -s "https://api.hume.ai/v0/batch/jobs?limit=10" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .
List Jobs by Status
curl -s "https://api.hume.ai/v0/batch/jobs?limit=10&status=COMPLETED" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .
Get Job Details
curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .
Get Job Predictions
curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}/predictions" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .
Download Job Artifacts
curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}/artifacts" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" --output /tmp/hume_artifacts.zip
Text-to-Speech
Synthesize Speech (JSON Response)
Write to /tmp/hume_request.json:
{
"utterances": [
{
"text": "Hello, how are you today?",
"description": "A warm and friendly greeting"
}
],
"format": {
"type": "mp3"
}
}
Then run:
curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .
The response contains base64-encoded audio in .generations[].audio.
Synthesize Speech with Specific Voice
Write to /tmp/hume_request.json:
{
"utterances": [
{
"text": "Welcome to our platform!",
"speed": 1.0
}
],
"voice": {
"name": "{voice-name}"
},
"format": {
"type": "mp3"
}
}
Then run:
curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .
Synthesize and Save Audio File
Write to /tmp/hume_request.json:
{
"utterances": [
{
"text": "This is a test of text-to-speech synthesis."
}
],
"format": {
"type": "mp3"
}
}
Then extract and decode the audio:
curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq -r '.generations[0].audio' | base64 -d > /tmp/hume_output.mp3
Synthesize Multiple Utterances
Write to /tmp/hume_request.json:
{
"utterances": [
{
"text": "First sentence.",
"description": "calm and measured"
},
{
"text": "Second sentence!",
"description": "excited and energetic",
"trailing_silence": 0.5
}
],
"format": {
"type": "mp3"
}
}
Then run:
curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .
EVI Configs
List Configs
curl -s "https://api.hume.ai/v0/evi/configs?page_size=20" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .
Create Config
Write to /tmp/hume_request.json:
{
"name": "My EVI Config"
}
Then run:
curl -s -X POST "https://api.hume.ai/v0/evi/configs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .
EVI Prompts
List Prompts
curl -s "https://api.hume.ai/v0/evi/prompts?page_size=20" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .
Create Prompt
Write to /tmp/hume_request.json:
{
"name": "Customer Support Agent",
"text": "You are a friendly and empathetic customer support agent. Listen carefully and help resolve issues."
}
Then run:
curl -s -X POST "https://api.hume.ai/v0/evi/prompts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .
EVI Chats
List Chat Events
curl -s "https://api.hume.ai/v0/evi/chats/{chat-id}?page_size=50&ascending_order=true" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .
Available Models for Expression Measurement
| Model | Description |
|-------|-------------|
| face | Facial expression analysis from images/video |
| prosody | Vocal expression analysis from audio |
| language | Emotion analysis from text |
| ner | Named entity recognition from text |
| burst | Non-speech vocal sounds (laughter, sighs) |
| facemesh | Detailed facial landmark detection |
TTS Utterance Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| text | string | Content to convert to speech (required) |
| description | string | Acting directions or voice style prompt |
| speed | number | Speech rate multiplier (default: 1.0) |
| trailing_silence | number | Silence after utterance in seconds (default: 0) |
Response Codes
| Status | Description |
|--------|-------------|
| 200 | Success |
| 201 | Resource created |
| 400 | Invalid request parameters |
| 401 | Missing or invalid API key |
| 404 | Resource not found |
| 429 | Rate limit exceeded |
| 5xx | Server error |
Guidelines
- Authentication: Use
X-Hume-Api-Keyheader (not Bearer token) for all requests - Batch Jobs: Jobs are asynchronous; poll the job details endpoint until status is
COMPLETED - Models: Specify only the models you need in batch requests to reduce processing time
- TTS Description: Use the
descriptionfield to control emotional tone and style of generated speech - Pagination: EVI endpoints use zero-based page numbering with configurable page size (1-100)
- API Key Types: Organization API Key for TTS and EVI; Personal API Key for Expression Measurement
API Reference
- Documentation: https://dev.hume.ai/reference
- Portal: https://app.hume.ai
- API Keys: https://app.hume.ai/keys