TTS LiveKit Plugin Skill Skill

TTS LiveKit Plugin Skill

This skill provides a complete solution for building self-hosted Text-to-Speech (TTS) systems integrated with LiveKit voice agents.

What This Skill Does

Creates a Self-Hosted TTS API Server
- FastAPI-based REST API
- Uses MeloTTS model from Hugging Face
- Supports streaming audio responses
- Multi-language and multi-voice support
- Production-ready with proper error handling
Builds a LiveKit TTS Plugin
- Fully compatible with LiveKit agents framework
- Implements standard TTS interface
- Streaming audio support
- Proper error handling and retries
- Drop-in replacement for commercial TTS providers
Provides Complete Testing
- Comprehensive test suite for API
- Integration tests for plugin
- No mocked functions - all real implementations
- Performance and concurrency tests
Includes Full Documentation
- API documentation with examples
- Plugin usage guide
- Deployment guide for production
- Multiple usage examples

Components

API Server (`api/`)

server.py: FastAPI server with MeloTTS integration
requirements.txt: Python dependencies
Endpoints:
- GET /: Health check
- GET /voices: List available voices
- POST /synthesize: Full audio synthesis
- POST /synthesize/stream: Streaming synthesis

LiveKit Plugin (`plugin/`)

melotts_plugin.py: LiveKit TTS plugin implementation
Extends livekit.agents.tts.TTS base class
Implements ChunkedStream for audio streaming
Uses aiohttp for HTTP requests
Proper exception handling (APIConnectionError, APITimeoutError, APIStatusError)

Tests (`tests/`)

test_api.py: API server tests
- Health checks
- Voice listing
- Synthesis (streaming and non-streaming)
- Multiple languages
- Error handling
- Concurrency
test_plugin.py: Plugin integration tests
- Plugin initialization
- Synthesis with real API
- Multiple languages
- Error handling
- Concurrency
- Timeout handling

Examples (`examples/`)

test_api_client.py: Standalone API testing script
simple_agent.py: Basic LiveKit agent example
voice_assistant.py: Complete voice assistant implementation

Documentation (`docs/`)

API.md: Complete API reference
PLUGIN.md: Plugin usage guide
DEPLOYMENT.md: Production deployment guide

Quick Start

1. Start the TTS API Server

cd api
pip install -r requirements.txt
python -m unidic download
python server.py

Server runs on http://localhost:8000

2. Test the API

cd examples
python test_api_client.py

3. Use in LiveKit Agent

from melotts_plugin import TTS

tts = TTS(
    api_base_url="http://localhost:8000",
    language="EN",
    speaker="EN-US",
    speed=1.0
)

stream = tts.synthesize("Hello from LiveKit!")

Features

✅ Self-hosted (no external API dependencies)
✅ High-quality natural speech (MeloTTS)
✅ 6 languages: English, Spanish, French, Chinese, Japanese, Korean
✅ Multiple voices per language
✅ Streaming audio for low latency
✅ CPU-friendly (optimized for real-time inference)
✅ GPU support (automatic if available)
✅ LiveKit agents framework compatible
✅ Production-ready error handling
✅ Comprehensive test coverage
✅ Full documentation

Architecture

┌─────────────────┐      HTTP POST       ┌──────────────────┐
│  LiveKit Agent  │ ──────────────────►  │   TTS API        │
│                 │                       │   Server         │
│  ┌───────────┐  │                       │                  │
│  │ MeloTTS   │  │   Audio Stream       │  ┌────────────┐  │
│  │ Plugin    │  │ ◄──────────────────  │  │  MeloTTS   │  │
│  └───────────┘  │    (WAV chunks)      │  │  Model     │  │
└─────────────────┘                       │  └────────────┘  │
                                          └──────────────────┘

Why MeloTTS?

High Quality: Natural-sounding speech
Fast: Optimized for real-time inference
CPU-Friendly: Works well even without GPU
Multi-lingual: 6 languages supported
Low Latency: ~150-200ms TTFB
Open Source: Free to use and modify

Performance

Latency: 150-200ms time-to-first-byte
CPU Usage: Optimized for real-time on CPUs
GPU Support: Automatic acceleration if available
Streaming: Chunked delivery for low latency
Concurrent Requests: Handles multiple simultaneous requests

Supported Languages

| Language | Code | Speakers | |----------|------|----------| | English | EN | EN-US, EN-BR, EN-AU, EN-IN | | Spanish | ES | ES | | French | FR | FR | | Chinese | ZH | ZH | | Japanese | JP | JP | | Korean | KR | KR |

Testing

All tests use real implementations - no mocks:

# Start API server
cd api && python server.py

# Run API tests
cd tests && pytest test_api.py -v

# Run plugin tests
cd tests && pytest test_plugin.py -v

Deployment

Multiple deployment options:

Standalone: Run directly with Python/Uvicorn
Docker: Containerized deployment
Kubernetes: Scalable cloud deployment
Cloud: AWS, GCP, Azure support

See docs/DEPLOYMENT.md for detailed guides.

Integration with LiveKit

The plugin is a drop-in replacement for other TTS providers:

# Instead of:
# from livekit.plugins import openai
# tts = openai.TTS()

# Use:
from melotts_plugin import TTS
tts = TTS(api_base_url="http://localhost:8000")

# Same interface, self-hosted!

Use Cases

Voice assistants
Interactive voice response (IVR) systems
Accessibility tools
Educational applications
Multilingual customer service bots
Real-time voice agents
Live streaming with voice synthesis

Requirements

API Server:

Python 3.9+
2GB+ RAM
FastAPI, MeloTTS, Uvicorn
Optional: GPU for faster inference

LiveKit Plugin:

Python 3.9+
livekit-agents >= 0.8.0
aiohttp >= 3.9.0

Security

For production:

Add API authentication
Enable HTTPS/TLS
Implement rate limiting
Configure CORS
Set up monitoring

See docs/DEPLOYMENT.md#security for details.

When to Use This Skill

Use this skill when you need to:

Build a self-hosted TTS solution
Create LiveKit voice agents with custom TTS
Avoid commercial TTS API costs
Have full control over voice synthesis
Support multiple languages
Deploy TTS in private/air-gapped environments
Build voice assistants
Integrate TTS into existing applications

Troubleshooting

Server won't start:

Run python -m unidic download
Check port 8000 is available
Verify dependencies installed

Plugin connection errors:

Ensure API server is running
Check api_base_url configuration
Verify network connectivity

Audio quality issues:

Try different voices/speakers
Adjust speed parameter
Check sample rate configuration

See documentation for more troubleshooting tips.

Resources

License

Apache 2.0 License

Support

Check the documentation in docs/
Review examples in examples/
Run the test suite to verify setup
Check logs for error messages

This skill provides everything needed for production-ready, self-hosted TTS with LiveKit integration. All code is fully functional with no mocks or placeholders.

Agent Skills: TTS LiveKit Plugin Skill

Install this agent skill to your local

Skill Files