Hugging Face API
Use the Hugging Face API via direct curl calls to search models and datasets, run serverless inference, and manage Hub repositories.
Official docs:
https://huggingface.co/docs/hub/en/apiOpenAPI spec:https://huggingface.co/.well-known/openapi.json
When to Use
Use this skill when you need to:
- Search and discover models, datasets, and spaces on the Hugging Face Hub
- Run serverless inference (text generation, image generation, embeddings, etc.)
- Get model or dataset metadata (tags, downloads, likes, card info)
- Manage repositories (create, delete, list files)
- Verify account access with whoami
Prerequisites
- Sign up at Hugging Face
- Go to Settings > Access Tokens and create a new token
- Select appropriate permissions (read access for browsing, write for repo management)
export HUGGING_FACE_TOKEN="hf_..."
Rate Limits
All API calls are subject to Hugging Face rate limits. Authenticated requests have higher limits than anonymous ones. Upgrade to a Pro or Enterprise account for elevated access.
How to Use
All examples below assume you have HUGGING_FACE_TOKEN set.
The base URLs are:
- Hub API:
https://huggingface.co/api - Inference API:
https://router.huggingface.co
1. Verify Account (whoami)
Check your token and account information:
curl -s "https://huggingface.co/api/whoami-v2" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '{name: .name, email: .email, type: .type}'
2. Search Models
Search for models with filters:
curl -s "https://huggingface.co/api/models?search=llama&sort=downloads&direction=-1&limit=5" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '.[].id'
Filter by pipeline task:
curl -s "https://huggingface.co/api/models?pipeline_tag=text-generation&sort=trending&limit=5" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '.[].id'
Common query parameters:
search- Search termpipeline_tag- Filter by task (text-generation, text-to-image, fill-mask, etc.)sort- Sort by: downloads, likes, trending, created_at, lastModifieddirection- Sort direction: -1 (descending), 1 (ascending)limit- Number of results (default 30)author- Filter by author/organization (e.g.meta-llama)filter- Filter by tags (e.g.pytorch,en)
3. Get Model Details
Get detailed information about a specific model:
curl -s "https://huggingface.co/api/models/meta-llama/Llama-3.1-8B-Instruct" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '{id, downloads, likes, pipeline_tag, tags: .tags[:5]}'
4. Search Datasets
Search for datasets:
curl -s "https://huggingface.co/api/datasets?search=squad&sort=downloads&direction=-1&limit=5" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '.[].id'
5. Get Dataset Details
Get detailed information about a specific dataset:
curl -s "https://huggingface.co/api/datasets/squad" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '{id, downloads, likes, tags: .tags[:5]}'
6. Search Spaces
Search for Spaces:
curl -s "https://huggingface.co/api/spaces?search=chatbot&sort=likes&direction=-1&limit=5" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '.[].id'
7. List Repository Files
List files in a model repository:
curl -s "https://huggingface.co/api/models/meta-llama/Llama-3.1-8B-Instruct/tree/main" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '.[] | {path: .rfilename, size}'
For datasets, replace models with datasets:
curl -s "https://huggingface.co/api/datasets/squad/tree/main" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '.[] | {path: .rfilename, size}'
8. Run Serverless Inference (Text Generation)
Run text generation using the Inference API with an OpenAI-compatible endpoint:
Write to /tmp/hugging_face_request.json:
{
"model": "meta-llama/Llama-3.1-8B-Instruct",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
],
"max_tokens": 100
}
Then run:
curl -s "https://router.huggingface.co/hf-inference/v1/chat/completions" --header "Content-Type: application/json" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" -d @/tmp/hugging_face_request.json | jq -r '.choices[0].message.content'
9. Run Serverless Inference (Text-to-Image)
Generate an image from text:
curl -s "https://router.huggingface.co/hf-inference/models/black-forest-labs/FLUX.1-schnell" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" --header "Content-Type: application/json" -d '{"inputs": "A cute cat wearing sunglasses"}' --output /tmp/hugging_face_image.png
The response is the raw image binary saved to the output file.
10. Run Serverless Inference (Embeddings)
Generate text embeddings:
Write to /tmp/hugging_face_request.json:
{
"inputs": "Hello, how are you?"
}
Then run:
curl -s "https://router.huggingface.co/hf-inference/models/sentence-transformers/all-MiniLM-L6-v2" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" --header "Content-Type: application/json" -d @/tmp/hugging_face_request.json | jq '.[0][:5]'
11. Run Serverless Inference (Text Classification)
Classify text using sentiment analysis or other classification models:
Write to /tmp/hugging_face_request.json:
{
"inputs": "I love using Hugging Face!"
}
Then run:
curl -s "https://router.huggingface.co/hf-inference/models/distilbert-base-uncased-finetuned-sst-2-english" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" --header "Content-Type: application/json" -d @/tmp/hugging_face_request.json | jq .
12. List Models with Inference Provider Support
Find models available for serverless inference:
curl -s "https://huggingface.co/api/models?inference_provider=all&pipeline_tag=text-generation&sort=trending&limit=10" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '.[].id'
Filter by a specific provider:
curl -s "https://huggingface.co/api/models?inference_provider=hf-inference&pipeline_tag=text-to-image&limit=5" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '.[].id'
13. Get Model Inference Providers
Check which inference providers serve a specific model:
curl -s "https://huggingface.co/api/models/meta-llama/Llama-3.1-8B-Instruct?expand[]=inferenceProviderMapping" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" | jq '.inferenceProviderMapping'
14. Create a Repository
Create a new model repository:
Write to /tmp/hugging_face_request.json:
{
"name": "my-new-model",
"type": "model",
"private": true
}
Then run:
curl -s -X POST "https://huggingface.co/api/repos/create" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" --header "Content-Type: application/json" -d @/tmp/hugging_face_request.json | jq .
Repository types: model, dataset, space
15. Delete a Repository
Delete a repository (requires write token):
Write to /tmp/hugging_face_request.json:
{
"name": "my-new-model",
"type": "model"
}
Then run:
curl -s -X DELETE "https://huggingface.co/api/repos/delete" --header "Authorization: Bearer $(printenv HUGGING_FACE_TOKEN)" --header "Content-Type: application/json" -d @/tmp/hugging_face_request.json | jq .
Guidelines
- Use Bearer authentication: Pass the token via
Authorization: Bearer $HUGGING_FACE_TOKENheader - Prefer serverless inference for quick tasks: Use the Inference API for prototyping; deploy Inference Endpoints for production
- Check model availability: Not all models support serverless inference; use the
inference_providerfilter to find available models - Use the OpenAI-compatible chat endpoint for text generation:
https://router.huggingface.co/hf-inference/v1/chat/completions - Complex JSON payloads: Write JSON to a temp file and use
-d @/tmp/hugging_face_request.jsonto avoid shell quoting issues - Respect rate limits: Authenticated requests have higher rate limits; consider a Pro account for heavy usage
- Model IDs use org/name format: Always specify the full model ID (e.g.
meta-llama/Llama-3.1-8B-Instruct)