Web Fetch
Overview
Extract clean, readable content from any URL using Jina Reader API. Returns raw JSON with title, content, and metadata optimized for LLM consumption.
When to Use
- User wants to read or analyze webpage content
- Need to extract article text from a URL
- Fetching documentation or reference pages
- Converting web pages to clean text for processing
Workflow
- Identify the URL from user request
- Validate URL format
- Run the fetch script
- Present extracted content to user
Usage
# Basic fetch
uv run --script scripts/web_fetch.py --url "https://example.com"
# With custom timeout
uv run --script scripts/web_fetch.py \
--url "https://example.com/article" \
--timeout 60
Parameters
| Parameter | Default | Description |
| ----------- | ---------- | ------------------------------------- |
| --url | (required) | URL to fetch and extract content from |
| --timeout | 30 | Request timeout in seconds |
Output Contract
| Scenario | stdout | stderr | exit code | | ----------- | ------------------ | ------------------ | --------- | | Success | Raw JSON from Jina | (empty) | 0 | | Invalid URL | (empty) | Error message | 1 | | Timeout | (empty) | Timeout error | 1 | | HTTP Error | (empty) | HTTP error details | 1 |
Success output contains:
- Page title and description
- Clean extracted content (markdown-formatted)
- URL and metadata
- Token usage information
Prerequisites
- Uses Jina Reader API (no API key required)
- Requires
uvfor running PEP 723 scripts
Examples
Fetch a webpage
uv run --script scripts/web_fetch.py \
--url "https://docs.python.org/3/whatsnew/3.12.html"
Fetch with longer timeout for slow pages
uv run --script scripts/web_fetch.py \
--url "https://example.com/large-article" \
--timeout 60