Agent Skills: Storage Format Guide

SQLite file format, B-trees, pages, cells, overflow, freelist that is used in tursodb

UncategorizedID: tursodatabase/turso/storage-format

Install this agent skill to your local

pnpm dlx add-skill https://github.com/tursodatabase/turso/tree/HEAD/.claude/skills/storage-format

Skill Files

Browse the full folder contents for storage-format.

Download Skill

Loading file tree…

.claude/skills/storage-format/SKILL.md

Skill Metadata

Name
storage-format
Description
SQLite file format, B-trees, pages, cells, overflow, freelist that is used in tursodb

Storage Format Guide

Database File Structure

┌─────────────────────────────┐
│ Page 1: Header + Schema     │  ← First 100 bytes = DB header
├─────────────────────────────┤
│ Page 2..N: B-tree pages     │  ← Tables and indexes
│            Overflow pages   │
│            Freelist pages   │
└─────────────────────────────┘

Page size: power of 2, 512-65536 bytes. Default 4096.

Database Header (First 100 Bytes)

| Offset | Size | Field | |--------|------|-------| | 0 | 16 | Magic: "SQLite format 3\0" | | 16 | 2 | Page size (big-endian) | | 18 | 1 | Write format version (1=rollback, 2=WAL) | | 19 | 1 | Read format version | | 24 | 4 | Change counter | | 28 | 4 | Database size in pages | | 32 | 4 | First freelist trunk page | | 36 | 4 | Total freelist pages | | 40 | 4 | Schema cookie | | 56 | 4 | Text encoding (1=UTF8, 2=UTF16LE, 3=UTF16BE) |

All multi-byte integers: big-endian.

Page Types

| Flag | Type | Purpose | |------|------|---------| | 0x02 | Interior index | Index B-tree internal node | | 0x05 | Interior table | Table B-tree internal node | | 0x0a | Leaf index | Index B-tree leaf | | 0x0d | Leaf table | Table B-tree leaf | | - | Overflow | Payload exceeding cell capacity | | - | Freelist | Unused pages (trunk or leaf) |

B-tree Structure

Two B-tree types:

  • Table B-tree: 64-bit rowid keys, stores row data
  • Index B-tree: Arbitrary keys (index columns + rowid)
Interior page:  [ptr0] key1 [ptr1] key2 [ptr2] ...
                   │         │         │
                   ▼         ▼         ▼
               child     child     child
               pages     pages     pages

Leaf page:     key1:data  key2:data  key3:data ...

Page 1 always root of sqlite_schema table.

Cell Format

Table Leaf Cell

[payload_size: varint] [rowid: varint] [payload] [overflow_ptr: u32?]

Table Interior Cell

[left_child_page: u32] [rowid: varint]

Index Cells

Similar but key is arbitrary (columns + rowid), not just rowid.

Record Format (Payload)

[header_size: varint] [type1: varint] [type2: varint] ... [data1] [data2] ...

Serial types: | Type | Meaning | |------|---------| | 0 | NULL | | 1-4 | 1/2/3/4 byte signed int | | 5 | 6 byte signed int | | 6 | 8 byte signed int | | 7 | IEEE 754 float | | 8 | Integer 0 | | 9 | Integer 1 | | ≥12 even | BLOB, length=(N-12)/2 | | ≥13 odd | Text, length=(N-13)/2 |

Overflow Pages

When payload exceeds threshold, excess stored in overflow chain:

[next_page: u32] [data...]

Last page has next_page=0.

Freelist

Linked list of trunk pages, each containing leaf page numbers:

Trunk: [next_trunk: u32] [leaf_count: u32] [leaf_pages: u32...]

Turso Implementation

Key files:

  • core/storage/sqlite3_ondisk.rs - On-disk format, PageType enum
  • core/storage/btree.rs - B-tree operations (large file)
  • core/storage/pager.rs - Page management
  • core/storage/buffer_pool.rs - Page caching

Debugging Storage

# Integrity check
cargo run --bin tursodb test.db "PRAGMA integrity_check;"

# Page count
cargo run --bin tursodb test.db "PRAGMA page_count;"

# Freelist info
cargo run --bin tursodb test.db "PRAGMA freelist_count;"

References