dot-skills Graph Database Schema Design Best Practices
Comprehensive graph database data modeling guide for property graphs (Neo4j, Memgraph, Amazon Neptune, etc.). Contains 46 rules across 8 categories, prioritized by modeling impact from critical (entity classification, relationship design) to incremental (scale and evolution). Each rule includes detailed explanations, real-world Cypher examples comparing incorrect vs. correct models, and specific impact descriptions.
Philosophy: Data modeling correctness first, performance second. Always ask "what is the user trying to achieve?" before choosing structure.
When to Apply
Reference these guidelines when:
- Designing a new graph database schema from domain requirements
- Translating a relational schema to a graph model
- Deciding whether something should be a node, relationship, or property
- Reviewing an existing graph schema for modeling errors
- Refactoring a graph that produces awkward or slow queries
- Planning for schema evolution and data growth
Rule Categories by Priority
| Priority | Category | Impact | Prefix |
|----------|----------|--------|--------|
| 1 | Entity Classification | CRITICAL | entity- |
| 2 | Relationship Design | CRITICAL | rel- |
| 3 | Property Placement | HIGH | prop- |
| 4 | Query-Driven Refinement | HIGH | query- |
| 5 | Structural Patterns | HIGH | pattern- |
| 6 | Anti-Patterns | MEDIUM | anti- |
| 7 | Constraints & Integrity | MEDIUM | constraint- |
| 8 | Scale & Evolution | LOW-MEDIUM | scale- |
Quick Reference
1. Entity Classification (CRITICAL)
entity-events- Model multi-participant events as first-class nodesentity-shared-values- Promote shared property values to nodesentity-specific-labels- Use specific labels over generic onesentity-multi-label- Qualify entities with multiple labelsentity-identity-state- Separate identity from mutable stateentity-reify-actions- Reify lifecycle actions into nodesentity-avoid-god-nodes- Avoid kitchen-sink entity nodes
2. Relationship Design (CRITICAL)
rel-specific-types- Use specific relationship types over generic onesrel-meaningful-direction- Choose semantically meaningful directionrel-naming-conventions- Follow UPPER_SNAKE_CASE for relationship typesrel-no-redundant-reverse- Don't create redundant reverse relationshipsrel-properties-scope- Put data on relationships only when it describes the connectionrel-single-semantic- One relationship type per semantic meaningrel-typed-over-filtered- Prefer typed relationships over generic + property filter
3. Property Placement (HIGH)
prop-no-foreign-keys- Don't embed foreign keys as propertiesprop-promote-to-node- Promote frequently-queried values to nodesprop-correct-data-types- Use appropriate data types for propertiesprop-no-arrays-for-connections- Don't use property arrays when you need relationshipsprop-relationship-vs-node-data- Know when data belongs on relationship vs. node
4. Query-Driven Refinement (HIGH)
query-critical-traversals- Design for your most critical traversals firstquery-shortcut-relationships- Add shortcut relationships for frequent multi-hop queriesquery-denormalize-reads- Denormalize for read-heavy pathsquery-filter-by-rel-props- Use relationship properties to filter traversalsquery-test-before-deploy- Test model against real queries before deploying
5. Structural Patterns (HIGH)
pattern-intermediary-nodes- Use intermediary nodes for multi-entity relationshipspattern-hierarchy- Model hierarchies with category nodes and depth relationshipspattern-linked-list- Use linked lists for ordered sequencespattern-timeline-tree- Apply timeline trees for temporal datapattern-fan-out- Fan-out pattern for event streams and activity feedspattern-bipartite- Use bipartite structure for many-to-many with context
6. Anti-Patterns (MEDIUM)
anti-join-table-nodes- Don't model relational join tables as nodesanti-generic-relationships- Don't use generic RELATED_TO or CONNECTED relationshipsanti-relational-porting- Don't port relational schemas directly to graphanti-over-modeling- Don't make everything a nodeanti-duplicate-data- Don't duplicate data instead of creating relationshipsanti-string-encoded-structure- Don't encode structured data as delimited strings
7. Constraints & Integrity (MEDIUM)
constraint-unique-identifiers- Define uniqueness constraints on natural identifiersconstraint-existence- Use existence constraints for required propertiesconstraint-index-traversals- Create indexes on traversal entry point propertiesconstraint-no-over-index- Don't over-index — each index has a write costconstraint-node-key- Use composite node keys for natural multi-part identifiers
8. Scale & Evolution (LOW-MEDIUM)
scale-supernode-mitigation- Mitigate supernodes with fan-out or partitioningscale-temporal-versioning- Separate current state from historical statescale-schema-migration- Plan for label and relationship type evolutionscale-batch-refactoring- Use APOC or batched queries for schema refactoringscale-dense-node-detection- Monitor and detect emerging supernodes
How to Use
Read individual reference files for detailed explanations and code examples:
- Section definitions - Category structure and impact levels
- Rule template - Template for adding new rules
Reference Files
| File | Description | |------|-------------| | references/_sections.md | Category definitions and ordering | | assets/templates/_template.md | Template for new rules | | metadata.json | Version and reference information |