Data Visualization
Visualization is communication. Every visual element must serve understanding.
Critical Rules
π¨ Use established algorithms. Graph layout, tree layout, spatial indexingβthese problems are solved. Check dagre, d3-force, ELK.js before implementing anything custom.
π¨ Choose encodings by perceptual accuracy. Position beats length beats angle beats area beats color. Prefer bar charts over pie charts over bubble charts.
π¨ Never rely on color alone. 8% of men are colorblind. Use shape, pattern, or labels as backup encoding.
π¨ Match rendering to scale. SVG for <1000 elements, Canvas for 1000-10000, WebGL for >10000.
1. Visual Encoding
Marks & Channels
Marks are geometric primitives representing data:
- Points (scatter plots, dot plots)
- Lines (line charts, network edges)
- Areas (bar charts, area charts, maps)
Channels are visual properties applied to marks:
- Position (x, y coordinates)
- Size (length, area, volume)
- Color (hue, saturation, lightness)
- Shape (circle, square, triangle)
- Orientation (angle, slope)
Cleveland & McGill Hierarchy (1984)
Visual encodings ranked by perceptual accuracy:
- Position along common scale (most accurate)
- Position on non-aligned scales
- Length
- Angle/slope
- Area
- Volume
- Color saturation/hue (least accurate)
Implication: Bar charts (position) > pie charts (angle) > bubble charts (area)
Preattentive Attributes
Properties processed in <250ms without conscious effort:
- Color (hue, saturation)
- Form (orientation, length, width, size, shape)
- Spatial position
- Motion
Use preattentive attributes for the most important dataβthey "pop out" automatically.
Channel Effectiveness by Data Type
| Data Type | Best Channels | |-----------|---------------| | Quantitative | Position, length, angle, area | | Ordinal | Position, density, saturation | | Categorical | Shape, hue, spatial region |
2. Interaction Design
Shneiderman's Mantra (1996)
"Overview first, zoom and filter, then details on demand"
- Overview β Show entire dataset, establish context
- Zoom & Filter β Reduce complexity, focus on subset
- Details on Demand β Tooltips, click-to-expand, drill-down
Interaction Patterns
| Pattern | Use Case | |---------|----------| | Brushing & linking | Cross-highlighting across coordinated views | | Focus + context | Fisheye lens, detail-on-demand panels | | Direct manipulation | Drag nodes, resize elements, reorder | | Animated transitions | Help users track changes between states | | Pan & zoom | Navigate large visualizations | | Filtering | Reduce data to relevant subset | | Selection | Highlight specific data points |
3. Chart Selection
By Question Type
| Question | Chart Type | Why | |----------|------------|-----| | How do values compare? | Bar chart | Position encoding is most accurate | | How has this changed over time? | Line chart | Shows trends, handles many points | | What's the distribution? | Histogram, box plot | Shows spread, outliers, shape | | What's the relationship? | Scatter plot | Reveals correlation, clusters | | What's the part-to-whole? | Stacked bar, treemap | Shows composition | | What are the connections? | Network graph, Sankey | Shows relationships, flows | | What's the hierarchy? | Tree, sunburst, treemap | Shows parent-child structure | | Where is it? | Choropleth, symbol map | Geographic context |
By Data Volume
| Volume | Approach | |--------|----------| | <20 points | Simple charts, direct labeling | | 20-500 | Standard visualization | | 500-5000 | Consider aggregation, filtering | | 5000+ | Aggregation mandatory, or Canvas/WebGL |
Common Anti-Patterns
- β Pie charts with >5 slices (use bar chart)
- β 3D charts without strong justification
- β Dual-axis with unrelated scales (misleading)
- β Non-zero baselines for bar charts (distorts perception)
- β Truncated axes without clear indication
4. Color
Palette Types
| Type | Use Case | Examples | |------|----------|----------| | Sequential | Low to high values | Blues, Greens, Viridis | | Diverging | Values diverge from midpoint | RdBu, BrBG, Spectral | | Categorical | Distinct categories | Set2, Tableau10, Category10 |
Colorblind Safety
- 8% of men, 0.5% of women have color vision deficiency
- Never rely on color aloneβuse shape, pattern, labels
- Safe sequential: viridis, cividis, plasma
- Safe categorical: ColorBrewer's colorblind-safe options
- Test with: Coblis, Sim Daltonism, Chrome DevTools
Perceptual Uniformity
- Avoid rainbow colormaps (jet)βperceptual steps are uneven
- Use viridis, parula, cividis for sequential data
- These ensure equal perceptual distance between values
Color Guidelines
- 4.5:1 contrast ratio for text (WCAG AA)
- 3:1 contrast for UI components
- Max 7-10 distinct categorical colors
- Use saturation/lightness variation for emphasis
5. Layout Algorithms
π¨ Before implementing ANY layout algorithm, check if a library exists.
Algorithm β Library Mapping
| Problem | Algorithm | Libraries | |---------|-----------|-----------| | Layered/DAG graphs | Sugiyama (1981) | dagre, ELK.js | | Force-directed networks | Fruchterman-Reingold (1991) | d3-force, Cytoscape.js | | Tree layouts | Reingold-Tilford (1981) | d3-hierarchy | | Treemaps | Squarified (2000) | d3-hierarchy, ECharts | | Circle packing | Wang (2006) | d3-hierarchy | | Sankey diagrams | β | d3-sankey | | Chord diagrams | β | d3-chord | | Large graphs (10k+) | WebGL + spatial indexing | Sigma.js, G6, deck.gl | | Spatial queries | Quadtree, R-tree | d3-quadtree, rbush | | Edge crossing minimization | Barth (2002) | Built into dagre/ELK |
When to Use Each Layout
| Layout | Best For | |--------|----------| | Sugiyama (dagre) | Flowcharts, dependency graphs, DAGs with direction | | Force-directed | Social networks, organic relationships, exploration | | Tree | Hierarchies with single parent per node | | Treemap | Hierarchies with quantitative values | | Circular | Emphasizing central nodes, ring structures | | Matrix | Dense graphs where edges would overlap |
These problems are solved. Never implement from scratch.
6. Rendering & Performance
Rendering Technology Thresholds
<1000 elements β SVG
- DOM events work naturally
- Accessibility (ARIA) supported
- Crisp at any zoom level
- CSS styling
1000-10000 β Canvas
- Batch rendering
- Manual hit testing required
- Lower memory footprint
- requestAnimationFrame for animation
>10000 β WebGL
- GPU acceleration
- Sigma.js, deck.gl, regl
- Complex setup
- Limited text rendering
Performance Patterns
| Pattern | When to Use | |---------|-------------| | Web Workers | Layout computation (never block main thread) | | Spatial indexing | Hit detection with quadtree/R-tree | | Level-of-detail | Simplify distant/small elements | | Viewport culling | Only render visible elements | | Debouncing | Expensive interactions (zoom, filter) | | Virtualization | Long lists of chart components | | Aggregation | Too many data points to render individually |
Anti-Patterns
- β 5000 SVG nodes (use Canvas)
- β Layout computation on main thread
- β Hit testing without spatial indexing
- β Rendering off-screen elements
- β Animating thousands of elements individually
7. Libraries
Graph Layouts
| Library | Best For | Notes | |---------|----------|-------| | dagre | Layered DAGs, flowcharts | Sugiyama algorithm, good defaults | | dagre-d3 | dagre + D3 rendering | SVG output | | ELK.js | Complex layouts, compound graphs | Eclipse Layout Kernel, highly configurable | | d3-force | Organic networks | Fruchterman-Reingold, customizable forces | | Cytoscape.js | Graph analysis + visualization | Rich algorithm library | | Sigma.js | Large graphs (10k+) | WebGL rendering | | G6/AntV | Enterprise graphs | Full-featured, Chinese ecosystem | | vis-network | Quick prototypes | Easy API, limited customization |
Charting
| Library | Best For | Notes | |---------|----------|-------| | D3.js | Custom, highly interactive | Low-level, maximum control | | Observable Plot | Quick exploration | D3 team, excellent defaults | | Recharts | React integration | Declarative, composable | | Victory | React integration | Animation support | | ECharts | Feature-rich dashboards | Great mobile, large dataset support | | Vega-Lite | Grammar of graphics | Declarative JSON spec | | Chart.js | Simple charts | Easy setup, limited customization | | Plotly | Scientific visualization | 3D support, interactivity |
When to Use D3 vs Higher-Level Libraries
Use D3 when:
- Need complete control over rendering
- Building novel/custom visualizations
- Integrating with existing SVG/Canvas code
- Performance-critical with custom optimizations
Use higher-level libraries when:
- Standard chart types suffice
- Faster development time matters
- Team less experienced with D3
- Need built-in responsiveness/animation
8. Composition & Layout
Project Composition (Dashboard Level)
- Visual hierarchy β Guide eye to most important first
- Grid systems β Align elements for coherence
- Grouping β Related visualizations together
- White space β Breathing room, not wasted space
- Reading flow β Z-pattern or F-pattern for Western audiences
Chart Composition (Single Chart)
| Element | Guidelines | |---------|------------| | Title | Clear, descriptive; top-left or centered above | | Subtitle | Additional context; smaller, below title | | Axes | Labeled with units; tick marks at meaningful intervals | | Legend | Embedded when possible; external if complex | | Aspect ratio | Affects slope perception; 45Β° banking for trends | | Margins | Enough for labels; consistent across charts |
Aspect Ratio Guidelines
- Line charts: ~16:9 for trends (banking to 45Β°)
- Bar charts: Depends on number of bars
- Scatter plots: Often square (1:1) for correlation
- Maps: Preserve geographic proportions
9. Annotation
Annotation Types
| Type | Purpose | |------|---------| | Title | The "what" β identifies the visualization | | Subtitle | Additional context, data source | | Caption | The "so what" β key insight or takeaway | | Axis labels | Variable names and units | | Legend | Decode color/shape/size mappings | | Callouts | Highlight specific data points | | Reference lines | Benchmarks, targets, averages | | Source citation | Data provenance |
Best Practices
- Annotate the insight, not just the data β "Sales peaked in Q3" not just "Sales over time"
- Use callouts sparingly β Highlight 1-3 key points maximum
- Direct labeling β Embed labels in chart when possible (vs separate legend)
- Provide context β Benchmarks, historical reference, targets
- Layer information β Overview visible, details on interaction
Text Hierarchy
- Title (largest, boldest)
- Subtitle/caption
- Axis titles
- Tick labels
- Annotations
- Source (smallest)
10. Accessibility
WCAG Requirements
- AA minimum (AAA preferred)
- 4.5:1 contrast ratio for normal text
- 3:1 contrast for large text and UI components
- No information conveyed by color alone
Keyboard Navigation
- Tab through interactive elements
- Arrow keys for traversing data points
- Enter/Space for selection
- Escape to cancel/close
Screen Reader Support
<svg role="img" aria-labelledby="chart-title chart-desc">
<title id="chart-title">Monthly Sales 2024</title>
<desc id="chart-desc">Bar chart showing sales increasing from $10M in January to $15M in December</desc>
</svg>
- Use ARIA labels and roles
- Provide text alternatives
- Announce dynamic updates with live regions
- Structure for logical reading order
Alternative Representations
- Data tables β Provide as fallback for all charts
- Text summaries β Describe key insights
- Sonification β Audio representation for time-series
- Tactile graphics β For physical accessibility
11. Anti-Patterns Summary
Design Anti-Patterns
| Anti-Pattern | Why It's Wrong | What to Do | |--------------|----------------|------------| | 3D charts | Distorts perception | Use 2D | | Pie >5 slices | Hard to compare | Use bar chart | | Dual unrelated axes | Misleading correlation | Separate charts | | Non-zero baseline | Exaggerates differences | Start at zero | | Rainbow colormap | Perceptually uneven | Use viridis | | Color-only encoding | Excludes colorblind | Add shape/pattern | | Chart junk | Distracts from data | Remove decoration | | Overplotting | Hides data density | Aggregate or jitter |
Implementation Anti-Patterns
| Anti-Pattern | Why It's Wrong | What to Do | |--------------|----------------|------------| | Custom graph layout | Reinventing solved problem | Use dagre/ELK | | 5000 SVG nodes | Poor performance | Use Canvas | | Main thread layout | Blocks UI | Use Web Worker | | No spatial indexing | Slow hit detection | Use quadtree | | Rendering off-screen | Wasted computation | Viewport culling |
12. Academic Foundations
Seminal Papers
| Paper | Year | Contribution | |-------|------|--------------| | Cleveland & McGill "Graphical Perception" | 1984 | Visual encoding hierarchy | | Shneiderman "The Eyes Have It" | 1996 | Overview-zoom-filter-details mantra | | Gansner et al. "Drawing Directed Graphs" | 1993 | Foundation for dagre | | Fruchterman & Reingold "Force-directed Placement" | 1991 | Foundation for d3-force | | Sugiyama et al. "Hierarchical Systems" | 1981 | Layered graph layout | | Barth et al. "Bilayer Cross Counting" | 2002 | Edge crossing minimization | | Brewer "Color Use Guidelines" | 1994 | ColorBrewer palettes |
Essential Resources
| Resource | Type | Focus | |----------|------|-------| | ColorBrewer (colorbrewer2.org) | Tool | Accessible color palettes | | From Data to Viz (data-to-viz.com) | Guide | Chart selection decision tree | | Visualization Analysis & Design (Munzner) | Textbook | Comprehensive theory | | Data Visualisation (Kirk) | Textbook | Practitioner guide | | Visual Display of Quantitative Information (Tufte) | Textbook | Data-ink ratio, chart junk | | D3 Gallery (observablehq.com/@d3/gallery) | Examples | Implementation patterns |
Summary
π¨ Before implementing visualization:
- What question are you answering? β Select chart type
- What's your data volume? β Select rendering technology
- Is there an established algorithm? β Use the library
- Is it accessible? β Color, keyboard, screen reader
- Does it follow perceptual best practices? β Encoding hierarchy