# Gong-Jira-Notion Pipeline: Business Logic Architecture

**Thesis:** This document provides comprehensive documentation of the Gong-Jira-Notion pipeline business logic, covering data flow, two-agent analysis architecture, synthesis algorithm, and LLM prompting strategy for automated call analysis and Jira ticket management.

## Overview

The pipeline automates the process of extracting actionable insights from Gong call recordings and mapping them to Jira tickets, using a **two-agent verification** approach with Claude and Gemini for higher accuracy.

```mermaid
graph TB
    subgraph "1. DATA INGESTION"
        A[Gong Call URL] --> B[Transcript Fetch<br/>from ClickHouse/API]
        B --> C[Metadata Extract<br/>title, date, speakers]
    end

    subgraph "2. CONTEXT ENRICHMENT"
        C --> D[Agency Discovery<br/>ClickHouse lookup]
        D --> E[Client Folder<br/>Resolution]
        E --> F[Jira Epic Lookup<br/>PS-XXXX]
        F --> G[Existing Tickets<br/>Sync]
    end

    subgraph "3. TWO-AGENT ANALYSIS"
        G --> H[Claude Sonnet 4<br/>Analysis]
        G --> I[Gemini Pro 3<br/>Analysis]
    end

    subgraph "4. SYNTHESIS & VERIFICATION"
        H --> J[Confidence Scoring<br/>HIGH/MEDIUM/LOW]
        I --> J
        J --> K[Merge & Dedupe]
        K --> L[Cross-Project<br/>Ticket Match]
    end

    subgraph "5. OUTPUT ACTIONS"
        L --> M[Jira Updates<br/>PS/ISD/IMD/BI]
        L --> N[New Jira Tickets<br/>Technical Issues]
        L --> O[Notion Tasks<br/>CSM Follow-ups]
        L --> P[Summary Email<br/>Client-Safe]
    end

    style H fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style I fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
    style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px
```

---

## 1. DATA INGESTION BLOCK

**Purpose:** Fetch raw call data from Gong (ClickHouse primary, API fallback).

### 1.1 Data Sources

| Source | Latency | Use Case |
|--------|---------|----------|
| **ClickHouse** `internal_analytics.dim_gong_calls` | 1-2 hours lag | Historical calls |
| **Gong API** (fallback) | Real-time | Recent calls <2h |

### 1.2 Extracted Fields

```mermaid
classDiagram
    class GongCallMetadata {
        +call_id: str
        +url: str
        +title: str
        +date: datetime
        +duration_seconds: int
        +workspace: str
        +improvado_team: List~str~
        +client_team: List~str~
        +call_type: str
        +direction: str
    }

    class TranscriptSegment {
        +speaker: str
        +text: str
        +start_seconds: int
        +end_seconds: int
        +timestamp_display: str
    }

    class ParsedTranscript {
        +metadata: GongCallMetadata
        +segments: List~TranscriptSegment~
        +raw_text: str
        +to_markdown(): str
    }

    ParsedTranscript --> GongCallMetadata
    ParsedTranscript --> TranscriptSegment
```

---

## 2. CONTEXT ENRICHMENT BLOCK

**Purpose:** Resolve agency context and fetch existing Jira tickets for matching.

### 2.1 Agency Discovery Flow

```mermaid
flowchart LR
    A[Gong Title] --> B{Parse Title}
    B --> C[Extract Client Name<br/>e.g., "Improvado <> HP"]
    C --> D[ClickHouse Lookup<br/>biz_active_customers]
    D --> E{Found?}
    E -->|Yes| F[Get agency_id,<br/>client_folder]
    E -->|No| G[Use Provided<br/>agency_id]
    F --> H[Find Jira Epic<br/>PS-XXXX]
    G --> H
    H --> I[Load Existing<br/>Tickets]
```

### 2.2 Jira Sync (Cross-Project)

**Key Insight:** Tickets come from multiple Jira projects - ALL must be checked:

| Project | Description | Example |
|---------|-------------|---------|
| **PS-XXXX** | Professional Services (main) | PS-3702 |
| **ISD-XXXX** | Support Desk (customer-submitted) | ISD-19331 |
| **IMD-XXXX** | Implementation/Data issues | IMD-1234 |
| **BI-XXXX** | BI/Analytics issues | BI-567 |

---

## 3. TWO-AGENT ANALYSIS BLOCK

**Purpose:** Run parallel analysis with Claude and Gemini for verification.

### 3.1 Why Two Agents?

```mermaid
pie title "Accuracy Improvement"
    "Single Agent Accuracy" : 75
    "Two-Agent Verification Boost" : 20
    "Edge Cases Caught" : 5
```

- **Different Strengths:** Claude excels at nuanced context; Gemini handles large context windows
- **Cross-Validation:** Agreement = HIGH confidence; Disagreement = needs review
- **Error Reduction:** Hallucinations caught by comparison

### 3.2 Model Configuration

| Agent | Model ID | Context | Temperature | Max Tokens |
|-------|----------|---------|-------------|------------|
| **Claude** | `claude-sonnet-4-20250514` | 200K | 0.1 | 16,000 |
| **Gemini** | `gemini-3-pro-preview` | 166K | 0.1 | 16,000 |

### 3.3 Analysis Prompt Strategy

The prompt is **shared** between both agents (DRY principle) and contains:

#### High-Level Prompt Structure:

```
┌─────────────────────────────────────────────────────────────┐
│  ROLE: "Jira documentation specialist"                       │
├─────────────────────────────────────────────────────────────┤
│  MATCHING RULES:                                             │
│  • Read FULL description, not just title                     │
│  • Pay attention to [VIEW: ...] tags                        │
│  • Check comments for recent context                         │
│  • Match by: dashboard view, feature, data source            │
├─────────────────────────────────────────────────────────────┤
│  CROSS-PROJECT MATCHING (CRITICAL):                          │
│  • Check PS/ISD/IMD/BI projects for duplicates              │
│  • ISD tickets = customer-submitted → often match calls      │
│  • Before creating new → search ALL existing tickets         │
├─────────────────────────────────────────────────────────────┤
│  PRIORITY VALUES (Jira exact names):                         │
│  • "Blocker" = Critical blocker, production down             │
│  • "Critical" = High priority, urgent                        │
│  • "Medium" = Normal priority (default)                      │
│  • "Low" = Nice to have                                      │
├─────────────────────────────────────────────────────────────┤
│  TASK TYPE CLASSIFICATION:                                   │
│  • "jira" = TECHNICAL work requiring ENGINEERING             │
│    - Bugs, errors, data issues, dashboard builds             │
│    - Requires code changes or data fixes                     │
│  • "notion" = CSM/administrative work                        │
│    - Meetings, training, follow-ups, scheduling              │
│    - Can be done WITHOUT engineering                         │
├─────────────────────────────────────────────────────────────┤
│  DEDUPLICATION RULES:                                        │
│  • If action is SUB-TASK of existing → ADD to updates        │
│  • NEVER create both: update + new ticket for same action    │
│  • Check email context for already completed tasks           │
├─────────────────────────────────────────────────────────────┤
│  INPUT DATA:                                                 │
│  • CALL TRANSCRIPT (full text with timestamps)               │
│  • EXISTING JIRA TICKETS (with descriptions + comments)      │
│  • RECENT EMAIL CORRESPONDENCE (optional, for dedup)         │
├─────────────────────────────────────────────────────────────┤
│  OUTPUT FORMAT (JSON):                                       │
│  • summary: 3-paragraph call summary                         │
│  • key_discussion_points: 3-5 client-safe bullet points      │
│  • new_tickets: [{ title, description, priority, task_type }]│
│  • updates: [{ ticket_key, items, confidence, reasoning }]   │
│  • next_steps: [{ owner, task, deadline, party }]            │
└─────────────────────────────────────────────────────────────┘
```

#### Key Prompt Rules:

1. **Client Prefix**: All new ticket titles start with `{CLIENT_PREFIX} - ` (e.g., "CW - Dashboard Issue")
2. **Confidence Levels**: HIGH, MEDIUM, LOW with reasoning
3. **Timestamp Citations**: `timestamp_display` links to exact moment in Gong
4. **Party Field**: `"improvado"` or `"client"` for next_steps ownership

---

## 4. SYNTHESIS & VERIFICATION BLOCK

**Purpose:** Merge two analyses into single verified output with confidence scores.

### 4.1 Confidence Scoring Algorithm

```mermaid
flowchart TD
    A[Gemini Result] --> C{Compare}
    B[Claude Result] --> C

    C --> D{Same Ticket?}
    D -->|Yes| E[EXACT Match<br/>Score: 0.95]
    D -->|No| F{Same Topic?}

    F -->|Yes| G[PARTIAL Match<br/>Score: 0.70]
    F -->|No| H{Only One<br/>Identified?}

    H -->|Yes| I[SINGLE Source<br/>Score: 0.60]
    H -->|No| J[CONFLICT<br/>Score: 0.40]

    E --> K[HIGH Confidence]
    G --> L[MEDIUM Confidence]
    I --> L
    J --> M[LOW Confidence<br/>Needs Manual Review]

    style E fill:#c8e6c9
    style G fill:#fff9c4
    style I fill:#fff9c4
    style J fill:#ffcdd2
```

### 4.2 Merge Strategy

| Match Type | Confidence | Action |
|------------|------------|--------|
| **EXACT** | HIGH (0.95) | Use either result (verified) |
| **PARTIAL** | MEDIUM (0.70) | Prefer Claude for ticket_key, Gemini for description |
| **SINGLE** | MEDIUM (0.60) | Include with source attribution |
| **CONFLICT** | LOW (0.40) | Flag for manual review |

### 4.3 Preference Rules

```python
prefer_claude_for = ["ticket_key", "assignee"]  # Better at matching
prefer_gemini_for = ["description", "summary"]   # Better at context
```

---

## 5. OUTPUT ACTIONS BLOCK

**Purpose:** Execute actions based on synthesis results.

### 5.1 Action Flow

```mermaid
flowchart LR
    subgraph "Synthesis Result"
        A[new_tickets]
        B[updates]
        C[next_steps]
    end

    subgraph "Task Type Router"
        A --> D{task_type?}
        D -->|jira| E[Jira API<br/>Create Ticket]
        D -->|notion| F[Notion API<br/>Create Task]
    end

    subgraph "Update Handler"
        B --> G[Jira API<br/>Add Comment]
        G --> H[Link Gong URL<br/>with Timestamp]
    end

    subgraph "Next Steps"
        C --> I{party?}
        I -->|improvado| J[Notion Task<br/>for Team]
        I -->|client| K[Email Summary<br/>with Actions]
    end
```

### 5.2 Jira Comment Format

```markdown
## Gong Call Update - {date}

**Recording:** [View in Gong]({gong_url}&highlights=[{from}:{to}])

### Discussion Points:
- {item_1}
- {item_2}

> "{exact_quote_from_call}"
```

### 5.3 Client-Safe Summary Rules

**NEVER include in client-facing output:**
- Client emotions/frustrations
- Performance evaluations
- Internal observations
- Personal/sensitive information

**ALWAYS include:**
- Technical topics discussed
- Decisions made
- Clarifications provided
- Project updates

---

## 6. EVENT FLOW (Motia Pipeline)

### 6.1 Topic Subscription Map

```mermaid
sequenceDiagram
    participant API as TriggerGongPipeline
    participant G as FetchGong
    participant A as DiscoverAgency
    participant J as SyncJira
    participant CL as AnalyzeClaude
    participant GE as AnalyzeGemini
    participant S as SynthesizeAnalysis
    participant JC as CreateJira
    participant N as CreateNotion

    API->>G: gong.fetch.requested
    G->>A: gong.fetched
    A->>J: agency.discovered
    J->>CL: jira.synced
    J->>GE: jira.synced
    Note over CL,GE: Parallel Execution
    CL->>S: analysis.claude.complete
    GE->>S: analysis.gemini.complete
    Note over S: Waits for BOTH
    S->>JC: synthesis.complete
    S->>N: synthesis.complete
    Note over JC,N: Parallel Execution
```

### 6.2 Step Registration Summary

| Step | Type | Subscribes | Emits |
|------|------|------------|-------|
| TriggerGongPipeline | API | `POST /gong/trigger` | `gong.fetch.requested` |
| FetchGongReal | Event | `gong.fetch.requested` | `gong.fetched` |
| DiscoverAgencyReal | Event | `gong.fetched` | `agency.discovered` |
| SyncJira | Event | `agency.discovered` | `jira.synced` |
| AnalyzeClaude | Event | `jira.synced` | `analysis.claude.complete` |
| AnalyzeGemini | Event | `jira.synced` | `analysis.gemini.complete` |
| SynthesizeAnalysis | Event | `analysis.*.complete` | `synthesis.complete` |
| CreateJira | Event | `synthesis.complete` | `jira.created` |
| CreateNotion | Event | `synthesis.complete` | `notion.created` |

---

## Ground Truth

**Source Files:**
- `lib/gong_fetcher.py` - Gong data fetch with fallback
- `lib/agency_discovery.py` - Agency resolution
- `lib/claude_analyzer.py` - Claude analysis
- `lib/gemini_analyzer.py` - Gemini analysis
- `lib/analyzer_prompt.py` - Shared prompt template (v3.0.8)
- `lib/synthesis.py` - Two-agent merge logic

**Document Metadata:**
- Created: 2025-12-30
- Author: Claude Code
- Version: 1.0.0
- Pipeline Version: v3.2.2
