---
name: hubspot
description: Query HubSpot data from ClickHouse internal_analytics - contacts, forms, deals, engagements, pre-qualification status. Use when user says "HubSpot contacts", "form submissions", "hubspot leads", "pre-qualification status", "hubspot engagements", "хабспот", "формы хабспот".
version: "1.0.0"
---
## HubSpot = ClickHouse Internal Analytics Tables

**Core Principle:** Query HubSpot CRM data via ClickHouse `internal_analytics` shard using `stg_hubspot_*` tables for contacts, forms, engagements, and lead qualification analysis.

**HubSpot Tables Schema (Continuant - TD):**
```mermaid
graph TD
    HS[HubSpot Tables] --> Contacts[stg_hubspot_contacts]
    HS --> Forms[stg_hubspot_form_submissions]
    HS --> Engage[stg_hubspot_engagements]

    Contacts --> CFields[email, pre_qual_status, segment, lead_score]
    Forms --> FFields[form_id, submitted_at, values_extract_map]
    Engage --> EFields[type: EMAIL/MEETING/CALL/TASK]

    Engage --> Meetings[stg_hubspot_engagements_meetings]
    Engage --> Calls[stg_hubspot_engagements_calls]
    Engage --> Tasks[stg_hubspot_engagements_tasks]
```

**Query Workflow (Occurrent - LR):**
```mermaid
graph LR
    A[Identify Data Need] --> B[Select Table]
    B --> C[Build Query with Joins]
    C --> D[Handle Arrays with arrayJoin]
    D --> E[Execute via internal.query]
```

**Primary source:** `07_HubSpot_Schema_Description.md`
**Session ID:** `76fd56b0-a051-44b2-b9e5-1563e48bd806` (2026-01-29)

### 1. Tables Reference

| Table | Purpose | Key Fields |
|-------|---------|------------|
| `stg_hubspot_contacts` | Contact info | `email`, `hubspot_contacts_id`, `hubspot_contact_pre_qualification_status`, `hubspot_contact_segment`, `hubspot_create_datetime` |
| `stg_hubspot_form_submissions` | Form fills | `email`, `hubspot_form_id`, `hubspot_form_submitted_at`, `hubspot_page_url`, `hubspot_values_extract_map` |
| `stg_hubspot_engagements` | All interactions | `hubspot_engagements_id`, `hubspot_engagements_type`, `hubspot_contact_id_arr`, `hubspot_engagements_created_at` |
| `stg_hubspot_engagements_meetings` | Meetings | `hubspot_event_id`, `hubspot_event_title`, `hubspot_event_start_datetime`, `hubspot_contacts` |
| `stg_hubspot_engagements_calls` | Calls | `hubspot_engagement_call_status`, `hubspot_engagement_call_duration_milliseconds` |

### 2. Pre-Qualification Statuses

Common values in `hubspot_contact_pre_qualification_status`:
- `Pre-Qualified` - Qualified for sales
- `Went Dark` - No response
- `Disqualified` - Not a fit
- `Disqualified – spam` - Spam lead
- `Actively Working` - In progress
- `Added to sequence` - In nurture sequence

### 3. Python Access Pattern

```python
from data_sources.clickhouse import internal

# Get contacts with pre-qual status
df = internal.query("""
    SELECT email, hubspot_contact_pre_qualification_status as status,
           hubspot_contact_segment as segment, hubspot_create_datetime
    FROM stg_hubspot_contacts
    WHERE hubspot_create_datetime >= '2024-01-01'
      AND hubspot_contact_pre_qualification_status != ''
""")

# Get form submissions with field values
df_forms = internal.query("""
    SELECT email, hubspot_form_submitted_at,
           hubspot_values_extract_map['monthly_ad_spend'] as ad_spend,
           hubspot_values_extract_map['contact_channel__c'] as channel
    FROM stg_hubspot_form_submissions
    WHERE hubspot_form_submitted_at >= '2024-01-01'
""")
```

### 4. Working with Arrays

HubSpot tables use arrays for contact/company associations. Use `arrayJoin` to expand:

```sql
-- Get all meetings with contact info
SELECT
    m.hubspot_event_title,
    m.hubspot_event_start_datetime,
    arrayJoin(e.hubspot_contact_id_arr) AS contact_id
FROM stg_hubspot_engagements e
JOIN stg_hubspot_engagements_meetings m
    ON e.hubspot_engagements_id = m.hubspot_event_id
WHERE e.hubspot_engagements_type = 'MEETING'
```

### 5. Common Joins

**Forms + Contacts:**
```sql
SELECT fs.email, fs.hubspot_form_submitted_at,
       c.hubspot_contact_pre_qualification_status,
       c.hubspot_contact_segment
FROM stg_hubspot_form_submissions fs
JOIN stg_hubspot_contacts c ON fs.email = c.email
```

**Contacts + Engagements:**
```sql
WITH contact_engagements AS (
    SELECT hubspot_engagements_id, hubspot_engagements_type,
           arrayJoin(hubspot_contact_id_arr) AS contact_id
    FROM stg_hubspot_engagements
)
SELECT c.email, ce.hubspot_engagements_type, COUNT(*) as cnt
FROM stg_hubspot_contacts c
JOIN contact_engagements ce ON c.hubspot_contacts_id = ce.contact_id
GROUP BY c.email, ce.hubspot_engagements_type
```

### 6. Form Field Extraction

`hubspot_values_extract_map` is a Map(String, String). Common fields:
- `monthly_ad_spend` - Budget indicator
- `contact_channel__c` - How they found us
- `how_can_we_help` - Interest area
- `jobtitle` - Job title
- `company` - Company name
- `firstname`, `lastname`, `phone`

Extract with: `hubspot_values_extract_map['field_name']`
