Agent Skills: Performing Log Source Onboarding in SIEM

Perform structured log source onboarding into SIEM platforms by configuring collectors, parsers, normalization, and validation for complete security visibility.

UncategorizedID: plurigrid/asi/performing-log-source-onboarding-in-siem

Install this agent skill to your local

pnpm dlx add-skill https://github.com/plurigrid/asi/tree/HEAD/plugins/asi/skills/performing-log-source-onboarding-in-siem

Skill Files

Browse the full folder contents for performing-log-source-onboarding-in-siem.

Download Skill

Loading file tree…

plugins/asi/skills/performing-log-source-onboarding-in-siem/SKILL.md

Skill Metadata

Name
performing-log-source-onboarding-in-siem
Description
Perform structured log source onboarding into SIEM platforms by configuring collectors, parsers, normalization, and validation for complete security visibility.

Performing Log Source Onboarding in SIEM

Overview

Log source onboarding is the systematic process of integrating new data sources into a SIEM platform to enable security monitoring and detection. Proper onboarding requires planning data sources, configuring collection agents, building parsers, normalizing fields to a common schema, and validating data quality. According to the UK NCSC, onboarding should prioritize log sources that provide the highest security value relative to their ingestion cost.

When to Use

  • When conducting security assessments that involve performing log source onboarding in siem
  • When following incident response procedures for related security events
  • When performing scheduled security testing or auditing activities
  • When validating security controls through hands-on testing

Prerequisites

  • SIEM platform deployed (Splunk, Elastic, Sentinel, QRadar, or similar)
  • Network access from source systems to SIEM collectors
  • Administrative access on source systems for agent installation
  • Common Information Model (CIM) or equivalent schema documentation
  • Change management approval for production system modifications

Log Source Priority Framework

Tier 1 - Critical (Onboard First)

| Source | Log Type | Security Value | |---|---|---| | Active Directory | Security Event Logs | Authentication, privilege escalation | | Firewalls | Traffic logs | Network access, C2 detection | | EDR/AV | Endpoint alerts | Malware, process execution | | VPN/Remote Access | Connection logs | Unauthorized access | | DNS Servers | Query logs | C2 beaconing, data exfiltration | | Email Gateway | Email security logs | Phishing, BEC |

Tier 2 - High Priority

| Source | Log Type | Security Value | |---|---|---| | Web Proxy | HTTP/HTTPS logs | Web-based attacks, data exfiltration | | Cloud platforms (AWS/Azure/GCP) | Audit logs | Cloud security posture | | Database servers | Audit/query logs | Data access, SQL injection | | DHCP/IPAM | Address allocation | Asset tracking | | File servers | Access logs | Data access monitoring |

Tier 3 - Standard

| Source | Log Type | Security Value | |---|---|---| | Application servers | App logs | Application-level attacks | | Print servers | Print logs | Data loss prevention | | Badge/physical access | Access logs | Physical security correlation | | Network devices (switches/routers) | Syslog | Network anomalies |

Onboarding Process

Step 1: Discovery and Assessment

1. Identify the log source:
   - System type and version
   - Log format (syslog, CEF, JSON, Windows Events, etc.)
   - Log volume estimate (EPS - events per second)
   - Network location and firewall requirements

2. Assess security value:
   - What threats can this source help detect?
   - Which MITRE ATT&CK techniques does it cover?
   - Is there an existing SIEM parser?

3. Estimate ingestion cost:
   - Daily volume in GB
   - License impact (per-GB or per-EPS pricing)
   - Storage retention requirements

Step 2: Configure Log Collection

Syslog-Based Collection (Firewalls, Network Devices)

# rsyslog configuration for receiving syslog
# /etc/rsyslog.d/10-siem-collection.conf

# UDP reception
module(load="imudp")
input(type="imudp" port="514" ruleset="siem_forwarding")

# TCP reception
module(load="imtcp")
input(type="imtcp" port="514" ruleset="siem_forwarding")

# TLS reception
module(load="imtcp" StreamDriver.AuthMode="x509/name"
       StreamDriver.Mode="1" StreamDriver.Name="gtls")
input(type="imtcp" port="6514" ruleset="siem_forwarding")

ruleset(name="siem_forwarding") {
    # Forward to SIEM
    action(type="omfwd" target="siem.company.com" port="9514"
           protocol="tcp" queue.type="LinkedList"
           queue.filename="siem_fwd" queue.maxdiskspace="1g"
           queue.saveonshutdown="on" action.resumeRetryCount="-1")
}

Windows Event Log Collection (Splunk Universal Forwarder)

# inputs.conf on Splunk Universal Forwarder
[WinEventLog://Security]
disabled = 0
index = wineventlog
sourcetype = WinEventLog:Security
evt_resolve_ad_obj = 1
checkpointInterval = 5

[WinEventLog://System]
disabled = 0
index = wineventlog
sourcetype = WinEventLog:System

[WinEventLog://Microsoft-Windows-Sysmon/Operational]
disabled = 0
index = wineventlog
sourcetype = XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
renderXml = true

[WinEventLog://Microsoft-Windows-PowerShell/Operational]
disabled = 0
index = wineventlog
sourcetype = XmlWinEventLog:Microsoft-Windows-PowerShell/Operational

Cloud Log Collection (AWS CloudTrail)

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Resources": {
    "CloudTrailToSIEM": {
      "Type": "AWS::CloudTrail::Trail",
      "Properties": {
        "TrailName": "siem-cloudtrail",
        "S3BucketName": "company-cloudtrail-logs",
        "IsLogging": true,
        "IsMultiRegionTrail": true,
        "IncludeGlobalServiceEvents": true,
        "EnableLogFileValidation": true,
        "EventSelectors": [
          {
            "ReadWriteType": "All",
            "IncludeManagementEvents": true,
            "DataResources": [
              {
                "Type": "AWS::S3::Object",
                "Values": ["arn:aws:s3"]
              }
            ]
          }
        ]
      }
    }
  }
}

Step 3: Parse and Normalize

Custom Parser Example (Splunk props.conf/transforms.conf)

# props.conf
[custom:firewall:logs]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S%z
MAX_TIMESTAMP_LOOKAHEAD = 30
TRANSFORMS-firewall = firewall_extract_fields
FIELDALIAS-src = src_addr AS src_ip
FIELDALIAS-dst = dst_addr AS dest_ip
EVAL-action = case(fw_action=="allow", "allowed", fw_action=="deny", "blocked", true(), "unknown")
EVAL-vendor_product = "Custom Firewall"
LOOKUP-geo = geo_ip_lookup ip AS dest_ip OUTPUT country, city, latitude, longitude

# transforms.conf
[firewall_extract_fields]
REGEX = ^(\S+)\s+(\S+)\s+action=(\w+)\s+src=(\S+):(\d+)\s+dst=(\S+):(\d+)\s+proto=(\w+)\s+bytes=(\d+)
FORMAT = timestamp::$1 hostname::$2 fw_action::$3 src_addr::$4 src_port::$5 dst_addr::$6 dst_port::$7 protocol::$8 bytes::$9

CIM Field Mapping

| Raw Field | CIM Field | Data Model | |---|---|---| | src_addr | src_ip | Network_Traffic | | dst_addr | dest_ip | Network_Traffic | | dst_port | dest_port | Network_Traffic | | fw_action | action | Network_Traffic | | bytes_sent + bytes_recv | bytes | Network_Traffic | | user_name | user | Authentication | | login_result | action | Authentication | | process_path | process | Endpoint |

Step 4: Validate Data Quality

# Verify events are arriving
index=new_source earliest=-1h
| stats count by sourcetype, host, source

# Check field extraction quality
index=new_source earliest=-1h
| stats count(src_ip) as has_src count(dest_ip) as has_dest count(action) as has_action count by sourcetype
| eval src_coverage=round(has_src/count*100,1)
| eval dest_coverage=round(has_dest/count*100,1)
| eval action_coverage=round(has_action/count*100,1)

# Verify CIM compliance
| datamodel Network_Traffic search
| search sourcetype=new_sourcetype
| stats count by source, sourcetype

# Check for timestamp parsing issues
index=new_source earliest=-1h
| eval time_diff=abs(_time - _indextime)
| stats avg(time_diff) as avg_lag max(time_diff) as max_lag by host
| where avg_lag > 300

Step 5: Enable Detection Coverage

# Verify existing correlation searches work with new source
index=new_source sourcetype=new_sourcetype
| tstats count from datamodel=Authentication by _time span=1h
| timechart span=1h count

# Create source-specific detection rule
[New Source - Authentication Anomaly]
search = index=new_source sourcetype=new_sourcetype action=failure \
| stats count by src_ip, user \
| where count > 10

Onboarding Checklist

  • [ ] Log source assessed and approved
  • [ ] Network connectivity verified
  • [ ] Collection agent/method configured
  • [ ] Log forwarding confirmed
  • [ ] Parser/field extraction configured
  • [ ] CIM compliance validated
  • [ ] Data model acceleration enabled
  • [ ] Volume within license budget
  • [ ] Retention policy configured
  • [ ] Detection rules enabled/created
  • [ ] Dashboard updated
  • [ ] Documentation completed
  • [ ] SOC team notified

References