SIEM Complete Guide: Security Monitoring & Log Analysis 2024

Key Takeaways

SIEM aggregates logs from all sources
Correlation rules detect complex attacks
Log quality is more important than quantity
Use cases must align with threat model

1. What is SIEM?
2. SIEM Architecture
3. Major SIEM Platforms
4. Log Sources
5. Detection Use Cases
6. Query Examples
7. Alert Tuning
8. Best Practices

1. What is SIEM?

Security Information and Event Management (SIEM) combines security information management (SIM) and security event management (SEM). It collects logs from across the enterprise, normalizes them, correlates events, and generates alerts for potential threats.

SIEM Capabilities

Log aggregation: Collect from all sources
Normalization: Standard format for analysis
Correlation: Connect related events
Alerting: Real-time threat notification
Dashboards: Visibility into security posture
Forensics: Historical investigation
Compliance: Audit trails and reporting

2. SIEM Architecture

# Typical SIEM Flow:
Log Sources → Collection → Parsing → Normalization → Storage → Correlation → Alerting

# Components:
# - Collectors/Forwarders: Gather logs
# - Indexer: Store and index data
# - Search Head: Query interface
# - Correlation Engine: Rule processing
# - Case Management: Incident tracking

3. Major SIEM Platforms

Splunk

Industry leader, powerful SPL query language, extensive app ecosystem. Enterprise pricing.

Elastic Security (ELK)

Open-source core, Elasticsearch backend, Kibana visualization. Cost-effective at scale.

Microsoft Sentinel

Cloud-native SIEM, Azure integration, KQL queries, SOAR built-in.

4. Critical Log Sources

Windows Event Logs: Security, System, PowerShell
Active Directory: Authentication, group changes
Firewall/IDS: Network traffic, blocks
Endpoint (EDR): Process, file, network events
Web servers: Access logs, errors
Cloud: CloudTrail, Azure Activity, GCP Logging
Email gateway: Phishing, spam, blocks
VPN/Proxy: Remote access, web activity

5. Detection Use Cases

Authentication Attacks

# Brute Force Detection (Splunk)
index=windows EventCode=4625 
| stats count by src_ip, user 
| where count > 10

# Password Spraying
index=windows EventCode=4625 
| stats dc(user) as users by src_ip 
| where users > 5

Lateral Movement

# PsExec Detection
index=windows EventCode=7045 ServiceName="PSEXESVC"

# Remote Service Creation
index=windows EventCode=4697 
| where ServiceFileName!="*System32*"

Data Exfiltration

# Large outbound transfers
index=firewall action=allowed bytes_out>100000000
| stats sum(bytes_out) as total by src_ip, dest_ip

# DNS tunneling (high query volume)
index=dns query_type=TXT 
| stats count by src_ip 
| where count > 1000

6. Query Examples

Splunk (SPL)

# Failed logins followed by success
index=windows EventCode=4625 OR EventCode=4624
| transaction user maxspan=5m 
| search EventCode=4625 EventCode=4624

# Process with encoded commands
index=windows EventCode=4688 
| where match(CommandLine, "-enc|-e ")

# Suspicious PowerShell
index=windows EventCode=4104 ScriptBlockText="*Invoke-Mimikatz*"

Elastic Security (KQL)

// Failed Windows logins
event.code: 4625 and winlog.event_data.LogonType: 3

// Process creation
event.category: process and process.name: powershell.exe

Microsoft Sentinel (KQL)

// Brute force detection
SecurityEvent
| where EventID == 4625
| summarize FailedLogins=count() by TargetAccount, IpAddress, bin(TimeGenerated, 5m)
| where FailedLogins > 10

7. Alert Tuning

Baseline normal: Understand legitimate activity
Whitelist known-good: Exclude expected alerts
Adjust thresholds: Balance detection vs noise
Add context: Enrich with asset/user info
Risk scoring: Prioritize high-impact alerts
Feedback loop: Learn from false positives

8. Best Practices

✅ Define clear use cases aligned with threats
✅ Prioritize quality logs over quantity
✅ Tune aggressively to reduce noise
✅ Integrate threat intelligence
✅ Automate response with SOAR
✅ Regular rule reviews and updates
✅ Document playbooks for each alert type

FAQ

How much log storage do I need?

Depends on environment size and retention needs. Start with 90-day hot storage, 1-year cold. Expect 50-500GB/day for medium enterprise.

Incident Response Threat Intel Network Security