Data Collection

Scalable ingestion from anywhere. Zero configuration chaos. Collect logs from virtually any source — cloud, on-prem, endpoint, or network. Graylog Log Collection gives you full visibility with high-throughput ingestion, real-time normalization, and centralized control via Sidecar. Built to scale with hybrid environments and teams who don’t have time for complexity.

Graylog Data Collection Highlights

Any Source. Any Scale.

From dev clusters to global cloud estates—collect it all without breaking a sweat.

Config Once. Apply Everywhere.

Push logging policies fleet-wide—no more agent wrangling.

Logs That Speak Human

Normalize chaos into clarity for faster investigations and simpler searches.

Graylog Data Collection — A Closer Look

Log Collection adds critical context by normalizing logs with user identity, location, and device data—automatically applying a consistent schema. That means cleaner aggregation, smarter analysis, and clearer insights across performance and security events.

Input Types

To support the diverse nature of log data, Graylog ingests a wide range of input types:

  • Syslog – Standard format for network and Unix systems. Supports UDP & TCP.
  • CEF – Vendor-agnostic format ideal for security data.
  • GELF – Native to Graylog. Supports structured logging over UDP, TCP, and HTTP.
  • BEATS – Lightweight Elastic agents for logs and metrics.
  • IPFIX / NetFlow – Monitor network flow and performance.
  • Plain Text – Flexible, no-frills ingestion.
  • KafkaReal-time log ingestion from distributed systems.

Input Types Available in: Graylog Security  |  Graylog Enterprise  |  Graylog Open   —  Compare Plans

*Feature capabilities vary by plan.

Graylog can poll external data sources by configuring a Web API input or using a predefined input. The response—typically in JSON format—is ingested as a log message for parsing, storage, and analysis.

Available polling input types include:

  • HTTP API (JSON) – Poll REST APIs for custom app and SaaS logs
  • AWS Inputs
  • Office 365 & Azure Event Hubs Inputs
  • Microsoft Defender Input – Security alerts, endpoint events, vulnerabilities
  • CrowdStrike Falcon Input – EDR logs, auth events, host data
  • F5 BIG-IP Input – Audit logs, daemon/kernel events, traffic insights
  • Google Workspace Input – Gmail logs, auth, Workspace apps
  • Symantec EDR / SES – Threat alerts, endpoint data, policy violations
  • Salesforce Input – Audit trails, API usage, encryption events

Input Polling Types Available in: Graylog Security  |  Graylog Enterprise  |  Graylog Open   —  Compare Plans

*Feature capabilities vary by plan.

Graylog supports a wide range of agents to collect log files, event logs, metrics, network traffic, and audit data. With Sidecar, you can centrally configure and manage these agents across your environment, streamlining deployment and reducing manual overhead.

Supported agents include:

  • Filebeat – Forward log files
  • Winlogbeat – Collect Windows Security and Event Logs
  • Auditbeat – Gather data from the Linux audit framework
  • Metricbeat – Capture system and application metrics
  • Packetbeat – Send protocols and network traffic data
  • NXLog – Handle structured log collection on Windows, Linux, and apps

Log Shipping Agents Available in: Graylog Security  |  Graylog Enterprise  |  Graylog Open   —  Compare Plans

*Feature capabilities vary by plan.

Graylog Sidecar lets you deploy and manage configurations for Filebeat, Winlogbeat, NXLog, and others—at scale. It’s a lightweight, scalable way to standardize logging levels across environments without touching every host.

Collector Types
Prebuilt default configurations for Winlogbeat, Filebeat, and Auditbeat.

Auto-Assigned Configuration
Sidecar applies default configs automatically when installed on a host, reducing friction for onboarding.

Assignment Tags
Create reusable profiles to apply different logging levels across environments or host groups.

Sidecar for Central Configuration Available in:  Graylog Security  |  Graylog Enterprise  |  Graylog Open  —  Compare Plans

*Feature capabilities vary by plan.

Why Choose Graylog Data Collection

Scalable Ingestion

  • Handles SMB to enterprise-scale log volumes

  • Supports hybrid, on-prem, and cloud-native environments

 

Unified Management

  • Centrally manage configs across thousands of agents

  • Assign logging levels by tag or host group

 

Broad Source Coverage

  • Ingest from cloud, EDR, SIEM, firewalls, and more

  • Normalize logs for consistent downstream analysis

 

Learn More About Data Collection in Graylog

Graylog Data Collection is a centralized, scalable solution for ingesting logs from cloud, on-prem, endpoint, and network sources. It simplifies log collection across complex environments using real-time normalization, flexible input types, and centralized configuration via Sidecar.

Graylog supports log collection from virtually any source, including cloud services (like AWS, Office 365, Google Workspace), endpoint detection and response (EDR) platforms, firewalls, and local or remote servers. It handles structured and unstructured logs through input types like Syslog, GELF, CEF, Beats, NetFlow/IPFIX, Kafka, and plain text.

Graylog is built for scalability—from SMB environments to enterprise-wide deployments. It supports high-throughput ingestion, real-time parsing, and normalization across hybrid environments, ensuring performance without sacrificing control.

Inputs in Graylog define how data is ingested into the system. You can configure inputs like Syslog (UDP/TCP), GELF (HTTP/UDP/TCP), and Beats to capture log data from various sources. Inputs can be managed at the node or cluster level and adjusted via the Graylog web interface.

Yes. Graylog supports input polling through HTTP APIs and built-in integrations. It can poll REST APIs or services like AWS, Azure Event Hubs, Microsoft Defender, Salesforce, Google Workspace, and others to ingest log data automatically in JSON format.

Graylog supports agents like Filebeat (log files), Winlogbeat (Windows Event Logs), Auditbeat (Linux audit framework), Metricbeat (metrics), Packetbeat (network data), and NXLog. These agents can be deployed manually or managed centrally using Graylog Sidecar.

Graylog Sidecar is a lightweight configuration manager that allows centralized deployment of logging configurations across hosts. It supports agents like Winlogbeat, Filebeat, and NXLog and automatically applies logging policies, reducing manual setup and ensuring consistency.

Graylog automatically applies a consistent schema to ingested logs, enriching them with metadata like user identity, device, and location. This enhances downstream analysis and provides clearer insights into performance and security events.

Absolutely. Graylog is designed for hybrid, cloud-native, and on-premise infrastructures. It supports ingestion from distributed systems and can scale horizontally to accommodate growing data needs.

Graylog supports secure data transmission, access control, and logging configuration via tags and role-based management. It integrates with security data sources and tools to provide visibility and control over sensitive event data.