If you have ever built a LEGO set, then you have a general idea of how telemetry works. Telemetry starts with individual data points, just like your LEGO build starts with a box of bricks. In complex IT environments, your security telemetry is spread across different technologies and monitoring tools, just like in a large build your LEGO bricks come separated into smaller, individually numbered bags. In both cases, the individual bricks or data points aren’t special. However, as you follow the LEGO instructions or incorporate analytics into your monitoring, the individual pieces combine to form the overall structure you need.
By understanding what telemetry is and how to use it for security, IT and security teams can use the data that their environments generate to create proactive security programs.
What is telemetry?
Telemetry is the science of measuring something, transmitting the results to a remote location, and then interpreting the results. In cybersecurity, telemetry refers to the security data that an organization’s systems, networks, applications, and devices generate. Security telemetry is often derived from log data, the information technologies create about activities impacting them.
Security telemetry comes from IT and cybersecurity technologies across the environment, including:
- Web applications and application programming interfaces (APIs), like user and performance data
- Network devices, like routers and firewalls
- Identity and access management (IAM) tools
- Databases, including on-premises and cloud locations
- Workstations and mobile devices, like laptops, smartphones, and tablets
Why is telemetry important?
On its own, telemetry is nothing more than raw data. When you collect, parse, normalize, aggregate, and analyze telemetry, the whole becomes greater than the sum of its individual parts. Telemetry enables IT and security teams to improve:
- Performance and efficiency: using analytics for proactive identification of security vulnerabilities or prediction of system maintenance activities
- Risk management: monitoring for security or operational abnormalities that can lead to business interruption and service outages
- Decision-making: using insights to understand current security and operations posture to find areas of improvement and determine future investments
- Threat hunting: aggregating data points to identify indicators of compromise (IoC’s) that could detect potential advanced persistent threats (APTs) hiding in systems
- Compliance: aggregating and analyzing data to document and report on whether controls function as intended
What are the types of security telemetry?
Security telemetry refers to the continuous monitoring and analysis of security events within information systems. By collecting detailed information on network traffic, user activities, and system logs, security telemetry enables you to create baselines that define normal behaviors and alert you to anomalous activities that might indicate a potential security incident.
Network Telemetry
Network telemetry helps your network monitoring by aggregating data from sources like:
- Firewalls and Next-Generation Firewalls (NGFW)
- Routers
- Switches
- DNS and Domain Name servers
These technologies generate data that provides insight into:
- Traffic patterns: inbound and outbound communications
- Latency: request and response times
- Usage: resources and ports accessed
- Health: CPU and memory use and device uptime
Endpoint Telemetry
Endpoint telemetry helps you manage devices by aggregating data from sources like:
- Workstations
- Servers
- Mobile Device Management (MDM)
- Endpoint detection and response (EDR)
- Antivirus and antimalware tools
- Vulnerability scanners
These technologies generate data that provides insight into:
- Configurations: updated settings that limit unnecessary functionality
- Vulnerabilities: known security issues that require installation of updates
- Anomalous behavior: programs running that might indicate malware infection
Application Telemetry
Application telemetry provides insights about web applications and their connected APIs by aggregating data from sources like:
- Applications and their servers
- Web Application Firewalls (WAF)
- API Gateways
- IAM tools
- Network devices
- API security tools
These technologies generate data that provides insight into:
- User access: who authenticates into applications and whether their access is limited to only what they need for completing job functions
- Credential-based attacks: identification of failed user logins indicating potential security incidents, like credential stuffing attacks
- API vulnerabilities: security weaknesses, like the ones listed in the OWASP API Security Top 10 list
- API attacks: malicious activity targeting API vulnerabilities
Cloud Telemetry
Cloud telemetry provides insights into system performance, resource utilization, and application health by aggregating data from sources like:
These technologies generate data that provides insight into:
- Misconfigurations: settings that attackers can exploit to achieve their objectives
- Resource and usage costs: memory, CPU, and execution times to understand resource allocation, scaling, and optimization
- Reliability: application’s design and architecture to maintain availability
- Performance: bottlenecks, latency issues, or resource constraints
- Vulnerabilities: programming errors that create exploitable weaknesses
Why is security telemetry challenging?
Many companies struggle to manage and correlate security telemetry because their technologies generate overwhelming amounts of data.
High Storage Costs
The high volumes of data that your environment generates can become prohibitively expensive. Many organizations struggle with high-security information and event management (SIEM) costs, especially as they adopt more cloud-native technologies that generate more data. However, you likely need to retain some data to meet compliance and retention requirements. This can leave you struggling to find multiple storage locations.
Data Ingestion Decisions
Additionally, all data is not equally valuable. For example, you may need packet data for a forensics investigation but not your everyday monitoring. The high storage costs often mean you have to make difficult decisions around the data you send to your security solution. You may need to make difficult decisions about the data that you forward to your security monitoring solution which could create blind spots.
Different Log Formats
Logs don’t have a standard format, creating challenges when correlating security telemetry to gain insights. Some examples of log formats include:
- Windows event logs: Microsoft’s proprietary format
- javaScript Object Notation (JSON): highly readable format, often used for structured logging
- Common Event Format (CEF): text-based, extensible open logging and auditing format
To correlate the data that your technologies generate, you need to parse and normalize the logs before you can correlate and analyze them.
Graylog for Security and Operations: Using Telemetry and Managing Data Effectively
Graylog ensures scalability as your data grows to reduce total cost of ownership (TCO). Our platform’s innovative data tiering and data pipeline management capability facilitates efficient data storage management by automatically organizing data to optimize access and minimize costs without compromising performance.
With frequently accessed data kept on high-performance systems and less active data in more cost-effective storage solutions, you can leverage Graylog Security’s built-in content to uplevel your threat detection and response (TDIR) processes. Our solution combines MITRE ATT&CK’s knowledge base of adversary behavior and vendor-agnostic sigma rules so you can rapidly respond to incidents, improving key cybersecurity metrics. By combining the power of MITRE ATT&CK and sigma rules, you can spend less time developing custom cyber content and more time focusing on more critical tasks.
To learn how Graylog can help you cost-effectively optimize your telemetry, contact us today or watch a demo.