Internet of Things (IoT) devices are everywhere you look. From the smartwatch on your wrist to the security cameras protecting your offices, connected IoT devices transmit all kinds of data. However, these compact devices are different from the other technologies your organization uses. Unlike traditional devices, IoT devices lack a standardized set of security capabilities, making them easier for attackers to exploit. Additionally, they are more sensitive, easily dropping off a network when probed which can create operational, cybersecurity, and physical safety risks depending on how you deploy them.
IoT logging can provide insights, but your cloud services providers offer different capabilities for managing them. By understanding the IoT logging formats and capabilities across Azure and Amazon Web Services (AWS), you can use the logs to help manage security and operational risks by incorporating them into your overarching monitoring program.
Why is IoT logging important?
IoT logging captures and records events, errors, and performance metrics that the devices generate for insights into:
- System health
- Anomaly detection
- Performance
- Device functionality
With IoT logging, organizations minimize service disruptions while gaining insights and benefits for:
- Attack surface management: Each device and its connected application adds more access points that attackers can exploit.
- Operational efficiency: IoT devices help manage and maintain equipment, like sensors that help identify issues with machines on a manufacturing floor.
- Insights from analytics: IoT devices gather data about customer and business processes, like foot traffic in a retail store.
What types of logs to IoT devices generate?
Although no standard for IoT device logging exists, most device generate information about:
- Status: Device state, including online, offline, transmitting, or error
- Error: Alert for monitoring purposes that might include error type or locations
- Authentication: Information about user login, including failed logins
- Memory dump: Details about device failures
- Configuration: Device characteristics, like settings or firmware version
What are some challenges with IoT logging?
While IoT event logs are important, many organizations struggle with the challenges that these unique devices create, including:
- Network dropping: Network congestion and poor connections with wireless devices can lead to gaps in log collection.
- High data volumes: IoT devices can generate high volumes of log data that make storing, aggregating, and analyzing difficult.
- Diverse formats: Lack of standard log format makes comparing and analyzing data difficult.
- Minimal hardware resources: Pulling log data from IoT devices can disrupt connectivity and IP-based tracking may not work.
Understanding IoT Log Formats Across Cloud Service Providers
As more organizations adopt IoT devices, the different cloud services providers (CSPs) offer IoT monitoring solutions. However, you should understand the different formats and options available across Azure and AWS. Google Cloud officially retired their IoT Core service in August 2023.
AWS IoT Log Entries
Every component of AWS IoT generates log entries that have an eventType explaining the reason for the log. If you’re using CloudWatch, some examples of useful log events include:
- Connect: Information about device connecting to system, includes the client ID, principal ID, protocol used, source IP address, source port
- Disconnect: Information about device disconnecting from system, includes client ID, principal ID, protocol used, source IP, source Port, reason for disconnecting, details of error
- GetRetainedMessage: Payload and details of a single retained message for a specified topic, includes date and time when AWS IoT stored retained message, protocol used, Quality of Service (QoS) level, subscribed topic name
- ListRetainedMessage: Summary information about retained messages, includes protocol used
- Publish-In: Receipt of message, includes client ID, principal ID, protocol used, whether messaged retained or not, source IP, source port, topic names
- Publish-Out: Message sent out, includes client ID, principal ID, protocol used, source IP, source port, topic names
- Queued: Persistent session disconnected, includes client ID, details of disconnect, protocol used, QoS, topic name
- Subscribe: Client subscribes to topic, includes client ID, principal ID, protocol used, source IP, source port, topic name
- Unsubscribe: Client unsubscribes from a topic, includes protocol used, client ID, principal ID, source IP, source port
- RetrieveOCSPStapleData: Server success/failure retrieving Online Certificate Status Protocol (OCSP) responses, includes failure reason, domain configuration name, connection details, requires details, response details
- FunctionExecution: AWS rules engine rule’s SQL query calls an external function, includes client ID, principal ID, resources used, matching rule’s name, topic name
- RuleExecution: AWS IoT rule action triggered, includes client ID, principal ID, resources used, action triggered, matching rule’s name, subscribed topic
- StartingRuleExecution: AWS IoT rules engine started to trigger rule action, includes client ID, principal ID, action triggered, matching rule, subscribed topic
- GetPendingJobExecution: Job execution request, includes client ID, client token, details from Jobs service, protocol, topic name
- ReportFinalJobExecutionCount: Job completion, includes details from Jobs service, job ID
- StartNextPendingJobExecution: Request to start next pending job, includes client ID, client token, details from Jobs service, protocol used, topic used
- UpdateJobExecution: Request to update job execution sent to Jobs service, includes client ID, client token, details from Jobs service, job ID, protocol used, topic used, job execution version
Azure IoT Hub
Azure IoT Hub is a managed service for communications between IoT devices and their connected applications. If you’re using Azure, the logging looks different from the Windows Event ID monitoring. The IoT hub supports the following categories of logs:
- C2DCommands: cloud-to-device sent, received, and received message feedback
- C2DTwinOperations: service-initiated events on device twins
- Configurations: events and errors for Automatic Device Management features
- Connections: device connect and disconnect events that can help identify unauthorized connection attempts or alert to lost device connections
- Device identity operations: errors that occur when creating, updating, or deleting identity registry entries
- Device streams: request-response interactions to individual devices
- Device telemetry: errors at the IoT hub related to telemetry pipeline
- Direct methods: request-response interactions sent to individual devices
- Distributed tracing: tracking correlation IDs for message carrying the trace context header
- File upload operations: errors at the IoT hub related to file upload functionality
- Jobs operations: job requests to update device twins or invoke direct methods on multiple devices
- Routes: errors that occur during message rout evaluation and endpoint health
- Twin queries: query requests for device twins
Within each category, the Azure IoT Hub supports the IoT device resource metrics as named in the REST API. Some examples of these include:
- commands.egress.abandon.success: number of cloud-to-device (C2D) messages that devices abandons
- commands.egress.complete.success: number of C2D message delivered that device successfully completed
- commands.egress.reject.success: number of C2D messages that the device reflected
- methods.failure: count of all failed C2D direct method calls
- methods.success: count of all successful C2D direct method calls
- connectedDeviceCount: number of devices connect to IoT hub
- endpoints.egress.builtIn.events: number of time IoT hub routing successfully delivered message to the built-in endpoint
- deviceDataUsage: bytes transferred to and from IoT hub-connected devices
- cancelJob.failure: count of all failed calls to cancel a job
- cancelJob.success: count of all success call to cancel a job
- completed: count of all completed job
- failed: count of all failed jobs
- RoutingDataSizeInBytesDelivered: total bytes of messages IoT hub delivers to an endpoint
Graylog Operations and Security: Aggregate, Correlate, and Analyze IoT Device Logs
With Graylog, you can build a single source of log information that enables observability and visibility across a complex environment. Graylog ingests all log data, no matter what service generates it, then applies a standardized data model so that you can correlate and analyze all events. Since your IT operations and security teams share the same information, they can communicate more effectively.
Further, with Graylog’s lightning-fast search capabilities, your security and IT teams can get the answers they need, even when they’re searching terabytes of data. Purpose-built for modern log analytics, Graylog gives you the two-for-one solution necessary to improve performance and reduce cybersecurity risk. Our cloud-native capabilities and out-of-the-box security content give your teams the ability to collaborate effectively, reducing service downtime and alert fatigue.
To learn how Graylog can help you save money and respond more effectively to issues, contact us today.