Using MITRE ATT&CK for Incident Response Playbooks

A structured approach to incident response enables you to create consistently repeatable processes. Your incident response playbook defines responsibilities and guides your security team through a list of activities to reduce uncertainty if or when an incident occurs. MITRE ATT&CK Framework outlines the tactics and techniques that threat actors use during different stages of an attack.

 

By incorporating MITRE ATT&CK into their incident response playbooks, organizations can use insights about attacker motivations and objectives that drive faster investigation and response times.

What is an incident response playbook?

An incident response playbook is the step-by-step guide containing the standard procedures that security teams use to respond to and resolve incidents. The playbook should manage all stages of incident response, including:

  • Preparation
  • Identification/detection
  • investigation/analysis
  • Containment
  • Remediation
  • Recovery
  • Lessons learned

 

The playbook includes predefined response actions, decision trees, and checklists tailored to various types of incidents, such as malware infections, data breaches, or unauthorized access attempts. While the specifics of the playbook differ from team to team, most contain similar, fundamental information and steps.

Definition of incident

Since an incident triggers the rest of the activities listed in the playbook, the organization first needs to define the type of event that requires IT and security teams to follow the steps. Additionally, IT and security teams need to consider all types of incidents, not just security incidents. For example, an “incident” could include events that impact service quality, like network speed, outside of the security context.

Assigned roles

Assigning roles and responsibilities in the playbook documents expectations before an incident occurs. By ensuring people know and understand their responsibilities, teams can respond to the incident faster. Some key roles and responsibilities might be:

  • Incident manager: individual responsible for overseeing the response activities, including deciding when to bring in more staff or keeping responders focused on restoring service
  • Senior technical responder: individual who works closely with incident manager and is responsible for developing hypotheses about the incident, identifying changes, and managing the technical response team
  • Communications manager: individual responsible for writing and sending internal and external messages about the incident

Steps and phases

While each incident is unique, the response process typically follows similar activities across each phase. By outlining the processes that responders should follow for each phase, teams can respond faster and reduce the incident’s impact.

 

For example, an organization might have a workflow across the following:

  • Detect: systems generate an alert
  • Open ticket: incident manager initiates response and opens a ticket to assign a responder
  • Assess: incident manager reviews the incident to assess potential impact
  • Investigate: responder investigates the incident’s root cause and identifies impacted asset
  • Respond: responder contains the threat
  • Remediate: responder takes actions to fix the security weakness that led to the incident

 

Depending on the incident, the activities that the responders take may look different. For example, the activities that contain a password spray attack might be different than the ones used for responding to a malware attack.

Templates and checklists

Templates and checklists create consistency across activities and communications. Checklists give responders a set of required or expected activities to complete. Some incident response processes that could benefit from checklists include:

  • Triage and investigation
  • Containment and eradication

Templates ensure that communications remain consistent across various incidents. Some templates that could help include:

  • Management reports
  • Customer emails
  • Internal emails

Post-incident discussions

Discussing “lessons learned” is critical to identifying potential areas of improvement. However, the organization should have clear expectations about when and how to engage in them. For example, the incident response playbook might include:

  • Meetings: people involved, timing
  • Metrics: determining success and areas of improvement
  • Next steps: prioritizing activities aligned to areas of improvement

 

What are some common incident response playbook scenarios?

Scenarios outline the security events that would trigger the organization’s incident response.

 

Distributed Denial of Service (DDoS)

A playbook for responding to a DDoS incident might include the following steps:

  • Preparation:
    • Create an asset inventory
    • Establish an escalation and reporting communication strategy
  • Detection:
    • Network activity: Packets/second for layers 3, 4, and 7, number of new TCP and UDP flows from clients to endpoints, total number TCP flows
    • Web application firewall (WAF): allowed requests, blocked requests, total counted requests, passed requests
  • Analysis:
    • Review source of incoming traffic during the event
    • Review protocols, source ports, and TCP flags
  • Containment:
    • Create Web Application Firewall (WAF) rules that match detected behavior
    • Add the conditions to the WAF rules
    • Add the rules to a web Access Control List (ACL) and count the requests matching the rules
    • Monitor counts and block source
  • Eradication:
    • Not applicable
  • Recovery:
    • Not applicable

 

Compromised Credentials

A playbook for responding to compromised credentials might include the following steps:

  • Preparation:
    • Implement Identity and Access Management (IAM) best practices
    • Establish an escalation and reporting communication strategy
  • Detection:
    • Unusual user creation
    • Users with more than one access key
    • Unfamiliar roles created or access
    • Unusual changes to permissions attached to roles
    • Unrecognized or unauthorized resources added to cloud environment
  • Analysis:
    • Review for unusual activity associated with logins
    • Correlate user ID with suspicious activities, like creating new cloud resources
    • Review user ID to identify the last service accessed
  • Containment:
    • Disable the user account and compromised access key
    • Change the access information
    • Revoke role or roles in any active sessions, application sessions, or role sessions
    • Isolate affected resources by detaching them from other resources and blocking inbound/outbound traffic
  • Eradication:
    • Remove resources created by compromised ID
    • Check for and remove unrecognized services
    • Remove unnecessary permissions related to cloud resources
    • Remove exposed data not necessary for operations
    • Scan for vulnerabilities on public facing resources
  • Recovery:
    • Restore data from known clean backups predating the event
    • Rebuild systems, if necessary, including redeploying from trusted sources
    • Restore appropriate access and permissions
    • Address vulnerabilities

 

Ransomware

 

A playbook for responding to ransomware might include the following steps:

  • Preparation:
    • Create and maintain asset inventory
    • Perform regularly vulnerability scans
    • Install and update antivirus on endpoint
    • Perform and verify data backups regularly
    • Apply security updates regularly
    • Disable unnecessary applications and functionalities
    • Establish an escalation and reporting communication strategy
  • Detection:
    • Review network traffic for data exfiltration “spikes” in activity
    • Review endpoint detection and response (EDR) log data for suspicious activity
  • Analysis:
    • Analyze log data related to number of bytes for source and destination IP addresses and ports
    • Review for deleted objects, files, filesystems, and data
    • Review for unauthorized activity, like creation of IAM users, policies, roles, or temporary credentials
    • Review API calls for requests to delete objects, files, filesystems, and data
  • Containment:
    • Isolate affected resources
    • Create network access control lists (NACLs) to limit traffic to and from affected resources
    • Add rules to limit traffic by protocol, like HTTP or TCP
    • Add rules to limit traffic source (inbound) or destination (outbound)
  • Eradication:
    • Remove compromised systems from network
    • Identify forensic data necessary
    • Remove compromised Domain Controller metadata from the domain
    • Inspect backups for potential infection
  • Recovery:
    • Restore data from known clean backups predating the event
    • Rebuild systems, if necessary, including redeploying from trusted sources
    • Delete unauthorized users, policies, roles
    • Revoke temporary credentials
    • Create new resources from trusted source

 

Using ATT&CK in Incident Response Playbook

Security teams can use ATT&CK to inform the Detection and Analysis sections of an incident playbook.

 

For example, when creating a playbook that responds to a specific incident type, security teams can use ATT&CK to help:

  • Detect an incident: ATT&CK defines the tactics, techniques, and procedures (TTPs) threat actors use during an incident. When building detections, like Sigma rules, around TTPs, they can map activity to type of incident, like DDoS, compromised credential, or ransomware attack.
  • Compare detected activity to common TTPs: Since TTPs provide insight into the why, what, and how of an attack, security teams can use them to make hypotheses about what threat actors plan to do next. For example, detecting a Phishing technique indicates an objective of Initial Access while Account Manipulation indicates Persistence. Initial Access occurs before Persistence, meaning that the TTP activity provides insight into the attack stage.
  • Perform technical analysis: Correlating anomalous activity with known TTPs provides technical context. TTPs can help security teams identify where in the attack chain the detection rule identifies the suspicious activity, enabling them to prioritize the next response steps.
  • Correlate events and document timeline: By mapping Tactics and Techniques to the log and event sources, security teams create a knowledge base that they can reference during their response activities. The log and event sources provide insight into the attack stage, giving security teams a way to create timelines for adversary activity.

 

Graylog: Incorporate ATT&CK into Incident Response Processes

With Graylog Security, you can use prebuilt content to map security events to MITRE ATT&CK. By combining Sigma rules and MITRE ATT&CK, you can create high-fidelity alerting rules that enable robust threat detection, lightning-fast investigations, and streamlined threat hunting. For example, with Graylog’s security analytics, you can monitor user activity for anomalous behavior indicating a potential security incident. By mapping this activity to the MITRE ATT&CK Framework, you can detect and investigate adversary attempts at using Valid Accounts to gain Initial Access, mitigating risk by isolating compromised accounts earlier in the attack path and reducing impact.

 

Graylog’s risk scoring capabilities enable you to streamline your TDIR by aggregating and correlating the severity of the log message and event definitions with the associated asset, reducing alert fatigue and allowing security teams to focus on high-value, high-risk issues.

 

 

Categories

Get the Monthly Tech Blog Roundup

Subscribe to the latest in log management, security, and all things Graylog blog delivered to your inbox once a month.