In my Securing Windows environments post, I touched upon logging and SIEM but I didn't go deeper as the post became too lengthy already. As a reference, I stated the following:
"... All of the things described up to this point, aim to reduce the attack surface (reducing the risk of successful compromise) and the impact should one occur ... Now, if you think about this, the Penetration testers will be running tools and scripts, which are not normally executed in your environment so that puts you a step further, because it will all be executed on a device that you own. At this point, having and analyzing logs (default Windows logs only are insufficient to trace activity) is fundamental. In fact, in 2019, there should be no excuse for not having a SIEM system, even if it is a free of charge one - ELK stack or Graylog (although it may not have every feature you wish it did)."
Basics
The above refers to utilizing the available logs to perform Threat Hunting. Threat hunting is the human centric (as opposed to automated detection by an appliance) process of proactively searching data and discovering cyber threats. Active Threat Hunting can significantly reduce the time of discovering a successful compromise in an early stage of the infection. Each of the hunts is based on a hypothesis with an overall "Has <something> happened in my environment"? For each hypothesis you will have to:
Identify the specific behavior we want to hunt for
Understand the attack technique behind it
Identify what data (and from which source) we need to detect it
With that being said, on a high level, we can go into 2 distinctive ways about Threat hunting:
Detection based on Indicators of Compromise (IOCs) -> Hunting for known bad
Detection based on Attack Techniques (TTPs) -> Hunting for unknown bad
Let's briefly look at those types.
Detection based on IOCs
To be proficient in doing this, we need to go gather IOCs from somewhere. The answer often is through "Threat Intelligence", which is data on threats gathered through social media, vendor reports, threat feeds, etc. It can come in the form of IP addresses, domain names, file hashes, TTPs, etc. Some companies (to follow) that often release Threat Intelligence Reports are:
FireEye
Verizon
TrustWave
CrowdStrike
Palo Alto Networks
Cylance
F-Secure
As you can imagine, constantly going through various blogs to obtain IOCs can be a time consuming and daunting task, especially with more vendors entering into this sphere. One solution would be to create a dashboard and have feeds to auto-populate the dashboard with data from multiple vendors.
When reading reports or research publications, you should try to gain the most out of it by asking questions such as:
What was and How was the objective achieved?
What to do to detect this activity?
Is this similar to previously known activity?
Other locations for gathering intel are CERTs, and places such as the Malware Information Sharing Platform (MISP), which is an open-source software solution for collecting, storing, distributing and sharing cyber security indicators and threats about cyber security incidents analysis and malware analysis.
Detection based on Attack Techniques (TTPs)
Here we utilize host, network and memory artifacts to identify certain tactics, techniques and procedures to help us find threats that are both unknown and undetected by our detection appliances. Clearly, for this we need to have a feed of different TTPs and by now, probably the most commonly referred to source of TTPs, and for a good reason, is MITRE ATT&CK. ATT&CK contains knowledge of varies attacks and description of how they manifest, which helps in understanding how and where to detect it. I'll show you later how we can refer to techniques from ATT&CK to generate a hypothesis on it, perform a hunt and finally create an automated alert.
Before we do hunting, we need to collect data. When collecting, we should ensure that we have a purpose, which is based on what we want to find in that data in order to avoid collecting a mountain of noise populated logs. Moreover, we need a defined Data Governance that ensures data completeness, data consistency and data timeliness. That leaves us with these questions:
What is the data that we have?
How do we ensure it is qualified for our needs?
How do we transform and make it useful to us?
Once we know what we want to collect, we have to decide on a method for transporting the data to our SIEM. The available methods are:
Push – agent on the host forwards log data as its captured
Pull – remotely connects and collects data at the time of the connection
A mix of the above
Finally, with a clear mind of what we want to collect and how we want to transport it, we face an overall hunting capability problem:
How much of the needed data is available for a hunt
How much of my environment would I cover with the available data during a hunt
How far back in time can I search (real-time, historical data or point-in-time search)
Alright enough talking ... so, where do I start?
Step 0.0 - Implement as much as you possibly can from what was discussed here. Hardening your environment and limiting the attack surface is unfortunately overlooked by many organizations (Yes, doing this will break things).
Step 0.1 - Begin collecting logs. I recommend at least:
Sysmon
PowerShell logs (ScriptBlock Logging)
AppLocker
Windows Security/System and Application logs
WMI logs
Event Tracing for Windows (mentioned later on, with LDAP example)
Network logs and passive DNS
Step 0.2 - Our focus is abnormal activity and unknown threats. Alright, how do we tell what is defined as abnormal? The answer is baselining (I know it sounds easier said that done but not doing it at all is definitely not giving any value). We need to create and understand what is "known good" in our environment, correlate it with known bad and have that delta generated to look at.
What to baseline? Start with these:
Running Processes
User logons -> where and when, what type of login
Network connections
Services and scheduled tasks
Allowed software to execute
Step 1 - Define a Hypothesis (eg. based on a technique from ATT&CK)
Step 2 - Find associated procedures with it
Step 3 - Simulate the attack yourself
Step 4 - Find detection points in the gathered sources after simulation
Step 5 - Set scope (might as well be across your entire environment) before hunting for it. Consider other factors such as network bandwidth that may have an effect on this step.
Thankfully, in 2020, many of the techniques have already been investigated and detection rules suggested; therefore we do not have to simulate everything. A great place for inspiration is the Sigma project with hundreds of detection rules already.
Conferences and blogs are another great places to get inspired by the research of other folks. Notably, the following are highly recommended:
Tom Ueltschi presentations:
Roberto Rodriguez's blog, which is absolutely phenomenal source of invaluable information on Threat Hunting.
And .... ACTION!
Alright, assuming we are all set at this point, let's take our rifles and go "kill" some. Here are 4 hunting hypothesis:
Have suspicious parent process executed PowerShell in my environment?
Have PowerShell Base64 obfuscated commands executed in my environment?
Have workstations had abnormal network connections over SMB?
Have workstations done any suspicious LDAP queries?
1. Have suspicious parent process executed PowerShell in my environment?
After researching a bit, we identify the following often flagged as potentially suspicious parent processes that have started PowerShell:
mshta.exe
rundll32.exe
regsvr32.exe
services.exe
winword.exe
wmiprvse.exe
powerpnt.exe
excel.exe
msaccess.exe
mpub.exe
visio.exe
outlook.exe
chrome.exe
iexplorer.exe
sqlserver.exe
The data that we need to hunt for this is contained in Sysmon ID 1 - Process Creation (Windows Event ID 4688 may also be used). Overall, we create the following query in our SIEM (which is ELK for this example):
winlog.event_data.ParentImage:(*mshta.exe OR *rundll32.exe OR *regsvr32.exe OR *services.exe OR *winword.exe OR *wmiprvse.exe OR *powerpnt.exe OR *excel.exe OR *msaccess.exe OR *mpub.exe OR *visio.exe OR *outlook.exe OR *chrome.exe OR *iexplorer.exe OR *sqlserver.exe) AND winlog.event_data.Image : *powershell.exe
Running the query to confirm that it works as expected, we get the following match:
As you could imagine, this is executed against test data so custom tuning to remove false positives which are specific for your environment may be necessary (allowlist known good based on your baselines!)
2. Have PowerShell Base64 obfuscated commands executed in my environment?
Taking the same approach as before, doing research on PowerShell Base64 obfuscated commands, we'll discover that the bare minimum that is required for this to work is an argument of "-e" (remember, we are looking at something very specific. For those of you who are thinking of Unmanaged PowerShell, that will be a separate hunt!). So, PowerShell must be started, and "-e" passed as an argument. To expand this, we'll look at potentially renamed executable from powershell.exe into whatever else an adversary may have renamed it to. A way to detect that is the "Description" field that Sysmon Event ID 1 captures as part of the event data. Even though powershell.exe is renamed, the file's description remains the same, containing "PowerShell" in it. So our hunting query will be formed as (ELK again):
(winlog.event_data.Description:*PowerShell OR winlog.event_data.Image:*powershell.exe) AND winlog.event_data.CommandLine:*-e*
Executing the query, will give us the following match:
3. Have workstations had abnormal network connections over SMB?
Our research for this task will lead us to T1105 and T1077 from ATT&CK. Since we are looking at something that is defined as abnormal, we need to understand what is normal. For me, any SMB connections between workstations (or those not originating from Privileged Access Workstation) are exactly that - I would not expect this to occur. Now, to identify this abnormal behavior, we can go multiple ways about it - our preferred approach for this blog post is to use Zeek's logs (previously known as Bro). We'll specifically look at the "ntlm" logs in which Zeek tracks authentication over NTLM. So our final query will look for destination port of 139 or 445 (Splunk query below):
index=zeek sourcetype=zeek_ntlm id.resp_p=445 id.resp_p=139 | table id.resp_h, id.resp_p, id.orig_h, id.orig_p, domainname, success, username
Well, we see successful authentications over SMB with the Administrator account. Normally, this should make you go "whaaa...t?"
4. Have workstations done any suspicious LDAP queries?
Again, we need to define what suspicious means in this case. In my definition, this would be if any workstation has performed an LDAP query, that is checking who is a member of any administrative group as regular users on their workstations should not be doing this.
To look for this, we'll be using Event Tracing for Windows - although not widely known and used, Windows provides logging capability that is somewhat "hidden". FuzzySec released SilkETW which provides a very easy to use command line interface into enabling different event providers, one of which is LDAP (NOTE - this will log LDAP events even if they are executed in memory only. This gives us great visibility into detecting tools such as SharpHound regardless of whether injected and executed from memory or from disk). We'll simply query for "admins" in our ELK instance once we have configured SilkETW to collect logs. A detection example is shown below:
We can continue creating hypotheses and detection rules but I think at this point, the benefit that proactive hunting brings is more than clear.
All of the above, is a short taste of the upcoming eLearnSecurity's Threat Hunting Professional v2 course, in which I contributed by enriching the previous version with hands-on exercises. The labs range from specific techniques to hunt for, to playgrounds of simulated complete compromises that enable you to practice and tune your detection abilities for varies TTPs in Splunk, ELK and directly through PowerShell parsing of logs (quick and dirty).