Cobalt Strike Archives - Cobalt Strike Research and Development

Arsenal Kit Update: Thread Stack Spoofing

As I mentioned in the recent Roadmap Update blog post, we are in the process of expanding the Cobalt Strike development team and ramping up our research activities so that we can release more tools outside of the core product release schedule. We’re also acutely aware of Cobalt Strike’s limitations when it comes to EDR and AV evasion, and our research efforts at the moment aim to make improvements in that area. In that vein, a new tool is now available in the Cobalt Strike Arsenal that adds thread stack spoofing capabilities.

AV and EDR detection mechanisms have been improving over the years and one specific technique that is used is thread stack inspection. This technique determines the legitimacy of a process that is calling a function or an API.

Thread stack spoofing is not a new technique and there are several good examples of this technique that are already available. We’d specifically like to highlight mgeeky’s thread stack spoofer, which works well and was the inspiration for our own implementation. The research team used concepts from mgeeky’s tool and added new concepts and techniques resulting from their own research activities to develop their own unique take on this technique.

Full details on our implementation are included in the readme that accompanies the tool in the Cobalt Strike Arsenal. This information and the tool itself are only available to licensed customers. The Cobalt Strike Arsenal is accessed via a link in Cobalt Strike, or directly here.

There’s Another New Deputy in Town

Things are moving in the Cobalt Strike world…
And they are moving… FAST.

When I started my position with the Cobalt Strike team, I got to meet the team in person in the head office in Eden Prairie, Minnesota.
I can’t say much yet, but the team has been cooking up some cool stuff coming into the next several releases.
I’m pleased to join a team of wonderful individuals that all excel in their own areas of expertise.

So what am I going to be doing here in the mix, you ask?

I’ll be drawing from my own expertise as a Cobalt Strike user, and that of our wonderful community to research, support, and build new features into the product. Some of these might make it into Beacon (or teamserver) itself, others will be released as BOFs or as kits.

There has been a lot of back and forth amongst the team already and I’m very excited to see the features that are already on the roadmap. Unfortunately, I have been sworn into an oath of silence, but fear not, you, the user, will get to see some cool features being added soon! (Something about 5 pair of socks?)

I maintain a relatively active social media presence and am lurking around in a majority of Discord and Slack channels. I have also been known to attend conferences every now and then, be it as an attendee or a speaker. So if you catch me online or IRL, feel free to have a chat!

Hopefully, this post has made you curious about what comes next, and I can’t wait to, together with the team, share new features with the community and our customers.

But for now, sit back, relax, and take part of the wonderful journey as we, as a team, lift Cobalt Strike into a new generation.

Out Of Band Update: Cobalt Strike 4.6.1

Cobalt Strike 4.6.1 is now available. This is an out of band update to fix a few issues that were discovered in the 4.6 release that were reported to be impacting users and for which there was no workaround. This does not affect the 4.7 release, which is still on track to ship this summer.

Website Cloning

Two issues related to website cloning were addressed. An issue was introduced with the 4.6 release that caused all website cloning to fail, and we had a separate backlog issue that caused an error when cloning https websites. Both of these issues have been fixed.

Error When Using rportfwd_local

An issue was reported whereby when using rportfwd_local, any connection that entered the forwarded port caused the Cobalt Strike client to disconnect and reconnect with the teamserver. This issue has been fixed.

Workaround: glibc Dependency Issue

Some users have reported an issue when running on certain (mainly older) Linux distributions that causes the teamserver to fail to start due to a glibc dependency. We are currently looking into ways to update our build process to minimise the impact of this in the 4.7 release. While there is no fix available at the moment, we have documented a workaround. If you are affected by this issue, please refer to the steps in the Cobalt Strike documentation.

We apologise for any problems that these issues may have caused. If you notice any other issues with Cobalt Strike, please refer to the online support page, or report them to our support email address. Licensed users can download version 4.6.1 from the website. To purchase Cobalt Strike or learn more, please contact us.

Cobalt Strike 4.6: The Line In The Sand

Cobalt Strike 4.6 is now available. As I mentioned in the recent Roadmap Update blog post, this isn’t a regular release, as it mostly focuses on security updates. There are also a couple of useful updates for users. A major release is planned for this summer, so this release lays the groundwork for the changes that are coming at that point.

Execute-assembly 1MB Limit Increase

A number of users have been asking for this for quite some time, and the change that we made affect not only execute-assembly, but other tasks (eg. dllinject) as well. We have added three new settings to the Malleable C2 profile (tasks_max_size, tasks_proxy_max_size and tasks_dns_proxy_max_size) that can be used to control maximum size limits. Note that these settings need to be set prior to team server startup. If the size is increased at a later time, old artifacts will still use the previous size settings and tasks that are too large will be rejected.

Comprehensive information on the new settings can be found in the Cobalt Strike documentation.

Arsenal Kit

We have combined the individual kits in the Cobalt Strike arsenal into a single kit, appropriately known as the Arsenal Kit. Building this kit yields a single aggressor script that can be loaded instead of loading all of the separate kits individually. The kit is controlled by the arsenal_kit_config file which is used to configure the kits that are built with the build_arsenal_kit.sh script.

The Arsenal Kit can be downloaded by licensed users from the Cobalt Strike arsenal.

Security Updates

This is the main focus of the Cobalt Strike 4.6 release. It is a necessary step as it lays the groundwork for our future development efforts.

Product security is nothing new. There has always been anti-proliferation processing in the software and, as discussed in this blog post (published by Raphael Mudge in 2019), we do our due diligence when it comes to screening potential customers and working with law enforcement. I think it is worth pointing out that the processes described by Raphael in that blog post are still processes that are followed at HelpSystems today–specifically:

From time to time, we receive informal requests for technical assistance or records from private entities. Our policy is not to perform analysis for, provide deconfliction services to, or disclose our records to private entities upon informal request.

If we have information relevant to a law enforcement investigation, we comply with valid legal process.

This stance is to avoid frivolous requests and to protect our customer’s information.

We also investigate tips. We can’t usually share information back, but we look into things brought to our attention.

We are also proactive when it comes to searching for Cobalt Strike teamservers out in the wild. This work is carried out by our own, dedicated threat intelligence team and it helps us to improve our product controls. That team also issues takedown requests if cracked copies are found.

Over the past few releases, we have made enhancements to Cobalt Strike’s product security. We intentionally haven’t described product security changes in much detail, but we do take it very seriously. Product security has been and will continue to be a key feature on our roadmap.

The 4.5 release in December 2021 saw changes to product licensing and improvements on the watermarking in the software. Those changes made it significantly more difficult to tamper with the authorization ID and locate the ever-changing hidden watermarks, therefore making it easier for us to trace stolen copies of Cobalt Strike back to specific customers. We have yet to see any credible reports of cracked copies of the 4.5 release being used because of these changes. We have seen what are claimed to be cracked copies of 4.5 being sold, but those have all turned out to be older versions badged as 4.5. By design, if the watermarks in the 4.5 release are tampered with, it will simply no longer work.

The 4.6 release brings a change to how the teamserver is deployed. Rather than a Java .jar archive, the teamserver has been built as a native binary. The client is still shipped as a .jar archive but we also plan to change that at some point as well. You shouldn’t notice anything different about the update process itself, but it is important to note that “cobaltstrike.jar” is now just a container for the team server (“TeamServerImage”) and client (“cobaltstrike-client.jar”), both of which will automatically be extracted during the update process. One thing to bear in mind though is that due to the changes in how Cobalt Strike 4.6 is installed and how it runs, coupled with changes to the download infrastructure to facilitate those changes, any scripts that you might have to automate the update process will likely no longer work and will need to be changed.

What does this mean? For you, moving forward, there is no real change. You can still download, update and use Cobalt Strike in the same way–however, please be aware that in this instance, you will need to download 4.6 directly from the website as the version 4.5 updater is incompatible with this release and will not recognize that an update is available. For us, building the software in this way is another step forward in terms of product security.

This is a line in the sand for us. We needed to make these necessary security enhancements so that we can forge ahead with our new development strategy and deliver more of what matters to our users. Normal service will be resumed with the 4.7 release this summer. Cobalt Strike will be 10 years old then so we’re hoping to do that release justice to mark the occasion properly.

To see a full list of what’s new in Cobalt Strike 4.6, please check out the release notes. Licensed users can download version 4.6 from the website. To purchase Cobalt Strike or learn more, please contact us.

Building Upon a Strong Foundation

In the weeks ahead, Cobalt Strike 4.6 will go live and will be a minor foundational release before we move into our new development model. This release will be less about features and is more focused on bolstering security even further. This is all in preparation for a much bigger release later, which will also serve as a celebration of Cobalt Strike’s 10th birthday. As we approach this 10-year anniversary, we’ve also taken the time to reflect on the incredible journey of this product.

Raphael Mudge created and developed Cobalt Strike for many years, entirely on his own. With the acquisition by HelpSystems more than two years ago, additional support came along to bring about some great new features, including the reconnect button, new Aggressor Script hooks, the Sleep Mask Kit, and the User Defined Reflective Loader (UDRL).

Now, with Raphael’s vision always in mind, we have a growing team focused on supporting this solution to bring more stability and flexibility. We’re also dedicating additional resources to research activities, with the goal of creating and releasing new tools into the Community Kit and the Cobalt Strike arsenal. Additionally, we are placing a great deal of emphasis on the security of the product itself in order to prevent misuse by malicious, non-licensed users.

With this increased investment comes additional costs and a pricing change. In appreciation for current Cobalt Strike users and their support of the solution, the change will not affect existing customers. The price of Cobalt Strike for new customers will be $5,900 per user for a one-year license.

The pricing for the Offensive Security – Advanced Bundle of Cobalt Strike and Core Impact will remain the same so you can pair any version of Core Impact—basic, pro, or enterprise—with Cobalt Strike at a reduced cost. Cobalt Strike’s interoperability with Core Impact highlights another one of the advantages of being part of a company with an ever-growing list of cybersecurity offerings. Developers of these products work together to help organizations create a cohesive security strategy that provides full coverage of their environments.

As we continue to evolve with the threat landscape and strengthen Cobalt Strike accordingly, a permanent fixture in our strategy will always be to listen to our customers. Many aspects of our updates are a direct result of customer feedback, so we encourage you to keep being vocal about the features that you most want to see. 

Cobalt Strike Roadmap Update

Historically, Raphael Mudge, the creator of Cobalt Strike, didn’t typically talk about the Cobalt Strike roadmap publicly. He preferred to play his cards close to his chest and only revealed the details about each release when it went live (and he didn’t give much warning about the release date, either). That was his way of building excitement for each release. For the most part we’ve continued that tradition, but I’d like to spend a little time being a bit more transparent about our future development plans, before dropping back into the shadows.

I spent about a year working closely with Raphael after HelpSystems acquired Strategic Cyber, amongst other things being educated on what makes Cobalt Strike so special. One of the many things that he instilled in me is that the fundamental principles of Cobalt Strike are stability and flexibility. He was excited to see a team of experienced, professional software engineers being built around the product to provide the stability and we’ve continued to add flexibility over the past few releases – for example, with the recent sleep mask kit and user defined reflective loader kit. That’s our mantra: Stability and Flexibility.

Raphael also cautioned against adding cutting edge, out of the box evasion techniques to Cobalt Strike. The obvious danger is that once they’re inevitably fingerprinted, we’d get stuck in an endless loop of fixing those issues rather than working on new features. Cobalt Strike’s defaults are easily fingerprinted and that’s by design. The idea is that you bring your own tools and techniques to Cobalt Strike and use those. That’s what makes it unique.

We spend a lot of time engaging with our user community on social media, Slack and Discord, sometimes engaging directly in those threads and sometimes via DM, email or on video calls. I love that aspect of my role. It’s great to get the opportunity to interact directly with people that are using Cobalt Strike and see first-hand what’s working and what isn’t.

We’ve had a lot of feedback recently that some users just don’t have the time to work on their own tools because they’re so busy on engagements. We created the Cobalt Strike Community Kit to act as a central repository of extensions written by our users to make it easier to find useful tools but obviously there are cases where specific tools just don’t exist and you don’t have time to write them yourselves. We don’t want to abandon our core philosophy and start adding out of the box evasion to the core product, but we are making some changes.

Firstly, we are expanding the development team to provide additional capacity. Secondly, and more importantly, we are changing our development cycle so that we can give you your cake AND let you eat it.

Up until now, we have aimed to get at least three releases out per year. We are moving to a model where we will release updates to core Cobalt Strike (Stability and Flexibility) twice per year. One release will be in the Summer, and another in the Winter. You’re confused. I can sense it. “How does reducing the number of releases help?” Well, the second part of the new release schedule is to ramp up research activities and start releasing more tools outside the regular release schedule. What does this mean? The plan is that essentially, in between those core releases (which should contain more features due to the extended development time between them), we’ll be releasing a steady stream of tools into the Community Kit and/or into the Cobalt Strike arsenal. The location of each tool pretty much depends on the type of tool being released and whether we’re releasing the source as well.

There is a caveat to this, though. There is a little short-term pain while we pivot to this new release model. There will be a small, intermediate Cobalt Strike release this Spring (late March or early April) that doesn’t really have a lot of flashy new features for you, our users, but sets the foundation for future releases. We have a much bigger release planned that should ship around July/August to mark Cobalt Strike’s 10-year anniversary.

The future is bright. HelpSystems continues to invest in Cobalt Strike and expand the team around it. We will continue to listen to our users and give you the product and features that you need.

Feature requests can be submitted to [email protected] and I’m always happy to talk to users on social media, Slack and Discord.

Joe’s Transition

My career is taking me in a new and exciting direction, and I am stepping down from my role on the Cobalt Strike team.

I’ve spent the last year helping HelpSystems integrate Cobalt Strike into their processes and shift from a single developer to a team effort. I can honestly say that “Cobalt Strike is in great hands.”

I’ve seen tremendous growth and support from the Cobalt Strike community in the last 12 months. Thank you all for the generous support and outstanding research. Keep pushing forward. 

The team’s mantra will continue to focus on product stability and flexibility to the product’s attack chain. I’ve helped pack the roadmap with great features, many of which came directly from community requests.

I’ve been a user of Cobalt Strike for almost ten years now, and it has been an honor to be part of this team. I wish the team good luck and look forward to seeing their influence on professional security testing. 

I may be moving on, but I will continue to be part of the security community. My career has been filled with great opportunities and great people. I am very thankful for the generous support shown by our community.

Make sure to follow  @CoreAdvisories for Cobalt Strike related Twitter announcements. 

Cobalt Strike Training Options

The Cobalt Strike training web page has been updated. https://www.cobaltstrike.com/training/

The training web page lists free courses created by the Cobalt Strike team that provide an overview of the product. It also lists courses offered by trusted 3rd parties. The 3rd party courses use Cobalt Strike to some degree and can be a great way to practice and learn how Cobalt Strike can be used in a realistic environment.

The page will be updated as new courses are added.

Cobalt Strike Training

Writing Beacon Object Files: Flexible, Stealthy, and Compatible

Our colleagues over at Core Security have been doing great things with Cobalt Strike, making use of it in their own engagements. They wrote up this post on creating Cobalt Strike Beacon Object Files using the MinGW compiler on Linux. It covers several ideas and best practices that will increase the quality of your BOFs.

Flexibility

Compiling to Both Object Files and Executables

While writing a BOF is great, it’s always worth making the code compile to both BOF and EXE.

This provides a lot more options: we could run our capability outside Beacon by just writing the EXE to disk and executing it. We could then convert it into position independent shellcode using donut and run it from memory.

Usually, calling a Windows API from Beacon Object File would appear as follows:

program.h

WINBASEAPI size_t __cdecl MSVCRT$strnlen(const char *s, size_t maxlen);

program.c

int length = MSVCRT$strnlen(someString, 256);
BeaconPrintf(CALLBACK_OUTPUT, "The variable length is %d.", length);

Makefile

BOFNAME := program
CC_x64 := x86_64-w64-mingw32-gcc
all:
    $(CC_x64) -c source/program.c -o compiled/$(BOFNAME).x64.o -masm=intel -Wall

However, we would like to create both a BOF and an EXE file using the same file. A practical option to achieve the creation of both files is to add a conditional compilation clause as shown below. In this example, we are using BOF:

Makefile

BOFNAME := program
CC_x64 := x86_64-w64-mingw32-gcc
all:
    $(CC_x64) -c source/program.c -o compiled/$(BOFNAME).x64.o   -masm=intel -Wall -DBOF
    $(CC_x64)    source/program.c -o compiled/$(BOFNAME).x64.exe -masm=intel -Wall

program.h

#ifdef BOF
WINBASEAPI size_t __cdecl MSVCRT$strnlen(const char *s, size_t maxlen);
#define strnlen MSVCRT$strnlen
#endif
#ifdef BOF
#define PRINT(...) { \
     BeaconPrintf(CALLBACK_OUTPUT, __VA_ARGS__); \
}
#else
#define PRINT(...) { \
     fprintf(stdout, __VA_ARGS__); \
     fprintf(stdout, "\n"); \
}
#endif

program.c

int length = strnlen(someString, 256);
PRINT("The variable length is %d.", length);

Finally, in our program.c file, we would define the “go” (BOF’s entry point) and “main” functions:

program.c

#ifdef BOF
void go(char* args, int length)
{
     // BOF code
}
#else
int main(int argc, char* argv[])
{
    // EXE code
{
#endif

Stealth

Syswhispers2 Integration

syswhispers2 is an awesome implementation of direct syscalls. However, if we take a look under the hood, we can see that it uses a global variable to achieve its objective. Unfortunately, global variables do not work very well with Beacon. This is because Beacon Object Files don’t have a .bss section, which is where global variables are typically stored.

A useful trick, originally suggested by Twitter user @the_bit_diddler, is to move the global variables to the .data section using a compiler directive, as shown below:

syscalls.c (before)

SW2_SYSCALL_LIST SW2_SyscallList;

syscalls.c (after)

SW2_SYSCALL_LIST SW2_SyscallList __attribute__ ((section(".data")));

This small change will allow the use of the syswhispers2 logic in a BOF.
In addition to the global variables change, there are other minor changes that need to be made so that the the code of syswhispers2 can compile with MinGW. For example, the API hashes format needs to be changed from 0ABCD1234h to: 0xABCD1234. The tool InlineWhispers should take care of the rest.

Using direct syscalls is a powerful technique to avoid userland hooks. Ironically, using them could get us caught.

There are at least two ways of detecting direct syscalls: dynamic and static.
The dynamic method is simply detecting that a syscall was called from a module that is not ntdll.dll. The static method is to find a syscall instruction by inspecting the program’s code and memory. How can we avoid both these detections? The answer is to call our syscalls from ntdll.dll.

First, we must locate where ntdll.dll is loaded. Luckily, syswhispers2 already has the code to do just that. Then, we can parse its headers and locate the code section.

Hiding the Use of syscalls

Once we know code section base address and size of ntdll.dll, all we need to do is search for the opcodes of the instructions syscall; ret. In x64, the bytes we are looking for are: { 0x0f, 0x05, 0xc3 }.

While it is true that EDRs and other tools hook (overwrite) syscalls in ntdll.dll, they certainly do not hook all existing syscalls, so we are guaranteed to find at least one occurrence of these three bytes. We might even find them by chance in a misaligned offset.

Once we find the syscall; ret bytes, we can save the address in a global variable (stored in the .data section). That way, we only need to find it once.

All what we have just described can be seen in the following code sequence:

syscalls.c

#ifdef _WIN64
#define PEB_OFFSET 0x60
#define READ_MEMLOC __readgsqword
#else
#define PEB_OFFSET 0x30
#define READ_MEMLOC __readfsdword
#endif

PVOID SyscallAddress __attribute__ ((section(".data"))) = NULL;
 
__attribute__((naked)) void SyscallNotFound(void)
{
    __asm__(" SyscallNotFound: \n\
        mov eax, 0xC0000225 \n\
        ret \n\
    ");
}

PVOID GetSyscallAddress(void)
{
#ifdef _WIN64
    BYTE syscall_code[] = { 0x0f, 0x05, 0xc3 };
#else
    BYTE syscall_code[] = { 0x0f, 0x34, 0xc3 };
#endif

    // Return early if the SyscallAddress is already defined
    if (SyscallAddress)
    {
        // make sure the instructions have not been replaced
        if (!strncmp((PVOID)syscall_code, SyscallAddress, sizeof(syscall_code)))
            return SyscallAddress;
    }
  
    // set the fallback as the default
    SyscallAddress = (PVOID) SyscallNotFound;
 
    // find the address of NTDLL
    PSW2_PEB Peb = (PSW2_PEB)READ_MEMLOC(PEB_OFFSET);
    PSW2_PEB_LDR_DATA Ldr = Peb->Ldr;
    PIMAGE_EXPORT_DIRECTORY ExportDirectory = NULL;
    PVOID DllBase = NULL;
    PVOID BaseOfCode = NULL;
    ULONG32 SizeOfCode = 0;
 
    // Get the DllBase address of NTDLL.dll. NTDLL is not guaranteed to be the second
    // in the list, so it's safer to loop through the full list and find it.
    PSW2_LDR_DATA_TABLE_ENTRY LdrEntry;
    for (LdrEntry = (PSW2_LDR_DATA_TABLE_ENTRY)Ldr->Reserved2[1]; LdrEntry->DllBase != NULL; LdrEntry = (PSW2_LDR_DATA_TABLE_ENTRY)LdrEntry->Reserved1[0])
    {
        DllBase = LdrEntry->DllBase;
        PIMAGE_DOS_HEADER DosHeader = (PIMAGE_DOS_HEADER)DllBase;
        PIMAGE_NT_HEADERS NtHeaders = SW2_RVA2VA(PIMAGE_NT_HEADERS, DllBase, DosHeader->e_lfanew);
        PIMAGE_DATA_DIRECTORY DataDirectory = (PIMAGE_DATA_DIRECTORY)NtHeaders->OptionalHeader.DataDirectory;
        DWORD VirtualAddress = DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress;
        if (VirtualAddress == 0) continue;
 
        ExportDirectory = SW2_RVA2VA(PIMAGE_EXPORT_DIRECTORY, DllBase, VirtualAddress);
 
        // If this is NTDLL.dll, exit loop.
        PCHAR DllName = SW2_RVA2VA(PCHAR, DllBase, ExportDirectory->Name);
        if ((*(ULONG*)DllName | 0x20202020) != 0x6c64746e) continue;
        if ((*(ULONG*)(DllName + 4) | 0x20202020) == 0x6c642e6c)
        {
            BaseOfCode = SW2_RVA2VA(PVOID, DllBase, NtHeaders->OptionalHeader.BaseOfCode);
            SizeOfCode = NtHeaders->OptionalHeader.SizeOfCode;
            break;
        }
    }
    if (!BaseOfCode || !SizeOfCode)
        return SyscallAddress;
 
    // try to find a 'syscall' instruction inside of NTDLL's code section
  
    PVOID CurrentAddress = BaseOfCode;
    PVOID EndOfCode = SW2_RVA2VA(PVOID, BaseOfCode, SizeOfCode - sizeof(syscall_code) + 1);
    while ((ULONG_PTR)CurrentAddress <= (ULONG_PTR)EndOfCode)
    {
        if (!strncmp((PVOID)syscall_code, CurrentAddress, sizeof(syscall_code)))
        {
            // found 'syscall' instruction in ntdll
            SyscallAddress = CurrentAddress;
            return SyscallAddress;
        }
        // increase the current address by one
        CurrentAddress = SW2_RVA2VA(PVOID, CurrentAddress, 1);
    }
    // syscall entry not found, using fallback
    return SyscallAddress;
}

syscalls.h

EXTERN_C PVOID GetSyscallAddress(void);

In the extremely unlikely scenario in which we do not find ANY occurrence of these three bytes in the code section of ntdll.dll, we can instead use our own function: SyscallNotFound. This simply returns STATUS_NOT_FOUND. We could implement a syscall; ret, but keep in mind that we want to avoid having the syscall instruction in our code in order to evade static analysis.

Once we have the memory address of interest, all we need to do is to modify the assembly of our syscall functions to jump to this memory address:

push rcx ; save volatile registers
push rdx
push r8
push r9
sub rsp, 0x28 ; allocate some space on the stack
call GetSyscallAddress ; call the C function and get the address of the 'syscall' instruction in ntdll.dll
add rsp, 0x28
push rax ; save the address in the stack
sub rsp, 0x28 ; allocate some space on the stack
mov ecx, 0x0123ABCD ; set the syscall hash as the parameter
call SW2_GetSyscallNumber ; get the id of the syscall using syswhispers2
add rsp, 0x28
pop r11 ; store the address of the 'syscall' instruction on r11
pop r9 ; restore the volatile registers
pop r8
pop rdx
pop rcx
mov r10, rcx
jmp r11 ; jump to ntdll.dll and call the syscall from there

And voilà, we use direct syscalls from a valid module (ntdll.dll) without having a syscall instruction in our code 😊.

Stripping the Debug Symbols

While this step is not critical, stripping your binaries is clever enough that it is worth the extra step. Once completed, they are not only a lot harder to analyze but they also get smaller in size.

All we need to do is modify the Makefile to look as follows:

BOFNAME := program
CC_x64 := x86_64-w64-mingw32-gcc
STRIP_x64 := x86_64-w64-mingw32-strip
 
all:
    $(CC_x64) -c program.c -o compiled/$(BOFNAME).x64.o   -masm=intel -Wall -DBOF
    $(STRIP_x64) --strip-unneeded compiled/$(BOFNAME).x64.o
 
    $(CC_x64)    program.c -o compiled/$(BOFNAME).x64.exe -masm=intel -Wall
    $(STRIP_x64) --strip-all compiled/$(BOFNAME).x64.exe

While the EXE does end up being a smaller, stripping the BOF doesn’t reduce its size significantly (only around 500 bytes).

Once the debugging symbols are stripped, if the program is compiled without changing the code, the resulting object file and executable will be the same regardless of who compiled it. This means that everyone will get the same object files after compiling it.


Is that a bad thing? Potentially, but only if fingerprinting is a concern. The code could be slightly modified and recompiled. For example, the seed of syswhispers2 could be changed. If code is run from a Beacon or in memory in the form of shellcode, fingerprinting should not be worrisome, as static analysis in those cases is not possible.

Compatibility

Supporting x86 might seem hard and pointless, but we shouldn’t limit ourselves and have every 32-bit machine out of our reach. Supporting x86 is a fun challenge and pays off in the end.

Code Logic

We’ll begin by introducing some conditional compilation clauses based on the architecture:

#if _WIN64
// x64 version of some logic
#else
// x86 version of some logic
#endif

If we want to add some code that is exclusive to x64:

#if _WIN64
// some code only for x64
#endif

If we want to add some code that is exclusive to x86:

#ifndef _WIN64
// some code only for x86
#endif

X86 syscall Support

To support syscalls in x86, we will have to deal with a few difficulties that are very manageable.

Function Names Within x86 Assembly

The main issue that we can encounter trying to call the C functions SW2_GetSyscallNumber and GetSyscallAddress from x86 inline assembly, results in these compiler errors:

/usr/lib/gcc/i686-w64-mingw32/11.2.0/../../../../i686-w64-mingw32/bin/ld: /tmp/ccbjuGDN.o:program.c:(.text+0x68): undefined reference to `GetSyscallAddress'

/usr/lib/gcc/i686-w64-mingw32/11.2.0/../../../../i686-w64-mingw32/bin/ld: /tmp/ccbjuGDN.o:program.c:(.text+0x73): undefined reference to `SW2_GetSyscallNumber'

There is some GCC documentation which explains that, for some reason, in x86 inline assembly, C functions (and variables) are prepended with an underscore to their name. So, in this case,  GetSyscallAddress becomes _GetSyscallAddress and SW2_GetSyscallNumber becomes _SW2_GetSyscallNumber.

Instead of calling them with the underscore, we can just adapt their definition to specify their name in assembly, like this:

syscalls.h

EXTERN_C DWORD SW2_GetSyscallNumber(DWORD FunctionHash) asm ("SW2_GetSyscallNumber");
EXTERN_C PVOID GetSyscallAddress(void) asm ("GetSyscallAddress");

We also need to do the same with the definitions for all the syscalls in syscalls.h. For example, here’s how we can modify NtOpenProcess:

syscalls.h (before)

EXTERN_C NTSTATUS NtOpenProcess(
OUT PHANDLE ProcessHandle,
IN ACCESS_MASK DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes,
IN PCLIENT_ID ClientId OPTIONAL);

syscalls.h (after)

EXTERN_C NTSTATUS NtOpenProcess(
OUT PHANDLE ProcessHandle,
IN ACCESS_MASK DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes,
IN PCLIENT_ID ClientId OPTIONAL) asm ("NtOpenProcess");

Once this is done, the weird x86 naming system should work fine.

Syscalls With Conflicting Types

There are some syscalls that fail to compile in x86, and produce an error message like:

error: conflicting types for ‘NtClose’;

While there are surely others, these syscalls are confirmed to have this issue:

  • NtClose
  • NtQueryInformationProcess
  • NtCreateFile
  • NtQuerySystemInformation
  • NtQueryObject

It appears that in x86, MinGW already has a definition of these functions somewhere. To fix this, we just need to rename the troubling syscalls by prepending an underscore to their name in the x86 version.

program.h

In program.c, we can call these functions normally, without prepending the underscore to their name.

X86 Assembly Code

For the assembly code, we’ll need to update syscalls-asm.h to look as follows:

syscalls-asm.h

Finally, the x86 assembly will look like this:

After all these changes, we have syscalls x86 support.

WoW64 Support?

WoW64 stands for Windows on Windows64, which means there are 32-bit programs running on 64-bit Windows machines.In WoW64 processes, syscalls are not called via a syscall or sysenter instruction. Instead, a jump to fs:[0xc0] is performed. Understanding the way this works requires a long explanation, but for the purpose of this article, all we need to know is that it translates syscalls from 32 to 64-bit so that the kernel can understand them.

One quick way of “supporting” syscalls on WoW64 processes is to perform the same jump from our code. However, there are a few drawbacks when doing this. First, this is by no means a direct syscall. EDRs can hook these calls. Additionally, in some syscalls that use pointers, we will not be able to reference addresses above 32-bit.

Truly supporting direct syscalls for WoW64 processes would require us to transition via a far jmp instruction into 64-bit code, translate the parameters to their 64-bit counterparts, adjust the calling convention, set the stack alignment and more. These actions alone could make up an entire post.

That being said, jumping to fs:[0xc0] is an easy trick and at least we would have some support for WoW64, which might be useful for some scenarios.

To detect if our program is running as WoW64 process, we’ll define a function called IsWoW64:

syscalls-asm.h

#if _WIN64
#define IsWoW64 IsWoW64
__asm__("IsWoW64: \n\
mov rax, 0 \n\
ret \n\
");
#else
#define IsWoW64 IsWoW64
__asm__("IsWoW64: \n\
mov eax, fs:[0xc0] \n\
test eax, eax \n\
jne wow64 \n\
mov eax, 0 \n\
ret \n\
wow64: \n\
mov eax, 1 \n\
ret \n\
");
#endif

syscalls.h

EXTERN_C BOOL IsWoW64(void) asm ("IsWoW64");

program.c

    if(IsWoW64())
    {
        PRINT("This is a 32-bit process running on a 64-bit machine!\n");
    }

If detection is a concern when running under a WoW64 context, just call IsWow64() and bail out if it returns as true.
This can be checked on the .CNA file in Cobalt Strike:

program.cna

$barch = barch($1);
$is64 = binfo($1, "is64");
if($barch eq "x86" && $is64 == 1)
{
    berror($1, "This program does not support WoW64");
    return;
}

We’ll also need to make a small change to the function GetSyscallAddress in order to set the syscall address to fs:[0xc0] if the process Is WoW64:

PVOID GetSyscallAddress(void)
{
#ifdef _WIN64
    BYTE syscall_code[] = { 0x0f, 0x05, 0xc3 };
#else
    BYTE syscall_code[] = { 0x0f, 0x34, 0xc3 };
#endif
 
#ifndef _WIN64
    if (IsWoW64())
    {
        // if we are a WoW64 process, jump to WOW32Reserved
        SyscallAddress = (PVOID)READ_MEMLOC(0xc0);
        return SyscallAddress;
    }
#endif
 
    // Return early if the SyscallAddress is already defined
    if (SyscallAddress)
    {
        // make sure the instructions have not been replaced
        if (!strncmp((PVOID)syscall_code, SyscallAddress, sizeof(syscall_code)))
            return SyscallAddress;
    }
 
    // set the fallback as the default
    SyscallAddress = (PVOID)DoSysenter;
    …

Finally, we’ll update our Makefile to compile for both 64 and 32-bit.

Makefile

BOFNAME := program
CC_x64 := x86_64-w64-mingw32-gcc
CC_x86 := i686-w64-mingw32-gcc
STRIP_x64 := x86_64-w64-mingw32-strip
STRIP_x86 := i686-w64-mingw32-strip
 
all:
    $(CC_x64) -c program.c -o compiled/$(BOFNAME).x64.o   -masm=intel -Wall -DBOF
    $(STRIP_x64) --strip-unneeded compiled/$(BOFNAME).x64.o
 
    $(CC_x86) -c program.c -o compiled/$(BOFNAME).x86.o   -masm=intel -Wall -DBOF
    $(STRIP_x86) --strip-unneeded compiled/$(BOFNAME).x86.o

    $(CC_x64)    program.c -o compiled/$(BOFNAME).x64.exe -masm=intel -Wall
    $(STRIP_x64) --strip-all compiled/$(BOFNAME).x64.exe
 
    $(CC_x86)    program.c -o compiled/$(BOFNAME).x86.exe -masm=intel -Wall
    $(STRIP_x86) --strip-all compiled/$(BOFNAME).x86.exe
 
clean:
    rm compiled/$(BOFNAME).*.*

Conclusion

To summarize, this post explored several technical solutions to achieve the following objectives:

  • Create executables as well as BOF using the same codebase
  • Use syscalls from ntdll.dll instead of using them directly from an unknown module
  • Strip executables to make them smaller and harder to analyze
  • Run on both 64-bit and 32-bit
  • Have partial support for syscalls in WoW64

If you want to see an example of all this working together, check out nanodump.

User Defined Reflective Loader (UDRL) Update in Cobalt Strike 4.5

The User Defined Reflective Loader (UDRL) was first introduced in Cobalt Strike 4.4. to allow the creation and use of a custom reflective loader. This quickly took off by the community and its limits were pushed. Updates were made in 4.5 to help address some of these limits.

Updates

Increased Size

A new hook BEACON_DLL_SIZE was added to specify either 5k or 100k for your custom loader. This increase will be reflected in your payloads.

Artifact Kit

The artifact kit has been updated to allow customization of sizes to ensure space is available for your loader.

UDRL and Malleable C2 Profile Consideration

When using the default reflective loader and generating Beacons there are a few settings that affect Beacon’s runtime configuration and the loader, which are automatically handled when the payload is generated. This ability allows an operator to change these settings in the profile which will modify how the payload is generated and how it looks like in memory. 

When using a UDRL these settings are inserted into the Beacon’s runtime configuration. However, because the reflective loader is defined by the user it is not possible to modify the reflective loader as it does in the default case. It is on the user to modify their reflective loader accordingly. 

For example, if you are using the example loader from the URDL kit, there is no code in the loader to do any conditional setup based on information stored in the header of the image. This could cause runtime issues with Beacon that you are not expecting because of the settings in your Malleable C2 profile. You will have to deal with the same type of issues when writing your own reflective loader from scratch.

Settings to be Considered

The following are settings to consider when the malleable C2 profile option stage.sleepmask is set to TRUE:

stage.userwx 

This setting is a Boolean and informs the default loader to either use RWX or RX memory. At runtime Beacon will either include or not include the .text section for masking. If the setting is set to TRUE, your user defined loader needs to set the protection on the .text section as RWX otherwise Beacon will crash. If the setting is set to FALSE, your user defined loader should set the protection on the .text section as RX as the .text section will not be masked. 

stage.obfuscate 

This setting is a Boolean and informs the default loader to either copy the header or not copy the header into memory. At runtime Beacon will either include or not include the header section for masking. If the setting is set to TRUE, your UDRL should not copy the header into memory as Beacon will not mask the header section. If the setting is set to FALSE, your user defined loader should copy the header into memory as Beacon will mask the header section. 

Depending on how sophisticated your reflective loader is you will need to make sure the settings in the Malleable C2 profile will work with how the Beacon payload is loaded into memory. With the BEACON_RDLL_GENERATE and BEACON_RDLL_GENERATE_LOCAL aggressor script hooks you do have the opportunity to modify your reflective loader by using the aggressor script pe_* functions. 

Handling a UDRL over 5k

The following is an example error that indicates your loader is over 5k. You can use the BEACON_DLL_SIZE hook to increase this space to 100k.

Loader is over 5k

Artifact Kit Considerations

If you are using an artifact kit based on the kit provided by Cobalt Strike, but is using the default ‘stagesize‘ values, this error is logged indicating the larger patched Beacon will not fit in the standard artifacts generated for the kit. You will need to rebuild the artifact kit with larger ‘stagesize‘ environment variable definitions.

Artifact kit hook error when the stagesize value is too low

References