Beacon Object Files (BOFs) were introduced in Cobalt Strike 4.1 in 2020. Since their release, BOFs have played a key role in post-exploitation activities, surpassing Reflective DLLs, .NET assemblies, and PowerShell scripts. However, in our experience, many developers struggle with four primary pain points:

  • The limitations of writing BOFs in C
  • Dynamic Function Resolution (DFR)
  • Difficulties with debugging BOFs
  • Unit Testing

In this blog post, we will tackle these difficulties by introducing a Visual Studio BOF template written in C++, which addresses the issues identified above. We aim to help you create more robust BOFs and to avoid unnecessary Beacon crashes on an endpoint.

Many of you asked if you could publicly share the BOFs written with this template. Therefore, we moved the template from Arsenal Kit to GitHub to encourage people to share them. You can find the template here.

Good Old C

Traditionally, BOFs have been written in C. For example, the following BOF queries the name of the primary Domain Controller in an Active Directory domain: 

#include <windows.h>
#include <stdio.h>
#include <dsgetdc.h>
#include "beacon.h"

DECLSPEC_IMPORT DWORD WINAPI NETAPI32$DsGetDcNameA(LPVOID, LPVOID, LPVOID, LPVOID, ULONG, LPVOID);
DECLSPEC_IMPORT DWORD WINAPI NETAPI32$NetApiBufferFree(LPVOID);

void go(char * args, int alen) {
    PDOMAIN_CONTROLLER_INFO pdcInfo;
    DWORD dwRet = NETAPI32$DsGetDcNameA(NULL, NULL, NULL, NULL, 0, &pdcInfo);

    if (ERROR_SUCCESS == dwRet) {
        BeaconPrintf(CALLBACK_OUTPUT, "%s", pdcInfo->DomainName);
    } 
    NETAPI32$NetApiBufferFree(pdcInfo);
} 

This example demonstrates some of the existing issues with DFR in BOF development. It necessitates declaring prototypes for imported functions and subsequently calling them using a verbose naming convention.

Unfortunately, in our opinion, C lacks the features required to address these challenges adequately. However, if we can compile an object file that contains the correct entry function (go()) with the correct calling convention, we can write a BOF in any programming language. For example, here is a PoC BOF written in Rust.

Note though that the object file cannot have any dependencies, which usually means we cannot use the programming language’s standard library. Therefore, a BOF written in other languages is not necessarily more feature rich than a BOF written in C. Regardless, we can still take advantage of the built-in features of the language itself to aid BOF development and in this blog post we will focus on built-in functionality within C++.

Using C++ to Write BOFs

We have chosen to use C++ for our template primarily because it offers features that help improve DFR declarations. We can also leverage powerful features built-in to the C++ language, such as templates, classes and compile-time expressions etc. Many use cases already exist for these features, such as applying compile-time string obfuscation, as demonstrated by Adam Yaxley here. Additionally, there is also the added benefit that it is trivial to port the significant number of existing C BOFs to use it.

Writing BOFs in C++ is nearly as straightforward as in C. However, because C++ encodes additional information into the name of functions, our entry point differs from what Cobalt Strike’s BOF loader expects. For example, after compiling the Domain Controller BOF shown above with the C compiler, the symbol table will contain the name go as shown in dumpbin’s output:

COFF SYMBOL TABLE
... 
00C 00000000 SECT3  notype ()    External     | go
...

In contrast, after compiling the same BOF with Microsoft’s C++ compiler, the symbol table will contain the name entry ?go@@YAX[…], as shown below: 

COFF SYMBOL TABLE
... 
00F 00000000 SECT4  notype ()    External     | ?go@@YAXPEADH@Z (void __cdecl go(char *,int))
...

To overcome this difference in naming convention, we can wrap the entry function inside an extern “C” block to force the compiler to use the C linkage for that function. Furthermore, to address the same issue, the beacon.h header file must also be included within an extern “C” block (as shown below). Otherwise, the BOF loader does not recognize Beacon API function names correctly.

extern "C" {
    #include "beacon.h"
    void go(char* args, int alen) {
        ... 
    } 
} 

Improving Dynamic Function Resolution

There are two distinct methods for resolving Win32 APIs with regards to BOFs. The first approach is to use the LoadLibraryA/GetModuleHandle and GetProcAddress functions. These are exposed to the BOF by default and can be used to directly resolve the Win32 APIs you wish to invoke. The second approach is to use Beacon’s Dynamic Function Resolution (DFR) which is the more widely used method. DFR enables you to provide the necessary information to Beacon, which in turn resolves the Win32 API functions during the loading process of the BOF.

DFR is a practical solution for simplifying BOF writing by offloading the resolution of Win32 APIs to Beacon. Despite its practicality, it has two disadvantages:

  1. DFR declarations are usually very verbose.
  2. When calling Win32 API functions, you must stick to the MODULE$Function naming convention. 

For example, usage of the DsGetDcNameA function requires the following DFR declaration:

DECLSPEC_IMPORT DWORD WINAPI NETAPI32$DsGetDcNameA(LPVOID, LPVOID, LPVOID, LPVOID, ULONG, LPVOID);

Several projects aim to ease the DFR declaration process. For instance, DTM‘s Python script automatically generates the necessary DFR declarations, while TrustedSec’s bofdefs.h header file provides declarations for commonly used Win32 APIs. These resources are helpful, but ultimately your code will still contain these lengthy declarations.

In the following sections, we will introduce two novel DFR approaches to aid BOF development: DFR via decltype specifiers and DFR via variable shadowing. Both approaches can now be used for BOF development in our Visual Studio BOF template.

DFR via the decltype Specifier

To simplify DFR, we can utilize the decltype specifier, which was introduced in C++11. Decltype allows us to extract the type of an expression and leverage it in DFR declarations to determine the function’s return type and arguments automatically. By using decltype, we can streamline DFR declarations in our code:

DECLSPEC_IMPORT decltype(DsGetDcNameA) NETAPI32$DsGetDcNameA;

The example above works because the DsGetDcNameA function is already declared in the Windows.h header file, enabling decltype to determine its type. However, the function declaration remains lengthy, primarily consisting of static components. Therefore, we can utilize a macro to generate the code above to reduce its length:

#define DFR(module, function) \
DECLSPEC_IMPORT decltype(function) module##$##function;

This macro simplifies the DFR declarations to the following:

DFR(NETAPI32, DsGetDcNameA)

Nevertheless, there is still room for improvement as the function still needs to be called using the NETAPI32$DsGetDcNameA nomenclature. This naming convention is less than ideal as it introduces extra steps when attempting to reuse existing code. For example, a common mistake is to forget to add the preceding MODULE$ name.

One approach to get around this problem is to create a macro that establishes a mapping between the standard Win32 API function names and their corresponding MODULE$Function labels. By employing this technique, the following definition allows us to use DsGetDcNameA, instead of NETAPI32$DsGetDcNameA:

DFR(NETAPI32, DsGetDcNameA)
#define DsGetDcNameA NETAPI32$DsGetDcNameA

Remember that the DFR macro must be used before defining the name mapping macro.

We can now convert the Domain Controller BOF introduced above to leverage the DFR macros to prevent us from writing the long DECLSPEC_IMPORT lines as follows:

#include <windows.h>
#include <stdio.h>
#include <dsgetdc.h>
#include <lm.h>
#include "beacon.h"
 
DFR(NETAPI32, DsGetDcNameA)
#define DsGetDcNameA NETAPI32$DsGetDcNameA
DFR(NETAPI32, NetApiBufferFree)
#define NetApiBufferFree NETAPI32$NetApiBufferFree

void go(char * args, int alen) {
    PDOMAIN_CONTROLLER_INFO pdcInfo;
    DWORD dwRet = DsGetDcNameA(NULL, NULL, NULL, NULL, 0, &pdcInfo);

    if (ERROR_SUCCESS == dwRet) {
        BeaconPrintf(CALLBACK_OUTPUT, "%s", pdcInfo->DomainName);
    }

    NetApiBufferFree(pdcInfo);
} 

DFR via Variable Shadowing

The DFR macro approach demonstrated above helps us automate the DECLSPEC_IMPORT declaration generation. However, we still must manually define the Function -> MODULE$Function mapping if we want to use Function() rather than the MODULE$Function naming convention. Unfortunately, other preprocessor macros cannot define new macros, so automating the macro definition within the DFR macro is impossible.

However, we can employ an alternative approach by defining a regular function pointer variable that stores the address of the MODULE$Function function. C++ supports variable shadowing, which allows us to declare a DsGetDcNameA variable, as shown below:

decltype(NETAPI32$DsGetDcNameA) *DsGetDcNameA = NETAPI32$DsGetDcNameA;

This approach allows us to call the NETAPI32$DsGetDcNameA function via a function pointer named DsGetDcNameA. Therefore, it removes the need for an extra macro to create the mapping between function names, which was required for the DFR macro.

This approach has one limitation. We cannot define the function pointer outside of a code block. This means we are forced to define the pointer in every function where we want to call that specific Win32 API.

The template includes the DFR_LOCAL macro to generate DFR declarations via variable shadowing as follows:

#define DFR_LOCAL(module, function) \
DECLSPEC_IMPORT decltype(function) module##$##function; \
decltype(module##$##function) *##function = module##$##function;

The domain controller BOF can now be simplified by using the DFR_LOCAL macro, as demonstrated below:

#include <windows.h>
#include <stdio.h>
#include <dsgetdc.h>
#include <lm.h>
#include "beacon.h"

void go(char * args, int alen) { 
    DFR_LOCAL(NETAPI32, DsGetDcNameA);
    DFR_LOCAL(NETAPI32, NetApiBufferFree);

    PDOMAIN_CONTROLLER_INFO pdcInfo;
    DWORD dwRet = DsGetDcNameA(NULL, NULL, NULL, NULL, 0, &pdcInfo);
    if (ERROR_SUCCESS == dwRet) {
        BeaconPrintf(CALLBACK_OUTPUT, "%s", pdcInfo->DomainName);
    }  

    NetApiBufferFree(pdcInfo);
} 

The choice of which approach to use is up to your personal preference. That said, it is essential to note a key implementation difference between the two methods. Let us examine the following code to understand this distinction better:

DFR(KERNEL32, GetLastError)
#define GetLastError KERNEL32$GetLastError 

void func1() {  
    DFR_LOCAL(KERNEL32, OpenProcess);
  
    OpenProcess(...);
    GetLastError();
} 

The latest version of Microsoft Visual Studio Compiler generates the following assembly for the above code:

; DFR_LOCAL(KERNEL32, OpenProcess)
mov     rax, QWORD PTR __imp_void * KERNEL32$OpenProcess; Get the address of the KERNEL32$OpenProcess
mov     QWORD PTR OpenProcess$[rsp], rax ; move the above address to stack 

; OpenProcess(...), argument setting for OpenProcess removed for clarity
call    QWORD PTR OpenProcess$[rsp] ; call KERNEL32$OpenProcess 

; GetLastError() 
call    QWORD PTR __imp_unsigned long KERNEL32$GetLastError(void) 

As expected, the DFR_LOCAL macro utilizes the stack to store the function pointer and the call instruction references the memory location on the stack.

Conversely, the DFR macro declares a function, and the call instruction directly invokes it. Therefore, it does not add extra code, as DFR_LOCAL does in the code snippet above. However, turning on compiler optimizations reduces the generated assembly for the DFR_LOCAL macro and simplifies it to:

call    QWORD PTR __imp_?KERNEL32$OpenProcess@@YAPEAXKHK@Z

Hence, there are no differences in the generated code between these two macros if compiler optimizations are used. You can experiment with the optimization settings for this example here. As a note, the template itself has all optimization disabled by default.

Debugging Unlocked

Debugging BOFs has always been time-consuming, primarily due to the need to run a Beacon for testing purposes. Attaching a debugger to a BOF can be tricky, often forcing developers to rely on print-style debugging techniques.

Several publicly available BOF runners, such as TrustedSec’s COFF Loader and Nettitude’s RunOF, can be used to execute a BOF without the requirement of running Beacon. It is possible to use the DebugBreak Win32 API or utilize compiler intrinsics to inject an “int 3” to trigger a debugger when working with these runners. However, these approaches require debugging at the assembly level, which is sub optimal, especially considering the source code is available.

With our Visual Studio BOF template, you can benefit from the convenience of debugging your BOF code directly within Visual Studio’s built-in debugger. This allows you to work at the source code level without needing to run the BOF through Beacon. To achieve this, we compile the debug version of the BOF as an EXE rather than an object file. However, for this approach to function correctly, it is necessary to define a main() function in the debug build as follows:

#ifdef _DEBUG
int main() {
    go(NULL, 0);
    return 0;
} 
#endif

Executing a BOF as an EXE introduces some minor differences compared to running it through Beacon. However, these differences are relatively minor (for example, DFR is not required, the memory pages are backed by the EXE, etc.). This approach offers a convenient method for debugging the code and enhances the overall developer experience. Despite the above, we would still recommend occasionally testing the BOF using Beacon to confirm it functions correctly.

As part of compiling a BOF into an EXE, it is essential to consider the DFR mechanism. The primary purpose of DFR is to enable the calling of Win32 API functions from a BOF. However, when compiling the BOF directly into an EXE, DFR becomes obsolete, as we can call the required Win32 API functions directly. To ensure proper functionality in the debug build, it is necessary to redefine the DFR and DFR_LOCAL macros as follows, effectively indicating that MODULE$Function simply means Function:

#define DFR_LOCAL(module, function) 
#define DFR(module, function) \ 
    decltype(function) *module##$##function = function; 

Argument Packer and the Beacon API

It is often necessary to pass arguments to a BOF. However, our current debug build lacks this capability. To address this limitation, the Visual Studio BOF template includes a mocked version of the argument packer.

The BofData class replicates the argument packing behavior of the bof_pack aggressor function. This enables us to call the BOF’s entry point with custom arguments without Beacon.

bof::mock::BofData data;

// the pack function takes one or more arguments
data.pack<int, short, int, const char*>(6502, 80, 68010, "Hello World");

// alternatively, the << operator can be used to construct the arguments buffer
data << 0xdeadface << L"Hello World";

// raw buffers can be added too
const char buf[] = { 0x41, 0x42, 0x43, 0x44 };
data.addData(buf, sizeof(buf));

go(data.get(), data.size());

BOFs often call the various Beacon API functions to perform specific actions and/or send output back to the TeamServer. To replicate a realistic (Beacon) execution environment, the template also provides mocked implementations of the various Beacon API functions. The mocked functions within the Output API print the output to the standard output, ensuring the results are visible. Moreover, all returned output is stored for future examination and analysis.

Additionally, the Internal API functions (such as BeaconUseToken, BeaconInjectProcess, etc.) are declared. However, it is important to note that these functions lack the real implementation and only display an error message on the standard error if called. If needed, you can always implement your own mockups for these unimplemented functions or change the current ones to fit your needs better.

All Tests Passed

Testing BOFs at scale or automating the testing process presents a considerable challenge, primarily due to the same issues encountered during the debugging phase. However, including any common unit testing framework in our debug build is straightforward now, enabling us to write test cases to cover the expected behavior of the BOF. On top of that, integrating this project template into a CI/CD pipeline offers a seamless way to test the BOF against various Windows versions. 

The project template offers an additional build target called UnitTest, specifically designed to build BOFs with the GoogleTest framework. Also, the mock library provides a convenient runMocked function that handles the argument packing, execution of the BOF’s entry point, and capturing all generated outputs. To illustrate this, let us say we have the following BOF, which parses one integer from the packed argument buffer and prints it back:

extern "C" { 
    #include "beacon.h"

    void go(char* args, int len) { 
        datap parser; 
        BeaconDataParse(&parser, args, len); 
        int number = BeaconDataInt(&parser); 
        BeaconPrintf(CALLBACK_OUTPUT, "Hello: %i", number); 
    } 
} 

A simple test case for the above BOF could be structured as follows:

TEST(ExampleBofTest, TestCase1) { 
    std::vector<bof::output::OutputEntry> actual =
        bof::runMocked<int>(go, 6502);

    std::vector<bof::output::OutputEntry> expected = {
        {CALLBACK_OUTPUT, "Hello: 6502"}
    };

    ASSERT_EQ(expected, actual);
}
Figure 1. The successful execution of test cases.

Visual Studio Project Template

The Visual Studio BOF template provided in the Arsenal Kit encloses all the aforementioned features and is a comprehensive solution for debugging and testing your BOFs. The template consists of three distinct build targets:

  1. Release: Generates production-ready object files for Cobalt Strike, ensuring a deployable version of your BOF.
  2. Debug: Builds a debug executable for the BOF, enabling a straightforward debugging experience to help identify and resolve bugs.
  3. UnitTest: Builds and executes all unit tests, allowing you to validate the functionality and behavior of your BOF through automated testing.

As a note, BOFs are compiled into individual object files. However, regular Visual Studio projects are meant to be compiled and linked into an EXE/DLL or library, and the project cannot be used to output only object files by default. To address this limitation, the project template utilizes Visual Studio’s Makefile project type. This provides a more granular build flow which allows for building only the necessary object files. Under the hood, this project type leverages Microsoft’s nmake to execute the Makefile and handle the build process. 

Additionally, having separate Visual Studio projects for each BOF may seem excessive. To enhance simplicity, the project template supports multiple BOFs within separate files; enabling each .cpp file to be compiled into an individual BOF, as demonstrated in the screenshot below:

Figure 2. An example file structure demonstrating multiple BOFs within a single solution.

This approach eliminates the need for multiple projects and allows grouping similar BOFs under one Visual Studio project.

Lastly, when it comes to debugging, since each BOF is compiled into its own debug executable, it is important to adjust the debug command accordingly. This can be done through the project properties: Configuration Properties -> Debugging -> Command.

Conclusion

In conclusion, BOF development poses its own challenges that complicate debugging and testing. By leveraging the Visual Studio BOF template, we can streamline our BOF development workflow, take advantage of C++’s features, and smoothly debug and test our BOFs without the requirement of Beacon. This project template is provided in our GitHub account with the installation and usage instructions.

Related Work

Visual Studio BOF template by @yas_o_h – We found this project while creating our own. It takes a different approach but still contains useful resources.