Hotpatching has been looming over Windows 11 for a while now, having already been shipped on the server & cloud deployments. It first came out in March that the first major version to include it will be 24H2, which can now be confirmed in a few minutes of reversing the kernel or ntdll. The feature turns out to be quite extensive and hooks itself in many core functionalities of the OS. As such, it’s bound to break a few programs here and there, and one of the unlucky victims this time was x64dbg.

Last week I was doing some debugging on the 24H2 insider build and realized that the memory map view in x64dbg was broken. Instead of showing a single entry for each section of each module, some modules had their sections inlined in a single entry. For example, memory belonging to kernel32.dll looked like this:

Address           Size              Party     Info
00007FFD053C0000  00000000000C8000  System    kernel32.dll, ".text", "fothk", ".rdata", ".data", ".pdata", ".didat", ".rsrc", ".reloc"

Whereas normally, it would show like this:

Address             Size                Party      Info 
00007FF975E90000    0000000000001000    System     kernel32.dll 
00007FF975E91000    000000000007E000    System      ".text"
00007FF975F0F000    0000000000033000    System      ".rdata" 
00007FF975F42000    0000000000002000    System      ".data"
00007FF975F44000    0000000000006000    System      ".pdata"
00007FF975F4A000    0000000000001000    System      ".didat"
00007FF975F4B000    0000000000001000    System      ".rsrc"
00007FF975F4C000    0000000000001000    System      ".reloc"

After a brief investigation, I found out that the reason it was breaking was an extra page existing at the end of the image region of those modules. The contents of the page didn’t immediately suggest what it’s being used for, so I didn’t initially push my investigation further. But the weirdness didn’t end there, and soon I noticed that I couldn’t do much with this page, as VirtualProtect failed to work on it, returning STATUS_NOT_COMMITTED. However, the page was seen as being committed by x64dbg, as well as other memory inspection tools and debuggers.

This sparked my curiosity and I decided to track down the root cause of this behaviour, as I hadn’t encountered anything like it before.. This lead me down the rabbit hole of CFG changes on 24H2 and gave birth to this post.

tl;dr


  • With hotpatching on Windows 11 comes SCP, a new feature whose purpose seems to be to provide relocatable, position-independent functions that can later be hooked painlessly into processes and individual modules. Even though the changes encompass more than CFG code, the majority of it seems to be focused on making CFG functions independent of external code & data.
  • The primary change is the implementation of new (but functionally the same) CFG functions in their dedicated sections in ntdll (usermode) and kCFG functions in ntoskrnl (kernel). The sections are copied and fixed up into their own dedicated pages at runtime, and these pages are then mapped into both processes and individual modules which satisfy some conditions related to hotpatching.
  • There don’t seem to be underlying security improvements and the changes are likely focused only on providing compatibility with hotpatching.

preliminaries


This post is not about hotpatching. It is a very extensive feature and reversing it would require more effort than I’ve put in here. The purpose of this post is to share implementation details of CFG in 24H2, including both usermode & kernel. The new implementation is somewhat coupled with hotpatching, so we won’t be able to answer some questions, and some details may remain obscured. Nevertheless, I believe that the post should provide a good overview of the changes and pointers for further research. All debugging & reversing was done on the Windows 11 Pro 26100.268 insider build.

In this post I assume that the reader is acquainted with CFG on windows & its history and that they have moderate knowledge of Windows NT internals. CFG has a long history on Windows and you can learn more about it from these sources:

The following may also provide some additional context for this post:

classic cfg


Trend Micro’s document I’ve mentioned above provides a very detailed description of the classic CFG implementation on Windows. We’ll summarize the mechanisms involved here:

  • The kernel contains a system-wide bitmap whose each bit denotes whether a range of 4 bytes are a valid call target. This bitmap is updated each time an image is loaded or unloaded in the userspace. To expose the bitmap to userspace, the address of the bitmap is written to ntdll’s exported DllSystemInitBlock.
  • To make use of CFG, a module provides a configuration in its IMAGE_LOAD_CONFIG_DIRECTORY debug directory, in the format explained in the docs. Additionally, a table of valid call targets is provided within one of the sections.
  • The kernel loader reads GuardCFFunctionTable and GuardCFFunctionCount from the directory, walks through the table, and sets the appropriate bits in the system-wide bitmap.
  • The usermode loader reads GuardCFCheckFunctionPointer & GuardCFDispatchFunctionPointer and patches them to point towards one of the classic CFG functions, e.g. if CFG is enabled for the process, then LdrpValidateUserCallTarget / LdrpDispatchUserCallTarget or the export-suppression counterparts, otherwise the nop functions.
  • Classic CFG functions validate (or validate and dispatch) a call target argument that’s passed in RAX. They use the system-wide bitmap for that, which they access by looking up its address in the DllSystemInitBlock, which was previously written by the kernel.
  • Kernel CFG (kCFG) works in a similar manner, but only if virtualization-based security is enabled. This is because VBS is the only way to truly protect the kCFG bitmap from being easily overwritable.

new ntdll sections


The first thing that stands out in the 24H2 build are the new sections in ntdll. There are four of them: SCPCFG, SCPCFGFP, SCPCFGNP, SCPCFGES; and they’re marked as RX, each of them taking only a single page. The structure of contents is the same for each section:

[+0x00] SCPCFG header
    [+0x00][0x04] Offset to dispatch (no es) function
    [+0x04][0x04] Offset to dispatch (es) function
    [+0x08][0x04] Offset to validate (no es) function
    [+0x0C][0x04] Offset to validate (es) function
    [+0x10][0x04] Offset to invalid call handler
    [+0x14][0x04] Offset to rtl function table
    [+0x18][0x28] Unknown // I didn't bother reversing
[+dynamic] Dispatch (no es) function
[+dynamic] Dispatch (es) function
[+dynamic] Validate (no es) function
[+dynamic] Validate (es) function
[+dynamic] Invalid call handler
[+dynamic] Icall handler
[+dynamic] Unwind table
    [+0x00][0x0C] Unwind table stuff
[+dynamic] Rtl function table
    [+0x00][dynamic] array of RUNTIME_FUNCTION entries

For example, the SCPCFG section looks like this:

[+0x00] SCPCFG header
    [+0x00][0x04] = 0x40
    [+0x04][0x04] = 0xC0
    [+0x08][0x04] = 0x140
    [+0x0C][0x04] = 0x1C0
    [+0x10][0x04] = 0x240
    [+0x14][0x04] = 0x2A4
    [+0x18][0x28] = {66 66 66 66 66 66 66 0F 1F 84 00 00 00 00 00 66 66 66 66 66 66 66 0F 1F 84 00 00 00 00 00 66 66 0F 1F 84 00 00 00 00 00}
[+0x40] = ScpCfgDispatchUserCallTarget
[+0xC0] = ScpCfgDispatchUserCallTargetES
[+0x140] = ScpCfgValidateUserCallTarget
[+0x1C0] = ScpCfgValidateUserCallTargetES
[+0x240] = ScpCfgHandleInvalidCallTarget
[+0x280] = ScpCfgICallHandler
[+0x298] = Unwind table
    [+0x00][0x0C] = {19 00 00 00 80 02 00 00 00 00 00 00}
[+0x2A4] = Rtl function table
    [+0x00][0x0C] {.SectionBegin = 0x00, .SectionEnd = 0x280, .UnwindData = 0x298}

Names of the functions located in all of the sections resemble those of classic CFG functions. Thus it’s not hard to determine the purpose of each function:

  • The validation & dispatch functions are counterparts of the classic CFG validation & dispatch functions.
  • The ES / NO ES distinction is made in regards to export suppression configuration, MS says some stuff about it here.
  • The invalid call handler is the function that’s called from the dispatch function in case the call target is invalid.
  • The icall handler is there to work in conjunction with exception handling, but I’m not entirely sure of the mechanisms involved.

Other than these functions, the sections also contain a function table with an unwind table, we’ll later see what these are used for. There are also 40 unknown bytes at the end of the header that I couldn’t understand, as I didn’t see any code accessing them.

On the surface, the functions between different sections look similar amongst each other, but also very similar to their classic CFG counterparts. The first difference that can be seen is that, where classic CFG functions use the address of the cfg bitmap written to DllSystemInitBlock, the new functions have a hardcoded 0x0123456789ABCDEF. That looks like a placeholder value that’s supposed to be patched by something later. For example:

mov     r11, cs:qword_1801D94F8     <------ bitmap address is stored inside ntdll DllSystemInitBlock in the .mrdata section
mov     r10, rax
shr     r10, 9
mov     r11, [r11+r10*8]
mov     r10, rax
shr     r10, 3
test    al, 0Fh
jnz     short loc_fail
bt      r11, r10
jnb     short loc_fail
jmp     rax
...

ntdll!LdrpDispatchUserCall

mov     r11, 123456789ABCDEFh     <------ hardcoded placeholder value
mov     r10, rax
shr     r10, 9
mov     r11, [r11+r10*8]
mov     r10, rax
shr     r10, 3
test    al, 0Fh
jnz     short loc_fail
bt      r11, r10
jnb     short loc_fail
jmp     rax
...

ntdll!ScpCfgDispatchUserCall

There are also meaningful differences between the sections. To demonstrate that, let’s take a look at the implementation of the dispatch function in each of the sections:

jmp     rax

ntdll!ScpCfgDispatchUserCallTarget_Nop (SCPCFGNP)

mov     r11, 123456789ABCDEFh 
mov     r10, rax
shr     r10, 9
mov     r11, [r11+r10*8]
mov     r10, rax
shr     r10, 3
test    al, 0Fh
jnz     short loc_fail
bt      r11, r10
jnb     short loc_fail
jmp     rax
...

ntdll!ScpCfgDispatchUserCall (SCPCFG)

mov     r11, 123456789ABCDEFh
mov     r10, rax
shr     r10, 9
mov     r11, [r11+r10*8]
mov     r10, rax
shr     r10, 3
test    al, 0Fh
jnz     short loc_180160069
bt      r11, r10
jnb     short loc_180160074
jmp     rax
...

ntdll!ScpCfgDispatchUserCallTarget_ES (SCPCFGES)

mov     r11, 123456789ABCDEFh
mov     r11, [r11]
jmp     r11

ntdll!ScpCfgDispatchUserCallTarget_Fptr (SCPCFGFP)

The implementation of the dispatch function in SCPCFG & SCPCFGES is the same, but the other two sections work in a different manner. SCPCFGNP implements a nop function, immediately jumping to the indirect function pointer. On the other hand, the implementation in SCPCFGFP jumps to a function pointer that’s read from an address that seems to be unknown at compile time.

With this in mind, it’s not hard to determine the intent behind each section:

  • SCPCFG & SCPCFGES represent regular CFG implementations, having access to a bitmap that stores information on whether an address is a valid call target. SCPCFGES uses export suppression, whereas SCPCFG doesn’t.
  • SCPCFGNP represents lack of CFG. Its implementations are defined to let any call target through. (NP = NOP)
  • SCPCFGFP calls an implementation that’s located elsewhere through a function pointer. (FP = function pointer)

One more thing worth looking into is the invalid call target handler. For all sections except SCPCFGNP, its implementation looks like this:

mov     r11, 123456789ABCDEFh
jmp     r11

ntdll!ScpCfgHandleInvalidCallTarget / ntdll!ScpCfgHandleInvalidCallTarget_ES / ntdll!ScpCfgHandleInvalidCallTarget_Fptr

Meaning that it also is supposed to jump to a yet-to-be-determined place. As we’ll see later, all of the placeholder values are going to be patched by the kernel.

The only references to the new sections from within ntdll are in a new export - RtlpScpCfgntdllExports, which points towards the headers and ends of the different sections. The export is only referenced from the export table as well.

.rdata:00000001801654A0 RtlpScpCfgntdllExports
.rdata:00000001801654A0      dq offset ScpCfgHeader_Nop
.rdata:00000001801654A8      dq offset ScpCfgEnd_Nop
.rdata:00000001801654B0      dq offset ScpCfgHeader
.rdata:00000001801654B8      dq offset ScpCfgEnd
.rdata:00000001801654C0      dq offset ScpCfgHeader_ES
.rdata:00000001801654C8      dq offset ScpCfgEnd_ES
.rdata:00000001801654D0      dq offset ScpCfgHeader_Fptr
.rdata:00000001801654D8      dq offset ScpCfgEnd_Fptr
.rdata:00000001801654E0      dq offset LdrpGuardDispatchIcallNoESFptr
.rdata:00000001801654E8      dq offset __guard_dispatch_icall_fptr
.rdata:00000001801654F0      dq offset LdrpGuardCheckIcallNoESFptr
.rdata:00000001801654F8      dq offset __guard_check_icall_fptr
.rdata:0000000180165500      dq offset LdrpHandleInvalidUserCallTarget

Even though sections are present in ntdll, they’re not used by usermode code as such. To understand how everything connects at runtime, we’ll need to dive into kernel code.

kernel initialization


Similarly to classic CFG, much of the logic related to SCPCFG happens in the kernel. Classic CFG is initialized in MiInitializeCfg, and this function remains unchanged between 23H2 and 24H2. SCPCFG is initialized in a different function - MiInitializeImageViewExtension. This takes place during phase 1 initialization and its purpose is to map the four sections from ntdll into the kernel and fix them up so that they can later be dropped anywhere in the userspace and used out-of-the-box. The following steps make up the gist of the process:

  • [MmInitializeImageViewExtension] The kernel obtains a view of the global cfg bitmap for the initial system process by calling MiMapSecurePureReserveView with PsInitialSystemProcess. Internally, this is implemented through MmMapViewOfSectionEx and the view additionally being secured via MiSecureVad, which disallows the view protection from being changed. This view is then stored in a global variable.
  • [MiInitializeImageViewExtensionCfg] Four new combined pages are allocated, and the corresponding combine blocks are stored into a global array. The pages are initially empty and will be filled in later. We’ll call these SCPCFG pages, as they’ll store data taken from ntdll’s SCPCFG sections.
  • [PsInitializeScpCfgPages / PspLocateNtdllAddressesForScpCfg] The exported RtlpScpCfgntdllExports is located in ntdll via its export table and used to find the offset of the four SCPCFG sections in the file. Contents of each section are then copied into the corresponding SCPCFG page and a bunch of sanity checks are performed to ensure integrity of the data.
  • [PspLocateNtdllAddressesForScpCfg] Pointers to ntdll’s SCPCFG functions are stored in a global array (PspNtdllScpFunctions). Here, the pointers don’t point inside any of the new pages, nor to the sections embedded in ntdll. They’re calculated as ntdllBaseAddress + ntdllImageSizeInMemory + fixedOffsetToTheFunction, suggesting that one of the SCPCFG pages will be mapped at the end of ntdll’s userspace image region. We’ll later see that this is indeed the case.
  • [PspFinalizeScpCfgPage] Placeholder values (i.e. 0x0123456789ABCDEF) are replaced in all four SCPCFG pages. Particularly, the cfg bitmap address placeholder is replaced with the address of the view to the cfg bitmap obtained in the first step. The placeholder in the invalid call target handler is replaced with the pointer stored in the last field of RtlpScpCfgntdllExports, i.e. LdrpHandleInvalidUserCallTarget. The placeholder function pointers in the SCPCFGFP section are replaced with the pointers stored in RtlpScpCfgntdllExports, starting from LdrpGuardDispatchIcallNoESFptr and ending with __guard_check_icall_fptr.
Show / Hide snippets
void MmInitializeImageViewExtension(bool doInitialize)
{
    if (doInitialize)
    {
        // this branch is reached during phase 1 initialization
        if (SUCCEEDED(MiMapSecurePureReserveView(PsInitialSystemProcess, g_CFGBitmap, g_SCPCFGBitmapView, ...))
        {
            // g_SCPCFGBitmapView now contains a read-only secure view of the CFG bitmap
            ...
            MiInitializeImageViewExtensionCfg(true);
        }
    }
    else
    {
        ... // this branch is reached during pre-initialization and does some minor hotpatching related initialization
    }
}

void MmInitializeImageViewExtensionCfg(bool arg)
{
    // kernel-y stuff like allocating PTEs
    ...
    
    // array of pages that will be used to store SCPCFG sections
    void* scpCfgPages[4];
    
    for (uint32_t sectionIndex = 0; sectionIndex < 4; ++sectionIndex)
    {
        // allocate a page, initialize pfn, make a valid pte
        ...
        void* newPage = ...;
        scpCfgPages[sectionIndex] = newPage;
        
        // allocate a new combine block for the page
        auto combineBlock = MiAllocateCombineBlock(...);
        
        // fill in info in the combine block and map the page
        ...
        
        // write combine block to the array in the global MiState, we'll call this array g_SCPCFGSectionBlocks
        // since arg == 1 during the initialization, the offset being used is 0xD78
        *(void**)(MiState + 4 * sectionIndex + (arg ? 0xD78 : 0xD98)) = combineBlock;
    }
    
    // Initialize the newly mapped pages
    PsInitializeScpCfgPages(scpCfgPages, ..., g_SCPCFGBitmapView, ...);
}

NTSTATUS PsInitializeScpCfgPages(void** pages, ..., void* cfgBitmapView, ...)
{
    // Locate ntdll addresses for scpcfg
    RTL_SCP_CFG_NTDLL_EXPORTS sections;
    RTL_SCP_CFG_NTDLL_EXPORTS_ARM64EC arm64EcSections;
    if (SUCCEEDED(PspLocateNtdllAddressesForScpCfg(..., &sections, &arm64EcSections))
    {
        // Copy each section into the corresponding page
        for (uint32_t sectionIndex = 0; sectionIndex < 4; ++sectionIndex)
        {
            auto section = &sections.headers[sectionIndex];   // offset sectionIndex * 0x10
            memcpy(pages[sectionIndex], section->Begin, section->End - section->Begin);
            ...
        }
        
        // Ensuring function offsets have fixed value in the header, i.e.
        // [0] = 0x40, [1] = 0xC0, [2] = 0x140, [3] = 0x1C0
        ...
        
        // Finalize each mapped section
        for (uint32_t sectionIndex = 0; sectionIndex < 4; ++sectionIndex)
        {
            if (SUCCEEDED(PspFinalizeScpCfgPage(pages[sectionIndex], sectionIndex, cfgBitmapView, sections))
            {
                ...
            }
        }
    }
}

NTSTATUS PspLocateNtdllAddressesForScpCfg(..., RTL_SCP_CFG_NTDLL_EXPORTS* outSections, RTL_SCP_CFG_NTDLL_EXPORTS_ARM64EC* outArm64Sections)
{
    // Set arm64 sections to zero
    memset(outArm64Sections, 0, sizeof(outArm64Sections));
    
    // Find the SCPCFG export in ntdll
   if (SUCCEEDED(PspCopyNtdllExport(..., "RtlpScpCfgNtdllExports", outSections, ...))
   {
      // Sanity checks and some internal remapping which doesn't affect logic much
      ...
   }
   
   // Set the global PspNtdllScpFunctions 
   PspNtdllScpFunctions[0] = MmGetScpCfgFunctionOffset(0x140, ntdllImageSize);
   PspNtdllScpFunctions[1] = MmGetScpCfgFunctionOffset(0x1C0, ntdllImageSize);
   PspNtdllScpFunctions[2] = MmGetScpCfgFunctionOffset(0x40, ntdllImageSize);
   PspNtdllScpFunctions[3] = MmGetScpCfgFunctionOffset(0xC0, ntdllImageSize);
}

void PspFinalizeScpCfgPage(void* mappedSectionPage, uint32_t sectionIndex, void* cfgBitmapView, RTL_SCP_CFG_NTDLL_EXPORTS* sections)
{
    if (pageIndex == 1 || pageIndex == 2) // branch for SCPCFG and SCPCFGES
    {
        // a bunch of sanity checks to ensure the data in the section looks good
        ...
        
        // replace the placeholder cfg bitmap address in code of each of the first four functions in the section
        for (uint32_t i = 0; i < 4; ++i)
        {
            *(void**)(mappedSectionPage + mappedSectionPage->func_offset[i] + 0x02) = cfgBitmapView;
        }
        
        // replace the placeholder function pointer in CfgScpHandleInvalidUserCallTarget et al
        *(void**)(mappedSectionPage + mappedSectionPage->func_offset[4] + 0x02) = sections->Ptr_HandleInvalidUserCallTarget;
    }
    else if (pageIndex == 3) // branch for SCPCFGFP
    {
        // similar sanity checks
        ...
        
        // Replace the placeholder function ptr in code of each of the first four functions for the FPTR section
        *(void**)(mappedSectionPage + mappedSectionPage->func_offset[0] + 0x02) = sections->Ptr_LdrpGuardDispatchIcallNoESFptr;
        *(void**)(mappedSectionPage + mappedSectionPage->func_offset[1] + 0x02) = sections->Ptr___guard_dispatch_icall_fptr;
        *(void**)(mappedSectionPage + mappedSectionPage->func_offset[2] + 0x02) = sections->Ptr_LdrpGuardCheckIcallNoESFptr;
        *(void**)(mappedSectionPage + mappedSectionPage->func_offset[3] + 0x02) = sections->Ptr__guard_check_icall_fptr;
        
        // And also replace the placeholder function pointer for HandleInvalidUserCallTarget as well
        *(void**)(mappedSectionPage + mappedSectionPage->func_offset[4] + 0x02) = sections->Ptr_HandleInvalidUserCallTarget;
    }
}

At the end of the initialization, the kernel has the following available:

  • A protected view of the cfg bitmap, stored in a global variable, I called it g_SCPCFGBitmapView.
  • Four combined pages containing corresponding SCPCFG section data copied from ntdll, fixed up to make the code functional. The corresponding block of each page is stored in a global array, I called it g_SCPCFGSectionBlocks.
  • Four function addresses stored in the global PspntdllScpFunctions array, which are pointing towards scpcfg functions in a page mapped at the end of the ntdll’s userspace image region.

notes:

  • The offsets to the first four functions inside of each section can be defined dynamically, but kernel code asserts that they’re actually fixed, i.e. the offsets must be 0x40, 0xC0, 0x140 and 0x1C0. This can be seen in PsInitializeScpCfgPages.
  • PspLocateNtdllAddressesForScpCfg takes a pointer to a struct named RTL_SCP_CFG_NTDLL_EXPORTS_ARM64EC, however the argument is simply zeroed-out. I haven’t checked the ARM64 version of the kernel, so it’s possible that this is properly filled in over there. Of course, it doesn’t make sense that we’d need ARM64EC information on x64.
  • The function pointers in PspntdllScpFunctions point towards nothing at the time they’re stored, as nothing has been mapped into the space at the end of ntdll’s image region. One of the SCPCFG pages will be mapped a bit later by the part of code responsible for usermode module linking. The delayed initialization here is not really a problem, as nothing will access the functions before the page is mapped.
  • It is now clear that the SCPCFGFP section is supposed to emulate classic CFG. In the fixup procedure, placeholder function pointers are replaced with the pointers to classic CFG functions. This doesn’t mean that it will always emulate the classic CFG, as the function pointers could easily be overwritten once again, but that doesn’t seem to be done at the time.

kernel linking


Once the initialization is finished, nothing else is done on a global scale. The SCPCFG pages are now ready and can be mapped into processes during their creation, as well as individual modules. These steps are separated, and we’ll cover them separately as well.

process linking


The entry point that we’re concerned with is MiMapProcessExecutable. This function is called during process allocation in the kernel mode, with the call chain being PspAllocateProcess -> MiInitializeProcessAddressSpace -> MiMapProcessExecutable. The following steps are then taken to setup SCPCFG for the process:

  • [MiCfgInitializeProcess] The global cfg bitmap is mapped for the process by using MiMapSecurePureReserveView. g_SCPCFGBitmapView is stored in the second argument, and if everything goes well (ie the process has access to the view), the kernel will simply return the same value. Otherwise, a new view is created. The view is then stored in the process mappings of the process object, at offset 0x3A0.
  • [MiMapAllImageScpPages] The code walks through the list of VADs belonging to the process, and checks if the VAD flags indicate that the VAD is read-only and hotpatchable. If so, the VAD must contain an image/module, and so it’s checked if the image contains function override fixups. If so, MiMapImageScpCfgPages is called to map the section into the module corresponding to the image. This function deals with an individual module and is also used by the module linking code path, so we’ll cover it later.
Show / Hide snippets
NTSTATUS MiCfgInitializeProcess(void* process)
{
    // check process flags & compatibility with cfg (e.g. some architectures are not supported)
    ...
    
    void* cfgBitmapView = g_SCPCFGBitmapView;
    uint32_t mapSize = 0;
    if (FAILED(MiMapSecurePureReserveView(process, g_CFGBitmap, &cfgBitmapView, &mapSize, NULL))
    {
           // if we fail to map the global existing view to the bitmap, create a new one
           cfgBitmapView = NULL;
           if (FAILED(MiMapSecurePureReserveView(process, g_CFGBitmap, &cfgBitmapView, &mapSize, NULL))
           {
                // return an error
                ...
           }
    }
    
    // this stores cfgBitmapView into the process mappings
    MiReferenceCfgVad(...)
    
    // special handling for ArmThumb2 and I386
    ...
}

void MiMapAllImageScpPages(void* process)
{
    // set flags to indicate that the process is using scpcfg
    process->someFlags |= 4;
    
    // iterate through image vads in the process and map pages into those which require them
    for (auto vad = MiGetFirstVad(process); vad != NULL; vad = MiGetNextVad(vad))
    {
        // ensure vad is protected with PAGE_EXECUTE_READ and that the hotpatchable indicator is set (bit 6 in flags2)
        if ((vad->ProtectionFlags & 0x70 == PAGE_EXECUTE_READ) && MiReadVadFlags2(vad) & 0x20 != 0)
        {
            if (MiDoesImageContainFunctionOverrideFixups(vad->Image))
            {
                MiMapImageScpCfgPages(process, vad);
            }
        }
    }
}

notes:

  • After the main part of code inside MiCfgInitializeProcess finishes, there’s an additional part that could potentially be run. If the architecture is I386 (0x14C) or ArmThumb2 (0x1C4), the cfg bitmap is mapped into another view, but with different parameters (I haven’t bothered deciphering what they mean). The new view is then stored at offset 0x3C0 in the process mappings.
  • The conditions required to map the SCPCFG section into an image are related to hotpatching, i.e. the hotpatch indicator flag must be set, and MiDoesImageContainFunctionOverrideFixups must return true. Both of these are tightly coupled with the implementation of hotpatching, so we don’t cover it here. I will mention that the indicator flag is set in MiMapViewOfImageSection when initially mapping the image, but the way its value is calculated is quite complicated and is out of scope.

module linking


The function of interest here is the giant MiMapViewOfImageSection, which is called when an image (module) is mapped into the userspace. The steps taken here are pretty straightforward:

  • [MiMapViewOfImageSection] First, determine if the module is hotpatchable. If so, the size of the image VAD is increased by 0x1000, and an additional page will be available at the end of the image region. If the module is hotpatchable and additionally contains function override fixups, call MiMapImageScpCfgPages to map the SCPCFG page into the additional page within the image region.
  • [MiMapImageScpCfgPages] First, we check if the process that the module is being loaded into is using SCPCFG. If so, PsGetScpCfgPageTypeForProcess is called to determine which page was mapped into the process. Afterwards, the page is fetched from g_SCPCFGSectionBlocks and mapped into the extra page at the end of extended image VAD. This is done MiDecommitPages, which additionally decommits the page.
Show / Hide snippets
void MiMapViewOfImageSection(void* imageVad, void* process, ...)
{
    // a bunch of stuff that a book could be written about
    ...
    
    // check if module is hotpatchable and increase the size of image
    if (...)
    {
        flags2 |= 0x20;
        imageSize += g_HardcodedValueEqualTo4096;
    }
    
    // more stuff, during this phase a VAD is created for the image using the imageSize variable
    ...
    auto vad = ...;
    
    // same conditions as in MiMapAllScpPages
    if ((flags2 & 0x20) != 0 && MiDoesImageContainFunctionOverrideFixups(imageVad))
    {
        MiMapImageScpCfgPages(process, vad);
    }
}

void MiMapImageScpCfgPages(void* process, void* imageVad)
{
    // check if the process has scpcfg enabled, this is the flag set during MiMapAllImageScpPages
    if ((process->someFlags & 4) != 0)
    {
        // check which scpcfg section the process is using
        const uint32_t scpCfgSectionType = PsGetScpCfgPageTypeForProcess(...);
        if (scpCfgSectionType != 4)
        {
            // fetch combine page from g_SCPCFGSectionBlocks, make a new prototype PTE etc.
            ...
            
            // this call maps the combine page into the address range at the end of the image, but also decommits the address range
            MiDecommitPages((baseAddrOfImage + MiGetImageExtensionBaseAddress(imageVad), ...);
        }
    }
}

notes:

  • The conditions required for an image to contain the SCPCFG page at the end of its region is the same here as it was in MiMapAllImageScpPages - the hotpatch indicator being set and function override fixups being present. This means that system DLLs don’t receive any special treatment and go through the same code paths as regular DLLs.
  • PsGetScpCfgPageTypeForProcess determines which type of SCPCFG page is mapped into the process. The check here is two-fold:
    • Recall that MiCfgInitializeProcess stores its global cfg bitmap view into process mappings at offset 0x3A0. This field is checked in this function and compared against the global g_SCPCFGBitmapView. If the two don’t match, 3 is returned, which corresponds to the SCPCFGFP section. This makes sense if you recall that g_SCPCFGBitmapView is hardcoded into asm in pages representing SCPCFG and SCPCFGES - if that view is not the one that process has, then the functions wouldn’t work properly.
    • On the other hand, if the two match, the export suppression flag is checked for the process to determine if SCPCFG or SCPCFGES is the appropriate section to use.
    • Index 0, aka the SCPCFGNP section, is returned only if a certain flag is unset for the module, which most likely just corresponds to the CFG flag. I haven’t bothered tracking it down.
  • We can now see that, in most cases, SCPCFG / SCPCFGES will be mapped into a module, based on whether export suppression is enabled. The only way for the module to end up using SCPCFGFP (which, as we determined earlier, currently falls back to original CFG) is for the process to fail to map the global view of the cfg bitmap, which seems like an unlikely condition and I didn’t catch it happening. The NOP section would only be used if the module doesn’t support CFG.
  • MiMapImageScpCfgPages is also where the location of the mapped page is determined. As it stands, it will always be mapped to the end of the image region. MiGetImageExtensionBaseAddress is called to determine the userspace address that the page should be mapped to, and this currently returns the sum of the base address of the module and the original size of the image.
  • Finally, as the address range to which the extra section belongs is decommitted, this somewhat explains why VirtualProtect fails on the page. It doesn’t explain why the page can still be seen as committed by other APIs and I was lazy to investigate that.

system dll block


There’s one more thing that the kernel needs to link for a process. Recall that classic CFG requires that the address of the cfg bitmap is written to DllSystemInitBlock in ntdll to make it known to the userspace. For SCPCFG we don’t need that, as the kernel writes the view to the bitmap directly to asm code. However, the userspace loader now needs to know which functions it should link to, as this is determined by the kernel. This info is once again written to the init block, now at offsets 0xF0 - 0x120. The fields at these offsets will contain 6 pointers to the corresponding SCPCFG functions, and are written by the kernel in PspPrepareSystemDllInitBlock. PspGetScpCfgFunctions is called to obtain the pointers to chosen functions. The choice and the action taken depends on a few conditions:

  • At this point, ntdll is already mapped into the process, so the code checks if it already contains a scpcfg section - if not, the function returns zero and nothing is written to the block. If yes, the section is determined by calling PsGetScpCfgPageTypeForProcess on the process.
  • If the determined section type is SCPCFG or SCPCFGES, the values stored in PspntdllScpFunctions are written.
  • If the determined section type is SCPCFGFP, nothing is written.
Show / Hide snippets
void PspPrepareSystemDllInitBlock(...)
{
    // initialization of other variables in the block
    ...
    
    // write scpcfg functions to the block
    void** functions = PspGetScpCfgFunctions(process);
    if (functions)
    {
        *(dllInitBlock + 0xF0) = functions[0];
        *(dllInitBlock + 0xF8) = functions[1];
        *(dllInitBlock + 0x100) = functions[2];
        *(dllInitBlock + 0x108) = functions[3];
        *(dllInitBlock + 0x110) = functions[6];
        *(dllInitBlock + 0x118) = functions[4];
        *(dllInitBlock + 0x120) = functions[5];
    }
}

typedef struct _MEMORY_IMAGE_EXTENSION_INFORMATION
{
    PVOID PageTypeArgs;
    ULONG PageOffset;
    SIZE_T PageSize;
} MEMORY_IMAGE_EXTENSION_INFORMATION;

void** PspGetScpCfgFunctions(void* process)
{
    // check if ntdll has a mapped scpcfg section in its image region, if not return NULL
    auto ntdllBaseAddr = PspSystemDlls[0][4];
    MEMORY_IMAGE_EXTENSION_INFORMATION info;
    if (ZwQueryVirtualMemory(-1, ntdllBaseAddr, 0x0E, &info, 0x18ui64, NULL) == 0xC00000BB)
        return NULL;
    if (info.PageSize == 0)
        return NULL;
    
    // get page type, this time 0 can't be returned, since the last argument is *true*
    const int pageType = PsGetScpCfgPageTypeForProcess(process, ..., true);
    switch (pageType)
    {
        // SCPCFG & SCPCFGES
        case 1:
        case 2:
            return &PspNtdllScpFunctions;
        // SCPCFGFP
        case 3:
            return NULL;
        default:
            return NULL;
    }
}

And it’s time we leave the kernel alone. What’s left is for the usermode loader to finish the job and actually link the indirect call pointers to the implementations in the mapped section.

usermode linking


As is the case for classic CFG, usermode linking is pretty light and done in ntdll!LdrpCfgProcessLoadConfig by the loader:

  • [LdrpCfgProcessLoadConfig] Most of the classic CFG code is still relevant here. IMAGE_LOAD_CONFIG_DIRECTORY of the module is read, in order to determine where the indirect call function pointers are stored. For each of the indirect call pointers present in the debug directory, either LdrpCfgCheckRoutineCallback or LdrpCfgDispatchRoutineCallback is called. These functions are supposed to decide if the pointer should be linked to a scpcfg routine or to the classic ntdll routine. This is decided by looking at the corresponding values in DllSystemInitBlock.

For example, LdrpCfgDispatchRoutineCallback looks like this:

void LdrpCfgDispatchRoutineCallback(void** fptr, int flags)
{
  if (LdrControlFlowGuardEnforcedWithExportSuppression() && (flags & IMAGE_GUARD_CF_EXPORT_SUPPRESSION_INFO_PRESENT) != 0)
  {
    if (*(dllInitBlock + 0x108))
        *fptr = *(dllInitBlock + 0x108);        <----- kernel wrote the ES dispatch function pointer here
    else
        *fptr = LdrpDispatchUserCallTargetES;
  }
  else
  {
    if (*(dllInitBlock + 0x100))
        *fptr = *(dllInitBlock + 0x100);         <----- kernel wrote the (no ES) dispatch function pointer here
    else
        *fptr = LdrpDispatchUserCallTarget
  }
}

ntdll!LdrpCfgDispatchRoutineCallback

One thing that sticks out here is that all modules will always link to the same functions, as function pointers are read from the ntdll init block. As far as the implementation in kernel goes, if these are available, they will have always been copied from PspntdllScpFunctions, which means that all loaded modules will initially link to functions in the dedicated page mapped to ntdll’s image region, rather than their own dedicated page. This probably changes in the event that the module needs to be hotpatched, but that’s me speculating, as I didn’t get that far.

extras


One extra thing that the usermode loader does is add a part of the newly mapped section to the system function table by calling RtlAddGrowableFunctionTable. The purpose of this function is to mark certain parts of code as functions in order to properly collect backtraces and dispatch exceptions. This is done in RtlpInsertOrRemoveScpCfgFunctionTable, which is called after the module is mapped & cfg initialized, or alternatively when the module is unloaded. To fetch info on which functions within the section should be added to the table, the function calls ZwQueryVirtualMemory with a new memory information class. Internally, it’s denoted as “image view extension” and its value is 0x0E. The code looks something like this:

typedef struct _MEMORY_IMAGE_EXTENSION_INFORMATION
{
    PVOID PageTypeArgs;
    ULONG PageOffset;
    SIZE_T PageSize;
} MEMORY_IMAGE_EXTENSION_INFORMATION;

NTSTATUS RtlpInsertOrRemoveScpCfgFunctionTable(void* moduleBase, int, bool insertOrRemove)
{
    MEMORY_IMAGE_EXTENSION_INFORMATION info = {};
    if (SUCCEEDED(NtQueryVirtualMemory(-1, moduleBase, 0x0E, &info, sizeof(info), ...))
    {
        if (info.PageOffset && info.PageSize)
        {
            void* pageBase = moduleBase + info.PageOffset;
            const uint32_t offsetToRtlTable = *(pageBase + 0x20);
            const uint64_t pageSize = info.PageSize;
            if (insertOrRemove)
            {
                RtlAddGrowableFunctionTable(..., pageBase + offsetToRtlTable, 1, 1, pageBase, pageSize);
            }
            else
            {
                RtlDeleteFunctionTable(pageBase + offsetToRtlTable);
            }
        }
    }
}

ntdll!RtlpInsertOrRemoveScpCfgFunctionTable

kernel scpcfg


Kernel CFG has also undergone some changes, which are very similar to the ones we’ve seen in usermode. In ntoskrnl.exe we now have a “KSCP” section, which is formatted in the following manner:

[+0x00] KSCP header
    [+0x00][0x04] = length of the section
    [+0x04][0x58] = offsets to individual functions below, sorted, 4 bytes each
    [+0x5C][0x24] = Unknown
[+0x80] __guard_retpoline_icall_handler
[+0xA0] sub_140B570A0
[+0xC0] __guard_retpoline_switchtable_jump_rax
[+0xE0] __guard_retpoline_switchtable_jump_rcx
[+0x100] __guard_retpoline_switchtable_jump_rdx
[+0x120] __guard_retpoline_switchtable_jump_rbx
[+0x140] __guard_retpoline_switchtable_jump_rsp
[+0x160] __guard_retpoline_switchtable_jump_rbp
[+0x180] __guard_retpoline_switchtable_jump_rsi
[+0x1A0] __guard_retpoline_switchtable_jump_rdi
[+0x1C0] __guard_retpoline_switchtable_jump_r8
[+0x1E0] __guard_retpoline_switchtable_jump_r9
[+0x200] __guard_retpoline_switchtable_jump_r10
[+0x220] __guard_retpoline_switchtable_jump_r11
[+0x240] __guard_retpoline_switchtable_jump_r12
[+0x260] __guard_retpoline_switchtable_jump_r13
[+0x280] __guard_retpoline_switchtable_jump_r14
[+0x2A0] __guard_retpoline_switchtable_jump_r15
[+0x2C0] __guard_retpoline_indirect_cfg_rax
[+0x3C0] __guard_retpoline_exit_indirect_rax
[+0x440] __guard_retpoline_import_r10
[+0x4E0] __guard_retpoline_import_r10_do_retpoline
[+0x520] __guard_retpoline_import_r10_log_event
[+0x580] __guard_retpoline_jump_hpat
[+0x5A0] __guard_retpoline_exit
[+0x780] KscpCfgDispatchUserCallTargetEsSmep
[+0x7E0] KscpCfgDispatchUserCallTargetEsNoSmep
[+0x840] KscpCfgHandleInvalidCallTarget

The structure looks similar to the new sections in ntdll, but there aren’t only new CFG functions in the section, there are also retpoline functions being included. Retpoline functions aren’t new and they’ve been present in previous versions of the kernel as well, though in a different section - RETPOL. That section is now gone. At the end of KSCP, we have three functions which seem reminiscent of the ones we’ve seen in userspace. However, their implementation is different. For example:

jmp     rax
-------------------------------------------------------------
db 7 dup(0CCh)
-------------------------------------------------------------
mov     r10, rax
shr     r10, 9
mov     r11, [r11+r10*8]
mov     r10, rax
shr     r10, 3
test    al, 0Fh
jnz     short loc_140B577A9
bt      r11, r10
jnb     short loc_140B577C1
jmp     rax
...

ntoskrnl!KscpCfgDispatchUserCallTargetEsSmep

There’s no placeholder values and the instructions themselves look like placeholders, ie there’s a weird disconnect between the first instruction and the rest of the function.

initialization


KSCPSCFG gets initialized with the rest of KSCP (the latter denoting all of the functions in the sections, with the former denoting only the three CFG functions at the end). The initialization happens in following steps:

  • [MiPrepareScpFixupsForNtAndHal] This is done during the preparation phase of system initialization. We begin by mapping the KSCP section and storing a pointer to it into a global variable, as well as its size in system pages. Let’s call these g_ScpBase and g_ScpSectionSizeInPages. Then we call MiApplyDynamicFixupsToKernelAndHal.
  • [MiApplyDynamicFixupsToKernelAndHal] This function performs some fixups on the retpolines and then calls RtlInitializeKscpCfgFunctions to fixup KscpCfgDispatchUserCallTargetEsSmep and KscpCfgDispatchUserCallTargetEsNoSmep. The fixup code is pretty simple, the beginning of both functions is patched so that the function jumps to __guard_retpoline_icall_handler.

KSCP is further initialized through MiInitializeKernelScp, but not much of note happens to the CFG functions here. The primary goal of this function seems to be to create a function table that can later be accessed through usual interfaces, like RtlpxLookupFunctionTable.

Show / Hide snippets
void MiPrepareScpFixupsForNtAndHal(...)
{
    auto kscpSectionDesc = RtlLookupImageSectionByName(..., "KSCP");
    g_ScpBase = kscpSectionDesc->SectionBase;
    g_ScpSectionSizeInPages = PAGE_COUNT(kscpSectionDesc->SectionSize);
    MiApplyDynamicFixupsToKernelAndHal(...);
}

void MiApplyDynamicFixupsToKernelAndHal(...)
{
    // retpoline fixups, etc.
    ...
    
    // fixup kscpcfg
    RtlInitializeKscpCfgFunctions(g_ScpBase, g_ScpSectionSizeInPages * 4096);
}

NTSTATUS RtlInitializeKscpCfgFunctions(void* scpBase, uint32_t scpSize)
{
    // sanity checks on section contents & size
    ...
    
    // patch the cfg functions to jump into __guard_retpoline_icall_handler
    *(scpBase->KscpCfgDispatchUserCallTargetEsSmep) = 0xE9;  // jmp
    *(scpBase->KscpCfgDispatchUserCallTargetEsSmep + 1) = scpBase->__guard_retpoline_icall_handler - scpBase->KscpCfgDispatchUserCallTargetEsSmep;
    *(scpBase->KscpCfgDispatchUserCallTargetEsNoSmep) = 0xE9;  // jmp
    *(scpBase->KscpCfgDispatchUserCallTargetEsNoSmep + 1) = scpBase->__guard_retpoline_icall_handler - scpBase->KscpCfgDispatchUserCallTargetEsNoSmep;
}

linking


The only step left is to link the indirect call handlers to KSCPCFG functions each time a driver is loaded. This is done in a familiar place, during MiProcessKernelCfgImageLoadConfig. This function parses the load configuration debug directory in the exe/driver being loaded, finds the indirect call pointers, and patches them to point towards CFG functions. Before 24H2, the functions being used would be guard_check_icall and guard_dispatch_icall (called guard_check_icall_no_overrides and guard_dispatch_icall_no_overrides in 24H2). In 24H2, this is changed, but only for the dispatch function. Instead of using guard_dispatch_icall_no_overrides, the pointers will be linked to KscpCfgDispatchUserCallTargetEsSmep or KscpCfgDispatchUserCallTargetEsNoSmep, depending on the status of SMEP.

The KSCP section is also additionally mapped for each module that’s loaded into the kernel, which is once again reminiscent of what happens in the userspace. This is done through MiMapKernelScp, which is called during system image loading.

Show / Hide snippets
void MiProcessKernelCfgImageLoadConfig(...)
{
    // validation function handling, very simple -> set it to guard_check_icall_no_overrides, no kscpcfg
    ...
    
    // dispatch function handling, has kscpcfg
    void** fptr = *(imageLoadConfig + 0x78);
    *fptr = guard_dispatch_icall_no_overrides;
    if (Mm64BitPhysicalAddress & 1) // SMEP check
        *fptr = KscpCfgDispatchUserCallTargetEsSmep;
    else
        *fptr = KscpCfgDispatchUserCallTargetEsNoSmep;
}

final thoughts


Lookin at the bigger picture, the changes do seem intended only for compatibility with hotpatching. We didn’t see any security improvements or optimizations that the new code is supposed to bring. Unfortunately, I didn’t find the motivation to dig further and figure out where exactly the new behaviour comes to shine, but it seems pretty clear that the entire set of changes was done to support hotpatching. This becomes obvious when looking at kernel changes, as there doesn’t seem be any additional security benefit from the new behaviour. The mention of “function overrides” in multiple places also seems to point towards there being a possibility to patch CFG functions for a module, or something of sort.

We did find out what causes the bugs in x64dbg:

  • The extra page at the end of image regions is the SCPCFG page.
  • The page is decommitted by the kernel, which is why VirtualProtect fails. I tinkered around a little bit, but couldn’t find a way to change this from userspace, as VirtualAlloc with MEM_COMMIT doesn’t seem to do anything and the followup VirtualProtect calls fail.

But there are a few questions I would’ve liked to have answered, and hope they’ll be answered in the future:

  • What does “SCP” stand for? My guess would be something like “Standalone Code Page”, but it could be so many other things that it’s probably not worth speculating.
  • Why do hotpatchable modules need their own SCP sections, when they’re going to be linked to ntdll’s SCP section initially?
  • Why is the decommitted usermode page still seen as committed? This one may be really easy, but I didn’t want to go down another rabbit hole.

I hope that more pieces of the puzzle are revealed in the upcoming months, as 24H2 releases to general audience and we get some more eyes on the kernel code. Ultimately, we won’t know the full story until MS decides to share some information, or somebody reverses the hotpatching machinery. I was not brave enough for that :)