← Two bluetooth vulnerabilities in Windows CVE-2023-24871 - RCE →

In this post, we’ll describe some basics of Bluetooth Low Energy and advertising, and introduce CVE 2023-24871.

contents


[1.0] terminology


Let’s start with some terminology:

  • BLE = Bluetooth Low Energy
  • HCI / Host / Controller - HCI stands for host-controller interface, which is how a host (in this case, bluetooth components on Windows) and controller (a bluetooth device with its firmware) communicate.
  • Adapter = The physical bluetooth device with its firmware.
  • Local device / controller / adapter = One which is vulnerable and belongs to a victim targeted by an attacker.
  • Remote device / controller / adapter = One which the attacker controls in order to trigger a vulnerability.
  • PDU = Protocol Data Unit, which is some data that’s transferred as part of a packet. You can think of PDUs as the “content” part of a bluetooth packet, with a specific format corresponding to the type of the packet.

[3.0] intro


Before we dive into the vulnerability, let’s introduce BLE and advertising. If you’re familiar with how it works, feel free to jump to the description of the vulnerability.

Bluetooth Low Energy is a subset of bluetooth technology that’s aimed at devices which don’t need to transmit a large amount of data in short periods of time. BLE protocols are distinct from classic bluetooth and cannot be used interchangeably, but devices can support both flavours. While classic bluetooth is used for stuff that requires a heavy transmission of data, like streaming music or transferring files, BLE is meant for much lighter stuff that doesn’t require a lot of power, like streaming live heart-rate data from a smartwatch, or sending signals via beacons.

BLE-compatible devices transmit data in multiple different ways, but one of the most basic functions they have is advertising and scanning. Devices can use advertising to broadcast data for multiple different purposes, but the general intention is for the data sent this way to contain information about the device, which actively scanning devices can use to identify and classify the broadcasting device. The information sent usually involves the name of the device, the ID of the manufacturer, the type and capabilities of the device, as well as certain indicators that tell the receiver whether the device can be connected to etc. In short, one can imagine advertising as devices introducing themselves to other devices which are listening for introductions.

Advertising information is sent in the form of BLE packets which contain advertisement PDUs. The lifetime of an advertisement is defined by the following steps:

  • A host that wants to advertise some information sends a set of specific HCI commands to the controller. These commands are meant to setup advertising parameters and enable or disable advertising. In the case of peripheral devices, there may not be a classic host-controller interface and advertising parameters may be hardcoded in the firmware.
  • A host that wants to listen for advertisements sends a specific HCI command to the controller that indicates that scanning should be enabled. The controller is usually passively listening to the advertisements regardless of whether the host wants to receive them, this command simply notifies the controller to forward received advertisements to the host.
  • When the host enables advertising, the controller starts broadcasting advertisements periodically in the form of BLE packets with a specific PDU containing advertisement data.
  • A controller that’s scanning for advertisements receives the packets and forwards advertisement data in the form of HCI events to the host.

During its lifetime, advertisement data is transferred in three different steps, each time in a different format. First, when the advertising host sets up advertising parameters, one of which is the advertising data. Second, when a BLE packet containing the advertising data is transferred between the controllers. And third, when the receiving controller sends a HCI event containing the advertising data to the host. We don’t actually care about the format of the data in the first two steps, as it’s irrelevant for the vulnerability. It’s only important to know how it works conceptually. What’s important for us is the format in which a host receives advertisement data while it’s scanning for it.

There are two HCI events which transfer advertising data to the host: LE Advertising Report and LE Extended Advertising Report. A report in this case represents a single advertisement unit that was received from a specific remote device. Extended advertising is an extension of “normal” advertising and was introduced in Bluetooth 5.0, its main purpose to allow larger amount of advertising data in a single advertising report. The events can hold multiple reports from different devices, as the controller can decide to batch them together and avoid having to send multiple events in a short period of time. The format of the data that these events transfer is similar:

Num_Reports, -- number of advertising reports in the data
Event_Type[i], -- type of the event, used to indicate whether the device can be connected to, if the advertisement was directed etc.
Address_Type[i], -- type of the address of the advertising device, can be public, random, etc.
Address[i], -- the address of the advertising device
Data_Length[i], -- length of the advertising data
Data[i], -- advertising data in bytes
RSSI[i] -- signal strength of the received packet

LE Advertising Report structure

Num_Reports, -- same as above
Event_Type[i], -- same as above
Address_Type[i], -- same as above
Address[i], -- same as above
Primary_PHY[i], -- not important
Secondary_PHY[i], -- not important
Advertising_SID[i], -- not important
TX_Power[i], -- not important
RSSI[i], -- same as above
Periodic_Advertising_Interval[i], -- not important
Direct_Address_Type[i], -- if the advertisement was directed, type of the address the advertisement was directed to
Direct_Address[i], -- address that the advertisement was directed to
Data_Length[i], -- same as above
Data[i] -- same as above

LE Extended Advertising Report structure

Extended advertising reports, while allowing for more data to be transferred, also include some other fields. These are mostly related to the physical transfer of data and are going to be rather unimportant to us, so I didn’t bother explaining them. The biggest difference between the two events is in the maximum length of advertising data that they can contain:

  • Legacy LE advertising reports can contain at most 31 bytes of advertising data, as this is the limit of advertising data that can be transferred in a single PDU when using legacy advertising.
  • Extended LE advertising reports can contain at most 1650 bytes of advertising data. This is because extended advertising PDUs can transfer a maximum of 254 bytes of advertising data, and it being possible to chain multiple sequential PDUs into a single advertising report. There are however some limitations specific to the type of the advertisement, as we’ll see later.

The format of the advertising data itself is fairly simple. It consists of one or more advertising sections, each of which has a specific format:

Length, -- length of the data in the section, including the type field
Type, -- type of the section, indicates what kind of data the section holds
Data -- data in the section

Advertising section structure

For example, a device can transfer its name in an advertising data section. The type of the section would then be 0x09 (“Complete Local Name”), the length of the data would be the length of the name in ASCII (plus a terminating character), and the data itself would hold a null-terminated ASCII string. The list of common advertising section types can be found here (PDF), but the specification allows for manufacturer-specific advertising sections whose format and contents may remain opaque.

Finally, it’s also useful to know what kinds of advertising event types exist and what kinds of addresses the advertising reports can carry. The purpose of the event type is to tell the receiver of the advertisement whether the advertising device can be connected to or scanned, whether the advertisement was directed specifically to this receiver, and some other info. The event types can then be:

  • ADV_IND - the advertising device can be connected to and scanned, and advertising is undirected. This kind of advertising is common in peripheral devices that are looking to connect to any central device, e.g. bluetooth headphones before they’re paired with a central device. The maximum length of advertisement data that can be sent in a single report of this type is 251 bytes.
  • ADV_DIRECT_IND - the advertising device can be connected to, and advertising is directed to a specific receiver. Reusing the previous example, this kind of advertising would be used by bluetooth headphones that are already paired with a central device - by using this kind of advertising, the headphones request to be connected to that specific device. Similarly, the maximum amount of data in a single report is 251 bytes.
  • ADV_NONCONN_IND - the advertising device is neither connectable nor scannable, and the advertising is undirected. This is quite common in beacons, whose only purpose is to broadcast some data to all nearby devices. There are no limitations on the maximum length of advertisement data that can be sent in a single report, i.e. the 1650 bytes limit stands here.
  • ADV_SCAN_IND - same as ADV_NONCONN_IND but the device allows to be scanned. This can be useful if the beacon in the previous example wants to transmit information only to interested parties - receiving devices can request a scan upon receiving the advertisement and receive the actual information. This type of report carries no advertisement data.
  • SCAN_RSP - indicates that this advertisement report is a response to a scan request. In the previous example, if a scannable device is scanned, the response will be received in the form of a SCAN_RSP advertisement report. Similarly, no limitation to the maximum length of data in a single report applies.

As for addresses, each advertising device will generally also include an address that the receiver should use if they want to communicate back. An address consists of 6 bytes and can be either randomly generated or persistent (and presumed unique). The type of the address can be:

  • Public Device Address - this is a unique MAC address of the device.
  • Random Device Address - this is a temporary random address.
  • Public Identity Address
  • Random Identity Address

This is pretty much all of the information we need to know for now. There’s a lot to say about advertising, but most of it is not relevant to the vulnerabilities at hand. If you’re interested in knowing more outside what’s covered in this post, you can find more info in that blog post I linked above.

[4.0] vulnerability


Windows Bluetooth stack is quite complex, spanning multiple different drivers, services and user-mode libraries. Microsoft provide a brief overview of the architecture here. I’ll share their representation:

As advertisement data can contain different types of information, advertising reports can end up being parsed in multiple different places. To avoid having to implement parsing procedures everywhere individually, Microsoft uses a static library that’s linked into the modules where this functionality is needed. Two of the functions in this library that play a part in parsing advertisement data are BTHLELib_ADValidateEx and BthLeLib_ADValidateBasic:

HRESULT BthLELib_ADValidateEx(const uint8_t* adv_data, uint16_t adv_data_size, uint8_t** out_sections, uint8_t* out_num_sections)
{
	// initial validation
	uint8_t num_sections = 0;
	HRESULT validation_res = BthLeLib_ADValidateBasic(adv_data, adv_data_size, &num_sections);
	if (validation_res < 0) return validation_res;
	if (num_sections == 0) return validation_res;
	
	// allocate the array for advertisement section data
	// struct BTHLE_AD_SECTION { uint8_t size; uint8_t type; uint8_t data[0x151]; }
	// sizeof(BTHLE_AD_SECTION) = 0x153
	BTHLE_AD_SECTION* sections = BthLELibAllocatePoolEx(sizeof(BTHLE_AD_SECTION) * num_sections);
	
	// validate data for each section and copy data into the array
	uint16_t section_offset = 0;
	for (uint8_t i = 0; section_offset < adv_data_size; ++i)
	{
		BTHLE_AD_SECTION* section = sections[i];
		uint8_t section_size = adv_data[section_offset];
		uint8_t section_type = adv_data[section_offset + 1];
		if (section_size == 0) break;
		if (section_size + section_offset + 1 >= adv_data_size) return 0xC000000D;
		
		// write the section size and the section type into the output section
		section->size = section_size;
		section->type = section_type;
		
		// based on section type, do validation and write output section data based on input section data
		switch (section_type)
		{
			...
			// validate section data based on section type
			// this is done by calling BthLELib_ADValidate???, where ??? denotes the type of the section
			if (BthLELib_ADValidate???(...))
			{
				// If validation fails for a section, the function exits
				ExFreePool(sections);
				return 0xC000000D;
			}
			...
			// for some section types, more data is written into the output section
			// e.g. vendor data, signed data sections...
			case ... :
			{
				// but most commonly, data is copied as is into the output section
				uint8_t section_data_offs = ...;
				uint8_t section_data_size = ...;
				memcpy(section->data, adv_data + section_offset + section_data_offs, section_data_size);
			}
		}
		section_offset += section_size + 1;
	}
	*out_sections = sections;
	*out_num_sections = num_sections;
	return 0;
}

HRESULT BthLeLib_ADValidateBasic(const uint8_t* adv_data, uint16_t adv_data_size, uint8_t* out_num_sections)
{
	uint16_t adv_data_offset = 0;
	while (adv_data_offset < adv_data_size)
	{
		uint8_t section_size = adv_data[adv_data_offset];
		if (section_size == 0) break;
		if (section_size + adv_data_offset + 1 >= adv_data_size) return 0xC000000D;

		/**** OVERFLOW HERE ***/
		++(*out_num_sections);
		/**** OVERFLOW HERE ***/

		adv_data_offset += section_size + 1;
	}
	while (adv_data_offset < adv_data_size)
	{
		if (adv_data[adv_data_offset++] != 0) return 0xC0000000D;
	}
	return 0;
}

BTHLELib_ADValidateEx is the function that external modules call in order to transform advertisement data received in a HCI event into a more suitable format. The function receives a pointer to raw advertisement data and its length, as well as out parameters that indicate the output array of transformed advertisement sections and the length of that array. At its very beginning, BthLeLib_ADValidateBasic is called, which ensures that each of the advertisement sections has the correct length (i.e. it doesn’t extend past the end of the data), but also counts the total number of sections in the data. The calculated count is then used by BthLELib_ADValidateEx to allocate memory for the array of output sections. The rest of the function is focused on parsing specific advertisement sections and copying data from the input buffer into the appropriate format inside the array entry corresponding to the section. While some output data is formatted in a more structured manner, for the vast majority of section types the input data is simply copied into the output section in its raw form.

The vulnerability is located in the code which counts the number of advertisement sections. Since an 8-bit unsigned integer is used for this purpose, having more than 255 sections in the data will make the value of the variable overflow. Once that happens, the count value returned by the function will be lower than the actual number of sections present in the data. The amount of memory allocated for the sections array will be lower than expected, resulting in out-of-bounds writes once the data from individual sections is copied into the memory that’s supposed to belong to the section array.

The simplest example of advertisement data that will trigger the vulnerability is one which has 257 “empty” sections, i.e. each section being:

Length = 0x01
Type = 0x00, -- 0x00 is an invalid / reserved type
Data = []

After exiting BthLeLib_ADValidateBasic, num_sections will be equal to 1, and the amount of memory allocated for the sections array will be 0x153 bytes. Meanwhile, the loop in BthLELib_ADValidateEx will iterate over all 257 sections, copying the length and type of each section way past the end of the allocated memory, i.e. in the 2nd iteration of the loop, 0x01 0x00 will be written just after the end of the allocated memory, in the 3rd iteration the same value will be written at offset 0x153 past the end of the memory etc.

[4.1] fix


Microsoft fixed the vulnerability by exiting BthLeLib_ADValidateBasic with an error if *out_num_sections ever reaches 255. The code looks something like this after the fix:

HRESULT BthLeLib_ADValidateBasic(const uint8_t* adv_data, uint16_t adv_data_size, uint8_t* out_num_sections)
{
	uint16_t adv_data_offset = 0;
	while (adv_data_offset < adv_data_size)
	{
		uint8_t section_size = adv_data[adv_data_offset];
		if (section_size == 0) break;
		/***** FIX *****/
		if (*out_num_sections == 255) return 0xC000000D;
		/***** FIX *****/
		if (section_size + adv_data_offset + 1 >= adv_data_size) return 0xC000000D;

		++(*out_num_sections);

		adv_data_offset += section_size + 1;
	}
	while (adv_data_offset < adv_data_size)
	{
		if (adv_data[adv_data_offset++] != 0) return 0xC0000000D;
	}
	return 0;
}

While this prevents integer overflow, it unnecessarily adds a limitation on the amount of advertising sections that can be contained in a single advertising report. Such limitations are not imposed by the standard and this just makes Microsoft non-compliant with it. Although it’s quite unlikely that anyone would ever need even close to 255 sections in a real life situation, so the limit will probably not make much of a difference.

[4.2] exploitability


As we’ll see later, the vulnerability is very much exploitable. Of course, we assume that the attacker fully controls advertising data, in that it must follow an expected format but its contents can be arbitrary.

  • The attacker has nice control over the amount of allocated data, as they control the number of sections in data and thus the value of num_sections. This means that they can make the allocation fall into almost any heap bucket that they desire.
  • The attacker has almost full control of the data that’s going to be written out-of-bounds. The limits include the fact that writes must start at multiples of 0x153 and that the first two bytes must represent the length and the type of the section. For example, certain sections (like the “Complete Name” section) are simply copied like-for-like from the input buffer to the array entry in the output, which allows the attacker full control of the data that’s written out of bounds.
  • In cases of integer overflow like this one, where the overflow leads to a miscalculation of the size of an output buffer, it’s not uncommon that the vulnerability is made useless by the fact that the memory corruption goes “too far”, i.e. it has to iterate over the entire non-overflowed range. However, this can be avoided here by abusing individual section validation. Once the attacker does not want to corrupt memory past some point, they can supply a corrupted section that fails the specific validation designed for that section type. This will make the code exit early, while still maintaining the memory corruption.

[4.3] impact


The vulnerable function is present in four different modules throughout the Windows Bluetooth stack:

  • (1) bthport.sys – a kernel driver that’s at the very bottom of the stack
  • (2) Microsoft.Bluetooth.Service.dll – a module used by the Bluetooth Support Service
  • (3) Windows.Internal.Service.dll – a module used by the Bluetooth Support Service
  • (4) dafBth.dll – a module used by the Device Association Service

The function is used to parse remote advertisement data in modules (1), (2) and (4). In module (3), it’s used to parse advertisement data that’s sent from local programs that are running “upstream” on the stack. As such, the vulnerability can be used as a vector for both RCE and LPE. We’ll cover both of those vectors individually below, as the circumstances surrounding them are pretty different.


This is all we can say about the vulnerability without going into the specific exploitation vector it may be used for.

← Two bluetooth vulnerabilities in Windows CVE-2023-24871 - RCE →