Sunday, July 3, 2022

[SOLVED] Does Linux kernel list implementation cause UB?

July 03, 2022 c, linked-list, linux-kernel, list

Issue

Prerequisites:

As per C standard, pointer arithmetics that would yield an invalid pointer, cause undefined behavior.
Linux source code seems to conform with C standard in a desire to be compatible with most architectures.
Linux's list implementation contains the following code(formatting preserved, probably the idea for another question is how to set proper tabulation width using Stackoverflow syntax):

#define list_entry(ptr, type, member) \
    container_of(ptr, type, member)

#define list_next_entry(pos, member) \
    list_entry((pos)->member.next, typeof(*(pos)), member)

#define list_first_entry(ptr, type, member) \
    list_entry((ptr)->next, type, member)

#define list_entry_is_head(pos, head, member)               \
    (&pos->member == (head))

#define list_for_each_entry(pos, head, member)              \
    for (pos = list_first_entry(head, typeof(*pos), member);    \
         !list_entry_is_head(pos, head, member);            \
         pos = list_next_entry(pos, member))

Typical usecase of the aforementioned list implementation is having structure of say type struct A, containing a head for the list of stuctures of type struct B.

Q: Let's assume offsetof(struct B, entry_in_list) > offsetof(struct A, list_head) and the following loop is implemented:

struct A* A_ptr = something_meaningful;
struct B* pos = NULL;
list_for_each_entry(pos, &A_ptr->list_head, entry_in_list) {
  do_something();
}

Then last (before loop exit) evaluation of list_next_entry(pos, member) would extend to:

container_of(A_ptr->list_head, struct B, entry_in_list) = 
 = (char*)A_ptr->list_head - offsetof(struct B, entry_in_list) =
 = (char*)A_ptr + offsetof(struct A, list_head) - offsetof(struct B, entry_in_list)

, which, according to our assumption, would point to area before A struct. Assuming this area does not contain allocated memory, the result of the container_of() macro would be an invalid pointer, thus causing UB(in general case OFC) in Linux. Is this reasoning plausible or am I mistaken somehow?

Or are there some parts of the standard universally considered to not be worth to follow?

Solution

As suspected by OP, the implementation of the list_for_each_entry(pos, head, member) macro depends on undefined behavior in the C language in order for the loop termination condition !list_entry_is_head(pos, head, member) to become false.

Assuming the list is non-empty, then after the final iteration, the third "advancing" expression of the for loop produces a pointer to an invalid typeof(*pos) at an address offsetof(typeof(*pos), member) bytes before the struct list_head pointed to by head. It relies on &pos->member nevertheless comparing equal to head.

Although it depends on undefined behavior, it is hard for the compiler to determine that pos is technically an invalid pointer. As long as both pos and head point within the same flat address space, the Linux kernel manages to get away with this bending of the rules.

The alternative would be for #include <linux/list.h> to not provide the list_for_each_entry(pos, head, member) macro at all, and for code to use the list_for_each(pos, head) and list_entry(ptr, type, member) macros instead (where pos is a struct list_head * and ptr is a type *), but that would typically require extra variables in the code.

Answered By - Ian Abbott

Answer Checked By - Clifford M. (WPSolving Volunteer)

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, July 3, 2022

[SOLVED] Does Linux kernel list implementation cause UB?

Issue

Solution

Popular Posts

Labels