PaX - structleak

I am rather fascinated with exploit mitigations, especially ones by PaX. When I first started out in security I came to learn of PaX quite quickly, and since moving into the binary exploitation space the desire to understand more about how these mitigations are created and how they work has greatly increased. In light of that, today I am going to looking into “STRUCTLEAK”.

Introduction

STRUCTLEAK is a GCC plugin created by PaX team, their decision to make such a plugin was prompted by CVE-2013-2141 (more on this CVE shortly). The idea of this mitigation is to zero-initialize data structures that can be copied into userland, this can help to relieve leaking behaviour from kernel memory.

History

As mentioned earlier, PaX’s decision for a GCC plugin of this nature came from the release of CVE-2013-2141. This CVE abused the do_tkill function to leak information from the kernel either by issuing a tkill or a tgkill syscall. If we take a look at the patch, we can see the fix was quite a simple one:

/* diff --git a/kernel/signal.c b/kernel/signal.c
 * index dd72567767d96..598dc06be4214 100644
 * --- a/kernel/signal.c
 * +++ b/kernel/signal.c
 * @@ -2948,7 +2948,7 @@ do_send_specific(pid_t tgid, pid_t pid, int sig, struct siginfo *info)
 */

static int do_tkill(pid_t tgid, pid_t pid, int sig)
{
/* - */	struct siginfo info;
/* + */	struct siginfo info = {};

As you can see, the issue occured because of a simple programming error, the difference being merely just an equals sign and a pair of curly brackets. This tiny programming error meant that when handling signals delivered from tkill, an information leak would occur

int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
{
    put_user_ex(ptr_to_compat(from->si_ptr), &to->si_ptr);
}

Errors like this happen all the time, these types of small errors which can lead to significant security issues are common. The whole goal of PaX and really the whole goal of exploit mitigations is that even if these errors occur, let’s implement some compile-time or run-time checks to combat them.

This post isn’t about that particular CVE, so if you would like to play around with it further yourself, there is a nice poc available on GitHub. If you want to download the specific patch you can do so here

Background

Before we can delve deeper into the implementation of this mitigation, let’s talk briefly about structures in C. A structure groups data together, each item in a structure is known as a member. The idea being structures is similar to the idea of classes in other languages.

Let’s use “siginfo” in an example of how structures work. Below we are declaring a structure “siginfo” and it has two members, “si_signo” and “si_errno” of type integer.

struct siginfo {
    int si_signo;
    int si_errno;
}

Now let’s say that we wish to assign a value to a structure member, then we could do that with some code such as this.

struct siginfo info;

info.si_signo = 1;
info.si_errno = 0;

What we have done here is called assignment. We’ve assigned the structure member “si_signo” to the value 1 and we’ve assigned the structure member “si_errno” to 0.

There is one concept we missed however, that is the concept of initialization. In C, if you wish to give a structure memeber an initial/default value then you can use initialization. However, it is not required to initialze structures. It is often thought that the best practice however, is to initialize structure members. This is because if members are not initialized the compiler is free to use the member as it wishes, which leads to issues and in some cases, security ones.

Why Does Leaking Occur?

In short, this comes down to compiler optimization.

There is an issue known as “unaligned memory access”. What this essentially means is that, if you try to read N bytes from an address that is not evenly divisible by N then the result is an unaligned memory access. This is very important, especially in the case of structure members.

To ensure alignment the compiler automatically adds padding to structures to maintain a contiguous and ordered structure, the image below is taken from this useful post by katecpp.

The above image repesents a structure in memory, where the grey blocks are padding and the remaining blocks are data. The problem with these padding bytes is that they are unintialized and unitialized can lead to information leakage.

CVE-2013-2141 is a perfect example of this behaviour. We have a situation where the result of an operation is being returned into a userspace address from the kernel, and thus disclosing kernel pointers due to those unitialized blocks.

structleak

Now that we understand how structures work, and what initialization is. We can begin to understand how structleak works. At a high level this plugin zero-intializes any structures which contain a __user attribute. In the Linux kernel the __user attribute denotes user space pointers and allows the compiler and developer to know they should not trust that pointer.

So what is zero initialization? Well, zero initialization, as it self describes is the process of setting initial values of objects to zero. By performing this zero initialization, if a programmer makes a mistake such as the one in CVE-2013-2141, GCC should pick it up.

Think back to the patch for CVE-2013-2141 (shown again below), all the patch did was simply zero intialize the structure before assignment so that the leaking behaviour does not occur. This is because the structure now has a default value and the compiler cannot use the member however it decides to. It has to respect that the value is filled.

/* diff --git a/kernel/signal.c b/kernel/signal.c
 * index dd72567767d96..598dc06be4214 100644
 * --- a/kernel/signal.c
 * +++ b/kernel/signal.c
 * @@ -2948,7 +2948,7 @@ do_send_specific(pid_t tgid, pid_t pid, int sig, struct siginfo *info)
 */

static int do_tkill(pid_t tgid, pid_t pid, int sig)
{
/* - */	struct siginfo info;
/* + */	struct siginfo info = {};

If you don’t know much about GCC plugins, I would recommend doing some research on those, specifically what GIMPLE is, there is a good thread on lwn about writing GCC plugins, which will help you understand how structleak works.

The first thing that structleak needs to do before it can zero initialize structure members is to enumerate all of the local variables:

static unsigned int structleak_execute(void)
{

    tree var;
    unsigned int i;

    FOR_EACH_LOCAL_DECL(cfun, i, var) {
        tree type = TREE_TYPE(var);

        if (!auto_var_in_fn_p(var, current_function_decl))
            continue; leads to infomration disclosure

        // only care about structures
        if (TREE_CODE(type) != RECORD_TYPE)
            continue;

        // if type is interesting, examine
        if (TYPE_USERSPACE(type))
            initialize(var);
    }
}  

When a structure/interesting type is found, it is passed to the intialize function so that the target structure is zero initialized:

static void initialize(tree var)
{
    basic_block bb;
    gimple_stmt_iterator gsi;
    tree initializer;
    gimple init_stmt;

    // build initializer expression
    initializer = build_constructor(TREE_TYPE(var), NULL);

    // built initializer stmt
    init_stmt = gimple_build_assign(var, initializer);
    gsi = gsi_start_bb(ENTRY_BLOCK_PTR->next_bb);
    gsi_insert_before(&gsi, init_stmt, GSI_NEW_STMT);
    update_stmt(init_stmt);
}

This plugin was ported to the Linux kernel in 2017, you can find that commit here.

Final Thoughts

One interesting comment in the Linux kernel port was the mention of targeting other types in the future, not just structures. At some point, I might look into doing that myself, if only just as a thought experiment.

References