You're reading for free via Monethic.io's Friend Link. Become a member to access the best of Medium.
Member-only story
DYLD — Do You Like Death? (V)
The lifecycle of a Dynamic Loader from its creation to its termination.
This is the fifth article in the series about debugging Dyld-1122 and analyzing its source code. We will start with the EphemeralAllocator
function and finish after examining the PersistentAllocator
.
Please note that this analysis may contain some errors as I am still learning and working on it alone. No one has checked it for mistakes. Please let me know in the comments or contact me through my social media if you find anything.
Let’s go!
WORKING MAP
As last time, we begin our journey by decompiling the Dyld using a Hopper.
hopper -e '/usr/lib/dyld'
We are in the dyld`start
analysing the Memory Manager. In the last article, I introduced pseudo-code, which you can see below. Based on it, we finished the red rectangle and are currently on EphemeralAllocator
:

In this episode, we start with the EphemeralAllocator
and finish after creating the PersistenAllocator
, which will be our default Allocator
.


Dyld GitHub repository:
- Start:
EphemeralAllocator
in dyld-1122.1 — dyldMain.cpp#L1234 - End:
ProcessConfig
in dyld-1122.1 — dyldMain.cpp#L1239
LLDB breakpoints:
# Start - dyld`start+1480
settings set target.env-vars DYLD_IN_CACHE=0
br set -n start -s dyld -R 1480
# END - dyld`start+1532
br set -n start -s dyld -R 1532
The next article will start at the exact point where this one finishes.
START — Linker Standard Library
First, we create ephemeralAllocator
as type EphemeralAllocator
. In the decompiled code, we can observe lsl
abbreviation:

It resolves to a Linker Standard Library, which is a temporary substitute for std
, since in the current state of loading, we have not imported libc yet.

We will deal with both allocators in this article.
Ephemeral Allocator
The code sets up an EphemeralAllocator
and initializes a MemoryManager
object within it. We will use it to allocate memory for short-lived objects.
Before calling lsl::EphemeralAllocator::EphemeralAllocator()
we put the pointer 0x16fdfef40
from the stack to the x0
register:

So we can deduce we will operate on this memory on the stack. Let's see what it looks like before calling the allocator:

Now we will branch and observe we will perform another jump to:

This is the main of our allocator. We may observe it executes several instructions before calling lsl::Allocator::aligned_alloc
:
There is also pacda
which is later authenticated by autda
in the mentioned lsl::Allocator::aligned_alloc
function. However, as described in the previous article, it is not supported yet on macOS, and it does not work here as well.

It looks like lsl::Allocator::aligned_alloc
takes three arguments: a pointer to the stack and two immediate values 0x08
and 0x28
:

Below, we can see the source code of the lsl::Allocator::aligned_alloc
:
We may also read its decompiled pseudo-code version from Hopper:

The arg0
will be our stack pointer (0x16fdfef40
), arg1
is alignment (0x8
) and arg2
is size (0x28
). So this function seems to place 40 bytes of some memory pointed by arg0
and aligns it to an 8-byte alignment. Let's see the stack before executing the lsl::Allocator::aligned_alloc
:

Then, let's set the breakpoint just before calling AllocationMetadata
and inspect the memory again (the green field is our 40 bytes in alignment scope):

At last, there is a lsl::AllocationMetadata::AllocationMetadata
function that is executed before we get back to dyld`start
:

We can also read its decompiled pseudo-code version from Hopper:

Again, we are taking 3 arguments, but this time arg0
has changed to 0x10019c020
, while our pointer to the allocator on the stack is in arg1
and the arg2
stores the size of memory 0x0030
we will be working on:

After executing AllocationMetadata
we store the value 0x000600002dfbfde8
in the memory pointed by 0x10019c020
:

Then, we continue the execution till we back to dyld`start+1488
and inspect our stack (0x16fdfef40
):

x/8gx 0x16fdfef40
Let’s also inspect these pointers on the stack before entering the allocator and after finishing its execution. As we can see, all these pointers except the first one 0x1000a9fd0
belong to the same memory page 0x10019c000
.
So all we need to do is inspect 0x1000a9fd0
and 0x10019c000
. For that purpose, we need to set two breakpoints, restart the execution and inspect the pointers on these breakpoints. Below, you can see the full workflow:
# SET BREEAKPOINTS
br set -n start -s dyld -R 1484
br set -n start -s dyld -R 1492
run
# INSPECT @ BR 1
x/20gx 0x00000001000a9fd0
x/20gx 0x000000010019c000
c
# INSPECT @ BR 2
x/20gx 0x00000001000a9fd0
x/20gx 0x000000010019c000
Below, we can observe what changed after executing EphemeralAllocator
. It seems that all memory was nulled out and only 0x10019c000
points to itself, and has value 0x400000
. It is the amount of memory cleared — 4MB.

We can confirm the cleared memory by the allocator by again inspecting it:
# The last 12 x 8B of 4MB memory:
x/12gx 0x000000010019c000+0x3fffa0
# First 12 x 8B of memory after 4MB:
x/12gx 0x000000010019c000+0x400000

All 4MB of memory belongs to Dyld Private Memory:

The question arises of how this memory region was allocated. Since in the decompiled code and the source code in the repository, we could not find any syscalls, and without using them, we could not map a new memory region.
To answer this question, we should debug the whole flow step by step, and we will find out that there is a hidden function vm_allocate_bytes
:

This function utilises vm_allocate
and vm_protect
which uses syscalls and mach traps to map a dyld private memory
region for our process.
I am describing how both of them work in the Persistent Allocator because they are used there too, and it is easier to follow since we can find the code related to them in the Hopper decompiled code and the Dyld repository source code.
For now, let's just check the sorted memory using vmmap -interleaved
to see the entire dyld private memory
region created by EphemeralAllocator
:

EphemeralAllocator
flow.
This way, we finished the task of the first allocator (EphemeralAllocator
). Dyld will use this memory region for short-live object allocations.
After analysis, it appears that the primary task of this allocator was to allocate a new memory region of 4MB of nulled memory and place its pointer on the stack. It looks like Dyld “private heap” used for short-lived (ephemeral) data.
Persistent Allocator
Now, we will set up the persistent allocator at line 1236
. From the source code, we can deduce that we will operate on the bootStrapMemoryManager
, which we set up in the previous article at 0x16fdfee80
.

In our decompiled code, we are here:

This corresponds to the below source code from the Dyld repository:

The same code looks like this in our decompiled pseudo-code:

Let’s set a breakpoint and observe what we will use in place of arguments:
br set -n start -s dyld -R 1492

As expected we are using 0x16fdfee80
pointer. We proceed with execution by stepping into this function and observing two nested functions:

The first function takes 0x16fdfee80
pointer and 0x40000
value as arguments. Its decompiled version can be seen below:
The vm_*
functions are part of the mach virtual memory subsystem.

Let's continue the execution and branch to the vm_allocate_bytes
. We can see here another 2 nested functions vm_allocate
and vm_protect
:
These are used to allocate virtual memory for a given task and set access privilege attributes for this region. The same was used in the Ephemeral Allocator.

Before moving on, it is good to read Mach Overview.
vm_allocate
Let's analyse the first one — vm_allocate
. We can find its source code in the XNU repository, and it is shown below:

As we can see, it takes 4 arguments: task
, address
pointer, size
and flags
. It returns kern_return_t
, which is just an alias for int
:

These 4 arguments are stored successively in x0
,x1
,x2
, andx3
registers:

# dyld`vm_allocate:
0x100013050 <+56>: bl 0x100011588 ; _kernelrpc_mach_vm_allocate_trap
0x10001307c <+100>: bl 0x100015018 ; mig_get_reply_port
0x1000130c8 <+176>: bl 0x10008b274 ; mach_msg2_internal
0x100013150 <+312>: bl 0x10008b4c4 ; mig_dealloc_reply_port
0x100013178 <+352>: bl 0x100011000 ; mach_msg_destroy
Each function utilises syscall, which is handled in the kernel space.
_kernelrpc_mach_vm_allocate_trap
Let's proceed with the _kernelrpc_mach_vm_allocate_trap
:

- If the task exists and has a port
send
rights we will executemach_copyin
to copy in the memory address fromargs->addr
into the variableaddr
. - Then, it uses this
addr
to allocate zero fill memory within the task’s virtual memory map usingmach_vm_allocate_external
. - Lastly, if allocation was successful, it copy the pointer to the memory location to the address specified by
args->addr
usingmach_copyout
.
The args structure is shown below:

On entering the function we can observe it will use -10
code number and call a trap with x0=task
, x1=address
, x2=size
, and x3=flags
:

We can list all tasks using lsmp
and -p
with PID
for our process to check if such a task even exists for our process and has send
rights:

As we can see, the task exists, and we have the proper right, so the last thing, let's check the memory pointed by our address before calling trap:

After executing the syscall, we may observe this memory was populated with a pointer to the allocated 0x404000
bytes of nulled memory:

What did this syscall change? It mapped a new region of memory for us. This can be observed using vmmap
after executing the syscall:
The red rectangle is a region allocated by the Persistent Allocator. We may also observe the green one, which was previously allocated by EphemeralAllocator.

After executing the syscall we jump over all of the rest code and return to the
vm_allocate_bytes
function fromvm_allocate
.
vm_protect
Then we proceed to the vm_protect
which source code is shown below:

The function takes 5 arguments, and it changes the protection for the specified size
of memory starting from address
:

The function utilises _kernelrpc_mach_vm_protect_trap
to achieve it:

As before with vm_allocate
we will check what happens to the specified memory region before and after executing this syscall:

As we may observe above, the protections have changed to ---
from rw-
. Additionally, we can double-check this using vmmap
:

After that we return from
vm_protect
tovm_allocate_bytes
function and conduct stack cookie comparison to protect against buffer overflow.
PersistentAllocator::PersistentAllocator
A check is performed at the end of the vm_allocate_bytes
function to ensure no buffer overflows during the allocation.

If the values in the x8
and x9
are the same we execute the instructions until we come across the retab
which takes us to persistentAllocator
:

Then we store the address 0x1005a0000
and size 0x40000
into the sp
which points to memory 0x16fdfede0
:

We proceed to the lsl::PersistentAllocator::PersistentAllocator
:

In our decompiled code, we can observe it takes the returned value from vm_allocate_bytes
as the 1st
argument, so it is 0x1005a0000
starting address of the dyld private memory
we just allocated.

In the Dyld repository, we can find the corresponding code:

The result is a fully initialized allocator ready for memory allocation and deallocation operations. It resides at 0x1005a0000
we may observe

It seems like this memory region will be treated as a second Dyld “private heap”. This memory pool will be used for further allocations for objects that will persist in memory. I will not analyze this part any further at this moment.
END
In this article, we analysed how the Ephemeral and Persistent Allocators are created and where they reside in the Virtual Memory space.
In the decompiled code, we finished here:

In the debugger, we finished here:

In the Dyld source code, we finished here:
In the following article, we will use our Persistent Allocator to create
ProcessConfig
andRuntimeState
and we will dive into these functions.
Continued in DYLD — Do You Like Death? (VI)