You're reading for free via Monethic.io's Friend Link. Become a member to access the best of Medium.
Member-only story
DYLD — Do You Like Death? (IV)
The lifecycle of a Dynamic Loader from its creation to its termination.
This is the fourth article in the series about debugging Dyld-1122 and analyzing its source code. We will start from the RuntimeLocks
function in dyldMain.cpp, which is the exact point where we finished the last article.
Please note that this analysis may contain some errors as I am still learning and working on it alone. No one has checked it for mistakes. Please let me know in the comments or contact me through my social media if you find anything.
Let’s go!
WORKING MAP
As last time, we begin our journey by decompiling the Dyld using a Hopper.
hopper -e '/usr/lib/dyld'
We are in the dyld`start
after finishing the handleDyldInCache
. We chose not to follow the restartWithDyldInCache
path using DYLD_IN_CACHE=0
and we eventually escaped the handleDyldInCache
and proceeded with the execution to RuntimeLocks
.
In this episode, we will start analysing the Memory Manager, beginning from RuntimeLocks
and finishing just before EphemeralAllocator
.



Dyld GitHub repository:
- Start:
RuntimeLocks
in dyld-1122.1 — dyldMain.cpp#L1219 - End:
EphemeralAllocator
in dyld-1122.1 — dyldMain.cpp#L1234
LLDB breakpoints:
# Start - dyld`start+1264
settings set target.env-vars DYLD_IN_CACHE=0
br set -n start -s dyld -R 1264
# END - dyld`start+1480
br set -n start -s dyld -R 1480
The next article will start at the exact point where this one finishes.
START — RuntimeLocks
We finished a long branch inside handleDyldInCache
and eventually landed one instruction before RuntimeLocks
, which adds to the SP
0x90
value and stores the calculation result in x0
. This is a pointer to the kernArgs
:

When looking at the source code, in the next few instructions, we will initialize RuntimeLocks
class as locks and three-pointers with nulls:

Starting from line 1212
, we got the SUPPPORT_PRE_LC_MAIN
, which is only ON
when loading x86_64
binaries, so we fall into the else
branch and in line 1219
, we declare an instance of the RuntimeLocks
class named locks
.

The class members _loadersLock
, _notifiersLock
,_tlvInfosLock
and _apiLock
are initialized with OS_UNFAIR_RECURSIVE_LOCK_INIT
value, while allocatorLock
and logSerializer
is initialized with OS_LOCK_UNFAIR_INIT
:
These locks ensure that only one thread can access the protected resource at a time, preventing data corruption, race conditions, and deadlocks.

Then we initialize three-pointers allocator
, state
, appMain
. In the lldb it looks like all locks
members and these pointers are initialized with zeroes.

Overall, we initialized the
RuntimeLocks
object with zeros in memory, likely preparing it for subsequent usage within the Dyld runtime environment.
MemoryManager
Now, we will initialize the Memory Manager to allocate the memory, configure the runtime, and load the executable with its dependencies.
We will break down this process into smaller parts as it is quite lengthy.
First, we initialize a MemoryManager
class as a bootStrapMemoryManager
with an apple
variables from a stack using the previously described findApple
:

In our decompiled code in the Hopper, this starts here:

After getting the pointer to the apple
arguments, we run:

Which during runtime in lldb looks like this:

These instructions null 16 bytes of memory in v0
register, 0x16fdfee70
, 0x16fdfee80
and null one byte at 0x16fdfeea0
:

We iterate over apple
arguments using _simple_getenv
to check if a dyld_hw_tpro
variable exists and has a non-null value. If that is the case, we set _tproEnable
variable to true
:

I am not 100% sure, but I think this is only set on system startup:

For our case _tproEnable
will not be set, as the dyld_hw_tpro
variable was not passed from the kernel to apple
arguments part of kernArgs
.

Next, we execute withWritableMemory
, which uses Lambda Capture [&]
. I made a lambda_capture_example.cpp
to explain how it works.

This code creates a temporary scope where the memory managed by the bootStrapMemoryManager
can be modified. Inside the lambda function captured by withWritableMemory
, several initialization tasks are performed:

The work()
here is our Code block
, from DyldMain
starting at line 1234
and finishing at line 1253
. This will be executed in place of work()
:
More readable pseudo code of the whole process:

This code may look trivial now, but this is memory management stuff. It will be a long journey, but do not worry. We will slice it into parts and make it through.
PRE WORK()
We start in Allocator.h#L165
where we place this
pointer on the stack and we protect the address we placed on the stack with PAC:

I will soon write an extensive article about Pointer Authentication on macOS in the Snake&Apple series. For the moment, this is a great source and this one:

The this
here is the current instance of the MemoryManager
we initialized as a bootStrapMemoryManager
. On the assembly level we see pacda
:

Pointer Authentication Code for Data address (pacda
), is using key A and the address in x16
register to sign x17
and then store the result in x16
.

However, as we execute the instruction below, we can observe that it changed nothing in our destination register (x16
):

Moving forward to the Authenticate Data address (autda
) which should authenticate the address in x16
using key A and the value in x17
:

The documentation states that:
If the authentication passes, the upper bits of the address are restored to enable subsequent use of the address. If the authentication fails, the upper bits are corrupted any subsequent use of the address results in a Translation fault.
So, if we change the address stored in x16
, the upper bits should not be cleared. Otherwise, the address would break the flow. Yet, since the pacda
has not signed it (leaving the higher bits untouched), this protects nothing.


I changed the register x16
on the audta
instruction and then proceeded with the execution flow and observed it changed nothing. In both cases, the jump at start+1388
is taken, even with the modified x16
register:

Currently, it does not look like it works on macOS Sonoma, and the reason for that is that Pointer Authentication is not fully supported yet.
The pacda
instruction should sign the 16 higher bits and autda
should authenticate it. In case of any modification, the execution should break.
Of course, for our modification, we will trigger this error since we used the pointer to invalid memory, which is occupied by PAGEZERO
and thus is ---
:

As we proceed we will trigger the EXC_BAD_ACCESS
error, so lets restart the process and continue without any modifications.

Does it mean Pointer Authentication not work?
I think this is a dead code and is unrelated to the __ptrauth_dyld_tpro0
. I could not find a reason why it exists, but I tested this part of the code with turned-off SIP and turned-on arm64e_preview_abi
to fully support PAC:

arm64e_preview_abi ON
The assembly code has gone. There are no more pacda
and we can see the pacga
at start+1316
. While normally, this instruction is at start+1420
:

It seems like a dead code for Sonoma because the arm64e_preview_abi
is turned off by default, so pacda
and autda
are not enforced. Still, they exist in the code.
Now, we start the assembly code related to our __ptrauth_dyld_tpro0
for real. So because of this dead code above, we are still in the same place, but where exactly in our decompiled pseudo code?
# To get to the loc_100015d24
br set -n start -s dyld -R 1404

First, we protect the pointer to the current instance of a memory manager in x8
with PAC (this time for real) using pacga
and store it on the stack:

We can observe that 0x16fdfeeb0
is stored on the stack in a signed form:

We can trace in lldb whenever this value changes using:
watchpoint set expression -- 0x16fdfeea8
However, I could not find the exact point where this signature is used. I think it is in code related to makeWriteable
what is shown later, but it is a wild guess.
Memory manager instance starts from WriteProtectionState
and the value 0x16fdfeeb0
points to the signature
and data
structure pointers:

In the Dyld source code, this structure is declared as previousState
and we can see its members below. The previousState.data
is 0x16fdfee80
.
I couldn’t find the reason why the signature pointer here is 0x01
. I also deduced from start+1468
that the previousState.data
is 0x16fdfee80+0x18
. Maybe I misinterpreted this part :/

0x16fdfee80 points to NULL

With 0x16fdfee80
pointer in x0
register we branch lockGuard
which is adequate to os_compiler_barrier
in the Dyld repository source code:

In our decompiled code, the lock
is released after the writeProtect
:

It seems like in lockGuard
and later writeProtect
we will be using our 0x16fdfee80
pointer (previousState.data
):

The os_compiler_barrier
ensures that memory operations before the barrier are completed before those after the barrier. It uses atomic_signal_fence
. I will not delve into it now but return to the topic during the XNU analysis.
Then, we acquire 0x16fdfee80
from x21
and proceed to the writeProtect
:

It seems like in the source code, we approach this line:

Most of the makeWriteable
code works only if tpro
was enabled
, so we will skip there almost all lines except 218
line:
The only thing that will be changed is data
member of previousState
and it will be set to 1
a few lines after the writeProtect
call:

0x16fdfee80 + 0x18
Here is the point I do not understand (mentioned in my lack of understanding why previousState.signature = 0x01
). Why are we using 0x16fdfee80+0x18
instead of 0x16fdfee80
? I could not find the answer and gave up after finding out that makeWriteable
is almost unused there if tpro
is off.
After that we release the
lock
and this is the end of our “prolog”.
END
After analysing this part of Memory Manager, even though I spent a lot of time on it, I didn’t do it satisfactorily and will probably return to it more than once.
I found many unknowns I need to learn more about. I could not find answers online, in books, or through source code analysis. The questions that bother me the most are: what tpro is, when is it turn on and what it is used for.
In the decompiled code, we finished here:

In the debugger, we finished here:

In the Dyld source code, we finished here:
In the following article we will analyse furher the
withWriteableMemory
and dive into thework()
. I hope I will do better than with the PRE-WORK part.
Continued in DYLD — Do You Like Death? (V)