<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://youssix.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://youssix.github.io/" rel="alternate" type="text/html" /><updated>2026-05-28T09:19:08+00:00</updated><id>https://youssix.github.io/feed.xml</id><title type="html">Youssix</title><subtitle>Security Research - Low-level systems, virtualization, and reverse engineering</subtitle><entry><title type="html">VMCS by Practice: Notes from Writing a Hypervisor</title><link href="https://youssix.github.io/2026/05/20/vmcs-by-practice/" rel="alternate" type="text/html" title="VMCS by Practice: Notes from Writing a Hypervisor" /><published>2026-05-20T00:00:00+00:00</published><updated>2026-05-20T00:00:00+00:00</updated><id>https://youssix.github.io/2026/05/20/vmcs-by-practice</id><content type="html" xml:base="https://youssix.github.io/2026/05/20/vmcs-by-practice/"><![CDATA[<p>If you start reading Intel VMX documentation seriously, you quickly notice something frustrating: the Intel SDM is extremely precise, but not pedagogical. It’s a reference manual, not a learning path.</p>

<p>Most beginner questions are not about individual fields. They’re about <em>relationships</em> between concepts:</p>

<ul>
  <li>How many VMCS structures actually exist?</li>
  <li>What does “current VMCS” really mean?</li>
  <li>Why does <code class="language-plaintext highlighter-rouge">VMPTRLD</code> fail right after allocation?</li>
  <li>What’s the difference between <code class="language-plaintext highlighter-rouge">VM_EXIT_REASON</code> and <code class="language-plaintext highlighter-rouge">VM_INSTRUCTION_ERROR</code>?</li>
  <li>Why do some VM exits feel impossible to debug?</li>
</ul>

<p>This article is not a complete VMX reference. It’s a collection of practical notes and mental models I wish I had earlier while working on a custom hypervisor for learning purposes.</p>

<p>The target reader has already opened Intel SDM Vol. 3C, read the first VMX chapters, maybe looked at projects like HyperPlatform or KVM, and now has concrete debugging questions.</p>

<p><strong>We will focus on:</strong></p>

<ul>
  <li>VMCS lifecycle</li>
  <li>VMCS activation rules</li>
  <li>common beginner mistakes</li>
  <li>VMX debugging flow</li>
  <li>VM exit diagnostics</li>
</ul>

<p><strong>We will not cover:</strong> EPT internals, posted interrupts, nested virtualization, APIC virtualization, advanced scheduling, or production-grade VMM design. Those deserve separate articles.</p>

<hr />

<h2 id="1-how-many-vmcs-exist-and-how-many-are-current">1. How Many VMCS Exist, and How Many Are Current?</h2>

<p>A common beginner misunderstanding is assuming there is “one VMCS per VM”. That is incorrect.</p>

<p><strong>The VMCS is tied to a vCPU, not to the VM itself.</strong></p>

<p>Consider this scenario:</p>

<ul>
  <li>4 VMs</li>
  <li>2 vCPUs per VM</li>
  <li>8 logical processors on the host</li>
</ul>

<p>That means <code class="language-plaintext highlighter-rouge">4 × 2 = 8</code> vCPUs, and because each vCPU needs its own execution context, <strong>8 VMCS regions allocated</strong>.</p>

<p>A mental model:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>VM1:  vCPU0 → VMCS A    VM2:  vCPU0 → VMCS C
      vCPU1 → VMCS B          vCPU1 → VMCS D

VM3:  vCPU0 → VMCS E    VM4:  vCPU0 → VMCS G
      vCPU1 → VMCS F          vCPU1 → VMCS H
</code></pre></div></div>

<p>That tells us how many VMCS structures exist in memory. But Intel VMX introduces another concept: the <strong>current VMCS</strong>.</p>

<p>A VMCS becomes <em>current</em> on a logical processor after <code class="language-plaintext highlighter-rouge">VMPTRLD</code>. This loads the VMCS pointer into the processor’s VMX state. The important rule:</p>

<blockquote>
  <p>A logical processor has exactly one current VMCS at a time. A VMCS cannot be current on multiple logical processors simultaneously.</p>
</blockquote>

<p>That second clause is what makes SMP hypervisor work non-trivial.</p>

<h3 id="allocated-vs-current-vs-launched">Allocated vs Current vs Launched</h3>

<p>These three terms are often confused, and the distinction is the single most useful mental model you can build early.</p>

<p><strong>Allocated</strong> — a 4 KB VMCS region exists in memory. Nothing more. The CPU doesn’t know about it.</p>

<p><strong>Current</strong> — <code class="language-plaintext highlighter-rouge">VMPTRLD</code> has been executed for this VMCS on a logical processor. The CPU now associates VMX execution state with that structure. <code class="language-plaintext highlighter-rouge">VMREAD</code> / <code class="language-plaintext highlighter-rouge">VMWRITE</code> operate on the current VMCS.</p>

<p><strong>Launched</strong> — <code class="language-plaintext highlighter-rouge">VMLAUNCH</code> has succeeded at least once on this VMCS. From this point, <code class="language-plaintext highlighter-rouge">VMLAUNCH</code> can no longer be used on it; only <code class="language-plaintext highlighter-rouge">VMRESUME</code>. The CPU internally tracks this launch-state bit.</p>

<p>A VMCS therefore lives in one of several states: <em>clear</em> (just allocated or just <code class="language-plaintext highlighter-rouge">VMCLEAR</code>‘d), <em>current but not launched</em>, or <em>current and launched</em>. <code class="language-plaintext highlighter-rouge">VMCLEAR</code> transitions a VMCS back to the <em>clear</em> state, which means after a clean migration to another LP you use <code class="language-plaintext highlighter-rouge">VMLAUNCH</code> again, not <code class="language-plaintext highlighter-rouge">VMRESUME</code>.</p>

<h3 id="maximum-current-vmcs-at-once">Maximum Current VMCS at Once</h3>

<p>In our example:</p>

<ul>
  <li>8 vCPUs</li>
  <li>8 logical processors</li>
</ul>

<p>All 8 VMCS structures <em>could</em> be current simultaneously, one per LP:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>LP0 → VMCS A    LP4 → VMCS E
LP1 → VMCS B    LP5 → VMCS F
LP2 → VMCS C    LP6 → VMCS G
LP3 → VMCS D    LP7 → VMCS H
</code></pre></div></div>

<p>But if only 2 vCPUs are scheduled at this instant: 8 VMCS allocated, 2 current, 6 sitting inactive in RAM.</p>

<h3 id="vmcs-migration-trap">VMCS Migration Trap</h3>

<p>One of the easiest mistakes in multi-core hypervisor development is reusing a VMCS on another CPU without properly clearing it first.</p>

<p>Before a VMCS can migrate cleanly between logical processors, you must issue <code class="language-plaintext highlighter-rouge">VMCLEAR</code> on the source LP. Otherwise:</p>

<ul>
  <li>the CPU on the source LP may still consider it current</li>
  <li><code class="language-plaintext highlighter-rouge">VMPTRLD</code> on the destination LP can fail</li>
  <li>internal cached state may not be flushed to the in-memory VMCS</li>
</ul>

<p>This is one of the reasons VMCS lifecycle management gets complicated once SMP enters the picture.</p>

<hr />

<h2 id="2-why-does-vmptrld-fail-right-after-allocation">2. Why Does <code class="language-plaintext highlighter-rouge">VMPTRLD</code> Fail Right After Allocation?</h2>

<p>This is one of the most classic VMX beginner failures. You allocate a 4 KB page:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span><span class="o">*</span> <span class="n">vmcs</span> <span class="o">=</span> <span class="n">MmAllocateContiguousMemory</span><span class="p">(</span><span class="mh">0x1000</span><span class="p">,</span> <span class="p">...);</span>
</code></pre></div></div>

<p>Then <code class="language-plaintext highlighter-rouge">VMPTRLD</code> fails immediately with <code class="language-plaintext highlighter-rouge">VMfailInvalid</code>. The usual reason: <strong>you forgot to initialize the VMCS revision identifier</strong>.</p>

<h3 id="the-vmcs-revision-identifier">The VMCS Revision Identifier</h3>

<p>A VMCS region is not just “any aligned page”. Intel requires the first 32 bits of the page to contain a specific value, the <em>VMCS revision identifier</em>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>offset 0x000:
  bits  0-30  → VMCS revision ID
  bit   31    → shadow VMCS indicator (keep at 0 unless using shadow VMCS)
</code></pre></div></div>

<p>Before <code class="language-plaintext highlighter-rouge">VMPTRLD</code>, you must write the revision ID at offset 0. It comes from MSR <code class="language-plaintext highlighter-rouge">IA32_VMX_BASIC</code> (<code class="language-plaintext highlighter-rouge">0x480</code>):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">uint64_t</span> <span class="n">vmx_basic</span> <span class="o">=</span> <span class="n">__readmsr</span><span class="p">(</span><span class="mh">0x480</span><span class="p">);</span>
<span class="kt">uint32_t</span> <span class="n">revision_id</span> <span class="o">=</span> <span class="p">(</span><span class="kt">uint32_t</span><span class="p">)(</span><span class="n">vmx_basic</span> <span class="o">&amp;</span> <span class="mh">0x7FFFFFFF</span><span class="p">);</span>

<span class="n">memset</span><span class="p">(</span><span class="n">vmcs</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mh">0x1000</span><span class="p">);</span>
<span class="o">*</span><span class="p">(</span><span class="kt">uint32_t</span><span class="o">*</span><span class="p">)</span><span class="n">vmcs</span> <span class="o">=</span> <span class="n">revision_id</span><span class="p">;</span>
</code></pre></div></div>

<p>After that, <code class="language-plaintext highlighter-rouge">VMPTRLD</code> can succeed.</p>

<h3 id="aside-vmxon-region">Aside: VMXON Region</h3>

<p>The VMXON region is a <em>different</em> 4 KB structure used by the <code class="language-plaintext highlighter-rouge">VMXON</code> instruction itself. It is <strong>not</strong> a VMCS, but confusingly it also requires the revision ID written at offset 0 (same MSR, same format). Beginners often allocate one and reuse it incorrectly. Keep them as separate allocations.</p>

<h3 id="physical-vs-virtual-address-trap">Physical vs Virtual Address Trap</h3>

<p><code class="language-plaintext highlighter-rouge">VMPTRLD</code> expects a <em>physical</em> address. Not a virtual address.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Wrong:</span>
<span class="n">__vmx_vmptrld</span><span class="p">(</span><span class="o">&amp;</span><span class="n">vmcs_virtual</span><span class="p">);</span>

<span class="c1">// Correct:</span>
<span class="n">__vmx_vmptrld</span><span class="p">(</span><span class="o">&amp;</span><span class="n">vmcs_physical</span><span class="p">);</span>
</code></pre></div></div>

<p>The VMCS region lives in physical memory from the CPU’s perspective.</p>

<h3 id="alignment-requirements">Alignment Requirements</h3>

<p>The VMCS region must be 4 KB aligned and 4 KB in size. This is why contiguous physical allocation is commonly used. Misalignment alone produces immediate VMX failures.</p>

<h3 id="why-intel-designed-it-this-way">Why Intel Designed It This Way</h3>

<p>Unlike AMD’s VMCB, the VMCS format is intentionally opaque. Intel does not expose an official C structure layout. The CPU owns the real internal format, and software interacts through <code class="language-plaintext highlighter-rouge">VMREAD</code> / <code class="language-plaintext highlighter-rouge">VMWRITE</code>.</p>

<p>The revision ID lets Intel evolve internal VMCS layouts across processors while keeping compatibility rules clear. The VMCS page is therefore partly a memory structure, and partly a CPU-managed object. That matters when debugging VMX state issues.</p>

<hr />

<h2 id="3-the-6-regions-of-a-vmcs">3. The 6 Regions of a VMCS</h2>

<p>The VMCS is easier to understand once you stop viewing it as a giant opaque structure and instead split it into functional areas. Intel conceptually divides it into six regions.</p>

<p><strong>1. Guest-State Area.</strong> Guest CPU state: RIP, RSP, CR0/CR3/CR4, segment state, MSRs, etc. On VM entry, the CPU loads guest execution state from here. On VM exit, the CPU saves it back here. This is a <em>transition snapshot</em>, not a live view of the guest CPU.</p>

<p><strong>2. Host-State Area.</strong> Defines the state restored during VM exit: host RIP, host RSP, host CR3, segment selectors. Again, transition state — not a live host CPU mirror.</p>

<p><strong>3. VM-Execution Control Fields.</strong> What should cause VM exits? CPUID intercept, MOV CR3 intercept, EPT enable, MSR bitmaps, exception bitmap, etc.</p>

<p><strong>4. VM-Exit Control Fields.</strong> What happens during the exit transition: host address-space size, save/load debug controls, MSR switching behavior.</p>

<p><strong>5. VM-Entry Control Fields.</strong> What happens during the entry transition: IA-32e guest mode, event injection, MSR loading.</p>

<p><strong>6. VM-Exit Information Fields.</strong> Read-only diagnostic fields filled by the CPU during VM exits: <code class="language-plaintext highlighter-rouge">VM_EXIT_REASON</code>, <code class="language-plaintext highlighter-rouge">EXIT_QUALIFICATION</code>, <code class="language-plaintext highlighter-rouge">VM_EXIT_INTR_INFO</code>. These are critical for debugging — they answer <em>why</em> the VM exited.</p>

<hr />

<h2 id="4-vm_exit_reason-vs-vm_instruction_error">4. <code class="language-plaintext highlighter-rouge">VM_EXIT_REASON</code> vs <code class="language-plaintext highlighter-rouge">VM_INSTRUCTION_ERROR</code></h2>

<p>This is probably the single most common conceptual confusion in beginner VMX development. These two fields answer completely different questions.</p>

<h3 id="vm_exit_reason"><code class="language-plaintext highlighter-rouge">VM_EXIT_REASON</code></h3>

<blockquote>
  <p>Why did the guest stop executing?</p>
</blockquote>

<p>The important implication: <strong>the guest was successfully running</strong>, and a VM exit happened afterward.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>VMLAUNCH
  → guest runs
  → VMEXIT occurs
  → hypervisor resumes
  → VM_EXIT_REASON available
</code></pre></div></div>

<p>Typical reasons: CPUID intercept, EPT violation, exception, HLT, I/O instruction, MOV CR3 intercept. You read it after a successful VM exit via <code class="language-plaintext highlighter-rouge">VMREAD(VM_EXIT_REASON)</code>.</p>

<h3 id="vm_instruction_error"><code class="language-plaintext highlighter-rouge">VM_INSTRUCTION_ERROR</code></h3>

<blockquote>
  <p>Why did a VMX instruction fail?</p>
</blockquote>

<p>Completely different concept. Examples: <code class="language-plaintext highlighter-rouge">VMLAUNCH</code>, <code class="language-plaintext highlighter-rouge">VMRESUME</code>, <code class="language-plaintext highlighter-rouge">VMPTRLD</code>, <code class="language-plaintext highlighter-rouge">VMWRITE</code>. The guest may never have executed at all.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>VMLAUNCH
  → VMfailValid
  → VM_INSTRUCTION_ERROR explains why
</code></pre></div></div>

<p>This is not a VM exit. This is a VMX API failure.</p>

<h3 id="the-car-analogy">The Car Analogy</h3>

<p><code class="language-plaintext highlighter-rouge">VM_EXIT_REASON</code> — <em>the car was driving; why did it stop?</em> Red light, crash, police stop, engine failure.</p>

<p><code class="language-plaintext highlighter-rouge">VM_INSTRUCTION_ERROR</code> — <em>the car never started; why?</em> Invalid key, dead battery, broken engine.</p>

<h3 id="the-beginner-trap">The Beginner Trap</h3>

<p>A very common mistake:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (VMLAUNCH failed)
    read VM_EXIT_REASON;
</code></pre></div></div>

<p>Conceptually wrong. No VM exit occurred. The guest never ran. The correct field is <code class="language-plaintext highlighter-rouge">VM_INSTRUCTION_ERROR</code>.</p>

<h3 id="vmfailinvalid-vs-vmfailvalid"><code class="language-plaintext highlighter-rouge">VMfailInvalid</code> vs <code class="language-plaintext highlighter-rouge">VMfailValid</code></h3>

<p>Another distinction worth knowing.</p>

<p><strong><code class="language-plaintext highlighter-rouge">VMfailInvalid</code></strong> — fundamental VMX failure: invalid VMCS pointer, non-aligned VMCS, VMX disabled. There may be no valid <code class="language-plaintext highlighter-rouge">VM_INSTRUCTION_ERROR</code> to read.</p>

<p><strong><code class="language-plaintext highlighter-rouge">VMfailValid</code></strong> — the instruction was understood, but parameters or VMCS state are invalid. <code class="language-plaintext highlighter-rouge">VM_INSTRUCTION_ERROR</code> will contain a precise reason.</p>

<h3 id="common-vm_instruction_error-codes">Common <code class="language-plaintext highlighter-rouge">VM_INSTRUCTION_ERROR</code> Codes</h3>

<p>Worth keeping a printed table near your debugger. The ones that come up most often:</p>

<table>
  <thead>
    <tr>
      <th style="text-align: right">Code</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">4</td>
      <td><code class="language-plaintext highlighter-rouge">VMLAUNCH</code> with non-clear VMCS</td>
    </tr>
    <tr>
      <td style="text-align: right">5</td>
      <td><code class="language-plaintext highlighter-rouge">VMRESUME</code> with non-launched VMCS</td>
    </tr>
    <tr>
      <td style="text-align: right">7</td>
      <td>VM entry with invalid control fields</td>
    </tr>
    <tr>
      <td style="text-align: right">8</td>
      <td>VM entry with invalid host-state field</td>
    </tr>
    <tr>
      <td style="text-align: right">9</td>
      <td><code class="language-plaintext highlighter-rouge">VMPTRLD</code> with invalid physical address</td>
    </tr>
    <tr>
      <td style="text-align: right">10</td>
      <td><code class="language-plaintext highlighter-rouge">VMPTRLD</code> with <code class="language-plaintext highlighter-rouge">VMXON</code> pointer</td>
    </tr>
    <tr>
      <td style="text-align: right">11</td>
      <td><code class="language-plaintext highlighter-rouge">VMPTRLD</code> with incorrect VMCS revision identifier</td>
    </tr>
    <tr>
      <td style="text-align: right">12</td>
      <td><code class="language-plaintext highlighter-rouge">VMREAD</code>/<code class="language-plaintext highlighter-rouge">VMWRITE</code> to unsupported VMCS component</td>
    </tr>
    <tr>
      <td style="text-align: right">13</td>
      <td><code class="language-plaintext highlighter-rouge">VMWRITE</code> to read-only VMCS component</td>
    </tr>
    <tr>
      <td style="text-align: right">26</td>
      <td>VM entry with events blocked by MOV SS</td>
    </tr>
  </tbody>
</table>

<p>Codes 7 and 8 are by far the most common when wiring up a new hypervisor.</p>

<h3 id="practical-rule">Practical Rule</h3>

<ul>
  <li>If the guest executed and then exited → read <code class="language-plaintext highlighter-rouge">VM_EXIT_REASON</code>.</li>
  <li>If a VMX instruction itself failed → read <code class="language-plaintext highlighter-rouge">VM_INSTRUCTION_ERROR</code>.</li>
</ul>

<p>This distinction alone removes a massive amount of VMX debugging confusion.</p>

<hr />

<h2 id="5-debugging-exit-reason-0">5. Debugging Exit Reason 0</h2>

<p>One of the most confusing VM exits for beginners:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>EXIT_REASON = 0
EXIT_QUALIFICATION = 0
</code></pre></div></div>

<p>This often looks meaningless. It is not.</p>

<h3 id="exit-reason-0--exception-or-nmi">Exit Reason 0 = Exception or NMI</h3>

<p>A guest exception occurred (or an NMI), and VMX controls caused a VM exit. The key point: <strong><code class="language-plaintext highlighter-rouge">EXIT_QUALIFICATION</code> is usually not the important field here</strong>. The most important field is <code class="language-plaintext highlighter-rouge">VM_EXIT_INTR_INFO</code>.</p>

<h3 id="vm_exit_intr_info"><code class="language-plaintext highlighter-rouge">VM_EXIT_INTR_INFO</code></h3>

<p>This field tells you:</p>

<ul>
  <li>exception vector</li>
  <li>interruption type</li>
  <li>whether an error code exists</li>
  <li><strong>whether the field is valid (bit 31 — check this first)</strong></li>
</ul>

<p>If bit 31 is 0, the rest of the field is meaningless. Beginners often skip this check and chase a phantom vector. Always validate first, then decode.</p>

<p>Common vectors:</p>

<table>
  <thead>
    <tr>
      <th style="text-align: right">Vector</th>
      <th>Exception</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">3</td>
      <td><code class="language-plaintext highlighter-rouge">#BP</code> (breakpoint)</td>
    </tr>
    <tr>
      <td style="text-align: right">6</td>
      <td><code class="language-plaintext highlighter-rouge">#UD</code> (invalid opcode)</td>
    </tr>
    <tr>
      <td style="text-align: right">13</td>
      <td><code class="language-plaintext highlighter-rouge">#GP</code> (general protection)</td>
    </tr>
    <tr>
      <td style="text-align: right">14</td>
      <td><code class="language-plaintext highlighter-rouge">#PF</code> (page fault)</td>
    </tr>
  </tbody>
</table>

<h3 id="page-faults-pf-vector-14">Page Faults (<code class="language-plaintext highlighter-rouge">#PF</code>, vector 14)</h3>

<p><code class="language-plaintext highlighter-rouge">EXIT_QUALIFICATION</code> becomes useful — it contains page-fault semantics. Also inspect:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">GUEST_LINEAR_ADDRESS</code></li>
  <li>guest <code class="language-plaintext highlighter-rouge">CR2</code></li>
  <li>guest <code class="language-plaintext highlighter-rouge">RIP</code></li>
</ul>

<p>Important: a normal guest page fault is <strong>exit reason 0</strong>. An EPT violation is <strong>exit reason 48</strong>. Do not mix them up — they are routed by completely different logic in the CPU.</p>

<h3 id="general-protection-faults-gp-vector-13">General Protection Faults (<code class="language-plaintext highlighter-rouge">#GP</code>, vector 13)</h3>

<p>Common causes when bringing up a hypervisor: invalid segment state, bad CR4 setup, invalid MSRs, non-canonical addresses, broken host-state fields. Priority checks: guest <code class="language-plaintext highlighter-rouge">RIP</code>, guest <code class="language-plaintext highlighter-rouge">CS</code>, guest <code class="language-plaintext highlighter-rouge">CR3</code>, host-state validity.</p>

<h3 id="invalid-opcode-ud-vector-6">Invalid Opcode (<code class="language-plaintext highlighter-rouge">#UD</code>, vector 6)</h3>

<p>Likely causes: unsupported instruction, broken <code class="language-plaintext highlighter-rouge">RIP</code>, invalid execution mode, VMX instruction leaking into the guest, XSAVE/XRSTOR issues.</p>

<h3 id="useful-debug-checklist-for-exit-reason-0">Useful Debug Checklist for Exit Reason 0</h3>

<ol>
  <li>Read <code class="language-plaintext highlighter-rouge">VM_EXIT_INTR_INFO</code>. Check bit 31. Identify the vector.</li>
  <li>Read <code class="language-plaintext highlighter-rouge">GUEST_RIP</code>. Locate the crashing instruction.</li>
  <li>Read <code class="language-plaintext highlighter-rouge">VM_EXIT_INSTRUCTION_LEN</code>. Decode instruction bytes correctly.</li>
  <li>Read guest <code class="language-plaintext highlighter-rouge">CR3</code> / <code class="language-plaintext highlighter-rouge">CS</code> / <code class="language-plaintext highlighter-rouge">RSP</code>. Understand execution context.</li>
  <li>Read <code class="language-plaintext highlighter-rouge">GUEST_LINEAR_ADDRESS</code> if memory-related.</li>
  <li>Distinguish guest fault from EPT fault. (Big source of confusion.)</li>
</ol>

<hr />

<h2 id="6-suggested-reading-order-for-sdm-vol-3c">6. Suggested Reading Order for SDM Vol. 3C</h2>

<p>If I restarted VMX learning from zero, I would read the Intel SDM in a very different order from how it’s printed. The SDM is structured as a reference manual, not a tutorial.</p>

<p><strong>Start with VMX operation basics.</strong> VMX root vs non-root, VM entry, VM exit, VMCS lifecycle. Without this, the rest feels random.</p>

<p><strong>Then read the VMCS layout chapter.</strong> Focus on guest state, host state, control fields, exit information fields. Do not try to memorize encodings yet — build a mental structure first.</p>

<p><strong>Then read the VM-entry and VM-exit chapters.</strong> These explain what the CPU actually does, in which order, what gets validated, and what can fail. This section alone explains many mysterious crashes.</p>

<p><strong>Keep these tables open while coding:</strong></p>

<ul>
  <li>VM instruction error codes</li>
  <li>VM exit reason codes</li>
  <li>Control-field allowed settings (<code class="language-plaintext highlighter-rouge">IA32_VMX_PINBASED_CTLS</code>, <code class="language-plaintext highlighter-rouge">IA32_VMX_PROCBASED_CTLS</code>, etc.)</li>
</ul>

<p>You will reference them constantly.</p>

<h3 id="external-references">External References</h3>

<ul>
  <li><strong>HyperPlatform</strong> — very readable educational VMX codebase.</li>
  <li><strong>KVM</strong> (<code class="language-plaintext highlighter-rouge">arch/x86/kvm/vmx/</code>) — production-quality reference. Dense, but valuable.</li>
  <li><strong>Daax / Karvandi / Sina Karvandi (Hvpp / Hypervisor From Scratch)</strong> — some of the best practical VMX writing publicly available.</li>
</ul>

<hr />

<h2 id="conclusion">Conclusion</h2>

<p>VMX becomes much easier once you stop thinking of it as “magic CPU behavior” and instead treat it as state transitions, execution contracts, and strict validation rules.</p>

<p>Most beginner VMX debugging problems are not caused by advanced concepts. They come from:</p>

<ul>
  <li>misunderstanding VMCS lifecycle</li>
  <li>confusing VM exits with VMX instruction failures</li>
  <li>reading the wrong diagnostic field</li>
  <li>not knowing which state is <em>transition</em> state versus <em>live execution</em> state</li>
</ul>

<p>This article focused on those foundations. It did not cover EPT internals, VPID, APIC virtualization, MSR bitmaps, posted interrupts, or nested virtualization — each deserves its own write-up, and EPT will be next.</p>

<p>If you understand VMCS lifecycle, current vs launched VMCS, VM exit diagnostics, and the difference between VM exits and VMX failures, you’ve already eliminated a surprisingly large percentage of early hypervisor debugging pain.</p>]]></content><author><name></name></author><category term="hypervisor" /><category term="vmx" /><category term="intel" /><category term="reverse-engineering" /><category term="low-level" /><summary type="html"><![CDATA[If you start reading Intel VMX documentation seriously, you quickly notice something frustrating: the Intel SDM is extremely precise, but not pedagogical. It’s a reference manual, not a learning path.]]></summary></entry><entry><title type="html">EPT Internals: Understanding Intel’s Second Layer of Paging</title><link href="https://youssix.github.io/2026/05/18/ept-internals/" rel="alternate" type="text/html" title="EPT Internals: Understanding Intel’s Second Layer of Paging" /><published>2026-05-18T00:00:00+00:00</published><updated>2026-05-18T00:00:00+00:00</updated><id>https://youssix.github.io/2026/05/18/ept-internals</id><content type="html" xml:base="https://youssix.github.io/2026/05/18/ept-internals/"><![CDATA[<p>This is the follow-up to <a href="/2026/05/20/vmcs-by-practice/">VMCS by Practice</a>. If that article focused on getting a guest to <em>run</em>, this one focuses on controlling what it <em>sees</em> in memory.</p>

<p>Extended Page Tables (EPT) is Intel’s hardware-assisted memory virtualization. Before EPT, hypervisors had to use shadow page tables - maintaining a parallel set of page tables and trapping every guest page table modification. It worked, but it was slow and complex.</p>

<p>EPT adds a second translation layer directly in hardware. The guest manages its own page tables normally (GVA → GPA), and the CPU automatically translates guest physical addresses to host physical addresses (GPA → HPA) through EPT. No traps needed for normal memory access.</p>

<p>I’ll go through the structures, the walk, the common mistakes, and how to debug all of it.</p>

<hr />

<h2 id="1-the-two-layer-translation-model">1. The Two-Layer Translation Model</h2>

<p>Without virtualization, address translation is:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Virtual Address (VA) → Physical Address (PA)
          via CR3 → page tables
</code></pre></div></div>

<p>With EPT enabled, the guest still thinks it controls physical memory, but every “physical” address the guest produces is actually a <em>guest physical address</em> (GPA). The CPU then walks EPT to find the real host physical address:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Guest Virtual Address (GVA)
    → Guest Page Tables (controlled by guest CR3)
    → Guest Physical Address (GPA)
        → EPT Page Tables (controlled by EPTP in VMCS)
        → Host Physical Address (HPA)
</code></pre></div></div>

<p>The critical insight: <strong>the guest page table walk itself goes through EPT</strong>. When the CPU reads a guest PTE, that PTE lives at a GPA, which must be translated through EPT to find the actual memory. This means a single guest virtual address translation can trigger multiple EPT walks.</p>

<h3 id="the-cost-of-nested-translation">The Cost of Nested Translation</h3>

<p>A full 4-level guest page walk with 4-level EPT means up to <strong>20 memory accesses</strong> in the worst case (no TLB hits):</p>

<ul>
  <li>4 guest page table levels</li>
  <li>Each level requires an EPT walk (4 EPT levels each)</li>
  <li>4 × 4 = 16 EPT accesses + 4 guest table reads = 20 total</li>
</ul>

<p>In practice, TLB caching (especially with VPIDs and EPT-tagged TLB entries) reduces this dramatically. But knowing the worst case explains why EPT misconfigurations hurt so much.</p>

<hr />

<h2 id="2-ept-structure-layout">2. EPT Structure Layout</h2>

<p>EPT uses the same hierarchical structure as regular x86-64 paging: 4 levels, 512 entries per table, 8 bytes per entry.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>EPTP (in VMCS)
  → PML4 (Page Map Level 4)         512 entries, each covers 512 GB
    → PDPT (Page Directory Pointer)  512 entries, each covers 1 GB
      → PD (Page Directory)          512 entries, each covers 2 MB
        → PT (Page Table)            512 entries, each covers 4 KB
</code></pre></div></div>

<p>Each level uses 9 bits of the guest physical address:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>GPA bits:
  [47:39] → PML4 index
  [38:30] → PDPT index
  [29:21] → PD index
  [20:12] → PT index
  [11:0]  → page offset
</code></pre></div></div>

<h3 id="ept-entry-format">EPT Entry Format</h3>

<p>An EPT entry at any level has this basic structure:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Bits 2:0   → Read / Write / Execute permissions
Bit  7     → Large page (1 GB at PDPT level, 2 MB at PD level)
Bits 51:12 → Physical address of next level (or final page frame)
</code></pre></div></div>

<p>The permission bits are the most important part for security research:</p>

<table>
  <thead>
    <tr>
      <th style="text-align: right">Bit</th>
      <th>Permission</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">0</td>
      <td>Read</td>
      <td>Guest can read this memory</td>
    </tr>
    <tr>
      <td style="text-align: right">1</td>
      <td>Write</td>
      <td>Guest can write this memory</td>
    </tr>
    <tr>
      <td style="text-align: right">2</td>
      <td>Execute</td>
      <td>Guest can execute from this memory</td>
    </tr>
  </tbody>
</table>

<p>Setting all three to 0 on a valid mapping means any access causes an <strong>EPT violation</strong> (VM exit reason 48). This is the foundation of EPT-based memory monitoring.</p>

<h3 id="the-eptp-ept-pointer">The EPTP (EPT Pointer)</h3>

<p>The EPT pointer is stored in the VMCS and tells the CPU where the PML4 table lives:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Bits 2:0   → Memory type for EPT structures (typically 6 = write-back)
Bits 5:3   → EPT page walk length minus 1 (set to 3 for 4-level walk)
Bits 51:12 → Physical address of PML4 table (4 KB aligned)
</code></pre></div></div>

<p>A common beginner mistake: setting the memory type wrong or forgetting the walk length field. Both produce immediate VM entry failures with unhelpful error messages.</p>

<hr />

<h2 id="3-building-an-identity-map">3. Building an Identity Map</h2>

<p>The simplest EPT configuration maps all guest physical memory 1:1 to host physical memory. GPA 0x1000 maps to HPA 0x1000. The guest sees exactly the real physical layout.</p>

<p>This is not useful for isolation, but it is the correct first step when bringing up a hypervisor. Get identity mapping working before attempting anything more complex.</p>

<h3 id="allocation-strategy">Allocation Strategy</h3>

<p>For a system with 4 GB of physical memory:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PML4:  1 table   (covers up to 256 TB)
PDPT:  1 table   (covers up to 512 GB, we need ~4 GB)
PD:    4 tables  (each covers 1 GB, 4 × 1 GB = 4 GB)
PT:    0 tables  (use 2 MB large pages to avoid this level)
</code></pre></div></div>

<p>Using 2 MB large pages simplifies the initial setup significantly. Set bit 7 in the PD entries to enable large pages:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(</span><span class="kt">uint64_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">2048</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>  <span class="c1">// 2048 × 2 MB = 4 GB</span>
    <span class="kt">uint64_t</span> <span class="n">pd_index</span> <span class="o">=</span> <span class="n">i</span> <span class="o">%</span> <span class="mi">512</span><span class="p">;</span>
    <span class="kt">uint64_t</span> <span class="n">pdpt_index</span> <span class="o">=</span> <span class="n">i</span> <span class="o">/</span> <span class="mi">512</span><span class="p">;</span>

    <span class="n">pd_tables</span><span class="p">[</span><span class="n">pdpt_index</span><span class="p">][</span><span class="n">pd_index</span><span class="p">]</span> <span class="o">=</span>
        <span class="p">(</span><span class="n">i</span> <span class="o">*</span> <span class="mh">0x200000</span><span class="p">)</span>    <span class="c1">// physical address</span>
        <span class="o">|</span> <span class="p">(</span><span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="mi">7</span><span class="p">)</span>        <span class="c1">// large page</span>
        <span class="o">|</span> <span class="mh">0x7</span><span class="p">;</span>            <span class="c1">// read + write + execute</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="common-identity-map-mistakes">Common Identity Map Mistakes</h3>

<p>MTRR interaction will get you. The memory type in EPT entries interacts with MTRR (Memory Type Range Registers). For identity mapping, write-back (6) works for RAM regions, but MMIO regions need uncacheable (0). I spent way too long debugging subtle corruption before realizing this.</p>

<p>MMIO ranges need to be covered too. Device MMIO regions (like the local APIC at <code class="language-plaintext highlighter-rouge">0xFEE00000</code>) must also be mapped. Missing MMIO mappings cause EPT violations when the guest accesses hardware.</p>

<p>Also watch out for physical address width. Not all CPUs support 48-bit physical addresses. Check <code class="language-plaintext highlighter-rouge">CPUID.80000008H:EAX[7:0]</code> for the actual width. Mapping beyond what the hardware supports causes undefined behavior.</p>

<hr />

<h2 id="4-ept-violations-vm-exit-reason-48">4. EPT Violations (VM Exit Reason 48)</h2>

<p>When the guest accesses memory in a way that violates EPT permissions, the CPU generates a VM exit with reason 48. This is the most important EPT-related exit for security research.</p>

<h3 id="exit-qualification">Exit Qualification</h3>

<p>For EPT violations, <code class="language-plaintext highlighter-rouge">EXIT_QUALIFICATION</code> contains detailed information about what happened:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Bit 0  → caused by data read
Bit 1  → caused by data write
Bit 2  → caused by instruction fetch
Bit 3  → EPT entry read permission (at the faulting level)
Bit 4  → EPT entry write permission
Bit 5  → EPT entry execute permission
Bit 7  → GPA is valid in GUEST_PHYSICAL_ADDRESS field
Bit 8  → fault was a GPA translation (not a page walk)
</code></pre></div></div>

<h3 id="relevant-vmcs-fields">Relevant VMCS Fields</h3>

<p>After an EPT violation, read these fields:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">GUEST_PHYSICAL_ADDRESS</code> - the GPA that caused the violation</li>
  <li><code class="language-plaintext highlighter-rouge">GUEST_LINEAR_ADDRESS</code> - the GVA the guest was accessing (if bit 7 of qualification is set)</li>
  <li><code class="language-plaintext highlighter-rouge">GUEST_RIP</code> - where the guest was executing</li>
  <li><code class="language-plaintext highlighter-rouge">VM_EXIT_INSTRUCTION_LEN</code> - length of the faulting instruction</li>
</ul>

<h3 id="ept-violation-vs-page-fault">EPT Violation vs Page Fault</h3>

<p>This is a critical distinction that confuses beginners:</p>

<p><strong>Page fault (exit reason 0, vector 14):</strong> The guest’s <em>own</em> page tables rejected the access. The guest OS would normally handle this via its page fault handler. The hypervisor sees it only if exception bitmap bit 14 is set.</p>

<p><strong>EPT violation (exit reason 48):</strong> The guest’s page tables were fine, but EPT rejected the GPA → HPA translation. The guest OS has no idea this happened. Only the hypervisor sees it.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>GVA → guest page tables → GPA → EPT → HPA
         page fault ↑              ↑ EPT violation
</code></pre></div></div>

<h3 id="ept-misconfiguration-exit-reason-49">EPT Misconfiguration (Exit Reason 49)</h3>

<p>Different from EPT violations. A misconfiguration means the EPT entry itself is structurally invalid - for example, a write-only page (write=1, read=0) which Intel does not allow. The CPU cannot meaningfully process the entry.</p>

<p>EPT misconfigurations usually indicate a hypervisor bug, not a guest behavior issue. Check your EPT construction logic when you see exit reason 49.</p>

<hr />

<h2 id="5-ept-based-memory-monitoring">5. EPT-Based Memory Monitoring</h2>

<p>The reason EPT matters for security research: it provides transparent memory access control below the operating system.</p>

<h3 id="readwrite-monitoring">Read/Write Monitoring</h3>

<p>Set an EPT page to execute-only (read=0, write=0, execute=1). Any data read or write by the guest triggers an EPT violation, but code execution continues normally.</p>

<p>Use case: monitoring access to sensitive structures without modifying the guest. The guest kernel cannot detect this monitoring because EPT operates below its privilege level.</p>

<h3 id="execute-monitoring">Execute Monitoring</h3>

<p>Set an EPT page to read/write but not execute (read=1, write=1, execute=0). Any instruction fetch from that page triggers an EPT violation.</p>

<p>Use case: detecting code execution in data regions, monitoring shellcode, tracking JIT compilation.</p>

<h3 id="the-split-tlb-approach">The Split-TLB Approach</h3>

<p>A more advanced technique: maintain two EPT views of the same physical memory.</p>

<ul>
  <li><strong>View A</strong>: read+write, no execute. Used for data access.</li>
  <li><strong>View B</strong>: execute-only, no read/write. Used for code execution.</li>
</ul>

<p>Switch between views on EPT violations. This allows monitoring both code execution and data access to the same memory region, which is useful for detecting self-modifying code or analyzing packed executables.</p>

<p>The complexity cost is significant. Each EPT violation requires determining whether to switch views, and high-frequency switching destroys performance. This is a research technique, not a production approach.</p>

<hr />

<h2 id="6-debugging-ept-issues">6. Debugging EPT Issues</h2>

<p>EPT bugs are hard to debug because the symptoms are indirect. The guest crashes, hangs, or behaves incorrectly, with no obvious connection to EPT.</p>

<h3 id="diagnostic-checklist">Diagnostic Checklist</h3>

<p><strong>Guest triple-faults immediately after VMLAUNCH:</strong></p>
<ul>
  <li>EPT identity map is incomplete. The guest’s first instruction fetch hits an unmapped GPA.</li>
  <li>Check that the guest RIP’s physical page is mapped with execute permission.</li>
</ul>

<p><strong>Guest runs but crashes accessing devices:</strong></p>
<ul>
  <li>MMIO regions are not mapped in EPT.</li>
  <li>Map at minimum: local APIC (<code class="language-plaintext highlighter-rouge">0xFEE00000</code>), IOAPIC, any device the guest uses.</li>
</ul>

<p><strong>Random guest corruption:</strong></p>
<ul>
  <li>Memory type mismatch. EPT entry memory type conflicts with MTRR settings.</li>
  <li>For RAM, use write-back. For MMIO, use uncacheable.</li>
</ul>

<p><strong>Performance is unexpectedly terrible:</strong></p>
<ul>
  <li>Using 4 KB pages where 2 MB pages would work. Every TLB miss is more expensive with smaller pages.</li>
  <li>VPID not enabled. Without VPID, every VM exit/entry flushes TLB entries.</li>
</ul>

<h3 id="ept-walk-validation">EPT Walk Validation</h3>

<p>When debugging, manually walk the EPT for a suspect GPA:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Given GPA = 0x00000000_001F5000:

PML4 index   = (GPA &gt;&gt; 39) &amp; 0x1FF = 0
PDPT index   = (GPA &gt;&gt; 30) &amp; 0x1FF = 0
PD index     = (GPA &gt;&gt; 21) &amp; 0x1FF = 0
PT index     = (GPA &gt;&gt; 12) &amp; 0x1FF = 0x1F5

Walk: EPTP → PML4[0] → PDPT[0] → PD[0] → PT[0x1F5]
</code></pre></div></div>

<p>At each level, verify:</p>
<ol>
  <li>The entry is present (at least one permission bit set, or the entry is valid)</li>
  <li>The physical address in the entry points to a real allocated table</li>
  <li>Permission bits match your intent</li>
</ol>

<hr />

<h2 id="7-ept-in-defensive-security">7. EPT in Defensive Security</h2>

<p>This is where EPT gets interesting from a defensive perspective. A bunch of modern security products rely on EPT, and knowing how they use it helps you evaluate what they actually protect.</p>

<p>EPT can make kernel code pages non-writable at the hardware level. Any attempt to patch kernel code - a common rootkit technique - triggers an EPT violation. Microsoft’s HVCI relies on this principle.</p>

<p>During incident response, a hypervisor can read guest physical memory through EPT without being subverted by kernel-level rootkits that manipulate the OS’s own page tables. Forensic analysts get a trustworthy view of memory that the OS itself can’t tamper with.</p>

<p>EPT execute-monitoring can also detect when data regions start executing code - a strong indicator of exploitation. EDR products increasingly use hypervisor-based telemetry for exactly this, because it operates below the level where malware can interfere.</p>

<p>Windows Credential Guard uses a separate VTL (Virtual Trust Level) with its own EPT mapping to isolate LSASS secrets. Even if an attacker gains kernel access in VTL0, EPT prevents reading the isolated memory.</p>

<p>All of these rely on the same thing: EPT gives you a monitoring layer that the OS and anything running inside it cannot see or bypass.</p>

<hr />

<h2 id="conclusion">Conclusion</h2>

<p>EPT transforms a hypervisor from “a thing that runs a guest” into “a thing that controls what the guest sees.” The identity map gets things working. Permission manipulation makes things interesting.</p>

<p>Quick recap of what matters:</p>

<ul>
  <li>Two-layer translation: GVA → GPA → HPA, each with its own page tables</li>
  <li>EPT violations (reason 48) are your primary tool for memory monitoring</li>
  <li>EPT violations are not page faults - different translation layer entirely</li>
  <li>Start with identity mapping using 2 MB large pages</li>
  <li>Memory type and MTRR interaction causes the most subtle bugs</li>
  <li>EPT-based monitoring sits below the OS, which is what makes it powerful</li>
</ul>

<p>Next up: the PEB (Process Environment Block) on Windows. Completely different domain, but the same theme of understanding internal structures to do useful security work.</p>]]></content><author><name></name></author><category term="hypervisor" /><category term="ept" /><category term="intel" /><category term="memory" /><category term="low-level" /><summary type="html"><![CDATA[This is the follow-up to VMCS by Practice. If that article focused on getting a guest to run, this one focuses on controlling what it sees in memory.]]></summary></entry><entry><title type="html">PEB Internals: What the Process Environment Block Reveals and Why Defenders Care</title><link href="https://youssix.github.io/2026/05/15/peb-windows-internals/" rel="alternate" type="text/html" title="PEB Internals: What the Process Environment Block Reveals and Why Defenders Care" /><published>2026-05-15T00:00:00+00:00</published><updated>2026-05-15T00:00:00+00:00</updated><id>https://youssix.github.io/2026/05/15/peb-windows-internals</id><content type="html" xml:base="https://youssix.github.io/2026/05/15/peb-windows-internals/"><![CDATA[<p>Every process on Windows has a Process Environment Block (PEB). Most developers never interact with it directly - the Win32 API abstracts everything away. But if you’re doing malware analysis, EDR engineering, or any kind of defensive work, the PEB comes up constantly.</p>

<hr />

<h2 id="1-what-is-the-peb">1. What Is the PEB?</h2>

<p>The PEB is a user-mode structure that the Windows kernel creates for every process. It lives in the process’s own address space, readable from user mode without any system calls. This is by design - many common operations (checking if the debugger is attached, enumerating loaded modules, reading environment variables) need this data without the overhead of a syscall.</p>

<p>The PEB is accessible through the TEB (Thread Environment Block), which itself is pointed to by the <code class="language-plaintext highlighter-rouge">GS</code> segment register on x64:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// x64: TEB is at GS:[0x30], PEB is at TEB+0x60</span>
<span class="n">PEB</span><span class="o">*</span> <span class="n">peb</span> <span class="o">=</span> <span class="p">(</span><span class="n">PEB</span><span class="o">*</span><span class="p">)</span><span class="n">__readgsqword</span><span class="p">(</span><span class="mh">0x60</span><span class="p">);</span>
</code></pre></div></div>

<p>On x86:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// x86: TEB is at FS:[0x18], PEB is at TEB+0x30</span>
<span class="n">PEB</span><span class="o">*</span> <span class="n">peb</span> <span class="o">=</span> <span class="p">(</span><span class="n">PEB</span><span class="o">*</span><span class="p">)</span><span class="n">__readfsdword</span><span class="p">(</span><span class="mh">0x30</span><span class="p">);</span>
</code></pre></div></div>

<h3 id="why-this-matters-for-defense">Why This Matters for Defense</h3>

<p>Because the PEB is in user-mode memory, any code running in the process can read <em>and modify</em> it. This is both a feature and a security concern. Legitimate code reads the PEB for process information. Malware modifies the PEB to hide its tracks.</p>

<hr />

<h2 id="2-key-peb-fields">2. Key PEB Fields</h2>

<p>The PEB is large and version-dependent. These are the fields that matter most for security analysis:</p>

<h3 id="beingdebugged-offset-0x02"><code class="language-plaintext highlighter-rouge">BeingDebugged</code> (offset 0x02)</h3>

<p>A single byte. Set to 1 when a debugger is attached via <code class="language-plaintext highlighter-rouge">DebugActiveProcess</code> or when the process is started under a debugger.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="n">peb</span><span class="o">-&gt;</span><span class="n">BeingDebugged</span><span class="p">)</span> <span class="p">{</span>
    <span class="c1">// debugger is attached</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is equivalent to calling <code class="language-plaintext highlighter-rouge">IsDebuggerPresent()</code>, which literally just reads this byte. Malware commonly checks this field and alters its behavior. Anti-analysis code typically modifies this field to 0 to hide the debugger.</p>

<p>If your EDR sees a process where <code class="language-plaintext highlighter-rouge">BeingDebugged</code> is 0 but debug events are active on the process, something is manipulating the PEB.</p>

<h3 id="ldr-offset-0x18---peb_ldr_data"><code class="language-plaintext highlighter-rouge">Ldr</code> (offset 0x18) - PEB_LDR_DATA</h3>

<p>Pointer to the loader data structure, which contains three linked lists of loaded modules:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">InLoadOrderModuleList</code> - modules in load order</li>
  <li><code class="language-plaintext highlighter-rouge">InMemoryOrderModuleList</code> - modules in memory address order</li>
  <li><code class="language-plaintext highlighter-rouge">InInitializationOrderModuleList</code> - modules in initialization order</li>
</ul>

<p>Each entry is an <code class="language-plaintext highlighter-rouge">LDR_DATA_TABLE_ENTRY</code> containing:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="k">struct</span> <span class="n">_LDR_DATA_TABLE_ENTRY</span> <span class="p">{</span>
    <span class="n">LIST_ENTRY</span> <span class="n">InLoadOrderLinks</span><span class="p">;</span>
    <span class="n">LIST_ENTRY</span> <span class="n">InMemoryOrderLinks</span><span class="p">;</span>
    <span class="n">LIST_ENTRY</span> <span class="n">InInitializationOrderLinks</span><span class="p">;</span>
    <span class="n">PVOID</span> <span class="n">DllBase</span><span class="p">;</span>               <span class="c1">// base address of the module</span>
    <span class="n">PVOID</span> <span class="n">EntryPoint</span><span class="p">;</span>            <span class="c1">// entry point</span>
    <span class="n">ULONG</span> <span class="n">SizeOfImage</span><span class="p">;</span>           <span class="c1">// size in memory</span>
    <span class="n">UNICODE_STRING</span> <span class="n">FullDllName</span><span class="p">;</span>  <span class="c1">// full path</span>
    <span class="n">UNICODE_STRING</span> <span class="n">BaseDllName</span><span class="p">;</span>  <span class="c1">// just the filename</span>
    <span class="c1">// ... more fields</span>
<span class="p">}</span> <span class="n">LDR_DATA_TABLE_ENTRY</span><span class="p">;</span>
</code></pre></div></div>

<p>Walking these lists is how tools like Process Explorer enumerate loaded DLLs. But malware can unlink entries from these lists to hide injected DLLs. The module is still loaded in memory, but PEB-based enumeration won’t find it.</p>

<h3 id="processparameters-offset-0x20---rtl_user_process_parameters"><code class="language-plaintext highlighter-rouge">ProcessParameters</code> (offset 0x20) - RTL_USER_PROCESS_PARAMETERS</h3>

<p>Contains the command line, current directory, environment variables, image path, and window information:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">RTL_USER_PROCESS_PARAMETERS</span><span class="o">*</span> <span class="n">params</span> <span class="o">=</span> <span class="n">peb</span><span class="o">-&gt;</span><span class="n">ProcessParameters</span><span class="p">;</span>
<span class="c1">// params-&gt;CommandLine      - full command line</span>
<span class="c1">// params-&gt;ImagePathName    - path to the executable</span>
<span class="c1">// params-&gt;Environment      - environment variable block</span>
<span class="c1">// params-&gt;CurrentDirectory - working directory</span>
</code></pre></div></div>

<p>Malware can modify <code class="language-plaintext highlighter-rouge">ImagePathName</code> to make the process appear to be running from a different location. Comparing PEB <code class="language-plaintext highlighter-rouge">ImagePathName</code> with the actual executable path (from kernel structures) reveals this tampering.</p>

<h3 id="ntglobalflag-offset-0x68-on-x86-0xbc-on-x64"><code class="language-plaintext highlighter-rouge">NtGlobalFlag</code> (offset 0x68 on x86, 0xBC on x64)</h3>

<p>Debug-related flags set by the OS. When a debugger creates a process, certain flags are set:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">FLG_HEAP_ENABLE_TAIL_CHECK</code> (0x10)</li>
  <li><code class="language-plaintext highlighter-rouge">FLG_HEAP_ENABLE_FREE_CHECK</code> (0x20)</li>
  <li><code class="language-plaintext highlighter-rouge">FLG_HEAP_VALIDATE_PARAMETERS</code> (0x40)</li>
</ul>

<p>Combined value when debugging: <code class="language-plaintext highlighter-rouge">0x70</code>.</p>

<p>Malware checks this: if <code class="language-plaintext highlighter-rouge">NtGlobalFlag</code> contains <code class="language-plaintext highlighter-rouge">0x70</code>, a debugger likely started the process. This is a more subtle check than <code class="language-plaintext highlighter-rouge">BeingDebugged</code> and is missed by naive anti-anti-debug tools.</p>

<hr />

<h2 id="3-peb-based-module-enumeration">3. PEB-Based Module Enumeration</h2>

<p>Walking the PEB loader lists is the standard way to enumerate modules from user mode without calling <code class="language-plaintext highlighter-rouge">EnumProcessModules</code> or <code class="language-plaintext highlighter-rouge">CreateToolhelp32Snapshot</code> - API calls that security tools monitor.</p>

<h3 id="the-walk">The Walk</h3>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">PEB</span><span class="o">*</span> <span class="n">peb</span> <span class="o">=</span> <span class="n">get_peb</span><span class="p">();</span>
<span class="n">PEB_LDR_DATA</span><span class="o">*</span> <span class="n">ldr</span> <span class="o">=</span> <span class="n">peb</span><span class="o">-&gt;</span><span class="n">Ldr</span><span class="p">;</span>
<span class="n">LIST_ENTRY</span><span class="o">*</span> <span class="n">head</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">ldr</span><span class="o">-&gt;</span><span class="n">InLoadOrderModuleList</span><span class="p">;</span>
<span class="n">LIST_ENTRY</span><span class="o">*</span> <span class="n">current</span> <span class="o">=</span> <span class="n">head</span><span class="o">-&gt;</span><span class="n">Flink</span><span class="p">;</span>

<span class="k">while</span> <span class="p">(</span><span class="n">current</span> <span class="o">!=</span> <span class="n">head</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">LDR_DATA_TABLE_ENTRY</span><span class="o">*</span> <span class="n">entry</span> <span class="o">=</span>
        <span class="n">CONTAINING_RECORD</span><span class="p">(</span><span class="n">current</span><span class="p">,</span> <span class="n">LDR_DATA_TABLE_ENTRY</span><span class="p">,</span> <span class="n">InLoadOrderLinks</span><span class="p">);</span>

    <span class="c1">// entry-&gt;BaseDllName.Buffer - module name</span>
    <span class="c1">// entry-&gt;DllBase            - base address</span>
    <span class="c1">// entry-&gt;SizeOfImage        - size</span>

    <span class="n">current</span> <span class="o">=</span> <span class="n">current</span><span class="o">-&gt;</span><span class="n">Flink</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="why-malware-does-this">Why Malware Does This</h3>

<p>Calling <code class="language-plaintext highlighter-rouge">GetModuleHandle</code> or <code class="language-plaintext highlighter-rouge">LoadLibrary</code> goes through the Windows API, which EDR products hook. PEB walking achieves the same result (finding a module base address) without touching any hooked functions.</p>

<p>This is a common pattern in malware:</p>

<ol>
  <li>Walk PEB to find <code class="language-plaintext highlighter-rouge">kernel32.dll</code> base address</li>
  <li>Parse its export table to find <code class="language-plaintext highlighter-rouge">GetProcAddress</code></li>
  <li>Use <code class="language-plaintext highlighter-rouge">GetProcAddress</code> to resolve everything else</li>
</ol>

<p>No API calls that an EDR can intercept in the traditional sense.</p>

<h3 id="how-to-catch-it">How to Catch It</h3>

<p>Even if malware avoids API hooks, ETW providers at the kernel level still see module loads. Comparing ETW module load events with PEB module lists reveals unlinking.</p>

<p>You can also scan process memory for PE headers (<code class="language-plaintext highlighter-rouge">MZ</code> / <code class="language-plaintext highlighter-rouge">PE</code> signatures) and compare against the PEB module list. If there’s a PE header in memory that doesn’t appear in the loader lists, something was unlinked.</p>

<p>Periodically comparing PEB module lists against kernel-side structures (<code class="language-plaintext highlighter-rouge">EPROCESS.VadRoot</code>, kernel module lists) works too. The kernel still tracks the memory regions even after PEB unlinking.</p>

<hr />

<h2 id="4-peb-manipulation-techniques-and-detection">4. PEB Manipulation Techniques and Detection</h2>

<h3 id="module-unlinking">Module Unlinking</h3>

<p>The most common PEB manipulation: removing a <code class="language-plaintext highlighter-rouge">LDR_DATA_TABLE_ENTRY</code> from the three loader lists. After unlinking, the module is still loaded and functional, but will not appear when enumerating modules via the PEB.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">unlink_module</span><span class="p">(</span><span class="n">LDR_DATA_TABLE_ENTRY</span><span class="o">*</span> <span class="n">entry</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">entry</span><span class="o">-&gt;</span><span class="n">InLoadOrderLinks</span><span class="p">.</span><span class="n">Blink</span><span class="o">-&gt;</span><span class="n">Flink</span> <span class="o">=</span> <span class="n">entry</span><span class="o">-&gt;</span><span class="n">InLoadOrderLinks</span><span class="p">.</span><span class="n">Flink</span><span class="p">;</span>
    <span class="n">entry</span><span class="o">-&gt;</span><span class="n">InLoadOrderLinks</span><span class="p">.</span><span class="n">Flink</span><span class="o">-&gt;</span><span class="n">Blink</span> <span class="o">=</span> <span class="n">entry</span><span class="o">-&gt;</span><span class="n">InLoadOrderLinks</span><span class="p">.</span><span class="n">Blink</span><span class="p">;</span>
    <span class="c1">// ... same for the other two lists</span>
<span class="p">}</span>
</code></pre></div></div>

<p>To catch this, compare the PEB module list against:</p>
<ul>
  <li>VAD (Virtual Address Descriptor) tree from kernel mode - memory regions are still tracked</li>
  <li>PE header scanning in the process address space</li>
  <li>ETW module load events (the load was logged before unlinking happened)</li>
</ul>

<h3 id="imagepathname-spoofing">ImagePathName Spoofing</h3>

<p>Overwriting <code class="language-plaintext highlighter-rouge">ProcessParameters-&gt;ImagePathName</code> to a different path. Some security tools trust this field to identify what binary is running.</p>

<p>Compare with <code class="language-plaintext highlighter-rouge">EPROCESS.ImageFileName</code> in kernel mode, or with the <code class="language-plaintext highlighter-rouge">QueryFullProcessImageName</code> result, which reads from kernel structures.</p>

<h3 id="beingdebugged--ntglobalflag-clearing">BeingDebugged / NtGlobalFlag Clearing</h3>

<p>Malware zeroes these fields to evade anti-debug checks by analysis tools or its own anti-analysis routines.</p>

<p>From a debugger, compare expected debug state with PEB values. Tools like ScyllaHide do the reverse (clear these fields to help analysts), which is useful during malware analysis.</p>

<h3 id="commandline-modification">CommandLine Modification</h3>

<p>Overwriting <code class="language-plaintext highlighter-rouge">ProcessParameters-&gt;CommandLine</code> after process creation. Process creation events capture the original command line, but tools querying the PEB later see the modified version.</p>

<p>Process creation events (Sysmon Event ID 1, ETW) capture the original command line. Comparing with the PEB reveals tampering.</p>

<hr />

<h2 id="5-peb-and-the-heap">5. PEB and the Heap</h2>

<p>The PEB contains pointers to process heaps:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">PVOID</span>  <span class="n">ProcessHeap</span><span class="p">;</span>              <span class="c1">// default process heap</span>
<span class="n">ULONG</span>  <span class="n">NumberOfHeaps</span><span class="p">;</span>            <span class="c1">// total heap count</span>
<span class="n">ULONG</span>  <span class="n">MaximumNumberOfHeaps</span><span class="p">;</span>     <span class="c1">// maximum heap count</span>
<span class="n">PVOID</span><span class="o">*</span> <span class="n">ProcessHeaps</span><span class="p">;</span>             <span class="c1">// array of heap pointers</span>
</code></pre></div></div>

<h3 id="heap-based-debug-detection">Heap-Based Debug Detection</h3>

<p>The process heap (from <code class="language-plaintext highlighter-rouge">GetProcessHeap()</code> or <code class="language-plaintext highlighter-rouge">PEB-&gt;ProcessHeap</code>) has debug-specific flags when a debugger creates the process:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Heap-&gt;Flags</code> should be <code class="language-plaintext highlighter-rouge">HEAP_GROWABLE</code> (0x02) normally</li>
  <li>Under a debugger: additional flags like <code class="language-plaintext highlighter-rouge">HEAP_TAIL_CHECKING_ENABLED</code>, <code class="language-plaintext highlighter-rouge">HEAP_FREE_CHECKING_ENABLED</code></li>
  <li><code class="language-plaintext highlighter-rouge">Heap-&gt;ForceFlags</code> should be 0 normally, non-zero under debugger</li>
</ul>

<p>This is a more reliable anti-debug check than <code class="language-plaintext highlighter-rouge">BeingDebugged</code> because many anti-anti-debug tools forget to patch heap flags.</p>

<p>This one has bitten me during analysis. You patch <code class="language-plaintext highlighter-rouge">BeingDebugged</code> thinking you’re clean, and the malware still detects you because you forgot about the heap flags. A lot of anti-anti-debug tools miss this.</p>

<hr />

<h2 id="6-peb-across-windows-versions">6. PEB Across Windows Versions</h2>

<p>The PEB grows with each Windows version. Microsoft adds fields but does not remove them, maintaining backward compatibility. Key additions over time:</p>

<ul>
  <li><strong>Windows Vista:</strong> Added <code class="language-plaintext highlighter-rouge">AppCompatFlags</code>, <code class="language-plaintext highlighter-rouge">AppCompatFlagsUser</code></li>
  <li><strong>Windows 8:</strong> Added <code class="language-plaintext highlighter-rouge">AppModelPolicy</code></li>
  <li><strong>Windows 10:</strong> Added <code class="language-plaintext highlighter-rouge">LeapSecondData</code>, <code class="language-plaintext highlighter-rouge">ActiveCodePage</code></li>
  <li><strong>Windows 11:</strong> Various additions for security mitigations</li>
</ul>

<h3 id="practical-implication">Practical Implication</h3>

<p>When writing tools that read the PEB, always verify the Windows version and use the correct structure layout. Using the wrong offsets causes silent reads of wrong fields. I’ve seen tools read garbage for months because of this.</p>

<p>The best approach: use <code class="language-plaintext highlighter-rouge">NtQueryInformationProcess(ProcessBasicInformation)</code> to get the PEB address, then read fields at known offsets rather than relying on a compiled structure definition that might not match the running OS version.</p>

<hr />

<h2 id="7-defensive-tooling-using-peb-analysis">7. Defensive Tooling Using PEB Analysis</h2>

<h3 id="integrity-monitoring">Integrity Monitoring</h3>

<p>A lightweight detection approach: periodically snapshot PEB state and compare:</p>

<ol>
  <li>Module list: any entries added or removed since last check?</li>
  <li><code class="language-plaintext highlighter-rouge">ImagePathName</code>: still matches the real binary path?</li>
  <li><code class="language-plaintext highlighter-rouge">BeingDebugged</code> / <code class="language-plaintext highlighter-rouge">NtGlobalFlag</code>: consistent with actual debug state?</li>
  <li><code class="language-plaintext highlighter-rouge">CommandLine</code>: matches process creation event?</li>
</ol>

<p>Changes to any of these fields outside of expected operations are strong indicators of compromise or tampering.</p>

<h3 id="edr-integration">EDR Integration</h3>

<p>Modern EDRs combine PEB inspection with other telemetry:</p>

<ul>
  <li>PEB module list + VAD scan + ETW events = comprehensive module visibility</li>
  <li>PEB <code class="language-plaintext highlighter-rouge">CommandLine</code> + Sysmon creation event = tamper detection</li>
  <li>PEB heap analysis + debug state = environment fingerprinting detection</li>
</ul>

<h3 id="forensic-analysis">Forensic Analysis</h3>

<p>During incident response, dumping the PEB provides immediate context:</p>

<ul>
  <li>What modules are loaded (and what’s hiding)?</li>
  <li>What was the real command line?</li>
  <li>Is the process environment modified?</li>
  <li>Are there signs of debugger evasion?</li>
</ul>

<p>Tools like Volatility can extract PEB data from memory dumps, making this analysis possible even on dead systems.</p>

<hr />

<h2 id="conclusion">Conclusion</h2>

<p>The PEB is small but it comes up everywhere. Attackers read it to avoid API hooks. They modify it to hide their presence. Defenders read it to catch that modification.</p>

<p>The big points: the PEB is user-mode writable, so always assume it can be tampered with. Module enumeration via PEB is a malware staple because it avoids hooked APIs. Defensive detection works by comparing PEB state against kernel-side ground truth. And debug detection through PEB goes well beyond <code class="language-plaintext highlighter-rouge">IsDebuggerPresent</code> - heap flags and <code class="language-plaintext highlighter-rouge">NtGlobalFlag</code> catch a lot of analysts off guard.</p>

<p>Every detection rule for PEB tampering starts with knowing how the tampering actually works.</p>]]></content><author><name></name></author><category term="windows" /><category term="internals" /><category term="defense" /><category term="malware-analysis" /><category term="low-level" /><summary type="html"><![CDATA[Every process on Windows has a Process Environment Block (PEB). Most developers never interact with it directly - the Win32 API abstracts everything away. But if you’re doing malware analysis, EDR engineering, or any kind of defensive work, the PEB comes up constantly.]]></summary></entry><entry><title type="html">VMT Hooking: How It Works and How to Detect It</title><link href="https://youssix.github.io/2026/05/12/vmt-hooking-detection/" rel="alternate" type="text/html" title="VMT Hooking: How It Works and How to Detect It" /><published>2026-05-12T00:00:00+00:00</published><updated>2026-05-12T00:00:00+00:00</updated><id>https://youssix.github.io/2026/05/12/vmt-hooking-detection</id><content type="html" xml:base="https://youssix.github.io/2026/05/12/vmt-hooking-detection/"><![CDATA[<p>Virtual Method Table (VMT) hooking is one of the oldest and most reliable hooking techniques on Windows. It exploits a fundamental C++ runtime mechanism - the vtable - to redirect virtual function calls without patching any code bytes. No code modification means most integrity scanners miss it entirely.</p>

<hr />

<h2 id="1-c-virtual-method-tables">1. C++ Virtual Method Tables</h2>

<p>Every C++ class with at least one virtual function has a vtable: a static array of function pointers, one per virtual method, in declaration order.</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">IRenderer</span> <span class="p">{</span>
<span class="nl">public:</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">Initialize</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>  <span class="c1">// vtable[0]</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">BeginFrame</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>  <span class="c1">// vtable[1]</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">EndFrame</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>    <span class="c1">// vtable[2]</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">Present</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>     <span class="c1">// vtable[3]</span>
<span class="p">};</span>
</code></pre></div></div>

<p>When an object is instantiated, its first 8 bytes (on x64) are a pointer to the class vtable:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Object in memory:
  +0x00: vtable pointer → [Initialize, BeginFrame, EndFrame, Present]
  +0x08: member data...
</code></pre></div></div>

<p>A virtual call like <code class="language-plaintext highlighter-rouge">renderer-&gt;Present()</code> compiles to:</p>

<pre><code class="language-asm">mov  rax, [rcx]          ; load vtable pointer from object
call [rax + 0x18]        ; call vtable[3] (Present)
</code></pre>

<p>The CPU reads the vtable pointer, indexes into the table, and calls whatever address it finds. There is no validation that the address is legitimate.</p>

<hr />

<h2 id="2-how-vmt-hooking-works">2. How VMT Hooking Works</h2>

<p>VMT hooking replaces an entry in the vtable with a pointer to a different function. After the hook, every virtual call to that method on <em>any object using that vtable</em> is redirected.</p>

<h3 id="method-1-direct-vtable-patch">Method 1: Direct Vtable Patch</h3>

<p>Overwrite a single entry in the vtable:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span><span class="o">**</span> <span class="n">vtable</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">void</span><span class="o">***</span><span class="p">)</span><span class="n">target_object</span><span class="p">;</span>

<span class="c1">// Save original</span>
<span class="n">original_Present</span> <span class="o">=</span> <span class="n">vtable</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>

<span class="c1">// The vtable is typically in .rdata (read-only), so change protection first</span>
<span class="n">DWORD</span> <span class="n">old_protect</span><span class="p">;</span>
<span class="n">VirtualProtect</span><span class="p">(</span><span class="o">&amp;</span><span class="n">vtable</span><span class="p">[</span><span class="mi">3</span><span class="p">],</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">),</span> <span class="n">PAGE_READWRITE</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">old_protect</span><span class="p">);</span>

<span class="c1">// Replace entry</span>
<span class="n">vtable</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">hooked_Present</span><span class="p">;</span>

<span class="c1">// Restore protection</span>
<span class="n">VirtualProtect</span><span class="p">(</span><span class="o">&amp;</span><span class="n">vtable</span><span class="p">[</span><span class="mi">3</span><span class="p">],</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">),</span> <span class="n">old_protect</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">old_protect</span><span class="p">);</span>
</code></pre></div></div>

<p><strong>Pros:</strong> Simple, affects all objects sharing this vtable.
<strong>Cons:</strong> Modifies the original vtable in <code class="language-plaintext highlighter-rouge">.rdata</code>, which can be detected by integrity checks.</p>

<h3 id="method-2-vtable-replacement">Method 2: Vtable Replacement</h3>

<p>Allocate a new vtable, copy all entries, modify the target entry, then point the object’s vtable pointer to the new table:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span><span class="o">**</span> <span class="n">original_vtable</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">void</span><span class="o">***</span><span class="p">)</span><span class="n">target_object</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">vtable_size</span> <span class="o">=</span> <span class="n">count_vtable_entries</span><span class="p">(</span><span class="n">original_vtable</span><span class="p">);</span>

<span class="c1">// Allocate new vtable</span>
<span class="kt">void</span><span class="o">**</span> <span class="n">new_vtable</span> <span class="o">=</span> <span class="p">(</span><span class="kt">void</span><span class="o">**</span><span class="p">)</span><span class="n">malloc</span><span class="p">(</span><span class="n">vtable_size</span> <span class="o">*</span> <span class="nf">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">));</span>
<span class="n">memcpy</span><span class="p">(</span><span class="n">new_vtable</span><span class="p">,</span> <span class="n">original_vtable</span><span class="p">,</span> <span class="n">vtable_size</span> <span class="o">*</span> <span class="nf">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">));</span>

<span class="c1">// Hook one entry in the copy</span>
<span class="n">new_vtable</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">hooked_Present</span><span class="p">;</span>

<span class="c1">// Point object to new vtable</span>
<span class="o">*</span><span class="p">(</span><span class="kt">void</span><span class="o">***</span><span class="p">)</span><span class="n">target_object</span> <span class="o">=</span> <span class="n">new_vtable</span><span class="p">;</span>
</code></pre></div></div>

<p><strong>Pros:</strong> Original vtable is untouched. Only one object is affected.
<strong>Cons:</strong> The new vtable is in heap memory, which is unusual. Only affects the specific object instance.</p>

<hr />

<h2 id="3-why-vmt-hooks-are-hard-to-detect">3. Why VMT Hooks Are Hard to Detect</h2>

<p>Inline hooks (patching the first bytes of a function with a <code class="language-plaintext highlighter-rouge">JMP</code>) are well-understood and widely detected:</p>

<ul>
  <li>Code integrity scanners compare function prologues against known-good copies</li>
  <li><code class="language-plaintext highlighter-rouge">JMP</code> or <code class="language-plaintext highlighter-rouge">INT3</code> instructions at function entry are obvious indicators</li>
  <li>ETW and kernel callbacks can detect code modification in certain scenarios</li>
</ul>

<p>VMT hooks avoid all of this:</p>

<ul>
  <li><strong>No code modification.</strong> The function code is untouched. Only a data pointer changes.</li>
  <li><strong>No executable memory changes.</strong> The modification is in a data section or on the heap.</li>
  <li><strong>Legitimate-looking calls.</strong> The CPU’s virtual dispatch mechanism works normally. The call instruction hasn’t changed.</li>
  <li><strong>No unusual instructions.</strong> There is no <code class="language-plaintext highlighter-rouge">JMP</code> to a trampoline, no <code class="language-plaintext highlighter-rouge">INT3</code>, no detour.</li>
</ul>

<p>This is why traditional code integrity scanning doesn’t catch VMT hooks.</p>

<hr />

<h2 id="4-real-world-attack-patterns">4. Real-World Attack Patterns</h2>

<p>Here’s where VMT hooks actually show up in practice.</p>

<h3 id="com-object-hooking">COM Object Hooking</h3>

<p>Windows COM (Component Object Model) is built on vtables. Every COM interface is a vtable. Hooking a COM object’s vtable redirects interface method calls:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>IUnknown vtable:
  [0] QueryInterface
  [1] AddRef
  [2] Release

IDXGISwapChain vtable:
  [0] QueryInterface
  [1] AddRef
  [2] Release
  ...
  [8] Present        ← commonly hooked
</code></pre></div></div>

<p>DXGI hooking via <code class="language-plaintext highlighter-rouge">Present</code> is used by game overlays (Steam, Discord), screen recorders, and performance tools. But the same technique can be used by malware to intercept graphics output or inject visual elements.</p>

<h3 id="browser-com-hooking">Browser COM Hooking</h3>

<p>Browsers expose COM interfaces for automation. Hooking these interfaces allows intercepting web traffic, modifying page content, or stealing credentials - all through vtable manipulation that won’t trigger code integrity alerts.</p>

<h3 id="security-product-bypass">Security Product Bypass</h3>

<p>Some security products expose COM or C++ interfaces that can be hooked via VMT. If a security scanning function is virtual, replacing its vtable entry effectively disables that scan while the product continues to report as healthy.</p>

<hr />

<h2 id="5-detection-strategies">5. Detection Strategies</h2>

<h3 id="strategy-1-vtable-pointer-validation">Strategy 1: Vtable Pointer Validation</h3>

<p>For known objects, verify that the vtable pointer points to the expected <code class="language-plaintext highlighter-rouge">.rdata</code> section of the correct module:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">bool</span> <span class="nf">validate_vtable</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span> <span class="n">object</span><span class="p">,</span> <span class="n">HMODULE</span> <span class="n">expected_module</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">void</span><span class="o">**</span> <span class="n">vtable</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">void</span><span class="o">***</span><span class="p">)</span><span class="n">object</span><span class="p">;</span>

    <span class="n">MODULEINFO</span> <span class="n">module_info</span><span class="p">;</span>
    <span class="n">GetModuleInformation</span><span class="p">(</span><span class="n">GetCurrentProcess</span><span class="p">(),</span> <span class="n">expected_module</span><span class="p">,</span>
                         <span class="o">&amp;</span><span class="n">module_info</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">module_info</span><span class="p">));</span>

    <span class="kt">uintptr_t</span> <span class="n">vtable_addr</span> <span class="o">=</span> <span class="p">(</span><span class="kt">uintptr_t</span><span class="p">)</span><span class="n">vtable</span><span class="p">;</span>
    <span class="kt">uintptr_t</span> <span class="n">module_start</span> <span class="o">=</span> <span class="p">(</span><span class="kt">uintptr_t</span><span class="p">)</span><span class="n">module_info</span><span class="p">.</span><span class="n">lpBaseOfDll</span><span class="p">;</span>
    <span class="kt">uintptr_t</span> <span class="n">module_end</span> <span class="o">=</span> <span class="n">module_start</span> <span class="o">+</span> <span class="n">module_info</span><span class="p">.</span><span class="n">SizeOfImage</span><span class="p">;</span>

    <span class="c1">// Vtable should be within the module's image</span>
    <span class="k">return</span> <span class="p">(</span><span class="n">vtable_addr</span> <span class="o">&gt;=</span> <span class="n">module_start</span> <span class="o">&amp;&amp;</span> <span class="n">vtable_addr</span> <span class="o">&lt;</span> <span class="n">module_end</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>If the vtable pointer points to heap memory or an unknown module, the object’s vtable has likely been replaced.</p>

<h3 id="strategy-2-vtable-entry-validation">Strategy 2: Vtable Entry Validation</h3>

<p>Even if the vtable itself is in the correct location, individual entries might be patched. Validate that each entry points to the expected module:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">bool</span> <span class="nf">validate_vtable_entry</span><span class="p">(</span><span class="kt">void</span><span class="o">**</span> <span class="n">vtable</span><span class="p">,</span> <span class="kt">int</span> <span class="n">index</span><span class="p">,</span> <span class="n">HMODULE</span> <span class="n">expected_module</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">void</span><span class="o">*</span> <span class="n">func_ptr</span> <span class="o">=</span> <span class="n">vtable</span><span class="p">[</span><span class="n">index</span><span class="p">];</span>

    <span class="n">MODULEINFO</span> <span class="n">module_info</span><span class="p">;</span>
    <span class="n">GetModuleInformation</span><span class="p">(</span><span class="n">GetCurrentProcess</span><span class="p">(),</span> <span class="n">expected_module</span><span class="p">,</span>
                         <span class="o">&amp;</span><span class="n">module_info</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">module_info</span><span class="p">));</span>

    <span class="kt">uintptr_t</span> <span class="n">func_addr</span> <span class="o">=</span> <span class="p">(</span><span class="kt">uintptr_t</span><span class="p">)</span><span class="n">func_ptr</span><span class="p">;</span>
    <span class="kt">uintptr_t</span> <span class="n">module_start</span> <span class="o">=</span> <span class="p">(</span><span class="kt">uintptr_t</span><span class="p">)</span><span class="n">module_info</span><span class="p">.</span><span class="n">lpBaseOfDll</span><span class="p">;</span>
    <span class="kt">uintptr_t</span> <span class="n">module_end</span> <span class="o">=</span> <span class="n">module_start</span> <span class="o">+</span> <span class="n">module_info</span><span class="p">.</span><span class="n">SizeOfImage</span><span class="p">;</span>

    <span class="k">return</span> <span class="p">(</span><span class="n">func_addr</span> <span class="o">&gt;=</span> <span class="n">module_start</span> <span class="o">&amp;&amp;</span> <span class="n">func_addr</span> <span class="o">&lt;</span> <span class="n">module_end</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>An entry pointing outside the expected module is a strong hook indicator.</p>

<h3 id="strategy-3-rdata-integrity-comparison">Strategy 3: .rdata Integrity Comparison</h3>

<p>Compare the in-memory vtable against the on-disk copy of the module:</p>

<ol>
  <li>Load the PE file from disk</li>
  <li>Find the <code class="language-plaintext highlighter-rouge">.rdata</code> section</li>
  <li>Locate the vtable by its known offset or by matching RTTI data</li>
  <li>Compare each entry against the in-memory version</li>
</ol>

<p>Any discrepancy indicates tampering. This is the most thorough approach but requires knowing which vtables to check.</p>

<h3 id="strategy-4-rtti-run-time-type-information-validation">Strategy 4: RTTI (Run-Time Type Information) Validation</h3>

<p>MSVC stores RTTI data adjacent to the vtable. The vtable pointer at index -1 points to a <code class="language-plaintext highlighter-rouge">CompleteObjectLocator</code> structure, which contains class hierarchy information.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>vtable[-1] → CompleteObjectLocator
               → TypeDescriptor (class name string)
               → ClassHierarchyDescriptor
</code></pre></div></div>

<p>If the RTTI chain is missing, corrupted, or points to unexpected locations, the vtable has been tampered with. Legitimate vtables always have valid RTTI in MSVC builds (unless compiled with <code class="language-plaintext highlighter-rouge">/GR-</code>).</p>

<h3 id="strategy-5-memory-region-analysis">Strategy 5: Memory Region Analysis</h3>

<p>Vtables belong in <code class="language-plaintext highlighter-rouge">.rdata</code> (read-only initialized data). Check the memory protection of the region containing the vtable:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">MEMORY_BASIC_INFORMATION</span> <span class="n">mbi</span><span class="p">;</span>
<span class="n">VirtualQuery</span><span class="p">(</span><span class="n">vtable</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">mbi</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">mbi</span><span class="p">));</span>

<span class="k">if</span> <span class="p">(</span><span class="n">mbi</span><span class="p">.</span><span class="n">Protect</span> <span class="o">!=</span> <span class="n">PAGE_READONLY</span><span class="p">)</span> <span class="p">{</span>
    <span class="c1">// .rdata should be PAGE_READONLY</span>
    <span class="c1">// PAGE_READWRITE suggests tampering</span>
<span class="p">}</span>

<span class="k">if</span> <span class="p">(</span><span class="n">mbi</span><span class="p">.</span><span class="n">Type</span> <span class="o">==</span> <span class="n">MEM_PRIVATE</span><span class="p">)</span> <span class="p">{</span>
    <span class="c1">// vtable is in heap/private memory, not in a module image</span>
    <span class="c1">// strong indicator of vtable replacement</span>
<span class="p">}</span>
</code></pre></div></div>

<hr />

<h2 id="6-monitoring-and-alerting">6. Monitoring and Alerting</h2>

<h3 id="periodic-integrity-scans">Periodic Integrity Scans</h3>

<p>For high-value targets (security-critical COM interfaces, graphics subsystem objects), periodically validate vtable integrity:</p>

<ol>
  <li>Enumerate known critical objects</li>
  <li>Validate vtable pointers and entries against expected modules</li>
  <li>Check RTTI integrity</li>
  <li>Log and alert on deviations</li>
</ol>

<h3 id="etw-integration">ETW Integration</h3>

<p>ETW providers can be configured to monitor for:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">VirtualProtect</code> calls targeting <code class="language-plaintext highlighter-rouge">.rdata</code> sections (needed for direct vtable patches)</li>
  <li>Memory allocation near module image ranges (might indicate vtable replacement setup)</li>
  <li>Suspicious <code class="language-plaintext highlighter-rouge">memcpy</code> patterns targeting known vtable locations</li>
</ul>

<h3 id="kernel-level-monitoring">Kernel-Level Monitoring</h3>

<p>A kernel driver can:</p>
<ul>
  <li>Monitor page table permission changes for <code class="language-plaintext highlighter-rouge">.rdata</code> pages</li>
  <li>Use hypervisor-based EPT monitoring (see <a href="/2026/05/18/ept-internals/">EPT Internals</a>) to detect writes to vtable memory without depending on user-mode integrity checks</li>
</ul>

<p>This is the strongest detection approach, as it cannot be subverted from user mode.</p>

<hr />

<h2 id="7-vmt-hooking-vs-other-hooking-techniques">7. VMT Hooking vs Other Hooking Techniques</h2>

<table>
  <thead>
    <tr>
      <th>Aspect</th>
      <th>Inline Hook</th>
      <th>IAT Hook</th>
      <th>VMT Hook</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Modifies code</td>
      <td>Yes</td>
      <td>No</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Modifies data</td>
      <td>No</td>
      <td>Yes (IAT)</td>
      <td>Yes (vtable)</td>
    </tr>
    <tr>
      <td>Scope</td>
      <td>All callers</td>
      <td>Import callers only</td>
      <td>Virtual call callers</td>
    </tr>
    <tr>
      <td>Code integrity detection</td>
      <td>Easy</td>
      <td>Medium</td>
      <td>Hard</td>
    </tr>
    <tr>
      <td>Requires C++ target</td>
      <td>No</td>
      <td>No</td>
      <td>Yes</td>
    </tr>
    <tr>
      <td>Detectable by ETW</td>
      <td>Partially</td>
      <td>Partially</td>
      <td>Harder</td>
    </tr>
  </tbody>
</table>

<p>VMT hooking occupies a specific niche: it requires a C++ virtual interface, but in return it is the most difficult hook type to detect with standard code scanning tools. For defenders, this means that code integrity alone is not sufficient - data integrity must also be verified.</p>

<hr />

<h2 id="conclusion">Conclusion</h2>

<p>VMT hooking exploits a core C++ mechanism that was never designed with adversarial use in mind. The vtable is trust-by-convention: the runtime assumes function pointers are legitimate because nothing validates them.</p>

<p>The bottom line: code integrity scanning alone does not catch VMT hooks. You need data integrity checks too. Vtable pointers should point to <code class="language-plaintext highlighter-rouge">.rdata</code> in the expected module, not to heap or unknown regions. RTTI validation helps for MSVC-compiled binaries. COM interfaces are vtable-based and represent a broad attack surface. And if you really want the strongest detection, hypervisor-based EPT monitoring operates below anything user-mode can subvert.</p>

<p>If you’re only scanning for inline hooks, you’re leaving a gap that adversaries know about.</p>]]></content><author><name></name></author><category term="windows" /><category term="hooking" /><category term="detection" /><category term="cpp" /><category term="defense" /><summary type="html"><![CDATA[Virtual Method Table (VMT) hooking is one of the oldest and most reliable hooking techniques on Windows. It exploits a fundamental C++ runtime mechanism - the vtable - to redirect virtual function calls without patching any code bytes. No code modification means most integrity scanners miss it entirely.]]></summary></entry><entry><title type="html">CRT vs NoCRT: How the C Runtime Helps Defenders Catch Injected DLLs</title><link href="https://youssix.github.io/2026/05/10/crt-nocrt-detection/" rel="alternate" type="text/html" title="CRT vs NoCRT: How the C Runtime Helps Defenders Catch Injected DLLs" /><published>2026-05-10T00:00:00+00:00</published><updated>2026-05-10T00:00:00+00:00</updated><id>https://youssix.github.io/2026/05/10/crt-nocrt-detection</id><content type="html" xml:base="https://youssix.github.io/2026/05/10/crt-nocrt-detection/"><![CDATA[<p>When an attacker injects a DLL into a process, one of the first decisions they make - whether they realize it or not - is whether to link the C Runtime Library (CRT). That decision leaves distinct forensic traces that defenders can use to detect the injection.</p>

<hr />

<h2 id="1-what-the-crt-actually-does">1. What the CRT Actually Does</h2>

<p>When you compile a DLL with Visual Studio using the default settings, the C Runtime Library is linked in. The CRT is not just <code class="language-plaintext highlighter-rouge">printf</code> and <code class="language-plaintext highlighter-rouge">malloc</code> - it’s a significant initialization framework that runs before your code.</p>

<p>When a CRT-linked DLL is loaded, this happens before <code class="language-plaintext highlighter-rouge">DllMain</code> executes:</p>

<ol>
  <li><strong>Security cookie initialization</strong> (<code class="language-plaintext highlighter-rouge">__security_init_cookie</code>) - generates a random stack canary value</li>
  <li><strong>CRT heap initialization</strong> - sets up the CRT’s internal heap</li>
  <li><strong>Thread-local storage initialization</strong> - initializes TLS slots</li>
  <li><strong>Atexit/onexit registration</strong> - prepares cleanup handlers</li>
  <li><strong>Floating-point initialization</strong> - configures FPU state</li>
  <li><strong>Global C++ constructor calls</strong> - runs static object constructors (<code class="language-plaintext highlighter-rouge">_initterm</code>)</li>
</ol>

<p>The actual entry point of a CRT-linked DLL is not <code class="language-plaintext highlighter-rouge">DllMain</code> - it is <code class="language-plaintext highlighter-rouge">_DllMainCRTStartup</code>, which does all of the above and then calls your <code class="language-plaintext highlighter-rouge">DllMain</code>.</p>

<h3 id="the-security-cookie">The Security Cookie</h3>

<p>The security cookie (<code class="language-plaintext highlighter-rouge">/GS</code> flag, enabled by default) is the most visible CRT artifact. The function <code class="language-plaintext highlighter-rouge">__security_init_cookie</code> generates a random value at DLL load time and stores it in <code class="language-plaintext highlighter-rouge">__security_cookie</code>. Every function that uses stack buffers places this value on the stack and validates it before returning.</p>

<p>The initialization is easy to spot in a disassembler:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>_DllMainCRTStartup:
    call    __security_init_cookie    ; ← distinctive CRT artifact
    jmp     dllmain_dispatch
</code></pre></div></div>

<p>This single call is one of the most reliable indicators that a DLL was compiled with the CRT.</p>

<hr />

<h2 id="2-crt-linked-dlls-what-defenders-see">2. CRT-Linked DLLs: What Defenders See</h2>

<p>A DLL compiled with the CRT has a recognizable fingerprint. Here’s what to look for.</p>

<h3 id="import-table">Import Table</h3>

<p>CRT-linked DLLs import from CRT libraries. The specific imports depend on the linking mode:</p>

<p><strong>Dynamic CRT (<code class="language-plaintext highlighter-rouge">/MD</code>):</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">vcruntime140.dll</code></li>
  <li><code class="language-plaintext highlighter-rouge">ucrtbase.dll</code> (or <code class="language-plaintext highlighter-rouge">api-ms-win-crt-*.dll</code> on newer Windows)</li>
  <li>Possibly <code class="language-plaintext highlighter-rouge">msvcp140.dll</code> for C++ standard library</li>
</ul>

<p><strong>Static CRT (<code class="language-plaintext highlighter-rouge">/MT</code>):</strong></p>
<ul>
  <li>No CRT DLL imports (everything is compiled into the binary)</li>
  <li>But the code patterns are still present in the <code class="language-plaintext highlighter-rouge">.text</code> section</li>
</ul>

<h3 id="entry-point-pattern">Entry Point Pattern</h3>

<p>The entry point follows a predictable pattern:</p>

<pre><code class="language-asm">; _DllMainCRTStartup
push    rbp
mov     rbp, rsp
sub     rsp, 0x20
call    __security_init_cookie
; ... CRT initialization ...
call    dllmain_dispatch
; ... CRT cleanup ...
</code></pre>

<p>The call to <code class="language-plaintext highlighter-rouge">__security_init_cookie</code> near the entry point is a strong CRT indicator. This function reads <code class="language-plaintext highlighter-rouge">RDTSC</code>, <code class="language-plaintext highlighter-rouge">GetCurrentProcessId</code>, <code class="language-plaintext highlighter-rouge">GetCurrentThreadId</code>, <code class="language-plaintext highlighter-rouge">GetSystemTimeAsFileTime</code>, and <code class="language-plaintext highlighter-rouge">QueryPerformanceCounter</code> to generate entropy for the cookie. Those API calls or their patterns are detectable.</p>

<h3 id="rdata-and-data-sections">.rdata and .data Sections</h3>

<p>CRT-linked DLLs contain specific global variables:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">__security_cookie</code> - the canary value (in <code class="language-plaintext highlighter-rouge">.data</code> or <code class="language-plaintext highlighter-rouge">.rdata</code>)</li>
  <li><code class="language-plaintext highlighter-rouge">_onexit_table</code> - atexit cleanup handlers</li>
  <li><code class="language-plaintext highlighter-rouge">__acrt_iob_func</code> references for stdio</li>
  <li>CRT error messages as strings (“runtime error”, “assertion failed”)</li>
</ul>

<h3 id="section-layout">Section Layout</h3>

<p>A typical CRT DLL has well-structured sections:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.text    - code (substantial, includes CRT runtime code)
.rdata   - read-only data, vtables, CRT strings
.data    - writable data, security cookie, global state
.pdata   - exception handling unwind data
.rsrc    - resources (optional)
.reloc   - relocation table
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">.pdata</code> section (exception unwind information) is almost always present in CRT DLLs because the CRT uses structured exception handling.</p>

<hr />

<h2 id="3-why-attackers-avoid-the-crt">3. Why Attackers Avoid the CRT</h2>

<p>Sophisticated attackers compile DLLs without the CRT for several reasons.</p>

<p>A minimal NoCRT DLL can be 4-8 KB. A CRT-linked DLL starts at 50-100 KB. Smaller files are easier to inject, less likely to trigger size-based heuristics, and faster to write into remote process memory.</p>

<p>The CRT also pulls in dozens of API imports, and each import is a potential detection point. A NoCRT DLL can operate with just a handful of functions from <code class="language-plaintext highlighter-rouge">ntdll.dll</code> or <code class="language-plaintext highlighter-rouge">kernel32.dll</code>.</p>

<p>CRT initialization calls multiple API functions that EDR products monitor. Skipping it means the DLL’s entry point runs directly - less telemetry generated. The CRT also brings code the attacker doesn’t need, with its own behavior (heap allocations, TLS operations, exception handlers) that creates noise.</p>

<p>And many detection rules are tuned to CRT-compiled binaries because that’s what most software produces. A NoCRT binary doesn’t match those patterns - which is itself a signal, as we’ll see.</p>

<h3 id="how-nocrt-dlls-are-built">How NoCRT DLLs Are Built</h3>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// NoCRT entry point - no _DllMainCRTStartup wrapper</span>
<span class="n">BOOL</span> <span class="n">WINAPI</span> <span class="nf">_DllMainCRTStartup</span><span class="p">(</span><span class="n">HINSTANCE</span> <span class="n">hinstDLL</span><span class="p">,</span> <span class="n">DWORD</span> <span class="n">fdwReason</span><span class="p">,</span> <span class="n">LPVOID</span> <span class="n">lpReserved</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">fdwReason</span> <span class="o">==</span> <span class="n">DLL_PROCESS_ATTACH</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// attacker code runs directly here</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="n">TRUE</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Compiled with:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">/NODEFAULTLIB</code> - no CRT libraries</li>
  <li><code class="language-plaintext highlighter-rouge">/ENTRY:_DllMainCRTStartup</code> - custom entry point</li>
  <li><code class="language-plaintext highlighter-rouge">/GS-</code> - no security cookie (requires CRT)</li>
  <li>No <code class="language-plaintext highlighter-rouge">printf</code>, <code class="language-plaintext highlighter-rouge">malloc</code>, <code class="language-plaintext highlighter-rouge">new</code> - only direct Win32 or NT API calls</li>
</ul>

<hr />

<h2 id="4-nocrt-dlls-what-defenders-see">4. NoCRT DLLs: What Defenders See</h2>

<p>The absence of CRT artifacts is just as distinctive as their presence.</p>

<h3 id="missing-security-cookie">Missing Security Cookie</h3>

<p>No <code class="language-plaintext highlighter-rouge">__security_init_cookie</code> call at the entry point. No <code class="language-plaintext highlighter-rouge">__security_cookie</code> global. No <code class="language-plaintext highlighter-rouge">__GSHandlerCheck</code> exception handlers. For a DLL that does anything non-trivial, this is unusual.</p>

<p>A DLL with a <code class="language-plaintext highlighter-rouge">.text</code> section larger than 4 KB but no security cookie initialization is suspicious. Legitimate developers almost never disable <code class="language-plaintext highlighter-rouge">/GS</code> because it’s the default and has negligible performance cost.</p>

<h3 id="minimal-import-table">Minimal Import Table</h3>

<p>A NoCRT DLL often imports only from <code class="language-plaintext highlighter-rouge">kernel32.dll</code> or <code class="language-plaintext highlighter-rouge">ntdll.dll</code>, with a handful of functions:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kernel32.dll:
    VirtualAlloc
    VirtualProtect
    CreateThread
    LoadLibraryA
    GetProcAddress
</code></pre></div></div>

<p>Or even more minimal, using only <code class="language-plaintext highlighter-rouge">ntdll.dll</code> native API:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ntdll.dll:
    NtAllocateVirtualMemory
    NtProtectVirtualMemory
    NtCreateThreadEx
    LdrLoadDll
</code></pre></div></div>

<p>A DLL that imports exclusively from <code class="language-plaintext highlighter-rouge">ntdll.dll</code> with no CRT imports is highly unusual for legitimate software. Most legitimate DLLs use the Win32 API layer.</p>

<h3 id="tiny-file-size">Tiny File Size</h3>

<p>A NoCRT DLL doing real work can be 4-15 KB. Legitimate DLLs with business logic are almost always larger. The distribution of DLL sizes in a normal process is skewed toward larger files.</p>

<p>Flag unsigned DLLs under 20 KB loaded into processes where the typical module size is much larger.</p>

<h3 id="flat-entry-point">Flat Entry Point</h3>

<p>NoCRT DLLs have a simple entry point that goes directly to attacker logic:</p>

<pre><code class="language-asm">_DllMainCRTStartup:
    cmp     edx, 1          ; DLL_PROCESS_ATTACH
    jne     short return_true
    ; ... immediately does attacker work ...
return_true:
    mov     eax, 1
    ret
</code></pre>

<p>Compare this with a CRT entry point that has initialization calls, exception handling setup, and a structured dispatch to <code class="language-plaintext highlighter-rouge">DllMain</code>. The difference is visible in static analysis.</p>

<h3 id="missing-pdata-section">Missing .pdata Section</h3>

<p>NoCRT DLLs compiled without exception handling often lack a <code class="language-plaintext highlighter-rouge">.pdata</code> section entirely. On x64 Windows, the <code class="language-plaintext highlighter-rouge">.pdata</code> section contains unwind information for structured exception handling. Its absence means the DLL has no SEH support.</p>

<p>A x64 DLL without <code class="language-plaintext highlighter-rouge">.pdata</code> is unusual. Not definitive on its own, but combined with other NoCRT indicators it strengthens the signal.</p>

<hr />

<h2 id="5-the-signature-gap">5. The Signature Gap</h2>

<p>This is the most straightforward detection opportunity.</p>

<p>Legitimate CRT-linked DLLs are almost always signed. Microsoft, Adobe, Google, game studios - every major software vendor signs their DLLs. The CRT itself (<code class="language-plaintext highlighter-rouge">vcruntime140.dll</code>, <code class="language-plaintext highlighter-rouge">ucrtbase.dll</code>) is Microsoft-signed.</p>

<p>An unsigned DLL that uses the CRT is suspicious because:</p>
<ol>
  <li>If the developer was professional enough to use the CRT (standard build process), they would typically also sign their binaries</li>
  <li>Legitimate unsigned DLLs exist (open-source plugins, internal tools) but they are a small population</li>
  <li>An injected DLL is, by definition, not part of the original application - it will not be signed by the application vendor</li>
</ol>

<p>An unsigned DLL loaded into a process where all other DLLs are signed is a strong anomaly signal. If it also uses the CRT, it was compiled with standard tooling but not through a standard release process.</p>

<h3 id="why-crt-makes-unsigned-dlls-easier-to-catch">Why CRT Makes Unsigned DLLs Easier to Catch</h3>

<p>Here’s the thing: <code class="language-plaintext highlighter-rouge">__security_init_cookie</code> calls <strong>4 external functions</strong> to gather entropy:</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">GetSystemTimeAsFileTime</code></li>
  <li><code class="language-plaintext highlighter-rouge">QueryPerformanceCounter</code></li>
  <li><code class="language-plaintext highlighter-rouge">GetCurrentProcessId</code></li>
  <li><code class="language-plaintext highlighter-rouge">GetCurrentThreadId</code></li>
</ol>

<p>Every one of these can be hooked by a security product. When the hook fires, the defender inspects the <strong>return address</strong> on the call stack. That return address points back into the calling module - the injected DLL. Walking the call stack reveals the full chain:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>GetSystemTimeAsFileTime          ← hooked, EDR gets control
  ← __security_init_cookie       ← return address is inside injected DLL
    ← _DllMainCRTStartup         ← CRT entry point
      ← LdrpCallInitRoutine      ← ntdll loader
</code></pre></div></div>

<p>The return address lands in a memory region. The defender checks: does this region belong to a signed, known module? If the return address resolves to an unsigned image, or to memory that is not backed by any loaded image at all, it is a strong injection indicator.</p>

<p>This is not a problem for legitimate DLLs. A signed module from a known vendor calling these same functions during normal CRT initialization will pass the return address check - the address resolves cleanly to a signed image with a valid certificate chain. The detection specifically targets the gap between “uses standard CRT tooling” and “did not go through a standard signing and distribution process.”</p>

<p>Beyond the security cookie, the CRT generates additional telemetry:</p>

<ul>
  <li>CRT heap initialization calls <code class="language-plaintext highlighter-rouge">HeapCreate</code> or <code class="language-plaintext highlighter-rouge">RtlCreateHeap</code> - same return address analysis applies</li>
  <li>TLS callbacks are registered and executed - monitored by ETW</li>
  <li>If dynamic CRT (<code class="language-plaintext highlighter-rouge">/MD</code>), loading the DLL triggers loads of <code class="language-plaintext highlighter-rouge">vcruntime140.dll</code> and <code class="language-plaintext highlighter-rouge">ucrtbase.dll</code> - module load events that EDR monitors</li>
</ul>

<p>An attacker using a NoCRT DLL avoids all of these hook trigger points - but as covered in section 4, the <em>absence</em> of these patterns is also detectable through structural analysis.</p>

<hr />

<h2 id="6-building-a-detection-matrix">6. Building a Detection Matrix</h2>

<p>Combining these signals into a scoring system:</p>

<table>
  <thead>
    <tr>
      <th>Signal</th>
      <th>CRT DLL (suspicious)</th>
      <th>NoCRT DLL (suspicious)</th>
      <th>Weight</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Unsigned</td>
      <td>Strong indicator</td>
      <td>Strong indicator</td>
      <td>High</td>
    </tr>
    <tr>
      <td>No security cookie</td>
      <td>N/A</td>
      <td>Present</td>
      <td>Medium</td>
    </tr>
    <tr>
      <td>Minimal imports</td>
      <td>Unlikely</td>
      <td>Likely</td>
      <td>Medium</td>
    </tr>
    <tr>
      <td>Small file size (&lt;20 KB)</td>
      <td>Unlikely</td>
      <td>Likely</td>
      <td>Medium</td>
    </tr>
    <tr>
      <td>No <code class="language-plaintext highlighter-rouge">.pdata</code> section</td>
      <td>Unlikely</td>
      <td>Likely</td>
      <td>Low</td>
    </tr>
    <tr>
      <td>ntdll-only imports</td>
      <td>Very unlikely</td>
      <td>Possible</td>
      <td>High</td>
    </tr>
    <tr>
      <td>Not in application manifest</td>
      <td>Strong indicator</td>
      <td>Strong indicator</td>
      <td>High</td>
    </tr>
    <tr>
      <td>Loaded after process init</td>
      <td>Moderate indicator</td>
      <td>Moderate indicator</td>
      <td>Medium</td>
    </tr>
    <tr>
      <td>No version info resource</td>
      <td>Moderate indicator</td>
      <td>Likely</td>
      <td>Low</td>
    </tr>
  </tbody>
</table>

<h3 id="detection-algorithm">Detection Algorithm</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>score = 0

if dll.is_unsigned:
    score += 30

if dll.loaded_after_process_init:
    score += 15

if not dll.has_security_cookie and dll.text_size &gt; 0x1000:
    score += 20  # NoCRT indicator

if dll.import_count &lt; 5:
    score += 15

if dll.file_size &lt; 0x5000:  # 20 KB
    score += 10

if dll.imports_only_ntdll:
    score += 25

if not dll.has_pdata_section:
    score += 5

if not dll.has_version_info:
    score += 5

if score &gt;= 50:
    alert("suspicious DLL injection detected")
</code></pre></div></div>

<p>This is a simplified example. Production EDR systems use more sophisticated scoring with machine learning and behavioral context. But the core signals are the same.</p>

<hr />

<h2 id="7-etw-and-kernel-level-detection">7. ETW and Kernel-Level Detection</h2>

<h3 id="module-load-events">Module Load Events</h3>

<p>ETW provides <code class="language-plaintext highlighter-rouge">IMAGE_LOAD</code> events whenever a DLL is loaded. Each event includes:</p>

<ul>
  <li>Image file path</li>
  <li>Image base address</li>
  <li>Image size</li>
  <li>Process ID</li>
  <li>Signing level and signature status</li>
</ul>

<p>Monitoring these events for unsigned images loaded after process initialization is the foundation of DLL injection detection.</p>

<h3 id="thread-creation-events">Thread Creation Events</h3>

<p>DLL injection typically involves creating a remote thread (via <code class="language-plaintext highlighter-rouge">CreateRemoteThread</code>, <code class="language-plaintext highlighter-rouge">NtCreateThreadEx</code>, or APC injection). ETW <code class="language-plaintext highlighter-rouge">THREAD_START</code> events capture:</p>

<ul>
  <li>Start address - does it point into a known module?</li>
  <li>Thread creation time relative to process creation</li>
  <li>Calling process (for remote thread creation)</li>
</ul>

<p>A thread starting at an address that does not belong to any known signed module is a strong injection indicator.</p>

<h3 id="combining-telemetry">Combining Telemetry</h3>

<p>The strongest detection comes from correlating events:</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">IMAGE_LOAD</code> for an unsigned DLL → timestamp T1</li>
  <li><code class="language-plaintext highlighter-rouge">THREAD_START</code> with start address in that DLL → timestamp T2</li>
  <li>T2 shortly after T1 → high confidence injection</li>
</ol>

<p>If the DLL also matches NoCRT patterns (small, minimal imports, no security cookie), the confidence increases further.</p>

<hr />

<h2 id="8-practical-recommendations-for-defenders">8. Practical Recommendations for Defenders</h2>

<h3 id="for-edr-engineers">For EDR Engineers</h3>

<p>Don’t just check signatures - also look for DLLs signed with revoked or untrusted certificates. Build a baseline of expected DLLs per application; any new DLL that appears in a stable application is worth investigating. Detect CRT absence, not just CRT presence - a DLL with no CRT artifacts doing complex work is more suspicious than one with the CRT. And watch for unexpected <code class="language-plaintext highlighter-rouge">vcruntime140.dll</code> or <code class="language-plaintext highlighter-rouge">ucrtbase.dll</code> loads, which signal something new was injected with CRT linkage.</p>

<h3 id="for-malware-analysts">For Malware Analysts</h3>

<p>Check the entry point first. CRT vs NoCRT is immediately visible from the entry point structure. Examine import table density - NoCRT malware often resolves APIs dynamically after load, so look for <code class="language-plaintext highlighter-rouge">GetProcAddress</code> chains or manual export table walking. And don’t forget about statically linked CRT (<code class="language-plaintext highlighter-rouge">/MT</code>): no CRT imports show up, but the code is still there in the binary.</p>

<h3 id="for-blue-teams">For Blue Teams</h3>

<p>Sysmon Event ID 7 (Image Loaded) with signature status filtering catches unsigned DLLs immediately. WDAC or AppLocker can block unsigned DLLs from loading entirely in high-security environments. Module load auditing with baseline comparison detects any new DLL in monitored processes.</p>

<hr />

<h2 id="conclusion">Conclusion</h2>

<p>The CRT is not a security feature - it is a development convenience. But its presence or absence creates a distinctive forensic fingerprint that defenders can use.</p>

<p>An unsigned CRT-linked DLL is easy to catch because the CRT generates initialization telemetry, imports from known CRT libraries, and follows a recognizable structure. Attackers who avoid the CRT to reduce this footprint create a different but equally detectable pattern: minimal imports, no security cookie, tiny file size, and missing standard sections.</p>

<p>For defenders, the lesson is to detect in both directions:</p>
<ul>
  <li><strong>CRT present + unsigned</strong> = amateur or careless injection, catch on telemetry and signature</li>
  <li><strong>CRT absent + unusual characteristics</strong> = deliberate evasion, catch on structural anomalies</li>
</ul>

<p>Neither choice is invisible to a well-instrumented environment.</p>]]></content><author><name></name></author><category term="windows" /><category term="detection" /><category term="defense" /><category term="malware-analysis" /><category term="dll" /><category term="crt" /><summary type="html"><![CDATA[When an attacker injects a DLL into a process, one of the first decisions they make - whether they realize it or not - is whether to link the C Runtime Library (CRT). That decision leaves distinct forensic traces that defenders can use to detect the injection.]]></summary></entry></feed>