A Catalog of Local Windows Kernel-mode Backdoor Techniques

现在的位置: 首页 > 综合 > 正文
A Catalog of Local Windows Kernel-mode Backdoor Techniques

2013年09月28日 ⁄ 综合 ⁄ 共 53354字 ⁄ 字号小中大 ⁄ 评论关闭
A Catalog of Local Windows Kernel-mode Backdoor Techniques

August, 2007

skape & Skywing

mmiller@hick.org & Skywing@valhallalegends.com
Abstract: This paper presents a detailed catalog of techniques that can be

used to create local kernel-mode backdoors on Windows.  These techniques

include function trampolines, descriptor table hooks, model-specific register

hooks, page table modifications, as well as others that have not previously

been described.  The majority of these techniques have been publicly known far

in advance of this paper.  However, at the time of this writing, there appears

to be no detailed single point of reference for many of them.  The intention

of this paper is to provide a solid understanding on the subject of local

kernel-mode backdoors.  This understanding is necessary in order to encourage

the thoughtful discussion of potential countermeasures and perceived

advancements.  In the vein of countermeasures, some additional thoughts are

given to the common misconception that PatchGuard, in its current design, can

be used to prevent kernel-mode rootkits.
1) Introduction
The classic separation of privileges between user-mode and kernel-mode has

been a common feature included in most modern operating systems.  This

separation allows operating systems to make security guarantees relating to

process isolation, kernel-user isolation, kernel-mode integrity, and so on.

These security guarantees are needed in order to prevent a lesser privileged

user-mode process from being able to take control of the system itself.  A

kernel-mode backdoor is one method of bypassing these security restrictions.
There are many different techniques that can be used to backdoor the kernel.

For the purpose of this document, a backdoor will be considered to be

something that provides access to resources that would otherwise normally be

restricted by the kernel.  These resources might include executing code with

kernel-mode privileges, accessing kernel-mode data, disabling security checks,

and so on.  To help further limit the scope of this document, the authors will

focus strictly on techniques that can be used to provide local backdoors into

the kernel on Windows.  In this context, a local backdoor is a backdoor that

does not rely on or make use of a network connection to provide access to

resources.  Instead, local backdoors can be viewed as ways of weakening the

kernel in an effort to provide access to resources from non-privileged

entities, such as user-mode processes.
The majority of the backdoor techniques discussed in this paper have been

written about at length and in great detail in many different publications[20,

8, 12, 18, 19, 21, 25, 26].  The primary goal of this paper is to act as a

point of reference for some of the common, as well as some of the

not-so-common, local kernel-mode backdoor techniques.  The authors have

attempted to include objective measurements for each technique along with a

description of how each technique works.  As a part of defining these

objective measurements, the authors have attempted to research the origins of

some of the more well-known backdoor techniques.  Since many of these

techniques have been used for such a long time, the origins have proven

somewhat challenging to uncover.
The structure of this paper is as follows.  In , each of the individual

techniques that can be used to provide a local kernel-mode backdoor are

discussed in detail.   provides a brief discussion into general strategies

that might be employed to prevent some of the techniques that are discussed.

attempts to refute some of the common arguments against preventing kernel-mode

backdoors in and of themselves.  Finally,  attempts to clarify why Microsoft's

PatchGuard should not be considered a security solution with respect to

kernel-mode backdoors.
2) Techniques
To help properly catalog the techniques described in this section, the authors

have attempted to include objective measurements of each technique.  These

measurements are broken down as follows:
- Category
  The authors have chosen to adopt Joanna Rutkowska's malware categorization in

  the interest of pursuing a standardized classification[34].  This model describes

  three types of malware.  Type 0 malware categorizes non-intrusive malware;

  Type I includes malware that modifies things that should otherwise never be

  modified (code segments, MSRs, etc); Type II includes malware that modifies

  things that should be modified (global variables, other data); Type III is not

  within the scope of this document[33, 34].
  In addition to the four malware types described by Rutkowska, the authors

  propose Type IIa which would categorize writable memory that should

  effectively be considered write-once in a given context.  For example, when a

  global DPC is initialized, the DpcRoutine can be considered write-once.  The

  authors consider this to be a derivative of Type II due to the fact that the

  memory remains writable and is less likely to be checked than that of Type I.
- Origin
  If possible, the first known instance of the technique's use or some

  additional background on its origin is given.
- Capabilities
  The capabilities the backdoor offers.  This can be one or more of the

  following: kernel-mode code execution, access to kernel-mode data, access to

  restricted resources.  If a technique allows kernel-mode code execution,

  then it implicitly has all other capabilities listed.
- Considerations
  Any restrictions or special points that must be made about the use of a

  given technique.
- Covertness
  A description of how easily the use of a given technique might be detected.
Since many of the techniques described in this document have been known for

quite some time, the authors have taken a best effort approach to identifying

sources of the original ideas.  In many cases, this has proved to be difficult

or impossible.  For this reason, the authors request that any inaccuracy in

citation be reported so that it may be corrected in future releases of this

paper.
2.1) Image Patches
Perhaps the most obvious approach that can be used to backdoor the kernel

involves the modification of code segments used by the kernel itself.  This

could include modifying the code segments of kernel-mode images like

ntoskrnl.exe, ndis.sys, ntfs.sys, and so on.  By making modifications to these

code segments, it is possible to hijack kernel-mode execution whenever a

hooked function is invoked.  The possibilities surrounding the modification of

code segments are limited only by what the kernel itself is capable of doing.
2.1.1) Function Prologue Hooking
Function hooking is the process of intercepting calls to a given function by

redirecting those calls to an alternative function.  The concept of function

hooking has been around for quite some time and it's unclear who originally

presented the idea.  There are a number of different libraries and papers that

exist which help to facilitate the hooking of functions[21].  With respect to

local kernel-mode backdoors, function hooking is an easy and reliable method

of creating a backdoor.  There are a few different ways in which functions can

be hooked.  One of the most common techniques involves overwriting the

prologue of the function to be hooked with an architecture-specific jump

instruction that transfers control to an alternative function somewhere else

in memory.  This is the approach taken by Microsoft's Detours library.  While

prologue hooks are conceptually simple, there is actually quite a bit of code

needed to implement them properly.
In order to implement a prologue hook in a portable and reliable manner, it is

often necessary to make use of a disassembler that is able to determine the

size, in bytes, of individual instructions.  The reason for this is that in

order to perform the prologue overwrite, the first few bytes of the function

to be hooked must be overwritten by a control transfer instruction (typically

a jump).  On the Intel architecture, control transfer instructions can have

one of three operands: a register, a relative offset, or a memory operand.

Each operand type controls the size of the jump instruction that will be

needed: 2 bytes, 5 bytes, and 6 bytes, respectively.  The disassembler makes

it possible to copy the first n instructions from the function's prologue

prior to performing the overwrite.  The value of n is determined by

disassembling each instruction in the prologue until the number of bytes

disassembled is greater than or equal to the number of bytes that will be

overwritten when hooking the function.
The reason the first n instructions must be saved in their entirety is to make

it possible for the original function to be called by the hook function.  In

order to call the original version of the function, a small stub of code must

be generated that will execute the first n instructions of the function's

original prologue followed by a jump to instruction n + 1 in the original

function's body.  This stub of code has the effect of allowing the original

function to be called without it being diverted by the prologue overwrite.

This method of implementing function prologue hooks is used extensively by

Detours and other hooking libraries[21].
Recent versions of Windows, such as XP SP2 and Vista, include image files that

come with a more elegant way of hooking a function with a function prologue

overwrite.  In fact, these images have been built with a compiler enhancement

that was designed specifically to improve Microsoft's ability to hook its own

functions during runtime.  The enhancement involves creating functions with a

two byte no-op instruction, such as a mov edi, edi, as the first instruction

of a function's prologue.  In addition to having this two byte instruction,

the compiler also prefixes 5 no-op instructions to the function itself.  The

two byte no-op instruction provides the necessary storage for a two byte

relative short jump instruction to be placed on top of it.  The relative short

jump, in turn, can then transfer control into another relative jump

instruction that has been placed in the 5 bytes that were prefixed to the

function itself.  The end result is a more deterministic way of hooking a

function using a prologue overwrite that does not rely on a disassembler.  A

common question is why a two byte no-op instruction was used rather than two

individual no-op instructions.  The answer for this has two parts.  First, a

two byte no-op instruction can be overwritten in an atomic fashion whereas

other prologue overwrites, such as a 5 byte or 6 byte overwrite, cannot.  The

second part has to do with the fact that having a two byte no-op instruction

prevents race conditions associated with any thread executing code from within

the set of bytes that are overwritten when the hook is installed.  This race

condition is common to any type of function prologue overwrite.
To better understand this race condition, consider what might happen if the

prologue of a function had two single byte no-op instructions.  Prior to this

function being hooked, a thread executes the first no-op instruction.  In

between the execution of this first no-op and the second no-op, the function

in question is hooked in the context of a second thread and the first two

bytes are overwritten with the opcodes associated with a relative short jump

instruction, such as 0xeb and 0xf9.  After the prologue overwrite occurs, the

first thread begins executing what was originally the second no-op

instruction.  However, now that the function has been hooked, the no-op

instruction may have been changed from 0x90 to 0xf9.  This may have disastrous

effects depending on the context that the hook is executed in.  While this

race condition may seem unlikely, it is nevertheless feasible and can

therefore directly impact the reliability of any solution that uses prologue

overwrites in order to hook functions.
Category: Type I
Origin: The concept of patching code has ``existed since the dawn of digital

computing''[21].
Capabilities: Kernel-mode code execution
Considerations: The reliability of a function prologue hook is directly

related to the reliability of the disassembler used and the number of bytes

that are overwritten in a function prologue.  If the two byte no-op

instruction is not present, then it is unlikely that a function prologue

overwrite will be able to be multiprocessor safe.  Likewise, if a disassembler

does not accurately count the size of instructions in relation to the actual

processor, then the function prologue hook may fail, leading to an unexpected

crash of the system.  One other point that is worth mentioning is that authors

of hook functions must be careful not to inadvertently introduce instability

issues into the operating system by failing to properly sanitize and check

parameters to the function that is hooked.  There have been many examples

where legitimate software has gone the route of hooking functions without

taking these considerations into account[38].
Covertness: At the time of this writing, the use of function prologue

overwrites is considered to not be covert.  It is trivial for tools, such as

Joanna Rutkowska's System Virginity Verifier[32], to compare the in-memory version

of system images with the on-disk versions in an effort to detect in-memory

alterations.  The Windows Debugger (windbg) will also make an analyst aware of

differences between in-memory code segments and their on-disk counterparts.
2.1.2) Disabling SeAccessCheck
In Phrack 55, Greg Hoglund described the benefits of patching nt!SeAccessCheck

so that it never returns access denied[19].  This has the effect of causing access

checks on securable objects to always grant access, regardless of whether or

not the access would normally be granted.  As a result, non-privileged users

can directly access otherwise privileged resources.  This simple modification

does not directly make it possible to execute privileged code, but it does

indirectly facilitate it by allowing non-privileged users to interact with and

modify system processes.
Category: Type I
Origin: Greg Hoglund was the first person to publicly identify this technique

in September, 1999[19].
Capabilities: Access to restricted resources.
Covertness: Like function prologue overwrites, the nt!SeAccessCheck patch can

be detected through differences between the mapped image of ntoskrnl.exe and

the on-disk version.
2.2) Descriptor Tables
The x86 architecture has a number of different descriptor tables that are used

by the processor to handle things like memory management (GDT), interrupt

dispatching (IDT), and so on.  In addition to processor-level descriptor

tables, the Windows operating system itself also includes a number of distinct

software-level descriptor tables, such as the SSDT.  The majority of these

descriptor tables are heavily relied upon by the operating system and

therefore represent a tantalizing target for use in backdoors.  Like the

function hooking technique described in , all of the techniques presented in

this subsection have been known about for a significant amount of time.  The

authors have attempted, when possible, to identify the origins of each

technique.
2.2.1) IDT
The Interrupt Descriptor Table (IDT) is a processor-relative structure that is

used when dispatching interrupts.  Interrupts are used by the processor as a

means of interrupting program execution in order to handle an event.

Interrupts can occur as a result of a signal from hardware or as a result of

software asserting an interrupt through the int instruction[23].  The IDT contains

256 descriptors that are associated with the 256 interrupt vectors supported

by the processor.  Each IDT descriptor can be one of three types of gate

descriptors (task, interrupt, trap) which are used to describe where and how

control should be transferred when an interrupt for a particular vector

occurs.  The base address and limit of the IDT are stored in the idtr register

which is populated through the lidt instruction.  The current base address and

limit of the idtr can be read using the sidt instruction.
The concept of an IDT hook has most likely been around since the origin of the

concept of interrupt handling.  In most cases, an IDT hook works by

redirecting the procedure entry point for a given IDT descriptor to an

alternative location.  Conceptually, this is the same process involved in

hooking any function pointer (which is described in more detail in ).  The

difference comes as a result of the specific code necessary to hook an IDT

descriptor.
On the x86 processor, each IDT descriptor is an eight byte data structure.

IDT descriptors that are either an interrupt gate or trap gate descriptor

contain the procedure entry point and code segment selector to be used when

the descriptor's associated interrupt vector is asserted.  In addition to

containing control transfer information, each IDT descriptor also contains

additional flags that further control what actions are taken.  The Windows

kernel describes IDT descriptors using the following structure:
kd> dt _KIDTENTRY

   +0x000 Offset           : Uint2B

   +0x002 Selector         : Uint2B

   +0x004 Access           : Uint2B

   +0x006 ExtendedOffset   : Uint2B
In the above data structure, the Offset field holds the low 16 bits of the

procedure entry point and the ExtendedOffset field holds the high 16 bits.

Using this knowledge, an IDT descriptor could be hooked by redirecting the

procedure entry point to an alternate function.  The following code

illustrates how this can be accomplished:
typedef struct _IDT

{

  USHORT          Limit;

  PIDT_DESCRIPTOR Descriptors;

} IDT, *PIDT;
static NTSTATUS HookIdtEntry(

  IN UCHAR DescriptorIndex,

  IN ULONG_PTR NewHandler,

  OUT PULONG_PTR OriginalHandler OPTIONAL)

{

  PIDT_DESCRIPTOR Descriptor = NULL;

  IDT             Idt;
  __asm sidt [Idt]
  Descriptor = &Idt.Descriptors[DescriptorIndex];
  *OriginalHandler =

    (ULONG_PTR)(Descriptor->OffsetLow +

                (Descriptor->OffsetHigh << 16));
  Descriptor->OffsetLow  =

    (USHORT)(NewHandler & 0xffff);

  Descriptor->OffsetHigh =

    (USHORT)((NewHandler >> 16) & 0xffff);
  __asm lidt [Idt]
  return STATUS_SUCCESS;

}
In addition to hooking an individual IDT descriptor, the entire IDT can be

hooked by creating a new table and then setting its information using the lidt

instruction.
Category: Type I; although some portions of the IDT may be legitimately

hooked.
Origin: The IDT hook has its origins in Interrupt Vector Table (IVT) hooks.

In October, 1999, Prasad Dabak et al wrote about IVT hooks[31].  Sadly, they also

seemingly failed to cite their sources.  It's certain that IVT hooks have

existed prior to 1999.  The oldest virus citation the authors could find was

from 1994, but DOS was released in 1981 and it is likely the first IVT hooks

were seen shortly thereafter.  A patent that was filed in December, 1985

entitled Dual operating system computer talks about IVT ``relocation'' in a

manner that suggests IVT hooking of some form.
Capabilities: Kernel-mode code execution.
Covertness: Detection of IDT hooks is often trivial and is a common practice

for rootkit detection tools[32].
2.2.2)  GDT / LDT
The Global Descriptor Table (GDT) and Local Descriptor Table (LDT) are used to

store segment descriptors that describe a view of a system's address space.

Each processor has its own GDT.  Segment descriptors include the base address,

limit, privilege information, and other flags that are used by the processor

when translating a logical address (seg:offset) to a linear address.  Segment

selectors are integers that are used to indirectly reference individual

segment descriptors based on their offset into a given descriptor table.

Software makes use of segment selectors through segment registers, such as CS,

DS, ES, and so on.  More detail about the behavior on segmentation can be

found in the x86 and x64 system programming manuals[1].
In Phrack 55, Greg Hoglund described the potential for abusing conforming code

segments[19].  A conforming code segment, as opposed to a non-conforming code

segment, permits control transfers where CPL is numerically greater than DPL.

However, the CPL is not altered as a result of this type of control transfer.

As such, effective privileges of the caller are not changed.  For this reason,

it's unclear how this could be used to access kernel-mode memory due to the

fact that page protections would still prevent lesser privileged callers from

accessing kernel-mode pages when paging is enabled.
Derek Soeder identified an awesome flaw in 2003 that allowed a user-mode

process to create an expand-down segment descriptor in the calling process'

LDT[40].  An expand-down segment descriptor inverts the meaning of the limit and

base address associated with a segment descriptor.  In this way, the limit

describes the lower limit and the base address describes the upper limit.  The

reason this is useful is due to the fact that when kernel-mode routines

validate addresses passed in from user-mode, they assume flat segments that

start at base address zero.  This is the same thing as assuming that a logical

address is equivalent to a linear address.  However, when expand-down segment

descriptors are used, the linear address will reference a memory location that

can be in stark contrast to the address that's being validated by kernel-mode.

In order to exploit this condition to escalate privileges, all that's

necessary is to identify a system service in kernel-mode that will run with

escalated privileges and make use of segment selectors provided by user-mode

without properly validating them.  Derek gives an example of a MOVS

instruction in the int 0x2e handler.  This trick can be abused in the context

of a local kernel-mode backdoor to provide a way for user-mode code to be able

to read and write kernel-mode memory.
In addition to abusing specific flaws in the way memory can be referenced

through the GDT and LDT, it's also possible to define custom gate descriptors

that would make it possible to call code in kernel-mode from user-mode[23].  One

particularly useful type of gate descriptor, at least in the context of a

backdoor, is a call gate descriptor.  The purpose of a call gate is to allow

lesser privileged code to call more privileged code in a secure fashion[45].  To

abuse this, a backdoor can simply define its own call gate descriptor and then

make use of it to run code in the context of the kernel.
Category: Type IIa; with the exception of the LDT.  The LDT may be better

classified as Type II considering it exposes an API to user-mode that allows

the creation of custom LDT entries (NtSetLdtEntries).
Origin: It's unclear if there were some situational requirements that would be

needed in order to abuse the issue described by Greg Hoglund.  The flaw

identified by Derek Soeder in 2003 was an example of a recurrence of an issue

that was found in older versions of other operating systems, such as Linux.

For example, a mailing list post made by Morten Welinder to LKML in 1996

describes a fix for what appears to be the same type of issue that was

identified by Derek[44].  Creating a custom gate descriptor for use in the context

of a backdoor has been used in the past.  Greg Hoglund described the use of

call gates in the context of a rootkit in 1999[19].
Capabilities: In the case of the expand-down segment descriptor, access to

kernel-mode data is possible.  This can also indirectly lead to kernel-mode

code execution, but it would rely on another backdoor technique.  If a gate

descriptor is abused, direct kernel-mode code execution is possible.
Covertness: It is entirely possible to write have code that will detect the

addition or alteration of entries in the GDT or each individual process LDT.

For example, PatchGuard will currently detect alterations to the GDT.
2.2.3) SSDT
The System Service Descriptor Table (SSDT) is used by the Windows kernel when

dispatching system calls.  The SSDT itself is exported in kernel-mode through

the nt!KeServiceDescriptorTable global variable.  This variable contains

information relating to system call tables that have been registered with the

operating.  In contrast to other operating systems, the Windows kernel

supports the dynamic registration (nt!KeAddSystemServiceTable) of new system

call tables at runtime.  The two most common system call tables are those used

for native and GDI system calls.
In the context of a local kernel-mode backdoor, system calls represent an

obvious target due to the fact that they are implicitly tied to the privilege

boundary that exists between user-mode and kernel-mode.  The act of hooking a

system call handler in kernel-mode makes it possible to expose a privileged

backdoor into the kernel using the operating system's well-defined system call

interface.  Furthermore, hooking system calls makes it possible for the

backdoor to alter data that is seen by user-mode and thus potentially hide its

presence to some degree.
In practice, system calls can be hooked on Windows using two distinct

strategies.  The first strategy involves using generic function hooking

techniques which are described in .  The second strategy involves using the

function pointer hooking technique which is described in .  Using the function

pointer hooking involves simply altering the function pointer associated with

a specific system call index by accessed the system call table which contains

the system call that is to be hooked.
The following code shows a very simple illustration of how one might go about

hooking a system call in the native system call table on 32-bit versions of

Windows System call hooking on 64-bit versions of Windows would require

PatchGuard to be disabled:
PVOID HookSystemCall(

  PVOID SystemCallFunction,

  PVOID HookFunction)

{

  ULONG SystemCallIndex =

    *(ULONG *)((PCHAR)SystemCallFunction+1);

  PVOID *NativeSystemCallTable =

    KeServiceDescriptorTable[0];

  PVOID OriginalSystemCall =

    NativeSystemCallTable[SystemCallIndex];
  NativeSystemCallTable[SystemCallIndex] = HookFunction;
  return OriginalSystemCall;

}
Category: Type I if prologue hook is used.  Type IIa if the function pointer

hook is used.  The SSDT (both native and GDI) should effectively be considered

write-once.
Origin: System call hooking has been used extensively for quite some time.

Since this technique has become so well-known, its actual origins are unclear.

The earliest description the authors could find was from M. B. Jones in a

paper from 1993 entitled Interposition agents: Transparently interposing user

code at the system interface[27].  Jones explains in his section on related work

that he was unable to find any explicit research on the subject prior of

agent-based interposition prior to his writing.  However, it seems clear that

system calls were being hooked in an ad-hoc fashion far in advance of this

point.  The authors were unable to find many of the papers cited by Jones.

Plaguez appears to be one of the first (Jan, 1998) to publicly illustrate the

usefulness of system call hooking in Linux with a specific eye toward security

in Phrack 52[30].
Capabilities: Kernel-mode code execution.
Considerations: On certain versions of Windows XP, the SSDT is marked as

read-only.  This must be taken into account when attempting to write to the

SSDT across multiple versions of Windows.
Covertness: System call hooks on Windows are very easy to detect.  Comparing

the in-memory SSDTs with the on-disk versions is one of the most common

strategies employed.
2.3) Model-specific Registers
Intel processors support a special category of processor-specific registers

known as Model-specific Registers (MSRs).  MSRs provide software with the

ability to control various hardware and software features.  Unlike other

registers, MSRs are tied to a specific processor model and are not guaranteed

to be supported in future versions of a processor line.  Some of the features

that MSRs offer include enhanced performance monitoring and debugging, among

other things.  Software can read MSRs using the rdmsr instruction and write

MSRs using the wrmsr[23].
This subsection will describe some of the MSRs that may be useful in the

context of a local kernel-mode backdoor.
2.3.1) IA32_SYSENTER_EIP
The Pentium II introduced enhanced support for transitioning between user-mode

and kernel-mode.  This support was provided through the introduction of two

new instructions: sysenter and sysexit.  AMD processors also introduced enhanced

new instructions to provide this feature.  When a user-mode application wishes

to transition to kernel-mode, it issues the sysenter instruction.  When the

kernel is ready to return to user-mode, it issues the sysexit instruction.

Unlike the the call instruction, the sysenter instruction takes no operands.

Instead, this instruction uses three specific MSRs that are initialized by the

operating system as the target for control transfers[23].
The IA32_SYSENTER_CS (0x174) MSR is used by the processor to set the kernel-mode

CS.  The IA32_SYSENTER_EIP (0x176) MSR contains the virtual address of the

kernel-mode entry point that code should begin executing at once the

transition has completed.  The third MSR, IA32_SYSENTER_ESP (0x175), contains

the virtual address that the stack pointer should be set to.  Of these three

MSRs, IA32_SYSENTER_EIP is the most interesting in terms of its potential for

use in the context of a backdoor.  Setting this MSR to the address of a

function controlled by the backdoor makes it possible for the backdoor to

intercept all system calls after they have trapped into kernel-mode.  This

provides a very powerful vantage point.
For more information on the behavior of the sysenter and sysexit instructions,

the reader should consult both the Intel manuals and John Gulbrandsen's

article[23, 15].
Category: Type I
Origin: This feature is provided for the explicit purpose of allowing an

operating system to control the behavior of the sysenter instruction.  As

such, it is only logical that it can also be applied in the context of a

backdoor.  Kimmo Kasslin mentions a virus from December, 2005 that made use of

MSR hooks[25].  Earlier that year in February, fuzenop from rootkit.com released a

proof of concept[12].
Capabilities: Kernel-mode code execution
Considerations: This technique is restricted by the fact that not all

processors support this MSR.  Furthermore, user-mode processes are not

necessarily required to use it in order to transition into kernel-mode when

performing a system call.  These facts limit the effectiveness of this

technique as it is not guaranteed to work on all machines.
Covertness: Changing the value of the IA32_SYSENTER_EIP MSR can be detected.

For example, PatchGuard currently checks to see if the equivalent AMD64 MSR

has been modified as a part of its polling checks[36].  It is more difficult for

third party vendors to perform this check due to the simple fact that the

default value for this MSR is an unexported symbol named nt!KiFastCallEntry:
kd> rdmsr 176

msr[176] = 00000000`804de6f0

kd> u 00000000`804de6f0

nt!KiFastCallEntry:

804de6f0 b923000000      mov     ecx,23h
Without having symbols, third parties have a more difficult time of

distinguishing between a value that is sane and one that is not.
2.4) Page Table Entries
When operating in protected mode, x86 processors support virtualizing the

address space through the use of a feature known as paging.  The paging

feature makes it possible to virtualize the address space by adding a

translation layer between linear addresses and physical addresses.  When paging

is not enabled, linear addresses are equivalent to physical addresses.    To

translate addresses, the processor uses portions of the address being

referenced to index directories and tables that convey flags and physical

address information that describe how the translation should be performed.

The majority of the details on how this translation is performed are outside

of the scope of this document.  If necessary, the reader should consult

section 3.7 of the Intel System Programming Manual[23].  Many other papers in the

references also discuss this topic[41].
The paging system is particularly interesting due to its potential for abuse

in the context of a backdoor.  When the processor attempts to translate a

linear address, it walks a number of page tables to determine the associated

physical address.  When this occurs, the processor makes a check to ensure

that the task referencing the address has sufficient rights to do so.  This

access check is enforced by checking the User/Supervisor bit of the

Page-Directory Entry (PDE) and Page-Table Entry (PTE) associated with the

page.  If this bit is clear, only the supervisor (privilege level 0) is

allowed to access the page.  If the bit is set, both supervisor and user are

allowed to access the page This isn't always the case depending on whether or

not the WP bit is set in CR0.
The implications surrounding this flag should be obvious.  By toggling the

flag in the PDE and PTE associated with an address, a backdoor can gain access

to read or write kernel-mode memory.  This would indirectly make it possible

to gain code execution by making use of one of the other techniques listed in

this document.
Category: Type II
Origin: The modification of PDE and PTE entries has been supported since the

hardware paging's inception.  The authors were not able to find an exact

source of the first use of this technique in a backdoor.  There have been a

number of examples in recent years of tools that abuse the supervisor bit in

one way or another[29, 41].  PaX team provided the first documentation of their

PAGEEXEC code in March, 2003.  In January, 1998, Mythrandir mentions the

supervisor bit in phrack 52 but doesn't explicitly call out how it could be

abused[28].
Capabilities: Access to kernel-mode data.
Considerations: Code that attempts to implement this approach would need to

properly support PAE and non-PAE processors on x86 in order to work reliably.

This approach is also extremely dangerous and potentially unreliable depending

on how it interacts with the memory manager.  For example, if pages are not

properly locked into physical memory, they may be pruned and thus any PDE or

PTE modifications would be lost.  This would result in the user-mode process

losing access to a specific page.
Covertness: This approach could be considered fairly covert without the

presence of some tool capable of intercepting PDE or PTE modifications.

Locking pages into physical memory may make it easier to detect in a polling

fashion by walking the set of locked pages and checking to see if their

associated PDE or PTE has been made accessible to user-mode.
2.5) Function Pointers
The use of function pointers to indirectly transfer control of execution from

one location to another is used extensively by the Windows kernel[18].  Like the

function prologue overwrite described in , the act of hooking a function by

altering a function pointer is an easy way to intercept future calls to a

given function.  The difference, however, is that hooking a function by

altering a function pointer will only intercept indirect calls made to the

hooked function through the function pointer.  Though this may seem like a

fairly significant limitation, even these restrictions do not drastically

limit the set of function pointers that can be abused to provide a kernel-mode

backdoor.
The concept itself should be simple enough.  All that's necessary is to modify

the contents of a given function pointer to point at untrusted code.  When the

function is invoked through the function pointer, the untrusted code is

executed instead.  If the untrusted code wishes to be able to call the

function that is being hooked, it can save the address that is stored in the

function pointer prior to overwriting it.  When possible, hooking a function

through a function pointer is a simple and elegant solution that should have

very little impact on the stability of the system (with obvious exception to

the quality of the replacement function).
Regardless of what approach is taken to hook a function, an obvious question

is where the backdoor code associated with a given hook function should be

placed.  There are really only two general memory locations that the code can

be stored.  It can either stored in user-mode, which would generally make it

specific to a given process, or kernel-mode, which would make it visible

system wide.  Deciding which of the two locations to use is a matter of

determining the contextual restrictions of the function pointer being

leveraged.  For example, if the function pointer is called through at a raised

IRQL, such as DISPATCH, then it is not possible to store the hook function's

code in pageable memory.  Another example of a restriction is the process

context in which the function pointer is used.  If a function pointer may be

called through in any process context, then there are only a finite number of

locations that the code could be placed in user-mode.  It's important to

understand some of the specific locations that code may be stored in
Perhaps the most obvious location that can be used to store code that is to

execute in kernel-mode is the kernel pools, such as the PagedPool and

NonPagedPool, which are used to store dynamically allocated memory.  In some

circumstances, it may also be possible to store code in regions of memory that

contain code or data associated with device drivers.  While these few examples

illustrate that there is certainly no shortage of locations in which to store

code, there are a few locations in particular that are worth calling out.
One such location is composed of a single physical page that is shared between

user-mode and kernel-mode.  This physical page is known as SharedUserData and

it is mapped into user-mode as read-only and kernel-mode as read-write.  The

virtual address that this physical page is mapped at is static in both

user-mode (0x7ffe0000) and kernel-mode (0xffdf0000) on all versions of Windows

NT+ The virtual mappings are no longer executable as of Windows XP SP2.

However, it is entirely possible for a backdoor to alter these page

permissions..  There is also plenty of unused memory within the page that is

allocated for SharedUserData.  The fact that the mapping address is static

makes it a useful location to store small amounts of code without needing to

allocate additional storage from the paged or non-paged pool[24].
Though the SharedUserData mapping is quite useful, there is actually an

alternative location that can be used to store code that is arguably more

covert.  This approach involves overwriting a function pointer with the

address of some code from the virtual mapping of the native DLL, ntdll.dll.

The native DLL is special in that it is the only DLL that is guaranteed to be

mapped into the context of every process, including the System process.  It is

also mapped at the same base address in every process due to assumptions made

by the Windows kernel.  While these are useful qualities, the best reason for

using the ntdll.dll mapping to store code is that doing so makes it possible

to store code in a process-relative fashion.  Understanding how this works in

practice requires some additional explanation.
The native DLL, ntdll.dll, is mapped into the address space of the System

process and subsequent processes during kernel and process initialization,

respectively.  This mapping is performed in kernel-mode by nt!PspMapSystemDll.

One can observe the presence of this mapping in the context of the System

process through a debugger as shown below.  These same basic steps can be

taken to confirm that ntdll.dll is mapped into other processes as well (The

command !vad is used to dump the virtual address directory for a given

process.  This directory contains descriptions of memory regions within a

given process):
kd> !process 0 0 System

PROCESS 81291660  SessionId: none  Cid: 0004

    Peb: 00000000  ParentCid: 0000

    DirBase: 00039000  ObjectTable: e1000a68

    HandleCount: 256.

    Image: System

kd> !process 81291660

PROCESS 81291660  SessionId: none  Cid: 0004

    Peb: 00000000  ParentCid: 0000

    DirBase: 00039000  ObjectTable: e1000a68

    HandleCount: 256.

    Image: System

    VadRoot 8128f288 Vads 4

...

kd> !vad 8128f288

VAD     level start end   commit

...

81207d98 ( 1) 7c900 7c9af 5 Mapped  Exe

kd> dS poi(poi(81207d98+0x18)+0x24)+0x30

e13591a8  "/WINDOWS/system32/ntdll.dll"
To make use of the ntdll.dll mapping as a location in which to store code, one

must understand the implications of altering the contents of the mapping

itself.  Like all other image mappings, the code pages associated with

ntdll.dll are marked as Copy-on-Write (COW) and are initially shared between

all processes.  When data is written to a page that has been marked with COW,

the kernel allocates a new physical page and copies the contents of the shared

page into the newly allocated page.  This new physical page is then associated

with the virtual page that is being written to.  Any changes made to the new

page are observed only within the context of the process that is making them.

This behavior is why altering the contents of a mapping associated with an

image file do not lead to changes appearing in all process contexts.
Based on the ability to make process-relative changes to the ntdll.dll

mapping, one is able to store code that will only be used when a function

pointer is called through in the context of a specific process.  When not

called in a specific process context, whatever code exists in the default

mapping of ntdll.dll will be executed.  In order to better understand how this

may work, it makes sense to walk through a concrete example.
In this example, a rootkit has opted to create a backdoor by overwriting the

function pointer that is used when dispatching IRPs using the

IRP_MJ_FLUSH_BUFFERS major function for a specific device object.  The

prototype for the function that handles IRP_MJ_FLUSH_BUFFERS IRPs is shown

below:
NTSTATUS DispatchFlushBuffers(

    IN PDEVICE_OBJECT DeviceObject,

    IN PIRP Irp);
In order to create a context-specific backdoor, the rootkit has chosen to

overwrite the function pointer described above with an address that resides

within ntdll.dll.  By default, the rootkit wants all processes except those

that are aware of the backdoor to simply have a no-operation occur when

IRP_MJ_FLUSH_BUFFERS is sent to the device object.  For processes that are aware

of the backdoor, the rootkit wants arbitrary code execution to occur in

kernel-mode.  To accomplish this, the function pointer should be overwritten

with an address that resides in ntdll.dll that contains a ret 0x8 instruction.

This will simply cause invocations of IRP_MJ_FLUSH_BUFFERS to return (without

completing the IRP).  The location of this ret 0x8 should be in a portion of

code that is rarely executed in user-mode.  For processes that wish to execute

arbitrary code in kernel-mode, it's as simple as altering the code that exists

at the address of the ret 0x8 instruction.  After altering the code, the

process only needs to issue an IRP_MJ_FLUSH_BUFFERS through the FlushFileBuffers

function on the affected device object.  The context-dependent execution of

code is made possible by the fact that, in most cases, IRPs are processed in

the context of the requesting process.
The remainder of this subsection will describe specific function pointers that

may be useful targets for use as backdoors.  The authors have tried to cover

some of the more intriguing examples of function pointers that may be hooked.

Still, it goes without saying that there are many more that have not been

explicitly described.  The authors would be interested to hear about

additional function pointers that have unique and useful properties in the

context of a local kernel-mode backdoor.
2.5.1) Import Address Table
The Import Address Table (IAT) of a PE image is used to store the absolute

virtual addresses of functions that are imported from external PE

images[35].  When a PE image is mapped into virtual memory, the dynamic loader (in

kernel-mode, this is ntoskrnl) takes care of populating the contents of the PE

image's IAT based on the actual virtual address locations of dependent

functions For the sake of simplicity, bound imports are excluded from this

explanation.  The compiler, in turn, generates code that uses an indirect call

instruction to invoke imported functions.  Each imported function has a

function pointer slot in the IAT.  In this fashion, PE images do not need to

have any preconceived knowledge of where dependent PE images are going to be

mapped in virtual memory.  Instead, this knowledge can be postponed until a

runtime determination is made.
The fundamental step involved in hooking an IAT entry really just boils down

to changing a function pointer.  What distinguishes an IAT hook from other

types of function pointer hooks is the context in which the overwritten

function pointer is called through.  Since each PE image has their own IAT,

any hook that is made to a given IAT will implicitly only affect the

associated PE image.  For example, consider a situation where both foo.sys and

bar.sys import ExAllocatePoolWithTag.  If the IAT entry for

ExAllocatePoolWithTag is hooked in foo.sys, only those calls made from within

foo.sys to ExAllocatePoolWithTag will be affected.  Calls made to the same

function from within bar.sys will be unaffected.  This type of limitation can

actually be a good thing, depending on the underlying motivations for a given

backdoor.
Category: Type I; may legitimately be modified, but should point to expected

values.
Origin: The origin of the first IAT hook is unclear.  In January, 2000, Silvio

described hooking via the ELF PLT which is, in some aspects, functionally

equivalent to the IAT in PE images.
Capabilities: Kernel-mode code execution
Considerations: Assuming the calling restrictions of an IAT hook are

acceptable for a given backdoor, there are no additional considerations that

need to be made.
Covertness: It is possible for modern tools to detect IAT hooks by analyzing

the contents of the IAT of each PE image loaded in kernel-mode.  To detect

discrepancies, a tool need only check to see if the virtual address associated

with each function in the IAT is indeed the same virtual address as exported

by the PE image that contains a dependent function.
2.5.2) KiDebugRoutine
The Windows kernel provides an extensive debugging interface to allow the

kernel itself (and third party drivers) to be debugged in a live, interactive

environment (as opposed to after-the-fact, post-mortem crash dump debugging).

This debugging interface is used by a kernel debugger program (kd.exe, or

WinDbg.exe) in order to perform tasks such as the inspecting the running state

(including memory, registers, kernel state such as processes and threads, and

the like) of the kernel on-demand.  The debugging interface also provides

facilities for the kernel to report various events of interest to a kernel

debugger, such as exceptions, module load events, debug print output, and a

handful of other state transitions.  As a result, the kernel debugger

interface has ``hooks'' built-in to various parts of the kernel for the

purpose of notifying the kernel debugger of these events.
The far-reaching capabilities of the kernel debugger in combination with the

fact that the kernel debugger interface is (in general) present in a

compatible fashion across all OS builds provides an attractive mechanism that

can be used to gain control of a system.  By subverting KiDebugRoutine to

instead point to a custom callback function, it becomes possible to

surepticiously gain control at key moments (debug prints, exception

dispatching, kernel module loading are the primary candidates).
The architecture of the kernel debugger event notification interface can be

summed up in terms of a global function pointer (KiDebugRoutine) in the

kernel.  A number distinct pieces of code, such as the exception dispatcher,

module loader, and so on are designed to call through KiDebugRoutine in order

to notify the kernel debugger of events.  In order to minimize overhead in

scenarios where the kernel debugger is inactive, KiDebugRoutine is typically

set to point to a dummy function,  KdpStub, which performs almost no actions

and, for the most part, simply returns immediately to the caller.  However,

when the system is booted with the kernel debugger enabled, KiDebugRoutine may

be set to an alternate function, KdpTrap, which passes the information

supplied by the caller to the remote debugger.
Although enabling or disabling the kernel debugger has traditionally been a

boot-time-only decision, newer OS builds such as Windows Server 2003 and

beyond have some support for transitioning a system from a ``kernel debugger

inactive'' state to a ``kernel debugger active'' state.  As a result, there is

some additional logic now baked into the dummy routine (KdpStub) which can

under some circumstances result in the debugger being activated on-demand.

This results in control being passed to the actual debugger communication

routine (KdpTrap) after an on-demand kernel debugger initialization.  Thus, in

some circumstances, KdpStub will pass control through to KdpTrap.
Additionally, in Windows Server 2003 and later, it is possible to disable the

kernel debugger on the fly.  This may result in KiDebugRoutine being changed

to refer to KdpStub instead of the boot-time-assigned KdpTrap.  This behavior,

combined with the previous points, is meant to show that provided a system is

booted with the kernel debugger enabled it may not be enough to just enforce a

policy that KiDebugRoutine must not change throughout the lifetime of the

system.
Aside from exception dispatching notifiations, most debug events find their

way to KiDebugRoutine via interrupt 0x2d, otherwise known as ``DebugService''.

This includes user-mode debug print events as well as kernel mode originated

events (such as kernel module load events).  The trap handler for interrupt

0x2d packages the information supplied to the debug service interrupt into the

format of a special exception that is then dispatched via KiExceptionDispatch

(the normal exception dispatcher path for interrupt-generated exceptions).

This in turn leads to KiDebugRoutine being called as a normal part of the

exception dispatcher's operation.
Category: Type IIa, varies.  Although on previous OS versions KiDebugRoutine

was essentially write-once, recent versions allow limited changes of this

value on the fly while the system is booted.
Origin: At the time of this writing, the authors are not aware of existing

malware using KiDebugRoutine.
Capabilities: Redirecting KiDebugRoutine to point to a caller-controlled

location allows control to be gained during exception dispatching (a very

common occurrence), as well as certain other circumstances (such as module

loading and debug print output).  As an added bonus, because KiDebugRoutine is

integral to the operation of the kernel debugger facility as a whole, it

should be possible to ``filter'' the events received by the kernel debugger by

manipulation of which events are actually passed on to KdpTrap, if a kernel

debugger is enabled.  However, it should be noted that other steps would need

to be taken to prevent a kernel debugger from detecting the presence of code,

such as the interception of the kernel debugger read-memory facilities.
Considerations: Depending on how the system global flags (NtGlobalFlag) are

configured, and whether the system was booted in such a way as to suppress

notification of user mode exceptions to the kernel debugger, exception events

may not always be delivered to KiDebugRoutine.  Also, as KiDebugRoutine is not

exported, it would be necessary to locate it in order to intercept it.

Furthermore, many of the debugger events occur in an arbitrary context, such

that pointing KiDebugRoutine to user mode (except within ntdll space) may be

considered dangerous.  Even while pointing KiDebugRoutine to ntdll, there is

the risk that the system may be brought down as some debugger events may be

reported while the system cannot tolerate paging (e.g. debug prints).  From a

thread-safety perspective, an interlocked exchange on KiDebugRoutine should be

a relatively synchronization-safe operation (however the new callback routine

may never be unmapped from the address space without some means of ensuring

that no callbacks are active).
Covertness: As KiDebugRoutine is a non-exported, writable kernel global, it

has some inherent defenses against simple detection techniques.  However, in

legitimate system operation, there are only two legal values for

KiDebugRoutine: KdpStub, and KdpTrap.  Though both of these routines are not

exported, a combination of detection techniques (such as verifying the

integrity of read only kernel code, and a verification that KiDebugRoutine

refers to a location within an expected code region of the kernel memory

image) may make it easier to locate blatant attacks on KiDebugRoutine.  For

example, simply setting KiDebugRoutine to point to an out-of-kernel location

could be detected with such an approach, as could pointing it elsewhere in the

kernel and then writing to it (either the target location would need to be

outside the normal code region, easily detectable, or normally read-only code

would have to be overwritten, also relatively easily detectable).  Also, all

versions of PatchGuard protect KiDebugRoutine in x64 versions of Windows.

This means that effective exploitation of KiDebugRoutine in the long term on

such systems would require an attacker to deal with PatchGuard.  This is

considered a relatively minor difficulty by the authors.
2.5.3) KTHREAD's SuspendApc
In order to support thread suspension, the Windows kernel includes a KAPC

field named SuspendApc in the KTHREAD structure that is associated with each

thread running on a system.  When thread suspension is requested, the kernel

takes steps to queue the SuspendApc structure to the thread's APC queue.  When

the APC queue is processed, the kernel invokes the APC's NormalRoutine, which

is typically initialized to nt!KiSuspendThread, from the SuspendApc structure

in the context of the thread that is being suspended.  Once nt!KiSuspendThread

completes, the thread is suspended.  The following shows what values the

SuspendApc is typically initialized to:
kd> dt -r1 _KTHREAD 80558c20

...

 +0x16c SuspendApc      : _KAPC

  +0x000 Type           : 18

  +0x002 Size           : 48

  +0x004 Spare0         : 0

  +0x008 Thread         : 0x80558c20 _KTHREAD

  +0x00c ApcListEntry   : _LIST_ENTRY [ 0x0 - 0x0 ]

  +0x014 KernelRoutine  : 0x804fa8a1 nt!KiSuspendNop

  +0x018 RundownRoutine : 0x805139ed nt!PopAttribNop

  +0x01c NormalRoutine  : 0x804fa881 nt!KiSuspendThread

  +0x020 NormalContext  : (null)

  +0x024 SystemArgument1: (null)

  +0x028 SystemArgument2: (null)

  +0x02c ApcStateIndex  : 0 ''

  +0x02d ApcMode        : 0 ''

  +0x02e Inserted       : 0 ''
Since the SuspendApc structure is specific to a given KTHREAD, any

modification made to a thread's SuspendApc.NormalRoutine will affect only that

specific thread.  By modifying the NormalRoutine of the SuspendApc associated

with a given thread, a backdoor can gain arbitrary code execution in

kernel-mode by simply attempting to suspend the thread.  It is trivial for a

user-mode application to trigger the backdoor.  The following sample code

illustrates how a thread might execute arbitrary code in kernel-mode if its

SuspendApc has been modified:
SuspendThread(GetCurrentThread());
The following code gives an example of assembly that implements the technique

described above taking into account the InitialStack insight described in the

considerations below:
public _RkSetSuspendApcNormalRoutine@4

_RkSetSuspendApcNormalRoutine@4 proc

  assume fs:nothing

  push  edi

  push  esi

  ; Grab the current thread pointer

  xor   ecx, ecx

  inc   ch

  mov   esi, fs:[ecx+24h]

  ; Grab KTHREAD.InitialStack

  lea   esi, [esi+18h]

  lodsd

  xchg  esi, edi

  ; Find StackBase

  repne scasd

  ; Set KTHREAD->SuspendApc.NormalRoutine

  mov   eax, [esp+0ch]

  xchg  eax, [edi+1ch]

  pop   esi

  pop   edi

  ret

_RkSetSuspendApcNormalRoutine@4 endp
Category: Type IIa
Origin: The authors believe this to be the first public description of this

technique.  Skywing is credited with the idea.  Greg Hoglund mentions abusing

APC queues to execute code, but he does not explicitly call out

SuspendApc[18].
Capabilities: Kernel-mode code execution.
Considerations: This technique is extremely effective.  It provides a simple

way of executing arbitrary code in kernel-mode by simply hijacking the

mechanism used to suspend a specific thread.  There are also some interesting

side effects that are worth mentioning.  Overwriting the SuspendApc's

NormalRoutine makes it so that the thread can no longer be suspended.  Even

better, if the hook function that replaces the NormalRoutine never returns, it

becomes impossible for the thread, and thus the owning process, to be killed

because of the fact that the NormalRoutine is invoked at APC level.  Both of

these side effects are valuable in the context of a rootkit.
One consideration that must be made from the perspective of a backdoor is that

it will be necessary to devise a technique that can be used to locate the

SuspendApc field in the KTHREAD structure across multiple versions of Windows.

Fortunately, there are heuristics that can be used to accomplish this.  In all

versions of Windows analyzed thus far, the SuspendApc field is preceded by the

StackBase field.  It has been confirmed on multiple operating systems that the

StackBase field is equal to the InitialStack field.  The InitialStack field is

located at a reliable offset (0x18) on all versions of Windows checked by the

authors.  Using this knowledge, it is trivial to write some code that scans

the KTHREAD structure on pointer aligned offsets until it encounters a value

that is equal to the InitialStack.  Once a match is found, it is possible to

assume that the SuspendApc immediately follows it.
Covertness: This technique involves overwriting a function pointer in a

dynamically allocated region of memory that is associated with a specific

thread.  This makes the technique fairly covert, but not impossible to detect.

One method of detecting this technique would be to enumerate the threads in

each process to see if the NormalRoutine of the SuspendApc is set to the

expected value of nt!KiSuspendThread.  It would be challenging for someone

other than Microsoft to implement this safely.  The authors are not aware of

any tool that currently does this.
2.5.4) Create Thread Notify Routine
The Windows kernel provides drivers with the ability to register a callback

that will be notified when threads are created and terminated.  This ability

is provided through the Windows Driver Model (WDM) export

nt!PsSetCreateThreadNotifyRoutine.  When a thread is created or terminated,

the kernel enumerates the list of registered callbacks and notifies them of

the event.
Category: Type II
Origin: The ability to register a callback that is notified when threads are

created and terminated has been included since the first release of the WDM.
Capabilities: Kernel-mode code execution.
Considerations: This technique is useful because a user-mode process can

control the invocation of the callback by simply creating or terminating a

thread.  Additionally, the callback will be notified in the context of the

process that is creating or terminating the thread.  This makes it possible to

set the callback routine to an address that resides within ntdll.dll.
Covertness: This technique is covert in that it is possible for a backdoor to

blend in with any other registered callbacks.  Without having a known-good

state to compare against, it would be challenging to conclusively state that a

registered callback is associated with a backdoor.  There are some indicators

that could be used that something is odd, such as if the callback routine

resides in ntdll.dll or if it resides in either the paged or non-paged pool.
2.5.5) Object Type Initializers
The Windows NT kernel uses an object-oriented approach to representing

resources such as files, drivers, devices, processes, threads, and so on.

Each object is categorized by an object type.  This object type categorization

provides a way for the kernel to support common actions that should be applied

to objects of the same type, among other things.  Under this design, each

object is associated with only one object type.  For example, process objects

are associated with the nt!PsProcessType object type.  The structure used to

represent an object type is the OBJECT_TYPE structure which contains a nested

structure named OBJECT_TYPEIN_ITIALIZER.  It's this second structure that

provides some particularly interesting fields that can be used in a backdoor.
As one might expect, the fields of most interest are function pointers.  These

function pointers, if non-null, are called by the kernel at certain points

during the lifetime of an object that is associated with a particular object

type.  The following debugger output shows the function pointer fields:
kd> dt nt!_OBJECT_TYPE_INITIALIZER

...

   +0x02c DumpProcedure    : Ptr32

   +0x030 OpenProcedure    : Ptr32

   +0x034 CloseProcedure   : Ptr32

   +0x038 DeleteProcedure  : Ptr32

   +0x03c ParseProcedure   : Ptr32

   +0x040 SecurityProcedure : Ptr32

   +0x044 QueryNameProcedure : Ptr32

   +0x048 OkayToCloseProcedure : Ptr32
Two fairly easy to understand procedures are OpenProcedure and CloseProcedure.

These function pointers are called when an object of a given type is opened

and closed, respectively.  This gives the object type initializer a chance to

perform some common operation on an instance of an object type.  In the case

of a backdoor, this exposes a mechanism through which arbitrary code could be

executed in kernel-mode whenever an object of a given type is opened or

closed.
Category: Type IIa
Origin: Matt Conover gave an excellent presentation on how object type

initializers can be used to detect rootkits at XCon 2005[8].  Conversely, they

can also be used to backdoor the system.  The authors are not aware of public

examples prior to Conover's presentation.  Greg Hoglund also mentions this

type of approach[18] in June, 2006.
Capabilities: Kernel-mode code execution.
Considerations: There are no unique considerations involved in the use of this

technique.
Covertness: This technique can be detected by tools designed to validate the

state of object type initializers against a known-good state.  Currently, the

authors are not aware of any tools that perform this type of check.
2.5.6) PsInvertedFunctionTable
With the introduction of Windows for x64, significant changes were made to how

exceptions are processed with respect to how exceptions operate in x86

versions of Windows.  On x86 versions of Windows, exception handlers were

essentially demand-registered at runtime by routines with exception handlers

(more of a code-based exception registration mechanism).  On x64 versions of

Windows, the exception registration path is accomplished using a more

data-driven model.  Specifically, exception handling (and especially unwind

handling) is now driven by metadata attached to each PE image (known as the

``exception directory''), which describes the relationship between routines

and their exception handlers, what the exception handler function pointer(s)

for each region of a routine are, and how to unwind each routine's machine

state in a completely data-driven fashion.
While there are significant advantages to having exception and unwind

dispatching accomplished using a data-driven model, there is a potential

performance penalty over the x86 method (which consisted of a linked list of

exception and unwind handlers registered at a known location,
【上篇】接风洗尘祝酒词
【下篇】如何创建一个Sencha Touch 2应用_记事本案例(第三部分)
作者: octagon

该日志由 octagon 于11年前发表在综合分类下，最后更新于 2013年09月28日.
转载请注明: A Catalog of Local Windows Kernel-mode Backdoor Techniques | 学步园 +复制链接
抱歉!评论已关闭.
返回首页

（其他合作也可洽谈）
必威体育
必威电竞
学步园

A Catalog of Local Windows Kernel-mode Backdoor Techniques

作者: octagon

书签

最新文章New

本站推荐

返回首页