a simple app embedded CPU emulator to intercept all memory reads/writes

  • Follow


Hello,

I need to classify to some degree & anylize all the data allocated &
used in one program at run-time.
The application is a large one, comes in binary form only(Windows XP+)
- so source code is available.

Generally, there are about 20K-60K dynamic(that is, from heap)
allocations exist at any time.
Most of them are C++ objects.

I was successful in injecting a DLL into the target process at
startup, so that i can overwrite
the program's first to-be-executed instructions with the jump to the
custom routine - the start
of CPU emulator.

Why I thing I need an emulator here? - Well, interpreting every
instruction of execution flow via
CPU simulator allows me to easily track all memory read/write ops -
and this is exactly what
i need to anylize all program's dynamically allocated data blocks.

So I need to implement a simple CPU instructions simulator. Currently,
I'am at the very beginning
and trying to predict all the difficulties I will face and all hard
things to deal with. I'am
going to list them in the numbered list below with some general
thoughts - please comment one them.

1) The majority of CPU instructions effect only the CPU state(its
registers) and memory
state(on mem write). I see no problems in emulating them.

2) Now comes some special ops like "out" which i absolutely can't
emulate so I should execute them natively.
Some intructions expects certain registers to be initialized with
input arguments - so I have to
temporary switch my emulator to 'native mode' setup the input
registers, execute the instruction as-is
hoping there is no side affects like EIP changing and then switch back
to emulation mode - hope there
is no problems with such approach?

  2.1) What about intructions which should be executed natively &
update EIP as the result - are
       such kind of ops exists, and if yes how they can be handled?

3) application runs multiple threads - so I should hook thread start
routine (Windows: CreateThread)
to point it to CPU emulatior instead. Currently see no difficulties
with this.

If you think of anything else I overlooked? - your input is highly
welcome then!

Thank you, Dmitry

p.s. sorry for my somewhat bad Enlish - I'am doing my best with it

0
Reply spamtrap2 (1628) 3/8/2007 10:38:35 PM

"dmitry sychov" <spamtrap@crayne.org> wrote in message 
news:1173393515.178068.281320@8g2000cwh.googlegroups.com...
> Hello,
>
> I need to classify to some degree & anylize all the data allocated &
> used in one program at run-time.
> The application is a large one, comes in binary form only(Windows XP+)
> - so source code is available.
>
<snip>

>
> If you think of anything else I overlooked? - your input is highly
> welcome then!
>

if I were attempting something like this, here is what I would do:
I have some custom loader (like, an app haxorer);
this loader loads, and then uses a custom loader to load the app into 
memory;
at this point, it would attempt to identify and patch up some certain 
functions (if possible it is very helpful if the exe still has a symbol 
table, ...);
the app is then run, with the patched functions.

now, what is patched, this depends on what is being done.

for analyzing memory usage, likely one would want to patch the memory 
allocation functions (malloc, free, realloc, ...). the patched versions 
could then allocate in a specially prepared heap, likely accessed via a 
write barrier (consisting of having the pages set inaccessible, and handling 
accesses via a custom pagefault handler).

and so on...

this could be much less work to implement than really any kind of emulator, 
but depending on what you are doing may or may not be fine-grained or 
accurate enough.


similar has been used in garbage collectors before as a means of allowing 
the app to continue running during garbage collection (well, I have 
typically always used non-incremental mark/sweep collectors, but yeah...).

dunno if this helps any.


> Thank you, Dmitry
>
> p.s. sorry for my somewhat bad Enlish - I'am doing my best with it
>


..

0
Reply cr88192 3/9/2007 9:19:37 AM


1 Replies
145 Views

(page loaded in 0.047 seconds)


Reply: