C decompiler/disassembler?

  • Follow


I recently downloaded a program (*.EXE) written in C, and I discovered
it had a few bugs. The original author of the program is apparently
unreachable, so I figure it falls on me to try and crack into the
program and fix the bugs. Does anyone know of a good C decompiler or
disassembler I can use to do this, that will read an EXE program as
input and generate its source code?

Brandon Taylor
0
Reply DMn2004404 7/26/2010 8:21:33 PM

DMn2004404 <dmn2004404@gmail.com> writes:
> I recently downloaded a program (*.EXE) written in C, and I discovered
> it had a few bugs. The original author of the program is apparently
> unreachable, so I figure it falls on me to try and crack into the
> program and fix the bugs. Does anyone know of a good C decompiler or
> disassembler I can use to do this, that will read an EXE program as
> input and generate its source code?

It's theoretically possible, given an executable file, to generate
C source code that will at least have the same behavior, if not
re-generate an identical executable if compiled.  Do not expect
any such generated C source code to be readable or maintainable.

It's not theoretically possible to regenerate the original C
source file from an executable.  Too much information, including
most identifiers, is discarded during compilation.  It's like
unscrambling an egg.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Nokia
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
0
Reply kst-u (21474) 7/26/2010 10:21:07 PM


Keith Thompson wrote:
> DMn2004404 <dmn2004404@gmail.com> writes:
>> I recently downloaded a program (*.EXE) written in C, and I discovered
>> it had a few bugs. The original author of the program is apparently
>> unreachable, so I figure it falls on me to try and crack into the
>> program and fix the bugs. Does anyone know of a good C decompiler or
>> disassembler I can use to do this, that will read an EXE program as
>> input and generate its source code?
> 
> It's theoretically possible, given an executable file, to generate
> C source code that will at least have the same behavior, if not
> re-generate an identical executable if compiled.  Do not expect
> any such generated C source code to be readable or maintainable.

Let me see if I can narrow the question to be topical and practical.
[my browser only sometimes lets me paste as quotation; how annoying:]
> 
> 
> #include <stdio.h>
> #include <math.h>
> 
> // C will be caller
> 
> int main() 
> {
> 
> double d, pi;
> d = atan(1.0);
> pi = (4.0)*d;
> printf("pi is %f\n", pi);
> 
> return 0;
> }
> 
> // gcc  -std=c99 -Wall -Wextra c_pi1.c -o out

If you were to pick apart the executable resulting from *this* program, 
with access to the same compiler and running on the same platform, how 
would you go about it?
> 
> It's not theoretically possible to regenerate the original C
> source file from an executable.  Too much information, including
> most identifiers, is discarded during compilation.  It's like
> unscrambling an egg.
> 

and there's an infinite number of programs that can produce the same 
executable.
-- 
Uno
0
Reply Uno 7/26/2010 10:47:30 PM

On 7/26/2010 6:21 PM, Keith Thompson wrote:
> DMn2004404<dmn2004404@gmail.com>  writes:
[...]
>> program and fix the bugs. Does anyone know of a good C decompiler or
>> disassembler I can use to do this, that will read an EXE program as
>> input and generate its source code?
[...]
> It's not theoretically possible to regenerate the original C
> source file from an executable.  Too much information, including
> most identifiers, is discarded during compilation.  It's like
> unscrambling an egg.

I believe the expression is something like "you can't turn hamburger back 
into a cow."  :-)

-- 
Kenneth Brody
0
Reply kenbrody (1860) 7/27/2010 1:07:44 PM

On 7/27/2010 9:07 AM, Kenneth Brody wrote:
> On 7/26/2010 6:21 PM, Keith Thompson wrote:
>> DMn2004404<dmn2004404@gmail.com> writes:
> [...]
>>> program and fix the bugs. Does anyone know of a good C decompiler or
>>> disassembler I can use to do this, that will read an EXE program as
>>> input and generate its source code?
> [...]
>> It's not theoretically possible to regenerate the original C
>> source file from an executable. Too much information, including
>> most identifiers, is discarded during compilation. It's like
>> unscrambling an egg.
>
> I believe the expression is something like "you can't turn hamburger
> back into a cow." :-)
>

http://www.itee.uq.edu.au/~cristina/dcc.html

 From the docs: "Dcc has a fundamental implementation flaw that limits 
it to about 30KB of input binary program, i.e. it currently handles toy 
programs only!  The problem is that pointers are kept in many places; 
many of these pointers point to elements of arrays. The arrays are all 
of variable size; the realloc system call can and will change the 
virtual addresses of these arrays, thus invalidating the pointers. 
Because of this, results are unpredictable as soon as one array is resized."





-- 
Billy Mays
http://www.jpgdump.com <- My attempt at humor.
0
Reply noway17 (176) 7/27/2010 1:25:03 PM

>I believe the expression is something like "you can't turn hamburger back 
>into a cow."  :-)

That operation has been broken (in the sense of "breaking the code"),
given the existence of DNA, and assuming you're not talking about
*cooked* hamburger (which, as I understand it from various "CSI"
TV shows, destroys the DNA).  Of course, it's still more economical
to raise another cow rather than attempting to clone one.

Executables do not contain anywhere near the hints provided by DNA.
You may or may not lose:
	- auto variable names (if no debug symbols)
	- types (if no debug symbols) (if ints and longs are 4 bytes
	  each, you can't tell which was intended, and then port
	  the program to a system where that isn't true).  You also
	  can't tell for sure whether loading 2 bytes of variable
	  is due to the variable being two bytes, or it's an
	  optimization of variable&65535.
	- structures.  It's difficult to tell whether two variables
	  close to each other are part of the same structure or just
	  different variables
	- function names (if no symbols)
	- Macros.  Calls to things like getc() may expand to wierd stuff.
	  Also, the constant 1 might be SEEK_CUR, EXIT_FAILURE, SIGHUP,
	  SIG_IGN, or something else.

If you had any idea about decompiling the program so you can make
changes, or so you can port it to another platform, expect to put
in a lot of effort.  Re-writing from scratch after determining the
behavior of the existing program might be more economical.

0
Reply gordonb 7/27/2010 7:46:32 PM

On 2010-07-27, Gordon Burditt <gordonb.trd82@burditt.org> wrote:
>>I believe the expression is something like "you can't turn hamburger back 
>>into a cow."  :-)
>
> That operation has been broken (in the sense of "breaking the code"),
> given the existence of DNA, and assuming you're not talking about
> *cooked* hamburger (which, as I understand it from various "CSI"
> TV shows, destroys the DNA).  Of course, it's still more economical
> to raise another cow rather than attempting to clone one.

You also do not get the same cow that you had before since the new cow will
not have the memories and experiences that the old cow did.  You will
likely get something that is quite similar with similar tendencies; but,
some things are also likely to be different.

> Executables do not contain anywhere near the hints provided by DNA.
> You may or may not lose:
> 	- auto variable names (if no debug symbols)
> 	- types (if no debug symbols) (if ints and longs are 4 bytes
> 	  each, you can't tell which was intended, and then port
> 	  the program to a system where that isn't true).  You also
> 	  can't tell for sure whether loading 2 bytes of variable
> 	  is due to the variable being two bytes, or it's an
> 	  optimization of variable&65535.
> 	- structures.  It's difficult to tell whether two variables
> 	  close to each other are part of the same structure or just
> 	  different variables
> 	- function names (if no symbols)
> 	- Macros.  Calls to things like getc() may expand to wierd stuff.
> 	  Also, the constant 1 might be SEEK_CUR, EXIT_FAILURE, SIGHUP,
> 	  SIG_IGN, or something else.

Indeed, there is no such thing as a decompiler; but, there are
deassemblers, used for reverse engineering, that can provide guesses to
information that is missing in the binary.  Depending on how optimized the
program is, they can sometimes identify things like loop constructs and
notice data that is always used and manipulated together.  The generated
assembly code includes the meta-information detected in its comments and
automatically generates replacement variable names for what types it able
to detect that you can search and replace as you figure out their
functionality.  They still are not going to get you anywhere near the
origional C source; but, the information that they can provide makes it
much easier to figure out how and what a program is doing.
0
Reply usernet1 (158) 7/27/2010 9:35:48 PM

On 2010-07-26, DMn2004404 <dmn2004404@gmail.com> wrote:
> I recently downloaded a program (*.EXE) written in C, and I discovered
> it had a few bugs. The original author of the program is apparently
> unreachable, so I figure it falls on me to try and crack into the
> program and fix the bugs. Does anyone know of a good C decompiler or
> disassembler I can use to do this, that will read an EXE program as
> input and generate its source code?

1. If you know how to do what the program does, it will be *much* easier
	simply to write it from scratch as you are unlikely to be able to
	the origional source back through disassembly.

2. If you don't know how the program works and you cannot find any
	documentation that allows you to implement the functionality
	yourself, then as a very last result, you can try to reverse
	engineer the program.  There are disassemblers that can try to
	guess about some of the structure and data of the program. (As well
	as packet sniffers, hex data viewers, system call tracers, etc that
	help you infer what the program is doing without having to decode
	the actual source.)  These tools fall under the domain of reverse
	engineering and you will be better off searching for reverse
	engineering tools and asking in reverse engineering groups
	directly.  You might find some information in security groups as
	finding security vulnerabilities often entails reverse
	engineering.
0
Reply usernet1 (158) 7/27/2010 9:49:14 PM

On 26 July, 23:47, Uno <merrilljen...@q.com> wrote:
> Keith Thompson wrote:
> > DMn2004404 <dmn2004...@gmail.com> writes:
>
>
> >> I recently downloaded a program (*.EXE) written in C, and I discovered
> >> it had a few bugs. The original author of the program is apparently
> >> unreachable, so I figure it falls on me to try and crack into the
> >> program and fix the bugs. Does anyone know of a good C decompiler or
> >> disassembler I can use to do this, that will read an EXE program as
> >> input and generate its source code?

nothing will generate "its source code". Compilation is information
lossy. You may be able to generate "some corresponding source code",
hopefully of a readable nature.

> > It's theoretically possible, given an executable file, to generate
> > C source code that will at least have the same behavior, if not
> > re-generate an identical executable if compiled. =A0Do not expect
> > any such generated C source code to be readable or maintainable.

<snip>

> > #include <stdio.h>
> > #include <math.h>
>
> > // C will be caller
>
> > int main()
> > {
>
> > double d, pi;
> > d =3D atan(1.0);
> > pi =3D (4.0)*d;
> > printf("pi is %f\n", pi);
>
> > return 0;
> > }
>
> > // gcc =A0-std=3Dc99 -Wall -Wextra c_pi1.c -o out
>
> If you were to pick apart the executable resulting from *this* program,
> with access to the same compiler and running on the same platform, how
> would you go about it?

I fail to see your point. Do you mean "if I were the writer of a
decompiler how would I pick apart the executable"? Are you expecting a
lot of optimisation or something?

What would be wrong with

#include <stdio.h>

int main (void)
{
    printf ("pi is %f\n", 3.14159265);
    return 0;
}

or is spotting #include <stdio.h> too smart for a hypothetical
decompiler?

> > It's not theoretically possible to regenerate the original C
> > source file from an executable. =A0Too much information, including
> > most identifiers, is discarded during compilation. =A0It's like
> > unscrambling an egg.
>
> and there's an infinite number of programs that can produce the same
> executable.

so what?


--

"High Integrity Software: The SPARK Approach to Safety and Security"
Customers interested in this title may also be interested in:
"Windows XP Home"
                        (Amazon)
0
Reply nick_keighley_nospam (4574) 7/28/2010 7:31:30 AM

Nick Keighley wrote:
> On 26 July, 23:47, Uno <merrilljen...@q.com> wrote:
>> Keith Thompson wrote:
>>> DMn2004404 <dmn2004...@gmail.com> writes:
>>
>>>> I recently downloaded a program (*.EXE) written in C, and I discovered
>>>> it had a few bugs. The original author of the program is apparently
>>>> unreachable, so I figure it falls on me to try and crack into the
>>>> program and fix the bugs. Does anyone know of a good C decompiler or
>>>> disassembler I can use to do this, that will read an EXE program as
>>>> input and generate its source code?
> 
> nothing will generate "its source code".

....unless it's a quine.

-- 
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
0
Reply Richard 7/28/2010 7:32:20 AM

On 27 July, 20:46, gordonb.tr...@burditt.org (Gordon Burditt) wrote:
>
>
> >I believe the expression is something like "you can't turn hamburger bac=
k
> >into a cow." =A0:-)
>
> That operation has been broken (in the sense of "breaking the code"),
> given the existence of DNA,

you might be surprised at the limitaions of cloning...


> [...]=A0Of course, it's still more economical
> to raise another cow rather than attempting to clone one.
>
> Executables do not contain anywhere near the hints provided by DNA.

I think you might be surprised how little there is in the DNA!

Consider CC the Cloned Cat. He was a tortoise-shell but was marked
completly differently from his clone doner(?). This is because coat
marking are epigenic.


> You may or may not lose:
> =A0 =A0 =A0 =A0 - auto variable names (if no debug symbols)
> =A0 =A0 =A0 =A0 - types (if no debug symbols) (if ints and longs are 4 by=
tes
> =A0 =A0 =A0 =A0 =A0 each, you can't tell which was intended, and then por=
t
> =A0 =A0 =A0 =A0 =A0 the program to a system where that isn't true). =A0Yo=
u also
> =A0 =A0 =A0 =A0 =A0 can't tell for sure whether loading 2 bytes of variab=
le
> =A0 =A0 =A0 =A0 =A0 is due to the variable being two bytes, or it's an
> =A0 =A0 =A0 =A0 =A0 optimization of variable&65535.
> =A0 =A0 =A0 =A0 - structures. =A0It's difficult to tell whether two varia=
bles
> =A0 =A0 =A0 =A0 =A0 close to each other are part of the same structure or=
 just
> =A0 =A0 =A0 =A0 =A0 different variables
> =A0 =A0 =A0 =A0 - function names (if no symbols)
> =A0 =A0 =A0 =A0 - Macros. =A0Calls to things like getc() may expand to wi=
erd stuff.
> =A0 =A0 =A0 =A0 =A0 Also, the constant 1 might be SEEK_CUR, EXIT_FAILURE,=
 SIGHUP,
> =A0 =A0 =A0 =A0 =A0 SIG_IGN, or something else.
>
> If you had any idea about decompiling the program so you can make
> changes, or so you can port it to another platform, expect to put
> in a lot of effort. =A0Re-writing from scratch after determining the
> behavior of the existing program might be more economical.

0
Reply nick_keighley_nospam (4574) 7/28/2010 7:37:09 AM

On 27 July, 22:35, Tim Harig <user...@ilthio.net> wrote:

> Indeed, there is no such thing as a decompiler;

there are things that claim to be decompilers

http://boomerang.sourceforge.net/



0
Reply Nick 7/28/2010 7:41:01 AM

On 7/27/2010 5:35 PM, Tim Harig wrote:
> On 2010-07-27, Gordon Burditt<gordonb.trd82@burditt.org>  wrote:
>>> I believe the expression is something like "you can't turn hamburger back
>>> into a cow."  :-)
>>
>> That operation has been broken (in the sense of "breaking the code"),
>> given the existence of DNA, and assuming you're not talking about
>> *cooked* hamburger (which, as I understand it from various "CSI"
>> TV shows, destroys the DNA).  Of course, it's still more economical
>> to raise another cow rather than attempting to clone one.
>
> You also do not get the same cow that you had before since the new cow will
> not have the memories and experiences that the old cow did.  You will
> likely get something that is quite similar with similar tendencies; but,
> some things are also likely to be different.
[...]

Question to all our C/Bovine experts out there.  If a cow were to give birth 
to identical twins, would the markings be identical as well?  I would assume 
not (like human identical twins' fingerprints), and I would assume that the 
clone's markings would be different as well.

-- 
Kenneth Brody
0
Reply Kenneth 7/28/2010 3:37:48 PM

"Nick Keighley" <nick_keighley_nospam@hotmail.com> wrote in message 
news:49d820f1-40ce-43d5-a96e-4415339f75a6@f42g2000yqn.googlegroups.com...
> On 27 July, 22:35, Tim Harig <user...@ilthio.net> wrote:
>
>> Indeed, there is no such thing as a decompiler;
>
> there are things that claim to be decompilers
>
> http://boomerang.sourceforge.net/
>
>
>

Or how about HexRays, one of the 'better' x86 decompilers out there 
(ridiculously expensive though). But manual disassembly will always produce 
better results.

Anyway, depending on how much debug info is left intact inside the 
executable, one could reproduce the original source down to the correct line 
number of the source file.

Though typically, the best you could do is just recreate a 'functional 
equivalent' of the original. 


0
Reply jchl 7/29/2010 12:23:16 AM

On Jul 28, 8:23=A0pm, "jchl" <i...@c.com> wrote:
> "Nick Keighley" <nick_keighley_nos...@hotmail.com> wrote in message
>
> news:49d820f1-40ce-43d5-a96e-4415339f75a6@f42g2000yqn.googlegroups.com...
>
> > On 27 July, 22:35, Tim Harig <user...@ilthio.net> wrote:
>
> >> Indeed, there is no such thing as a decompiler;
>
> > there are things that claim to be decompilers
>
> >http://boomerang.sourceforge.net/
>
> Or how about HexRays, one of the 'better' x86 decompilers out there
> (ridiculously expensive though). But manual disassembly will always produ=
ce
> better results.
>
> Anyway, depending on how much debug info is left intact inside the
> executable, one could reproduce the original source down to the correct l=
ine
> number of the source file.
>
> Though typically, the best you could do is just recreate a 'functional
> equivalent' of the original.

Question : Is your .EXE complied with debugging optoin ON? Anyone know
how to check that? In theory it should easy to extract code (may not
be as is but atleast ...) out of .EXE program compiled with debugging
ON.

Option : As explained, it is hard to regenerate the code out of EXE
without debugging info ON. Other feasible option might be to get
similar software without the issues with this software.

Option : It is even more fesible if you can specify in and out of the
program and someone could rewrite that, And even if you get the code
out of buggi software someone will need to understand and fix it,
right?

0
Reply Sachin 7/29/2010 1:36:28 AM

In message <a574f59e-5c91-44ad-8007-b2606b9d1dd0@w30g2000yqw.googlegroup
s.com>, DMn2004404 <dmn2004404@gmail.com> writes
>I recently downloaded a program (*.EXE) written in C, and I discovered
>it had a few bugs. The original author of the program is apparently
>unreachable, so I figure it falls on me to try and crack into the
>program and fix the bugs. Does anyone know of a good C decompiler or
>disassembler I can use to do this, that will read an EXE program as
>input and generate its source code?
>
>Brandon Taylor

To some extent it depends where in the world you are and the license
attached to the application.  The license may not permit reverse
engineering and neither may the law where you are unless you have
explicit permission...





-- 
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills  Staffs  England     /\/\/\/\/
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/



0
Reply chris32 (3350) 8/2/2010 11:00:13 AM

15 Replies
672 Views

(page loaded in 0.227 seconds)

Similiar Articles:









7/26/2012 3:48:33 PM


Reply: