Hello,
I have been recently trying to see how some simple c programs translate
into assembly but my efforts have led to endless segfaults after
reassembling them
so i started with a simple hello world program
int main(){
printf("Hello, World!\n");
}
which to my surprise "ndisasm -b 32" returned a asm file that was 30000+
lines long
so i thought hmm that wasn't right
so i did the simple hello world with nasm
section .data
msg db "Hello, World!","$"
len equ $ - msg
section .code
global _start
_start:
mov edx,len
mov ecx,msg
mov ebx,1
mov eax,4
int 21h
mov eax,1
mov ebx,0
int 21h
also returned code with several thousands of line (mostly 'add [eax], al')
any way i was wondering if there was a disassembler that if you reassemble
it it works like the first program
also i get stuff that doesn't work right like 'loopne 0x154f38'
i also tried udis but that worked less than ndisasm
-dylan
|
|
0
|
|
|
|
Reply
|
thiesd
|
7/26/2008 5:35:39 PM |
|
"thiesd" <spamtrap@crayne.org> wrote in message
news:g6fn9h$f6o$1@misc-cct.server.rpi.edu...
> Hello,
> I have been recently trying to see how some simple c programs translate
> into assembly but my efforts have led to endless segfaults after
> reassembling them
>
> so i started with a simple hello world program
> int main(){
> printf("Hello, World!\n");
> }
> which to my surprise "ndisasm -b 32" returned a asm file that was 30000+
> lines long
Depending on the compiler, it will most likely add all of the startup code,
including the PE header (for Windows), etc. Also, see below for more.
> so i thought hmm that wasn't right
> so i did the simple hello world with nasm
> section .data
> msg db "Hello, World!","$"
> len equ $ - msg
> section .code
> global _start
>
> _start:
> mov edx,len
> mov ecx,msg
> mov ebx,1
> mov eax,4
> int 21h
>
> mov eax,1
> mov ebx,0
> int 21h
The code looks like Linux, but the int 21h looks like DOS.
I haven't done any Linux programming, so I may be wrong here,
but shouldn't the int 21h be int 80h?
> also returned code with several thousands of line (mostly 'add [eax], al')
If you will notice, the binary form for add [eax],al is 00 00.
Most likely, nasm is adding padding to the end of the section. I don't
know much about nasm, but this would be my guess. Now, if a section
is 4096 bytes, and it takes 2 bytes per add [eax],al, then you would
have 2048 lines. Not several thousand, but a couple thousand non the less.
> any way i was wondering if there was a disassembler that if you reassemble
> it it works like the first program
A disassembler simply takes the binary bytes and translates them
to the corresponding mnemonics. It doesn't know the difference between
data or code.
>From your post, I get that you want a "smart" disassembler. There are
a few debuggers that do a better job, but the better the job, the more
expensive the tool.
> also i get stuff that doesn't work right like 'loopne 0x154f38'
What is wrong with loopne 0x154f38 ?
It is the same as loopnz 0x154f38, which loops if the zero flag
is clear and ecx != 0.
> i also tried udis but that worked less than ndisasm
I have never tried udis, so I have no comment here. Maybe someone
else has and can comment.
Ben
|
|
0
|
|
|
|
Reply
|
Benjamin
|
7/26/2008 7:58:07 PM
|
|
"thiesd" <spamtrap@crayne.org> wrote in message
news:g6fn9h$f6o$1@misc-cct.server.rpi.edu...
> Hello,
> I have been recently trying to see how some simple c programs translate
> into assembly but my efforts have led to endless segfaults after
> reassembling them
>
> so i started with a simple hello world program
> int main(){
> printf("Hello, World!\n");
> }
> which to my surprise "ndisasm -b 32" returned a asm file that was 30000+
> lines long
Depending on the compiler, it will most likely add all of the startup code,
including the PE header (for Windows), etc. Also, see below for more.
> so i thought hmm that wasn't right
> so i did the simple hello world with nasm
> section .data
> msg db "Hello, World!","$"
> len equ $ - msg
> section .code
> global _start
>
> _start:
> mov edx,len
> mov ecx,msg
> mov ebx,1
> mov eax,4
> int 21h
>
> mov eax,1
> mov ebx,0
> int 21h
The code looks like Linux, but the int 21h looks like DOS.
I haven't done any Linux programming, so I may be wrong here,
but shouldn't the int 21h be int 80h?
> also returned code with several thousands of line (mostly 'add [eax], al')
If you will notice, the binary form for add [eax],al is 00 00.
Most likely, nasm is adding padding to the end of the section. I don't
know much about nasm, but this would be my guess. Now, if a section
is 4096 bytes, and it takes 2 bytes per add [eax],al, then you would
have 2048 lines. Not several thousand, but a couple thousand non the less.
> any way i was wondering if there was a disassembler that if you reassemble
> it it works like the first program
A disassembler simply takes the binary bytes and translates them
to the corresponding mnemonics. It doesn't know the difference between
data or code.
>From your post, I get that you want a "smart" disassembler. There are
a few debuggers that do a better job, but the better the job, the more
expensive the tool.
> also i get stuff that doesn't work right like 'loopne 0x154f38'
What is wrong with loopne 0x154f38 ?
It is the same as loopnz 0x154f38, which loops if the zero flag
is clear and ecx != 0.
> i also tried udis but that worked less than ndisasm
I have never tried udis, so I have no comment here. Maybe someone
else has and can comment.
Ben
|
|
0
|
|
|
|
Reply
|
Benjamin
|
7/26/2008 7:58:07 PM
|
|
On 26-Jul-2008, "Benjamin David Lunt" <spamtrap@crayne.org> wrote:
> > section .data
> > msg db "Hello, World!","$"
> > len equ $ - msg
> > section .code
> > global _start
> >
> > _start:
> > mov edx,len
> > mov ecx,msg
> > mov ebx,1
> > mov eax,4
> > int 21h
> >
> > mov eax,1
> > mov ebx,0
> > int 21h
>
> The code looks like Linux, but the int 21h looks like DOS.
> I haven't done any Linux programming, so I may be wrong here,
> but shouldn't the int 21h be int 80h?
that is dos code ^_^ but yea you are right to work in linux it needs to be
80h (i've tried on both os)
i have been messing around with differant methods and found that objdump (i
think it is a linux but also comes with djgpp) gives the best results so
far but it requires some clean up to code
i can figure out the trivial stuff like "test.o: \n\n\t Disassembly of
`_main':" like stuff but there is some syntax that i cannot understand such
as "call 4004e0 <_start+0x90>", well i understand it but i don't know how
to make it assemble correctly
|
|
0
|
|
|
|
Reply
|
Dylan
|
7/26/2008 8:33:22 PM
|
|
thiesd wrote:
> Hello,
> I have been recently trying to see how some simple c programs translate
> into assembly but my efforts have led to endless segfaults after
> reassembling them
>
> so i started with a simple hello world program
> int main(){
> printf("Hello, World!\n");
> }
> which to my surprise "ndisasm -b 32" returned a asm file that was 30000+
> lines long
> so i thought hmm that wasn't right
:)
Linked with the "--static" switch? Well, anyway, there's a lot of
"cruft" in a C-generated file (IME). Giving ndisasm a long and
complicated command line may help (RTFM)... "gcc -S" or "objdump -s" may
produce more useful results (although not in Nasm syntax).
> so i did the simple hello world with nasm
> section .data
> msg db "Hello, World!","$"
> len equ $ - msg
> section .code
> global _start
>
> _start:
> mov edx,len
> mov ecx,msg
> mov ebx,1
> mov eax,4
> int 21h
Say what???
> mov eax,1
> mov ebx,0
> int 21h
I suppose this is a "posto", and the real file is int 80h? If not... I
have *no* idea! :)
> also returned code with several thousands of line (mostly 'add [eax], al')
Lots of zero-padding... (giving ld the "-s" switch, or better yet,
"strip -R.comment myfile" will help some)
> any way i was wondering if there was a disassembler that if you reassemble
> it it works like the first program
Try Jeff Owens' "asmsrc":
http://linuxasmtools.net/
No promises - a "perfect" disassembly (of "any arbitrary file") is
theoretically "impossible", but at least "asmsrc" is intended to do what
you want. It *does* work (imperfectly) on your example file (with the
int 21h's "promoted" to int 80h)...
;Input file: hw.src
(no, Jeff, the input file was "hw", *this* file is "hw.src"... Close
enough for asm :)
;Dynamic Libraries found: no
;Lib startup code wrapper found: no
;Symbol table found: yes
;Debug symbols found: no
;static load file
; Compile with: nasm -felf xxxx.asm -o xxxx.o
; ld xxxx.o -o xxxx
; (xxxx = filename)
global _start
[section .text]
_start:
mov eax,04H
mov ebx,01H
mov ecx,msg
mov edx,0DH
int byte 080H
mov eax,01H
int byte 080H
msg:
dec eax
db "ello, world"
db 00Ah
Dunno where the "dec eax" came from - asmsrc *does* seem to understand
that it's doing data (my error - I put "msg in .text - works right with
it in .data)... Works as intended anyway. Good Luck!!!
Best,
Frank
|
|
0
|
|
|
|
Reply
|
Frank
|
7/26/2008 8:46:50 PM
|
|
On Sat, 26 Jul 2008 12:58:07 -0700
"Benjamin David Lunt" <spamtrap@crayne.org> wrote:
> The code looks like Linux, but the int 21h looks like DOS.
> I haven't done any Linux programming, so I may be wrong here,
> but shouldn't the int 21h be int 80h?
Indeed it should.
> Most likely, nasm is adding padding to the end of the section. I
> don't know much about nasm, but this would be my guess
NASM itself adds an ELF header before the user code, and a lot of
system information following it, but it does not pad out the section(s).
However, if one is disassembling the executable module, then one must
also consider what the linker has done.
--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html
|
|
0
|
|
|
|
Reply
|
Charles
|
7/26/2008 8:58:30 PM
|
|
On Sat, 26 Jul 2008 17:35:39 GMT
thiesd <spamtrap@crayne.org> wrote:
> any way i was wondering if there was a disassembler that if you
> reassemble it it works like the first program
The NASM disassembler is designed for files in binary format, and needs
some guidance from the user about where the user code begins and ends
in other formats. As it appears that you are using Linux, read the man
pages for readelf and objdump.
--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html
|
|
0
|
|
|
|
Reply
|
Charles
|
7/26/2008 9:02:57 PM
|
|
"Charles Crayne" <spamtrap@crayne.org> wrote in message
news:20080726135830.39e8d8d6@thor.crayne.org...
> On Sat, 26 Jul 2008 12:58:07 -0700
> "Benjamin David Lunt" <spamtrap@crayne.org> wrote:
>
>> The code looks like Linux, but the int 21h looks like DOS.
>> I haven't done any Linux programming, so I may be wrong here,
>> but shouldn't the int 21h be int 80h?
>
> Indeed it should.
>
>> Most likely, nasm is adding padding to the end of the section. I
>> don't know much about nasm, but this would be my guess
>
> NASM itself adds an ELF header before the user code, and a lot of
> system information following it, but it does not pad out the section(s).
> However, if one is disassembling the executable module, then one must
> also consider what the linker has done.
Thanks Chuck,
I knew that. I have been working too much with binary only
assembly modules. I haven't used a linker with my assembly
in years. :-)
Thanks,
Ben
|
|
0
|
|
|
|
Reply
|
Benjamin
|
7/26/2008 9:10:22 PM
|
|
On Sat, 26 Jul 2008 20:33:22 GMT
"Dylan Thies" <spamtrap@crayne.org> wrote:
> i can figure out the trivial stuff like "test.o: \n\n\t Disassembly of
> `_main':" like stuff but there is some syntax that i cannot
> understand such as "call 4004e0 <_start+0x90>", well i understand
> it but i don't know how to make it assemble correctly
Unless the original compilation or assembly was done with debug
information requested, local labels will be lost, and you will have to
replace them. So, to make "call 4004e0 <_start+0x90>" assemble
correctly you need to change it to something like "call L4004e0", and
also insert this same label in front of the instruction which starts
at the address which is 90 hex bytes after the label _start. Of course,
a more meaningful label would be even better.
--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html
|
|
0
|
|
|
|
Reply
|
Charles
|
7/26/2008 11:43:40 PM
|
|
I figured out most of what i wanted to know now
turns out the
> call 4004e0 <_start+0x90>
stuff was actually objdump trying to disassemble stuff that didn't need to
be disassembled
I figured this out when ever I looked at the hex of the "message:" label.
Funny thing is it was "Hello, World!" (not that exact part of code i quoted
but the whole section).
anyways thanks for the insight to some of the other programs avalible to
disassmble.
-Dylan
|
|
0
|
|
|
|
Reply
|
Dylan
|
7/27/2008 1:52:50 AM
|
|
hello again,
Is there a way to get the out put like readelf and objdump on binary files
also is there such that runs in win xp
-Dylan
|
|
0
|
|
|
|
Reply
|
Dylan
|
7/27/2008 7:41:45 AM
|
|
On Jul 27, 12:41�am, "Dylan Thies" <spamt...@crayne.org> wrote:
> hello again,
> Is there a way to get the out put like readelf and objdump on binary files
> also is there such that runs in win xp
>
> -Dylan
For windows, try using IDA Pro. There is a free version available.
I'm not familiar with the elf file format, so I cannot help you there.
|
|
0
|
|
|
|
Reply
|
bwaichu
|
7/28/2008 6:53:47 AM
|
|
"Dylan Thies" <spamtrap@crayne.org> wrote:
>
>Is there a way to get the out put like readelf and objdump on binary files
>also is there such that runs in win xp
Assuming you have Visual Studio, you can get a first attempt by:
link /dump /disasm xxx.obj
It isn't as helpful as products like IDA.
--
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.
|
|
0
|
|
|
|
Reply
|
Tim
|
7/29/2008 6:03:29 AM
|
|
On Tue, 29 Jul 2008 06:03:29 GMT, Tim Roberts <spamtrap@crayne.org>
wrote:
>"Dylan Thies" <spamtrap@crayne.org> wrote:
>>
>>Is there a way to get the out put like readelf and objdump on binary files
>>also is there such that runs in win xp
>
>Assuming you have Visual Studio, you can get a first attempt by:
> link /dump /disasm xxx.obj
>
>It isn't as helpful as products like IDA.
Almost useless, IMO. :-)
--
ArarghMail807 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html
To reply by email, remove the extra stuff from the reply address.
|
|
0
|
|
|
|
Reply
|
ArarghMail807NOSPAM
|
7/29/2008 7:14:41 AM
|
|
On Tue, 29 Jul 2008 02:14:41 -0500, ArarghMail807NOSPAM
<spamtrap@crayne.org> wrote:
>On Tue, 29 Jul 2008 06:03:29 GMT, Tim Roberts <spamtrap@crayne.org>
>wrote:
>
>>"Dylan Thies" <spamtrap@crayne.org> wrote:
>>>
>>>Is there a way to get the out put like readelf and objdump on binary files
>>>also is there such that runs in win xp
>>
>>Assuming you have Visual Studio, you can get a first attempt by:
>> link /dump /disasm xxx.obj
>>
>>It isn't as helpful as products like IDA.
>
>Almost useless, IMO. :-)
The first, not IDA. :-)
--
ArarghMail807 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html
To reply by email, remove the extra stuff from the reply address.
|
|
0
|
|
|
|
Reply
|
ArarghMail807NOSPAM
|
7/29/2008 8:43:02 AM
|
|
Ok so in my wanderings of the web and thanks to y'all, I think
(therefore i might be?)
that for linux objdump/readelf are amazing and wish they worked for non-elf
object
and as far as windows goes ipa pro is really usefull has it's quirks but
very usefull
so I want to thank you all for your help, and for people new to
dissassembly
IPA pro + for windows users
objdump/readelf ++ for linux users
(^.^)
-Dylan
|
|
0
|
|
|
|
Reply
|
Dylan
|
7/29/2008 4:55:09 PM
|
|
On Jul 29, 12:14 am, ArarghMail807NOSPAM <spamt...@crayne.org> wrote:
> On Tue, 29 Jul 2008 06:03:29 GMT, Tim Roberts <spamt...@crayne.org>
> wrote:
>
> >"Dylan Thies" <spamt...@crayne.org> wrote:
>
> >>Is there a way to get the out put like readelf and objdump on binary files
> >>also is there such that runs in win xp
>
> >Assuming you have Visual Studio, you can get a first attempt by:
> > link /dump /disasm xxx.obj
>
> >It isn't as helpful as products like IDA.
>
> Almost useless, IMO. :-)
If you do that on an EXE for which you have a PDB file with symbols,
it may be useful. :)
Alex
|
|
0
|
|
|
|
Reply
|
Alexei
|
7/30/2008 7:36:51 AM
|
|
"Dylan Thies" <spamtrap@crayne.org> wrote:
>
>so I want to thank you all for your help, and for people new to
>dissassembly
>IPA pro + for windows users
>objdump/readelf ++ for linux users
Although I am thoroughly in favor of the liberal application of IPA to
programming projects, you meant IDA here.
(IPA = India pale ale)
--
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.
|
|
0
|
|
|
|
Reply
|
Tim
|
7/31/2008 5:58:57 AM
|
|
Yeah that's what I meant, it was a late night (up for 36+ hours)(<working on
fixing that bad habit)
but yeah sometimes I think that a little ale could make the hex make more
sense well at least look better (^_^) probably a little less bulky too
-Dylan
|
|
0
|
|
|
|
Reply
|
Dylan
|
7/31/2008 4:19:08 PM
|
|
|
18 Replies
102 Views
(page loaded in 0.232 seconds)
|