Odd problem with int 0x21, ah = 0x4A

  • Follow


I have the following applet:

	BITS 16
	ORG 0x100
=20
	mov	ax, ss
	mov	dx, es
	sub	ax, dx
	mov	bx, sp
	shr	bx, 4
	add	bx, ax
	mov	ah, 0x4A
	int	0x21
	ret

In the debugger, all segment registers all point to the same segment of
memory which is OK. After the call, DS gets trashed but everything looks
OK. I checked with the Ralf Brown's interrupt lists but it doesn't
mention anything about DS getting trashed. Any ideas why?=20
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 4/28/2011 2:18:19 PM

alex.buell wrote:

|I have the following applet:

|BITS 16
|ORG 0x100

|mov ax, ss
|mov dx, es
|sub ax, dx
|mov bx, sp
|shr bx, 4
|add bx, ax
|mov ah, 0x4A
|int 0x21
|ret

|In the debugger, all segment registers all point to the same segment of
|memory which is OK. After the call, DS gets trashed but everything looks
|OK. I checked with the Ralf Brown's interrupt lists but it doesn't
|mention anything about DS getting trashed. Any ideas why?

DS isn't involved nor prone to change after this, but...:

RBIL also say about INT21-4A:
CF set on error
AX = error code (07h,08h,09h) (see #01680 at AH=59h/BX=0000h)
BX = maximum paragraphs available for specified memory block

I'm not sure what you try to achieve with your code snip above.
It wont be a good idea to tell the DOS that you wont use the assigned
stack anymore without having set up your own stack before that.

Any hardware or software IRQ/INT (incl.INT21h) will use the current stack
as long in RealMode and also in PM/VM as long no taskswitch occure.

So your tool may have pushed DS in front of your attempt and cannot
restore it because of the stack-change ?
__
wolfgang


0
Reply nowhere583 (184) 4/28/2011 5:30:31 PM


On Thu, 2011-04-28 at 19:30 +0200, wolfgang kern wrote:
> alex.buell wrote:
>=20
> |I have the following applet:
>=20
> |BITS 16
> |ORG 0x100
>=20
> |mov ax, ss
> |mov dx, es
> |sub ax, dx
> |mov bx, sp
> |shr bx, 4
> |add bx, ax
> |mov ah, 0x4A
> |int 0x21
> |ret
>=20
> |In the debugger, all segment registers all point to the same segment of
> |memory which is OK. After the call, DS gets trashed but everything looks
> |OK. I checked with the Ralf Brown's interrupt lists but it doesn't
> |mention anything about DS getting trashed. Any ideas why?
>=20
> DS isn't involved nor prone to change after this, but...:
>=20
> RBIL also say about INT21-4A:
> CF set on error
> AX =3D error code (07h,08h,09h) (see #01680 at AH=3D59h/BX=3D0000h)
> BX =3D maximum paragraphs available for specified memory block
>=20
> I'm not sure what you try to achieve with your code snip above.
> It wont be a good idea to tell the DOS that you wont use the assigned
> stack anymore without having set up your own stack before that.
>=20
> Any hardware or software IRQ/INT (incl.INT21h) will use the current stack
> as long in RealMode and also in PM/VM as long no taskswitch occure.
>=20
> So your tool may have pushed DS in front of your attempt and cannot
> restore it because of the stack-change ?

I'm converting the following: http://www.japheth.de/JWasm/Dos64.html to
NASM format, generating a binary image directly without linking. That's
why I'm having problems getting it to work!

	;.x64p
	BITS 16
	ORG 0x100

_TEXT16 segment use16

	jmp	start16
=09
GDTR	dw 4 * 8 - 1
	dd GDT
IDTR	dw 256 * 16 - 1
	dd 0
nullidt dd 0x3FF
	dd 0

	align 8
GDT	dq 0
	dw 0xFFFF, 0, 0x9A00, 0x0AF
	dw 0xFFFF, 0, 0x9A00, 0x000
	dw 0xFFFF, 0, 0x9200, 0x000

wPICMask dw 0

start16:
	push 	cs
	pop	ds
	mov	ax, cs
	movzx	eax, ax
	shl	eax, 4
	add	dword [GDTR + 2], eax
	mov	word [GDT + 2 * 8 + 2], ax
	mov	word [GDT + 3 * 8 + 2], ax
	shr	eax, 16
	mov	byte [GDT + 2 * 8 + 4], al
	mov	byte [GDT + 3 * 8 + 4], al

	mov	ax, ss
	mov	dx, es
	sub	ax, dx
	mov	bx, sp
	shr	bx, 4
	add	bx, ax
	mov	ah, 0x4A
	int	0x21
	push	cs
	pop	es
	mov	ax, ss
	mov	dx, cs
	sub	ax, dx
	shl	ax, 4
	add	ax, sp
	push	ds
	pop	ss
	mov	sp, ax

	push	cs
	pop	ds

	smsw	ax
	test	al, 1
	jz	skip1

	mov	dx, err1
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err1:	db	"Mode is V86. Need REAL mode to switch to LONG mode.", 13, 10,
'$'

skip1:
	xor	edx, edx
	mov	eax, 0x80000000
	cpuid
	test 	edx, 0x20000000
	jnz	skip2

	mov	dx, err2
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err2:	db	"No 64 bit cpu detected.", 13, 10, '$'

skip2:
	mov	bx, 0x1000
	mov	ah, 0x48
	int	0x21
	jnc	skip3
	mov	dx, err3
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err3:	db	"Out of memory", 13, 10, '$'

skip3:
	add 	ax, 0x100 - 1
	mov	al, 0
	mov	es, ax

	sub	di, di
	mov	cx, 4096
	sub	eax, eax
	rep	stosd

	sub	di, di
	mov	ax, es
	movzx	eax, ax
	shl	eax, 4
	mov	cr3, eax

	lea	edx, [eax + 0x5000]
	mov	dword [IDTR + 2], edx

	or	eax, 111b
	add	eax, 0x1000
	mov	[es:di + 0x0000], eax
	add	eax, 0x1000
	mov	[es:di + 0x1000], eax
	add	eax, 0x1000
	mov	[es:di + 0x2000], eax
	mov	di, 0x3000
	mov	eax, 0 + 111b
	mov	cx, 256

..skip4:
	stosd
	add	di, 4
	add	eax, 0x1000
	loop	.skip4

	mov	bx, _TEXT
	movzx	ebx, bx
	shl	ebx, 4
	add 	[llg], ebx

	mov	di, 0x5000
	mov	cx, 32
	mov	edx, exception
	add	edx, ebx

make_exc_gates:
	mov	eax, edx
	stosw
	mov	ax, 8
	stosw
	mov	ax, 0x8E00
	stosd
	xor	eax, eax
	stosd
	stosd
	add	edx, 4
	loop	make_exc_gates
	mov	cx, 256 - 32

make_int_gates:
	mov	eax, interrupt
	add	eax, ebx
	stosw
	mov	ax, 8
	stosw
	mov	ax, 0x8E00
	stosd
	xor	eax, eax
	stosd
	stosd
	loop	make_int_gates

	mov	di, 0x5000
	mov	eax, ebx
	add	eax, clock
	mov	[es:di + 0x80 * 16 + 0], ax
	shr	eax, 16
	mov	[es:di + 0x80 * 16 + 6], ax

	mov	eax, ebx
	add	eax, keyboard
	mov	[es:di + 0x81 * 16 + 0], ax
	shr	eax, 16
	mov	[es:di + 0x81 * 16 + 6], ax

	pushf
	pop	ax
	and	ah, 0xBF
	push	ax
	popf

	cli
	in	al, 0xA1
	mov	ah, al
	in	al, 0x21
	mov	[wPICMask], ax
	mov	al, 10001b
	out	0x20, al
	mov	al, 10001b
	out	0xA0, al
	mov	al, 0x80
	out	0x21, al
	mov	al, 0x88
	out	0xA1, al
	mov	al, 100b
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov 	al, 1
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	al, 11111100b
	out	0x21, al
	in	al, 0xA1
	mov	al, 11111111b
	out	0xA1, al

	mov	eax, cr4
	or	eax, 1 << 5
	mov	cr4, eax

	mov	ecx, 0xC0000080

	rdmsr
	or	eax, 1 << 8
	wrmsr

	lgdt	[GDTR]
	lidt	[IDTR]

	mov	cx, ss
	movzx	ecx, cx
	shl	ecx, 4
	add	ecx, esp

	mov	eax, cr0
	or	eax, 0x80000001
	mov	cr0, eax

	db	0x66, 0xEA	; jmp 0x8:0x0
llg	dd	long_start
	dw	8

backtoreal:
	mov	eax, cr0
	and	eax, 0x7FFFFFF
	mov	cr0, eax
	mov	ecx, 0xC000080
	rdmsr
	and	ax, !1
	wrmsr
	mov	ax, 24
	mov	ss, ax
	mov	ds, ax
	mov	es, ax
	mov	eax, cr0
	and	al, 0xFE
	mov	cr0, eax

	db	0xEA
	dw	$+4
	dw	_TEXT16
	mov	ax, STACK
	mov	ss, ax
	mov	sp, 4096
	push	cs
	pop	ds
	lidt	[nullidt]
	mov	eax, cr4
	and	al, !0x20
	mov	cr4, eax

	mov	al, 10001b
	out	0x20, al
	mov	al, 10001b
	out	0xA0, al
	mov	al, 0x08
	out	0x21, al
	mov	al, 0x70
	out	0xA1, al
	mov	al, 100b
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov	al, 1
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	ax, [wPICMask]
	out	0x21, al
	mov	al, ah
	out	0xA1, al
	sti

	mov	ax, 0x4C00
	int	0x21

	BITS 64

_TEXT segment use64=20

long_start:
	xor	eax, eax
	mov	ss, eax
	mov	esp, ecx
	sti

	call	WriteStrX
	db	'Hello 64bit', 10, 0

nextcmd:
	mov	r8b, 0

nocmd:
	cmp	r8b, 0
	jz	nocmd
	cmp	r8b, 1
	jz	esc_pressed
	cmp	r8b, 0x13
	jz	r_pressed
	call	WriteStrX
	db 	'Unknown key ', 0
	mov	al, r8b
	call	WriteB
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

r_pressed:
	call	WriteStrX
	db	10, "cr0=3D", 0
	mov	rax, cr0
	call	WriteQW
	call	WriteStrX
	db	10, "cr2=3D", 0
	mov	rax, cr2
	call	WriteQW
	call	WriteStrX
	db	10, "cr3=3D", 0
	mov	rax, cr3
	call	WriteQW
	call	WriteStrX
	db	10, "cr4=3D", 0
	mov	rax, cr4
	call	WriteQW
	call	WriteStrX
	db	10, "cr8=3D", 0
	mov	rax, cr8
	call	WriteQW
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

esc_pressed:
	jmp	[bv]
bv:	dd	backtoreal
	dw	16

scroll_screen:
	cld
	mov	rdi, rsi
	movzx	rax, word [rbp + 0x4a]
	push	rax
	lea	rsi, [rsi + 2 * rax]
	mov	cl, [rbp + 0x84]
	mul	cl
	mov	rcx, rax
	rep	movsw
	pop	rcx
	mov	ax, 0x0720
	rep	stosw
	ret

WriteChr:
	push	rbp
	push	rdi
	push	rsi
	push	rbx
	push 	rcx
	push	rdx
	push	rax
	mov	rdi, 0xB8000
	mov	rbp, 0x400
	cmp	byte [rbp + 0x63], 0xB4
	jnz	.skip5
	xor	di, di

..skip5:
	movzx	rbx, word [rbp + 0x4E]
	add	rdi, rbx
	movzx	rbx, byte [rbp + 0x62]
	mov	rsi, rdi
	movzx	rcx, byte [rbx * 2 + rbp + 0x50 + 1]
	movzx	rax, word [rbp + 0x4A]
	mul	rcx
	movzx	rdx, byte [rbx + 2 + rbp + 0x50]
	add	rax, rdx
	mov	dh, cl
	lea	rdi, [rdi + rax * 2]
	mov	al, [rsp]
	cmp	al, 10
	jz	.newline
	mov	[rdi], al
	mov	byte [rdi + 1], 7
	inc	dl
	cmp	dl, byte [rbp + 0x4a]
	jb	.skip6

..newline:
	mov	dl, 0
	inc	dh
	cmp	dh, byte [rbp + 0x84]
	jbe	.skip6
	dec	dh
	call	scroll_screen

..skip6:
	mov	[rbx * 2 + rbp + 0x50], dx
	pop	rax
	pop	rdx
	pop	rcx
	pop	rbx
	pop	rsi
	pop	rdi
	pop	rbp
	ret

WriteStr:
	push	rsi
	mov	rsi, rdx
	cld

..skip7:
	lodsb
	and	al, al
	jz	.skip8
	call	WriteChr
	jmp	.skip7

..skip8:
	pop	rsi
	ret

WriteStrX:
	push	rsi
	mov	rsi, [rsp + 8]
	cld

..skip9:
	lodsb
	and	al, al
	jz	.skip10
	call	WriteChr
	jmp	.skip9

..skip10:
	mov	[rsp + 8], rsi
	pop	rsi
	ret

WriteQW:
	push	rax
	shr	rax, 32
	call	WriteQW
	pop	rax

WriteDW:
	push	rax
	shr	rax, 16
	call	WriteW
	pop	rax

WriteW:
	push	rax
	shr	rax, 8
	call	WriteB
	pop	rax

WriteB:
	push	rax
	shr	rax, 4
	call	WriteNb
	pop	rax

WriteNb:
	and	al, 0x0F
	add	al, '0'
	cmp	al, '9'
	jbe	.skip11
	add	al, 7

..skip11:
	jmp	WriteChr

exception:
%assign excno 0
%rep	32
	push	excno
	jmp	.skip12
%assign excno excno + 1
%endrep

..skip12:
	call	WriteStrX
	db	10, "Exception ", 0
	pop	rax
	call	WriteB
	call	WriteStrX
	db 	" errcode=3D", 0
	mov	rax, [rsp + 0]
	call	WriteQW
	call	WriteStrX
	db	" rip=3D", 0
	mov	rax, [rsp + 8]
	call	WriteQW
	call	WriteStrX
	db	10, 0

..skip13:
	jmp $

clock:
	push 	rbp
	mov	rbp, 0x400
	inc	dword [rbp + 0x6C]
	pop	rbp

interrupt:
	push	rax
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

keyboard:
	push	rax
	in	al, 0x60
	test	al, 0x80
	jnz	.skip14
	mov	r8b, al

..skip14:
	in	al, 0x61
	out	0x61, al
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

STACK segment use16

	resb 4096



--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 4/28/2011 6:25:57 PM

"Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
message news:m2oo88-leq.ln1@nntp.local.net...
>
> I'm converting the following: http://www.japheth.de/JWasm/Dos64.html to
> NASM format, generating a binary image directly without linking. That's
> why I'm having problems getting it to work!
>

Great!

You understand that "JWasm -mz" produces an .exe, i.e., DOS64.EXE from
DOS64.asm.  Yes?  Are you producing a .com?  AFAIK, NASM cannot produce MZ
executables via a command line parameter.

I cut the first 48 bytes off of DOS64.EXE and comment out "jmp start16" in
your code so comparing .obj's.  That's where they seemed to match up.

1)
; ORG 0x100
ORG 0x0

2)
;jmp start16

3)
;nullidt dd 0x3ff
nullidt dw 0x3ff

4)
;push cs
;pop ds

5)
;err1: db "Mode is V86. Need REAL mode to switch to LONG mode.", 13, 10, '$'
err1: db "Mode is V86. Need REAL mode to switch to LONG mode!", 13, 10, '$'

6)
;err2: db "No 64 bit cpu detected.", 13, 10, 'S'
err2: db "No 64bit cpu detected.", 13, 10, 'S'

7)
offsets prior/after to make_exc_gates are different:
 _TEXT
 exception
 interrupt
 clock

8)

Something wrong around esc_pressed.
Disassemblies are off by 3 bytes.
I only see one byte difference here.
So, 2 bytes elsewhere...
The 0A00 is "db 10,0" in r_pressed
The FC is cld in scroll_screen.

Japheth's version:
00000000  E8E9000000        call dword 0xee
00000005  E8CA000000        call dword 0xd4
0000000A  0A00              or al,[rax]
0000000C  E958FFFFFF        jmp dword 0xffffffffffffff69
00000011  FF2D00000000      jmp dword far [rel 0x17]
00000017  97                xchg eax,edi
00000018  0200              add al,[rax]
0000001A  0010              add [rax],dl
0000001C  00FC              add ah,bh
0000001E  488BFE            mov rdi,rsi
00000021  480FB6454A        movzx rax,byte [rbp+0x4a]
00000026  50                push rax

Your version:
00000000  E8F0000000        call dword 0xf5
00000005  E8D1000000        call dword 0xdb
0000000A  0A00              or al,[rax]
0000000C  E958FFFFFF        jmp dword 0xffffffffffffff69
00000011  FF2425D4030000    jmp qword near [0x3d4]
00000018  97                xchg eax,edi
00000019  0200              add al,[rax]
0000001B  0010              add [rax],dl
0000001D  00FC              add ah,bh
0000001F  4889F7            mov rdi,rsi
00000022  480FB7454A        movzx rax,word [rbp+0x4a]
00000027  50                push rax


Sorry, that's as far as I got...  Got to go.  Maybe more later.


Rod Pemberton


0
Reply do_not_have7664 (117) 4/28/2011 11:41:39 PM

On Thu, 2011-04-28 at 19:41 -0400, Rod Pemberton wrote:

> You understand that "JWasm -mz" produces an .exe, i.e., DOS64.EXE from
> DOS64.asm.  Yes?  Are you producing a .com?  AFAIK, NASM cannot produce M=
Z
> executables via a command line parameter.

Ah, that does explains a lot, I was making a tiny COM, not an EXE,
didn't think of that.=20

> I cut the first 48 bytes off of DOS64.EXE and comment out "jmp start16" i=
n
> your code so comparing .obj's.  That's where they seemed to match up.
>=20
> 1)
> ; ORG 0x100
> ORG 0x0
>=20
> 2)
> ;jmp start16
>=20
> 3)
> ;nullidt dd 0x3ff
> nullidt dw 0x3ff
>=20
> 4)
> ;push cs
> ;pop ds

I added that in to restore DS as it seems to get trashed during 0x21
call with ah =3D 0x4A, not sure why that happens but restoring DS sorts it
out.=20

> 5)
> ;err1: db "Mode is V86. Need REAL mode to switch to LONG mode.", 13, 10, =
'$'
> err1: db "Mode is V86. Need REAL mode to switch to LONG mode!", 13, 10, '=
$'
>=20
> 6)
> ;err2: db "No 64 bit cpu detected.", 13, 10, 'S'
> err2: db "No 64bit cpu detected.", 13, 10, 'S'
>=20
> 7)
> offsets prior/after to make_exc_gates are different:
>  _TEXT
>  exception
>  interrupt
>  clock
>=20
> 8)
>=20
> Something wrong around esc_pressed.
> Disassemblies are off by 3 bytes.
> I only see one byte difference here.
> So, 2 bytes elsewhere...
> The 0A00 is "db 10,0" in r_pressed
> The FC is cld in scroll_screen.
>=20
> Japheth's version:
> 00000011  FF2D00000000      jmp dword far [rel 0x17]

> Your version:
> 00000011  FF2425D4030000    jmp qword near [0x3d4]

That would explain the differences.=20

> Sorry, that's as far as I got...  Got to go.  Maybe more later.

Not to worry, it's more than enough to get on with, thanks for taking a
look at it!

--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 4/29/2011 12:47:44 AM

Rod Pemberton wrote:
> "Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
> message news:m2oo88-leq.ln1@nntp.local.net...
>> I'm converting the following: http://www.japheth.de/JWasm/Dos64.html to
>> NASM format, generating a binary image directly without linking. That's
>> why I'm having problems getting it to work!
>>
> 
> Great!
> 
> You understand that "JWasm -mz" produces an .exe, i.e., DOS64.EXE from
> DOS64.asm.  Yes?  Are you producing a .com?  AFAIK, NASM cannot produce MZ
> executables via a command line parameter.

Actually, you can:

http://www.nasm.us/xdoc/2.09.08/html/nasmdoc8.html#section-8.1.2

but I doubt if it is useful...

[probably useful observations snipped]

[original post]

	BITS 16
	ORG 0x100

	mov	ax, ss
	mov	dx, es
	sub	ax, dx
	mov	bx, sp
	shr	bx, 4
	add	bx, ax
	mov	ah, 0x4A
	int	0x21
	ret


We know that when dos loads a .com file, all segregs are the same, so ax 
is going to be zero. sp will be 0FFFEh - dos will give you a full 64k 
(unless there's less available) and will "push word 0". So "shr bx, 4" 
is going to truncate this value to 0FFFh. At this point, Wolfgang's 
concern could come into play - we've freed part of the stack, and could 
"lose" ds in the middle of an interrupt...

Since we're not trying to "shrink" the file (which would involve moving 
the stack), we know how many paragraphs we want to keep. Just "mov bx, 
1000h" should work properly! We may be able to skip both the "resize" 
and the "malloc" (it's all "our" memory, we don't really have to "ask" - 
a 64-bit machine with not enough memory is... improbable)

When we get to:
[Japheth's code]
	mov	bx, _TEXT

we may be in trouble. That isn't going to work in Nasm's "-f bin" mode, 
I'm pretty sure. Maybe we can use cs... I'm a little lost at this point. 
I'll study Japheth's code some more...

I'm not going to be able to test this, but maybe I can get as far as "no 
64-bit cpu detected"... Meanwhile, "go for it", you guys! It would be a 
great example to have for Nasm!

Best,
Frank
0
Reply fbkotler9831 (124) 4/29/2011 1:26:40 AM

"Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
message news:heep88-4mv.ln1@nntp.local.net...
> > Japheth's version:
> > 00000011  FF2D00000000      jmp dword far [rel 0x17]
>
> > Your version:
> > 00000011  FF2425D4030000    jmp qword near [0x3d4]
>

8)
esc_pressed:
; jmp [bv]
jmp dword far [rel bv]
bv: dd backtoreal
dw 16

Or, maybe you need a "default rel" for the use64 section?


Rod Pemberton


0
Reply do_not_have7664 (117) 4/29/2011 4:20:00 AM

On Fri, 2011-04-29 at 00:20 -0400, Rod Pemberton wrote:
> > > Japheth's version:
> > > 00000011  FF2D00000000      jmp dword far [rel 0x17]
> >
> > > Your version:
> > > 00000011  FF2425D4030000    jmp qword near [0x3d4]
> >
>=20
> 8)
> esc_pressed:
> ; jmp [bv]
> jmp dword far [rel bv]
> bv: dd backtoreal
> dw 16=20

No, I think that's what's needed. Thanks!
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 4/29/2011 10:58:41 AM

"Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
message news:28iq88-d33.ln1@nntp.local.net...
> On Fri, 2011-04-29 at 00:20 -0400, Rod Pemberton wrote:
> > > > Japheth's version:
> > > > 00000011  FF2D00000000      jmp dword far [rel 0x17]
> > >
> > > > Your version:
> > > > 00000011  FF2425D4030000    jmp qword near [0x3d4]
> > >
>
> > 8)
> > esc_pressed:
> > ; jmp [bv]
> > jmp dword far [rel bv]
> > bv: dd backtoreal
> > dw 16
>
> No, I think that's what's needed. Thanks!
>

But...  You need more!  :-)

9)
There are four 00 bytes inserted between 16-bit and 64-bit segment in
Japheth's version, but not in yours...

  BITS 64

align 8, db 0 ; new

  _TEXT segment use64

10)
; db 'Unknown key ', 0
db 'unknown key ', 0

11)
two changes in Writechr

; mov rdi, 0xB8000
mov rdi, dword 0xB8000

; mov rbp, 0x400
mov rbp, dword 0x400

12)
one change in clock

; mov rbp, 0x400
mov rbp, dword 0x400


OK, those changes up through 11 gives me the same length (1,503 bytes) and
very close disassemblies.  That does not mean they are sync'd exactly.
Maybe, I'll go through the other differences another day.  Most look like
equivalent instruction encodings, but a few look like different byte
sequences, maybe different offsets.

HTH,


Rod Pemberton



0
Reply do_not_have7664 (117) 4/29/2011 6:59:31 PM

Rod Pemberton wrote:
....
> But...  You need more!  :-)

Yeah, maybe...

> 9)
> There are four 00 bytes inserted between 16-bit and 64-bit segment in
> Japheth's version, but not in yours...
> 
>   BITS 64
> 
> align 8, db 0 ; new
> 
>   _TEXT segment use64

This is because the above statement isn't proper Nasm syntax. Nasm 
creates a segment *named* "use64"! "_TEXT" is apparently a label, to 
Nasm, but not the name of a segment - I don't think we can use segment 
names like this in a .com file, anyway.

Something you might find useful is to put "[map all dos64.map]" at the 
top of the file. This will show what Nasm is doing with your "segments", 
and where Nasm thinks the labels go. I don't think it's anywhere near 
right, but I'm nowhere near figuring out how to make it right...

Am I correct in imagining that we set the base of our descriptors to the 
linear address where dos loaded us? That might make things simpler 
(offsets would be the same in all parts of the code). We might be able 
to cope with a zero base by doing...

section text64 vstart=0 ; or wherever we're loaded

.... but I'm not sure how we'd find our way back to "backtoreal" if we do 
that...

I've only begun to look at the code, and there's a *lot* of it I don't 
understand! In some ways, I think it can be "simplified" by being a .com 
file, but having "named segments" that can be used as symbols is handy, 
too. In "-f obj" we can do:

mov ax, data
mov ds, ax

.... but it won't work in "-f bin"! Not as-intended, anyway. I think we 
can work around it...

db 0xEA
dw $ + 4
dw _TEXT16

for example, is going to need to be patched at run-time. I think an MZ 
file does this at load-time, right?

Nasm thinks "_TEXT16" is a label at 0x100...

I wonder if it would be worth looking up Tomasz's Fasm code?

Best,
Frank

0
Reply fbkotler9831 (124) 4/29/2011 10:15:16 PM

On Fri, 2011-04-29 at 14:59 -0400, Rod Pemberton wrote:

> OK, those changes up through 11 gives me the same length (1,503 bytes) an=
d
> very close disassemblies.  That does not mean they are sync'd exactly.
> Maybe, I'll go through the other differences another day.  Most look like
> equivalent instruction encodings, but a few look like different byte
> sequences, maybe different offsets.

OK, I did some more screwing around with the NASM conversion. I present
my current findings here;

(hacked exebin.mac file, sets up entry point, diff it with original to
see why)
; -*- nasm -*-
; NASM macro file to allow the `bin' output format to generate
; simple .EXE files by constructing the EXE header by hand.
; Adapted from a contribution by Yann Guidon <whygee_corp@hol.fr>

%define EXE_stack_size EXE_realstacksize

; modified to take entry point as argument
%macro EXE_begin 1
	  ORG 0E0h
	  section .text

header_start:
	  db 4Dh,5Ah		; EXE file signature
	  dw EXE_allocsize % 512
	  dw (EXE_allocsize + 511) / 512
	  dw 0			; relocation information: none
	  dw (header_end-header_start) / 16 ; header size in paragraphs
	  dw (EXE_absssize + EXE_realstacksize) / 16 ; min extra mem
	  dw (EXE_absssize + EXE_realstacksize) / 16 ; max extra mem
	  dw -0x10			; Initial SS (before fixup)
	  dw EXE_endbss + EXE_realstacksize ; Initial SP (1K DPMI+1K STACK)
	  dw 0			; (no) Checksum
	  dw %1			; Initial IP - start just after the header
	  dw -0x10 		; Initial CS (before fixup)
	  dw 0			; file offset to relocation table: none
	  dw 0			; (no overlay)
	  align 16,db 0
header_end:

EXE_startcode:
	  section .data
EXE_startdata:
	  section .bss
EXE_startbss:
%endmacro

%macro EXE_stack 1
EXE_realstacksize equ %1
%define EXE_stack_size EXE_bogusstacksize ; defeat EQU in EXE_end
%endmacro

%macro EXE_end 0
	  section .text
EXE_endcode:
	  section .data
EXE_enddata:
	  section .bss
	  alignb 4
EXE_endbss:

EXE_acodesize equ (EXE_endcode-EXE_startcode+3) & (~3)
EXE_datasize equ EXE_enddata-EXE_startdata
EXE_absssize equ (EXE_endbss-EXE_startbss+3) & (~3)
EXE_allocsize equ EXE_acodesize + EXE_datasize

EXE_stack_size equ 0x800	; default if nothing else was used
%endmacro

(and latest dos64.asm)
%include 'exebin.mac'

	EXE_begin start16

	BITS 16

_TEXT16 segment use16=20

	;jmp	start16

	section .text

GDTR	dw 4 * 8 - 1
	dd GDT
IDTR	dw 256 * 16 - 1
	dd 0
nullidt dw 0x3FF
	dd 0

	align 8
GDT	dq 0
	dw 0xFFFF, 0, 0x9A00, 0x0AF
	dw 0xFFFF, 0, 0x9A00, 0x000
	dw 0xFFFF, 0, 0x9200, 0x000

wPICMask dw 0

start16:
	push 	cs
	pop	ds
	mov	ax, cs
	movzx	eax, ax
	shl	eax, 4
	add	dword [GDTR + 2], eax
	mov	word [GDT + 2 * 8 + 2], ax
	mov	word [GDT + 3 * 8 + 2], ax
	shr	eax, 16
	mov	byte [GDT + 2 * 8 + 4], al
	mov	byte [GDT + 3 * 8 + 4], al

	mov	ax, ss
	mov	dx, es
	sub	ax, dx
	mov	bx, sp
	shr	bx, 4
	add	bx, ax
	mov	ah, 0x4A
	int	0x21
	push	cs
	pop	es
	mov	ax, ss
	mov	dx, cs
	sub	ax, dx
	shl	ax, 4
	add	ax, sp
	push	ds
	pop	ss
	mov	sp, ax

	;push	cs
	;pop	ds

	smsw	ax
	test	al, 1
	jz	skip1

	mov	dx, err1
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err1:	db	"Mode is V86. Need REAL mode to switch to LONG mode!", 13, 10,
'$'

skip1:
	xor	edx, edx
	mov	eax, 0x80000000
	cpuid
	test 	edx, 0x20000000
	jnz	skip2

	mov	dx, err2
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err2:	db	"No 64 bit cpu detected.", 13, 10, '$'

skip2:
	mov	bx, 0x1000
	mov	ah, 0x48
	int	0x21
	jnc	skip3
	mov	dx, err3
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err3:	db	"Out of memory", 13, 10, '$'

skip3:
	add 	ax, 0x100 - 1
	mov	al, 0
	mov	es, ax

	sub	di, di
	mov	cx, 4096
	sub	eax, eax
	rep	stosd

	sub	di, di
	mov	ax, es
	movzx	eax, ax
	shl	eax, 4
	mov	cr3, eax

	lea	edx, [eax + 0x5000]
	mov	dword [IDTR + 2], edx

	or	eax, 111b
	add	eax, 0x1000
	mov	[es:di + 0x0000], eax
	add	eax, 0x1000
	mov	[es:di + 0x1000], eax
	add	eax, 0x1000
	mov	[es:di + 0x2000], eax
	mov	di, 0x3000
	mov	eax, 0 + 111b
	mov	cx, 256

..skip4:
	stosd
	add	di, 4
	add	eax, 0x1000
	loop	.skip4

	mov	bx, _TEXT
	movzx	ebx, bx
	shl	ebx, 4
	add 	[llg], ebx

	mov	di, 0x5000
	mov	cx, 32
	mov	edx, exception
	add	edx, ebx

make_exc_gates:
	mov	eax, edx
	stosw
	mov	ax, 8
	stosw
	mov	ax, 0x8E00
	stosd
	xor	eax, eax
	stosd
	stosd
	add	edx, 4
	loop	make_exc_gates
	mov	cx, 256 - 32

make_int_gates:
	mov	eax, interrupt
	add	eax, ebx
	stosw
	mov	ax, 8
	stosw
	mov	ax, 0x8E00
	stosd
	xor	eax, eax
	stosd
	stosd
	loop	make_int_gates

	mov	di, 0x5000
	mov	eax, ebx
	add	eax, clock
	mov	[es:di + 0x80 * 16 + 0], ax
	shr	eax, 16
	mov	[es:di + 0x80 * 16 + 6], ax

	mov	eax, ebx
	add	eax, keyboard
	mov	[es:di + 0x81 * 16 + 0], ax
	shr	eax, 16
	mov	[es:di + 0x81 * 16 + 6], ax

	pushf
	pop	ax
	and	ah, 0xBF
	push	ax
	popf

	cli
	in	al, 0xA1
	mov	ah, al
	in	al, 0x21
	mov	[wPICMask], ax
	mov	al, 10001b
	out	0x20, al
	mov	al, 10001b
	out	0xA0, al
	mov	al, 0x80
	out	0x21, al
	mov	al, 0x88
	out	0xA1, al
	mov	al, 100b
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov 	al, 1
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	al, 11111100b
	out	0x21, al
	in	al, 0xA1
	mov	al, 11111111b
	out	0xA1, al

	mov	eax, cr4
	or	eax, 1 << 5
	mov	cr4, eax

	mov	ecx, 0xC0000080

	rdmsr
	or	eax, 1 << 8
	wrmsr

	lgdt	[GDTR]
	lidt	[IDTR]

	mov	cx, ss
	movzx	ecx, cx
	shl	ecx, 4
	add	ecx, esp

	mov	eax, cr0
	or	eax, 0x80000001
	mov	cr0, eax

	db	0x66, 0xEA	; jmp 0x8:0x0
llg	dd	long_start
	dw	8

backtoreal:
	mov	eax, cr0
	and	eax, 0x7FFFFFF
	mov	cr0, eax
	mov	ecx, 0xC000080
	rdmsr
	and	ax, !1
	wrmsr
	mov	ax, 24
	mov	ss, ax
	mov	ds, ax
	mov	es, ax
	mov	eax, cr0
	and	al, 0xFE
	mov	cr0, eax

	db	0xEA
	dw	$+4
	dw	_TEXT16
	mov	ax, STACK
	mov	ss, ax
	mov	sp, 4096
	push	cs
	pop	ds
	lidt	[nullidt]
	mov	eax, cr4
	and	al, !0x20
	mov	cr4, eax

	mov	al, 10001b
	out	0x20, al
	mov	al, 10001b
	out	0xA0, al
	mov	al, 0x08
	out	0x21, al
	mov	al, 0x70
	out	0xA1, al
	mov	al, 100b
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov	al, 1
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	ax, [wPICMask]
	out	0x21, al
	mov	al, ah
	out	0xA1, al
	sti

	mov	ax, 0x4C00
	int	0x21

	BITS 64

	align 8, db 0

_TEXT segment use64=20

long_start:
	xor	eax, eax
	mov	ss, eax
	mov	esp, ecx
	sti

	call	WriteStrX
	db	'Hello 64bit', 10, 0

nextcmd:
	mov	r8b, 0

nocmd:
	cmp	r8b, 0
	jz	nocmd
	cmp	r8b, 1
	jz	esc_pressed
	cmp	r8b, 0x13
	jz	r_pressed
	call	WriteStrX
	db 	'Unknown key ', 0
	mov	al, r8b
	call	WriteB
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

r_pressed:
	call	WriteStrX
	db	10, "cr0=3D", 0
	mov	rax, cr0
	call	WriteQW
	call	WriteStrX
	db	10, "cr2=3D", 0
	mov	rax, cr2
	call	WriteQW
	call	WriteStrX
	db	10, "cr3=3D", 0
	mov	rax, cr3
	call	WriteQW
	call	WriteStrX
	db	10, "cr4=3D", 0
	mov	rax, cr4
	call	WriteQW
	call	WriteStrX
	db	10, "cr8=3D", 0
	mov	rax, cr8
	call	WriteQW
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

esc_pressed:
	jmp	dword far [rel bv]
bv:	dd	backtoreal
	dw	16

scroll_screen:
	cld
	mov	rdi, rsi
	movzx	rax, word [rbp + 0x4a]
	push	rax
	lea	rsi, [rsi + 2 * rax]
	mov	cl, [rbp + 0x84]
	mul	cl
	mov	rcx, rax
	rep	movsw
	pop	rcx
	mov	ax, 0x0720
	rep	stosw
	ret

WriteChr:
	push	rbp
	push	rdi
	push	rsi
	push	rbx
	push 	rcx
	push	rdx
	push	rax
	mov	rdi, dword 0xB8000
	mov	rbp, dword 0x400
	cmp	byte [rbp + 0x63], 0xB4
	jnz	.skip5
	xor	di, di

..skip5:
	movzx	rbx, word [rbp + 0x4E]
	add	rdi, rbx
	movzx	rbx, byte [rbp + 0x62]
	mov	rsi, rdi
	movzx	rcx, byte [rbx * 2 + rbp + 0x50 + 1]
	movzx	rax, word [rbp + 0x4A]
	mul	rcx
	movzx	rdx, byte [rbx + 2 + rbp + 0x50]
	add	rax, rdx
	mov	dh, cl
	lea	rdi, [rdi + rax * 2]
	mov	al, [rsp]
	cmp	al, 10
	jz	.newline
	mov	[rdi], al
	mov	byte [rdi + 1], 7
	inc	dl
	cmp	dl, byte [rbp + 0x4a]
	jb	.skip6

..newline:
	mov	dl, 0
	inc	dh
	cmp	dh, byte [rbp + 0x84]
	jbe	.skip6
	dec	dh
	call	scroll_screen

..skip6:
	mov	[rbx * 2 + rbp + 0x50], dx
	pop	rax
	pop	rdx
	pop	rcx
	pop	rbx
	pop	rsi
	pop	rdi
	pop	rbp
	ret

WriteStr:
	push	rsi
	mov	rsi, rdx
	cld

..skip7:
	lodsb
	and	al, al
	jz	.skip8
	call	WriteChr
	jmp	.skip7

..skip8:
	pop	rsi
	ret

WriteStrX:
	push	rsi
	mov	rsi, [rsp + 8]
	cld

..skip9:
	lodsb
	and	al, al
	jz	.skip10
	call	WriteChr
	jmp	.skip9

..skip10:
	mov	[rsp + 8], rsi
	pop	rsi
	ret

WriteQW:
	push	rax
	shr	rax, 32
	call	WriteQW
	pop	rax

WriteDW:
	push	rax
	shr	rax, 16
	call	WriteW
	pop	rax

WriteW:
	push	rax
	shr	rax, 8
	call	WriteB
	pop	rax

WriteB:
	push	rax
	shr	rax, 4
	call	WriteNb
	pop	rax

WriteNb:
	and	al, 0x0F
	add	al, '0'
	cmp	al, '9'
	jbe	.skip11
	add	al, 7

..skip11:
	jmp	WriteChr

exception:
%assign excno 0
%rep	32
	push	excno
	jmp	.skip12
%assign excno excno + 1
%endrep

..skip12:
	call	WriteStrX
	db	10, "Exception ", 0
	pop	rax
	call	WriteB
	call	WriteStrX
	db 	" errcode=3D", 0
	mov	rax, [rsp + 0]
	call	WriteQW
	call	WriteStrX
	db	" rip=3D", 0
	mov	rax, [rsp + 8]
	call	WriteQW
	call	WriteStrX
	db	10, 0

..skip13:
	jmp $

clock:
	push 	rbp
	mov	rbp, dword 0x400
	inc	dword [rbp + 0x6C]
	pop	rbp

interrupt:
	push	rax
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

keyboard:
	push	rax
	in	al, 0x60
	test	al, 0x80
	jnz	.skip14
	mov	r8b, al

..skip14:
	in	al, 0x61
	out	0x61, al
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

	section .bss

STACK segment use16
	EXE_stack 4096
	EXE_end

It still blows up though :)
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 4/29/2011 10:43:35 PM

Freedom on the Oceans wrote:

....
> It still blows up though :)

You aren't getting here, but this looks like a typo:

WriteQW:
	push	rax
	shr	rax, 32
	call	WriteQW
	pop	rax

Just thought I'd mention it, while I spotted it...

[map all dos64.map] ; trust me.

Best,
Frank

0
Reply fbkotler9831 (124) 4/30/2011 3:05:59 AM

"Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
message news:ohrr88-kvq.ln1@nntp.local.net...
> On Fri, 2011-04-29 at 14:59 -0400, Rod Pemberton wrote:
>
> > OK, those changes up through 11 gives me the same length (1,503 bytes)
> > and very close disassemblies.  That does not mean they are sync'd
> > exactly. Maybe, I'll go through the other differences another day.
> > Most look like equivalent instruction encodings, but a few look like
> > different byte sequences, maybe different offsets.
>
> OK, I did some more screwing around with the NASM conversion. I present
> my current findings here;
>

No offense, but I think you're rushing a bit.  I'd suggest getting the base
..obj's identical first, or at least 100% equivalent.  Once they are
identical or equivalent in terms of the bytes, then you can patch on a
..EXE header and *it will work*, because Japheth's works.  There are 96 byte
differences left.  With a quick skim, most appear to me to be equivalent
instruction encodings.  They'll all have to be checked though.  So, only 16
bytes appear to be "wrong" to me.  Six that concern me are right near the
start of the .obj.  I'm not sure what they are yet or how they got there.
They don't appear to be part of the source code.  JWasm linker patch-up
values for .EXE?

Frank found another error with WriteQW instead of WriteDW.  Some of the
offsets that I mentioned were incorrect and misuse of "segment" lines as
Frank noted will probably be identified if you follow Frank's advice on
using "[map all dos64.map]".  I've not used that feature.

> It still blows up though :)
>

Yep...

Once they are identical in binary, or 100% equivalent, you can "cut-n-paste"
the first 48 bytes from Japheth's version compiled with JWasm onto yours.
If your version works, all is good in the .obj part.  Then, move on to
getting NASM .EXE code to work.


Rod Pemberton



0
Reply do_not_have7664 (117) 4/30/2011 6:30:05 AM

On Sat, 2011-04-30 at 02:30 -0400, Rod Pemberton wrote:

> No offense, but I think you're rushing a bit.  I'd suggest getting the ba=
se
> .obj's identical first, or at least 100% equivalent.  Once they are
> identical or equivalent in terms of the bytes, then you can patch on a
> .EXE header and *it will work*, because Japheth's works.  There are 96 by=
te
> differences left.  With a quick skim, most appear to me to be equivalent
> instruction encodings.  They'll all have to be checked though.  So, only =
16
> bytes appear to be "wrong" to me.  Six that concern me are right near the
> start of the .obj.  I'm not sure what they are yet or how they got there.
> They don't appear to be part of the source code.  JWasm linker patch-up
> values for .EXE?

As it turns out it's easier to get it looking identical binary-wise, so
I've backed up my work to a different file, and reverted the changes.=20

I've also tacked on the .exe header as a list of db constants to make it
easier to do a diff. I found a program called vbindiff which can take
both the original exe and my exe and do a binary diff, with all
different bytes highlighted in red.=20

As it turns out, they're quite close to being identical.=20

> Frank found another error with WriteQW instead of WriteDW.  Some of the
> offsets that I mentioned were incorrect and misuse of "segment" lines as
> Frank noted will probably be identified if you follow Frank's advice on
> using "[map all dos64.map]".  I've not used that feature.

That one's been fixed.=20

> > It still blows up though :)
> >
>=20
> Yep...
>=20
> Once they are identical in binary, or 100% equivalent, you can "cut-n-pas=
te"
> the first 48 bytes from Japheth's version compiled with JWasm onto yours.
> If your version works, all is good in the .obj part.  Then, move on to
> getting NASM .EXE code to work.

That's the plan, thanks
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 4/30/2011 11:07:35 AM

"Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
message news:q47t88-jqi.ln1@nntp.local.net...
>
> I found a program called vbindiff which can take
> both the original exe and my exe and do a binary diff, with all
> different bytes highlighted in red.
>

If it helps, I was using these tools:

  DOS's FC (file compare) program with /B parameter to do binary differences
  NDISASM, DOS version, with -e parameter (hex value) to skip to locations
at or around the differences
  DOS EDIT to find locations where instructions matched those in disassembly
  NASM, DOS version, to recompile your version
  QED to cut 48 bytes from JWasm version (ancient, small file, DOS text
editor with hex editing support)
  GNU LESS from DJGPP project to pause and scroll NDISASM's and FC's output

All are readily available except QED.  You'll probably have to locate
another text editor that supports hex editing to get the .EXE header from
JWasm compiled version.  DOS's COPY with /B parameter can be used to combine
the .EXE header from JWasm version and your .obj.


Rod Pemberton


0
Reply do_not_have7664 (117) 4/30/2011 7:59:12 PM

On Sat, 2011-04-30 at 15:59 -0400, Rod Pemberton wrote:
> > I found a program called vbindiff which can take
> > both the original exe and my exe and do a binary diff, with all
> > different bytes highlighted in red.
> >
>=20
> If it helps, I was using these tools:
>=20
>   DOS's FC (file compare) program with /B parameter to do binary
> differences
>   NDISASM, DOS version, with -e parameter (hex value) to skip to
> locations
> at or around the differences
>   DOS EDIT to find locations where instructions matched those in
> disassembly
>   NASM, DOS version, to recompile your version
>   QED to cut 48 bytes from JWasm version (ancient, small file, DOS
> text
> editor with hex editing support)
>   GNU LESS from DJGPP project to pause and scroll NDISASM's and FC's
> output
>=20
> All are readily available except QED.  You'll probably have to locate
> another text editor that supports hex editing to get the .EXE header
> from
> JWasm compiled version.  DOS's COPY with /B parameter can be used to
> combine
> the .EXE header from JWasm version and your .obj.

I'm cross assembling from Linux to DOS, and testing on amd6 with
VirtualBox. Seems to work quite well. Japteth's original version works
just fine with virtualisation enabled in VirtualBox.

Just something to note: HIMEM/EMM386 will crash with an EMM386 exception
in Virtualbox if virtualisation is enabled, so I don't boot DOS with
these two drivers loaded.=20

The six bytes that you noted at the beginning appears to be generated
through the 'align 8' directive used. NASM inserts six 0x90 bytes,
whilst JWASM inserts two sets of 0x2E 0x8B 0xC0 to align the entry
point. I'm really not sure why JWASM does that.

I'll just paste the latest version of my work so far:

	; 48 byte header

	db	0x4D, 0x5A, 0x0F, 0x00, 0x04, 0x00, 0x03, 0x00=20
	db	0x03, 0x00, 0x01, 0x01, 0xFF, 0xFF, 0x5E, 0x00
	db	0x00, 0x10, 0x00, 0x00, 0x3A, 0x00, 0x00, 0x00
	db	0x1E, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8F, 0x01
	db	0x00, 0x00, 0xC4, 0x02, 0x00, 0x00, 0xC7, 0x02=20
	db	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00

;%include 'exebin.mac'

	ORG	-0x30 	; takes into account the header tacked on

;	EXE_begin start16

	BITS	16

_TEXT16 segment use16=20

GDTR	dw	4 * 8 - 1
	dd	GDT
IDTR	dw	256 * 16 - 1
	dd	0
nullidt dw	0x3FF
	dd	0

	align	8
GDT	dq	0
	dw	0xFFFF, 0, 0x9A00, 0x0AF
	dw	0xFFFF, 0, 0x9A00, 0x000
	dw	0xFFFF, 0, 0x9200, 0x000

wPICMask dw	0

start16:
	push 	cs
	pop	ds
	mov	ax, cs
	movzx	eax, ax
	shl	eax, 4
	add	dword [GDTR + 2], eax
	mov	word [GDT + 2 * 8 + 2], ax
	mov	word [GDT + 3 * 8 + 2], ax
	shr	eax, 16
	mov	byte [GDT + 2 * 8 + 4], al
	mov	byte [GDT + 3 * 8 + 4], al

	mov	ax, ss
	mov	dx, es
	sub	ax, dx
	mov	bx, sp
	shr	bx, 4
	add	bx, ax
	mov	ah, 0x4A
	int	0x21
	push	cs
	pop	es
	mov	ax, ss
	mov	dx, cs
	sub	ax, dx
	shl	ax, 4
	add	ax, sp
	push	ds
	pop	ss
	mov	sp, ax

	smsw	ax
	test	al, 1
	jz	skip1

	mov	dx, err1
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err1:	db	"Mode is V86. Need REAL mode to switch to LONG mode!", 13, 10,
'$'

skip1:
	xor	edx, edx
	mov	eax, 0x80000001
	cpuid
	test 	edx, 0x20000000
	jnz	skip2

	mov	dx, err2
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err2:	db	"No 64bit cpu detected.", 13, 10, '$'

skip2:
	mov	bx, 0x1000
	mov	ah, 0x48
	int	0x21
	jnc	skip3
	mov	dx, err3
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err3:	db	"Out of memory", 13, 10, '$'

skip3:
	add 	ax, 0x100 - 1
	mov	al, 0
	mov	es, ax

	sub	di, di
	mov	cx, 4096
	sub	eax, eax
	rep	stosd

	sub	di, di
	mov	ax, es
	movzx	eax, ax
	shl	eax, 4
	mov	cr3, eax

	lea	edx, [eax + 0x5000]
	mov	dword [IDTR + 2], edx

	or	eax, 111b
	add	eax, 0x1000
	mov	[es:di + 0x0000], eax
	add	eax, 0x1000
	mov	[es:di + 0x1000], eax
	add	eax, 0x1000
	mov	[es:di + 0x2000], eax
	mov	di, 0x3000
	mov	eax, 0 + 111b
	mov	cx, 256

..skip4:
	stosd
	add	di, 4
	add	eax, 0x1000
	loop	.skip4

	mov	bx, _TEXT
	movzx	ebx, bx
	shl	ebx, 4
	add 	[llg], ebx

	mov	di, 0x5000
	mov	cx, 32
	mov	edx, exception
	add	edx, ebx

make_exc_gates:
	mov	eax, edx
	stosw
	mov	ax, 8
	stosw
	mov	ax, 0x8E00
	stosd
	xor	eax, eax
	stosd
	stosd
	add	edx, 4
	loop	make_exc_gates
	mov	cx, 256 - 32

make_int_gates:
	mov	eax, interrupt
	add	eax, ebx
	stosw
	mov	ax, 8
	stosw
	mov	ax, 0x8E00
	stosd
	xor	eax, eax
	stosd
	stosd
	loop	make_int_gates

	mov	di, 0x5000
	mov	eax, ebx
	add	eax, clock
	mov	[es:di + 0x80 * 16 + 0], ax
	shr	eax, 16
	mov	[es:di + 0x80 * 16 + 6], ax

	mov	eax, ebx
	add	eax, keyboard
	mov	[es:di + 0x81 * 16 + 0], ax
	shr	eax, 16
	mov	[es:di + 0x81 * 16 + 6], ax

	pushf
	pop	ax
	and	ah, 0xBF
	push	ax
	popf

	cli
	in	al, 0xA1
	mov	ah, al
	in	al, 0x21
	mov	[wPICMask], ax
	mov	al, 10001b
	out	0x20, al
	mov	al, 10001b
	out	0xA0, al
	mov	al, 0x80
	out	0x21, al
	mov	al, 0x88
	out	0xA1, al
	mov	al, 100b
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov 	al, 1
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	al, 11111100b
	out	0x21, al
	in	al, 0xA1
	mov	al, 11111111b
	out	0xA1, al

	mov	eax, cr4
	or	eax, 1 << 5
	mov	cr4, eax

	mov	ecx, 0xC0000080

	rdmsr
	or	eax, 1 << 8
	wrmsr

	lgdt	[GDTR]
	lidt	[IDTR]

	mov	cx, ss
	movzx	ecx, cx
	shl	ecx, 4
	add	ecx, esp

	mov	eax, cr0
	or	eax, 0x80000001
	mov	cr0, eax

	db	0x66, 0xEA	; jmp 0x8:0x0
llg	dd	long_start
	dw	8

backtoreal:
	mov	eax, cr0
	and	eax, 0x7FFFFFF
	mov	cr0, eax
	mov	ecx, 0xC000080
	rdmsr
	and	ax, !1
	wrmsr
	mov	ax, 24
	mov	ss, ax
	mov	ds, ax
	mov	es, ax
	mov	eax, cr0
	and	al, 0xFE
	mov	cr0, eax

	db	0xEA
	dw	$+4
	dw	_TEXT16
	mov	ax, STACK
	mov	ss, ax
	mov	sp, 4096
	push	cs
	pop	ds
	lidt	[nullidt]
	mov	eax, cr4
	and	al, !0x20
	mov	cr4, eax

	mov	al, 10001b
	out	0x20, al
	mov	al, 10001b
	out	0xA0, al
	mov	al, 0x08
	out	0x21, al
	mov	al, 0x70
	out	0xA1, al
	mov	al, 100b
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov	al, 1
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	ax, [wPICMask]
	out	0x21, al
	mov	al, ah
	out	0xA1, al
	sti

	mov	ax, 0x4C00
	int	0x21

	BITS 64

	align 8, db 0

_TEXT segment use64

long_start:
	xor	eax, eax
	mov	ss, eax
	mov	esp, ecx
	sti

	call	WriteStrX
	db	'Hello 64bit', 10, 0

nextcmd:
	mov	r8b, 0

nocmd:
	cmp	r8b, 0
	jz	nocmd
	cmp	r8b, 1
	jz	esc_pressed
	cmp	r8b, 0x13
	jz	r_pressed
	call	WriteStrX
	db 	'Unknown key ', 0
	mov	al, r8b
	call	WriteB
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

r_pressed:
	call	WriteStrX
	db	10, "cr0=3D", 0
	mov	rax, cr0
	call	WriteQW
	call	WriteStrX
	db	10, "cr2=3D", 0
	mov	rax, cr2
	call	WriteQW
	call	WriteStrX
	db	10, "cr3=3D", 0
	mov	rax, cr3
	call	WriteQW
	call	WriteStrX
	db	10, "cr4=3D", 0
	mov	rax, cr4
	call	WriteQW
	call	WriteStrX
	db	10, "cr8=3D", 0
	mov	rax, cr8
	call	WriteQW
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

esc_pressed:
	jmp	dword far [rel bv]
bv:	dd	backtoreal
	dw	16

scroll_screen:
	cld
	mov	rdi, rsi
	movzx	rax, word [rbp + 0x4a]
	push	rax
	lea	rsi, [rsi + 2 * rax]
	mov	cl, [rbp + 0x84]
	mul	cl
	mov	rcx, rax
	rep	movsw
	pop	rcx
	mov	ax, 0x0720
	rep	stosw
	ret

WriteChr:
	push	rbp
	push	rdi
	push	rsi
	push	rbx
	push 	rcx
	push	rdx
	push	rax
	mov	rdi, dword 0xB8000
	mov	rbp, dword 0x400
	cmp	byte [rbp + 0x63], 0xB4
	jnz	.skip5
	xor	di, di

..skip5:
	movzx	rbx, word [rbp + 0x4E]
	add	rdi, rbx
	movzx	rbx, byte [rbp + 0x62]
	mov	rsi, rdi
	movzx	rcx, byte [rbx * 2 + rbp + 0x50 + 1]
	movzx	rax, word [rbp + 0x4A]
	mul	rcx
	movzx	rdx, byte [rbx + 2 + rbp + 0x50]
	add	rax, rdx
	mov	dh, cl
	lea	rdi, [rdi + rax * 2]
	mov	al, [rsp]
	cmp	al, 10
	jz	.newline
	mov	[rdi], al
	mov	byte [rdi + 1], 7
	inc	dl
	cmp	dl, byte [rbp + 0x4a]
	jb	.skip6

..newline:
	mov	dl, 0
	inc	dh
	cmp	dh, byte [rbp + 0x84]
	jbe	.skip6
	dec	dh
	call	scroll_screen

..skip6:
	mov	[rbx * 2 + rbp + 0x50], dx
	pop	rax
	pop	rdx
	pop	rcx
	pop	rbx
	pop	rsi
	pop	rdi
	pop	rbp
	ret

WriteStr:
	push	rsi
	mov	rsi, rdx
	cld

..skip7:
	lodsb
	and	al, al
	jz	.skip8
	call	WriteChr
	jmp	.skip7

..skip8:
	pop	rsi
	ret

WriteStrX:
	push	rsi
	mov	rsi, [rsp + 8]
	cld

..skip9:
	lodsb
	and	al, al
	jz	.skip10
	call	WriteChr
	jmp	.skip9

..skip10:
	mov	[rsp + 8], rsi
	pop	rsi
	ret

WriteQW:
	push	rax
	shr	rax, 32
	call	WriteDW
	pop	rax

WriteDW:
	push	rax
	shr	rax, 16
	call	WriteW
	pop	rax

WriteW:
	push	rax
	shr	rax, 8
	call	WriteB
	pop	rax

WriteB:
	push	rax
	shr	rax, 4
	call	WriteNb
	pop	rax

WriteNb:
	and	al, 0x0F
	add	al, '0'
	cmp	al, '9'
	jbe	.skip11
	add	al, 7

..skip11:
	jmp	WriteChr

exception:
%assign excno 0
%rep	32
	push	excno
	jmp	.skip12
%assign excno excno + 1
%endrep

..skip12:
	call	WriteStrX
	db	10, "Exception ", 0
	pop	rax
	call	WriteB
	call	WriteStrX
	db 	" errcode=3D", 0
	mov	rax, [rsp + 0]
	call	WriteQW
	call	WriteStrX
	db	" rip=3D", 0
	mov	rax, [rsp + 8]
	call	WriteQW
	call	WriteStrX
	db	10, 0

..skip13:
	jmp $

clock:
	push 	rbp
	mov	rbp, dword 0x400
	inc	dword [rbp + 0x6C]
	pop	rbp

interrupt:
	push	rax
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

keyboard:
	push	rax
	in	al, 0x60
	test	al, 0x80
	jnz	.skip14
	mov	r8b, al

..skip14:
	in	al, 0x61
	out	0x61, al
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

STACK segment use16
	resb 	0

;	EXE_stack 4096
;	EXE_end

--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 4/30/2011 9:33:00 PM

Freedom on the Oceans wrote:

....
> _TEXT16 segment use16 

Not Nasm syntax! This is not doing what Masm/Tasm/Jwasm would do with 
it. (Makes a section/segment named "use32". "_TEXT16" is a label in the 
default ".text" section.)

segment _TEXT16 use16

.... *almost* Nasm syntax... The only "attributes" to section/segment 
names (in "-f bin") are "align=??", "start=??", and "vstart=??". Nasm 
warns that it is ignoring "use16". But Nasm does not make a "symbol" out 
of section names in "-f bin" mode, so the assembly errors out when you 
try to use it. :(

I don't understand why you guys are trying to use such a convoluted 
method to build this. Why not assemble it with "-f obj" and link it to 
an MZ, if that's what you want? Oh, no JWlink in Linux? (Japheth 
apparently supplies only a binary?) Could you link it in the same 
"virtual dos" where you're testing it? I've got Alink running in Linux, 
if that would help. (might not be "identical" - linkers have some 
flexibility...)

Or... why not make it a .com file? It would require some changes - 
startup conditions are different - but a lot of the "segreg shuffle" 
that Japheth does is to bring his MZ into a more ".com-like" state, 
unless I'm mistaken... Should be simpler, but we aren't going to be able 
to use labels like "_TEXT16", "_TEXT", and "STACK". We know - at runtime 
- what our load segment is, so we should be able to patch the places 
these are used. Am I thinking straight about this?

I'm going to have to learn some new stuff to understand this code, as 
well as "review" some stuff my memory's fuzzy on. (The "Nasm version" 
should be better commented, ideally) I think he's patching the "base" of 
descriptors to our linear address, right? Then in "new memory", he makes 
page tables(?), some "exe gates"(?), and some "int gates"(?). Then he 
clears a bit in the flags... (nmi?) Does something with the PIC (matches 
it to the IDT we built?) Does something with cr4. What's that value 
we're putting in ecx? Does something with msr(?) Loads GDT and IDT 
(we've patched the addresses - these need to be linear, right?). Then, 
we overwrite that funny value we put in ecx - this ecx is used to set 
esp after the jump to 64-bit code. We set cr0 to PMODE and make a far 
jump.  This part really puzzles me! ss is set to zero! In 32-bit code, 
this would be an "invalid descriptor". It's okay in 64-bit? We took a 
long jump to a descriptor, but ss is zero. What's the "base" on ss? At 
this point, I'm pretty lost!

I haven't been able to get really "focused" on this, as yet, but I'm 
"thinking about it"...

Best,
Frank
0
Reply fbkotler9831 (124) 5/1/2011 5:45:35 AM

"Frank Kotler" <fbkotler@nospicedham.myfairpoint.net> wrote in message
news:ipisb5$p1s$1@speranza.aioe.org...
> Freedom on the Oceans wrote:
>
> I don't understand why you guys are trying to use such a
> convoluted method to build this.
>

You guys?  Hmm...  I take it Frank is frustrated ... ?

Yes, he needs to fix the segment stuff.  I'm not sure how to do that.  I
would probably create a label and eliminate the other MASM-like cruft.
I.e., one big segment...  It's just an big binary blob.  I take it that that
may be a problem if a linker is used?

On his last post, he didn't state whether it's working yet or not.  I'm
assuming it's not.  The segment stuff not being done correctly as you noted
is likely one of the remaining reasons for that.  I.e., wrong offsets in
those 16 bytes that looked "wrong" to me.

IMO, I'm not using a convoluted method.  I was just trying to get the main
(.obj) portion of both assembly programs to match when compiled.  That's how
I tell that assemblies are equivalent.  How do you?  It seems his NASM
version is close to being correct, even if some stuff is incorrect for now.

> Why not assemble it with "-f obj" and link it to
> an MZ, if that's what you want?
>

.... and ...

> Or... why not make it a .com file?
>

IMO, there is nothing wrong with either...  Once he knows the NASM assembly
is correct, he can .EXE it, .COM it, or tweak it to his heart's content.
AIUI, it's not working yet, so doing either of those is premature, IMO.

> ... Japheth does [all this "stuff" in his code] ...
>

I haven't tried to figure out Japheth's code.  Maybe Alex has.  For me, that
would just add to the confusion.  ISTM that an equivalent binary in NASM
assembly is the first step.  There shouldn't really be any need for
"porting" or rewriting since x86 binary is x86 binary.  It doesn't matter if
NASM code or if JWasm code produced it.  If they're identical, they're
identical.

I make get back to checking out stuff in a day or two.


Rod Pemberton


0
Reply do_not_have7664 (117) 5/1/2011 7:32:24 AM

On Sun, 2011-05-01 at 03:32 -0400, Rod Pemberton wrote:
> "Frank Kotler" <fbkotler@nospicedham.myfairpoint.net> wrote in message
> news:ipisb5$p1s$1@speranza.aioe.org...
> > Freedom on the Oceans wrote:
> >
> > I don't understand why you guys are trying to use such a
> > convoluted method to build this.
> >
>=20
> You guys?  Hmm...  I take it Frank is frustrated ... ?

:)

> Yes, he needs to fix the segment stuff.  I'm not sure how to do that.  I
> would probably create a label and eliminate the other MASM-like cruft.
> I.e., one big segment...  It's just an big binary blob.  I take it that t=
hat
> may be a problem if a linker is used?

Or port alink 1.6 to Linux, which I have just *redone* as I'd already
ported it back in 2005, but it wasn't quite right. The latest patch for
it has just been posted.=20

> On his last post, he didn't state whether it's working yet or not.  I'm
> assuming it's not.  The segment stuff not being done correctly as you not=
ed
> is likely one of the remaining reasons for that.  I.e., wrong offsets in
> those 16 bytes that looked "wrong" to me.

It looks correct but doesn't run. As far as I've tested it, it works
until it's time to enable long mode. vbindiff picks out a lot of
differences between the JWASM and NASM versions because they emit
different encodings for the opcodes used (Why??? It used to be much
easier in the old Z80/6502 days!)

> I haven't tried to figure out Japheth's code.  Maybe Alex has.  For me, t=
hat
> would just add to the confusion.  ISTM that an equivalent binary in NASM
> assembly is the first step.  There shouldn't really be any need for
> "porting" or rewriting since x86 binary is x86 binary.  It doesn't matter=
 if
> NASM code or if JWasm code produced it.  If they're identical, they're
> identical.

This is the next step, figuring what does what. Thankfully the JWASM
version is nicely documented.

> I make get back to checking out stuff in a day or two.

Thanks, it's been an challenging project!
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 5/1/2011 8:46:42 AM

Frank Kotler wrote:
> Freedom on the Oceans wrote:

[...]
Me too work on a 64-32-16 switch since a while ...
but I can start from my own RM16/PM32 environment and I have no issues
with file-system because I write my binaries direct onto the harddisk
by only remembering size in sectors and LBA-number (just 16+48 bits).

Still fiddling around with ?best? organisation of shared data fields
by not loosing too much time on any switches.

> Or... why not make it a .com file? ....

I'd make it a self-relocating .bin to include wherever desired.

> I'm going to have to learn some new stuff to understand this code, as well 
> as "review" some stuff my memory's fuzzy on. (The "Nasm version" should be 
> better commented, ideally) I think he's patching the "base" of descriptors 
> to our linear address, right? Then in "new memory", he makes page 
> tables(?), some "exe gates"(?), and some "int gates"(?).

I read it as EXC.gates :)  nothing else than INT00..1F

> Then he clears a bit in the flags... (nmi?)

No, bit 14 is the NT-bit (nested task bit) rare set while in DOS anyway...

> Does something with the PIC (matches it to the IDT we built?)
didn't check in detail seems it masks all of and redirect it ?

>Does something with cr4.

set b5 PAE (enable 2MB pages)

> What's that value we're putting in ecx? Does something with msr(?)

RDMSR/WRMSR register address is given in ECX,
C000_0080 aka 'EFER', b8(EFER) = Enable LONG.

> Loads GDT and IDT (we've patched the addresses - these need to be linear, 
> right?). Then, we overwrite that funny value we put in ecx - this ecx is 
> used to set esp after the jump to 64-bit code. We set cr0 to PMODE and 
> make a far jump.  This part really puzzles me! ss is set to zero! In 
> 32-bit code, this would be an "invalid descriptor". It's okay in 64-bit? 
> We took a long jump to a descriptor, but ss is zero. What's the "base" on 
> ss? At this point, I'm pretty lost!

:) it takes some time to fully understand this 'not so new' 64-bit world!

there isn't any other than zero base for DS,ES,SS available in 64-bit mode,
only FS and GS can be limited, see FS/GS-MSRs. Also CS descriptors ignore
base and limit fields beside other bits while in 64-bit mode.

Everthing is 'near' almost unprotected and so perhaps a bit faster,
if I just could make 64-bit work without this damned slowing paging...

But data and code-descriptors still exist also in long mode, because their
base and limits are used when in 'compatible long' mode.

> I haven't been able to get really "focused" on this, as yet, but I'm 
> "thinking about it"...

I just read over Alex's code in a similar fast way, so I didn't check
if it may work at all. But almost everthing required seem to be there:
address conversion, 32-bit GDT, page setup, 64bit IDT entries, ...

I may have read it too fast, so I'd miss a register-content save/restore
and an opportunity check for having LONGmode data and code above 4GB.
__
wolfgang


0
Reply nowhere583 (184) 5/1/2011 11:43:52 AM

"Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
message news:j8jv88-c1h.ln1@nntp.local.net...
>
> > I [may] get back to checking out stuff in a day or two.
>
> Thanks, it's been an challenging project!
>

I've got more fixes, but not all of them.  I didn't check to see if you
found them in your last post...

Ok, I went through and checked all the differences.  There were an
additional 16 places that are incorrect.  I marked the ones I found but
currently have no fix for as "no fix".  There are 7 off those.  Those
offsets will need to be fixed.  FYI, some instructions were "byte" form
instead of "word" form according to NDISASM.  Japheth's code says WORD
though.

So, that fixes 9 places and leaves 7 places to look at: segments or their
labels, offsets of various labels.

12-17 BAD
possibly issue with align
 90's vs  2e 8b c0 2e 8b c0

5f-60 OK
61-62 OK
66-67 OK
72-73 OK
77-78 OK
7B-7C OK
C6 OK

CA BAD
in skip1:, value is plus one
; mov eax, 0x80000000
mov eax, 0x80000001

128 OK
12E OK
133 OK
159 OK
15A OK
163 OK
164 OK
16F OK
170 OK

18F-190 BAD
below .skip4, offset is wrong, no fix
mov bx, _TEXT

1A6-1A7 BAD
below .skip4, offset is wrong, no fix
mov edx, exception

1AC-1AD OK
1AE-1AF OK
1BB OK

1CC-1CD BAD
in make_int_gates, offset is wrong, no fix
mov eax, interrupt

1D1-1D2 OK
1DE OK
1EA-1EB OK

1EE-1EF BAD
below make_int_gates, offset is wrong, no fix
add eax, clock

201-202 OK

205-206 BAD
below make_int_gates, offset is wrong, no fix
add eax, keyboard

221-222 OK
281-282 OK

291-292 BAD
before to backtoreal, offset is wrong: 0x310 vs 0x0, no fix
db 0x66, 0xEA; jmp 0x8:0x0
llg dd long_start
dw 8

29F BAD
in backtoreal - missing F
; and eax, 0x7FFFFFF
and eax, 0x7FFFFFFF

2A8 BAD
in backtoreal - missing 0
; mov ecx, 0xC000080
mov ecx, 0xC0000080

2AB OK

2AC-2AD BAD
in backtoreal, wrong not
; and ax, !1
and ax,~1

2C7-2C8 BAD
in backtoreal, offset is wrong, no fix
dw _TEXT16

2D9 BAD
in backtoreal, wrong not
; and al, !0x20
and al, ~0x20

302-303 OK
310 OK
314-315 OK
354-355 OK
3DF-3E0 OK

3E3 BAD
in scroll_screen, byte not word
; movzx rax, word [rbp + 0x4a]
movzx rax, byte [rbp + 0x4a]

3F4-3F5 OK
41E OK

422 BAD
in .skip5, byte not word
; movzx rbx, word [rbp + 0x4e]
movzx rbx, byte [rbp + 0x4e]

426-427 OK
42E-42F OK

438 BAD
in .skip5, byte not word
; movzx rax, word [rbp + 0x4a]
movzx rax, byte [rbp + 0x4a]

442-44e BAD
in .skip5, mul not add
; movzx rdx, byte [rbx + 2 + rbp + 0x50]
movzx rdx, byte [rbx * 2 + rbp + 0x50]

445-446 OK
447-448 OK
483-484 OK
487 OK
49C OK
5D1-5D2 OK


Rod Pemberton


0
Reply do_not_have7664 (117) 5/1/2011 9:23:13 PM

"Rod Pemberton" <do_not_have@nospicedham.noavailemail.cmm> wrote in message
news:ipj20t$4sg$1@speranza.aioe.org...
> "Frank Kotler" <fbkotler@nospicedham.myfairpoint.net> wrote in message
> news:ipisb5$p1s$1@speranza.aioe.org...
>
> > Why not assemble it with "-f obj" and link it to
> > an MZ, if that's what you want?
> >
>
> ... and ...
>
> > Or... why not make it a .com file?
> >
>
> IMO, there is nothing wrong with either...
>

I was wrong.  Apparently, NASM doesn't like *either* one.

I decided to see if I could fix the "segment" issue.  I'm not familiar with
NASM's "segment" or "section" keywords.  AFAICT, it seems that the "segment"
keyword *DOES NOT WORK* with .bin, even though NASM documentation from many
versions says it should work with .bin.  Supposedly, there is even enhanced
support for .bin.  (Some NASMDOC's have "6.1.2 `bin' Extensions to the
`SECTION' directive")  It is supposed to define a label for the section.
Unfortunately, 0.98.39, 2.06rc8, 2.08rc9, and 2.09rc6, *DO NOT* seem to
recognize this label.  Is this correct?  I must've done something
incorrect...

; nasm -f bin
BITS 16
segment asdf

mov eax, asdf
mov ebx, [asdf]
mov eax, seg asdf
dw asdf

error: symbol `asdf' undefined
error: symbol `asdf' undefined
error: symbol `asdf' undefined
error: symbol `asdf' undefined

So, I decided to try .obj, i.e., "nasm -f .obj", since this too is supposed
to support sections or segments.  It seems to like the 16-bit section of
DOS64.asm.  But, it appears that "-f obj" with NASM won't generate 64-bit
code for a .obj:

  "dos64.asm:308:error: obj output format does not support 64-bit code"

If segments or sections are not available with .bin, and .obj won't do the
64-bit code, how is this thing ever going to completely compile with NASM?

Unless someone knows how that stuff is supposed to work, I guess we're going
to have to fake segments using .bin since it compiles 64-bit...

It seems to me to be ridiculous that one cannot emit 64-bit code in an .obj
or that sections are not supported for a .com.  It's not up to an assembler
to decide what I can or cannot do.  It should do it.


Rod Pemberton


0
Reply do_not_have7664 (117) 5/2/2011 12:32:31 AM

On May 1, 7:32=A0pm, "Rod Pemberton"
<do_not_h...@nospicedham.noavailemail.cmm> wrote:
> "Rod Pemberton" <do_not_h...@nospicedham.noavailemail.cmm> wrote in messa=
ge
>
> news:ipj20t$4sg$1@speranza.aioe.org...
>
> > "Frank Kotler" <fbkot...@nospicedham.myfairpoint.net> wrote in message
> >news:ipisb5$p1s$1@speranza.aioe.org...
>
> > > Why not assemble it with "-f obj" and link it to
> > > an MZ, if that's what you want?
>
> > ... and ...
>
> > > Or... why not make it a .com file?
>
> > IMO, there is nothing wrong with either...
>
> I was wrong. =A0Apparently, NASM doesn't like *either* one.
>
> I decided to see if I could fix the "segment" issue. =A0I'm not familiar =
with
> NASM's "segment" or "section" keywords. =A0AFAICT, it seems that the "seg=
ment"
> keyword *DOES NOT WORK* with .bin, even though NASM documentation from ma=
ny
> versions says it should work with .bin. =A0Supposedly, there is even enha=
nced
> support for .bin. =A0(Some NASMDOC's have "6.1.2 `bin' Extensions to the
> `SECTION' directive") =A0It is supposed to define a label for the section=
..
> Unfortunately, 0.98.39, 2.06rc8, 2.08rc9, and 2.09rc6, *DO NOT* seem to
> recognize this label. =A0Is this correct? =A0I must've done something
> incorrect...
>
> ; nasm -f bin
> BITS 16
> segment asdf
>
> mov eax, asdf
> mov ebx, [asdf]
> mov eax, seg asdf
> dw asdf
>
> error: symbol `asdf' undefined
> error: symbol `asdf' undefined
> error: symbol `asdf' undefined
> error: symbol `asdf' undefined
>

Yes, asdf is an undefined symbol in the above, so..

 [MAP ALL asdf.MAP]

; nasm -f bin
BITS 16

 [segment asdf]

mov eax, asdf
mov ebx, [asdf]
mov eax, seg asdf

asdf:  dw asdf

;; -=3Deof=3D-

ASDF.NSM:10: error: binary output format does not support segment base
references

NASM version 0.98.38 compiled on Sep 12 2003

----
Yet -f bin is quite flexible and you can certainly do segments with
it.

http://www.project-fbin.hostoi.com/nasmhints/binspace/binexe/EXETMPLT.NSM.H=
TML

and,

http://www.project-fbin.hostoi.com/nasmhints/binspace/dheader/DHEADER.NSM.h=
tml


The first example is an .exe, the second is a .com which manages its
own sections, but these seem outdated, I'll have to look for newer
examples, esp. for the exe (a codeview friendly version, meaning it
expects segments ordered as DS=3DES,CS,SS ).

The exe header allows for defining nobits sections, so you can
designate a memory block to be reserved and appended to the exe image,
the Loader does this for you instead of using int 21h AH`4Ah.  The
thing with -f bin is that it doesn't create a fixup record, so you
don't maintain modules as .obj's for link, you maintain modules as asm
sources and include them in a monolithic assembly, skipping the link
phase to go right to the final image.

fwiw,

Steve


> So, I decided to try .obj, i.e., "nasm -f .obj", since this too is suppos=
ed
> to support sections or segments. =A0It seems to like the 16-bit section o=
f
> DOS64.asm. =A0But, it appears that "-f obj" with NASM won't generate 64-b=
it
> code for a .obj:
>
> =A0 "dos64.asm:308:error: obj output format does not support 64-bit code"
>
> If segments or sections are not available with .bin, and .obj won't do th=
e
> 64-bit code, how is this thing ever going to completely compile with NASM=
?
>
> Unless someone knows how that stuff is supposed to work, I guess we're go=
ing
> to have to fake segments using .bin since it compiles 64-bit...
>
> It seems to me to be ridiculous that one cannot emit 64-bit code in an .o=
bj
> or that sections are not supported for a .com. =A0It's not up to an assem=
bler
> to decide what I can or cannot do. =A0It should do it.
>
> Rod Pemberton

0
Reply s_dubrovich7206 (19) 5/2/2011 4:33:48 AM

Rod Pemberton wrote:
> "Frank Kotler" <fbkotler@nospicedham.myfairpoint.net> wrote in message
> news:ipisb5$p1s$1@speranza.aioe.org...
>> Freedom on the Oceans wrote:
>>
>> I don't understand why you guys are trying to use such a
>> convoluted method to build this.
>>
> 
> You guys?  Hmm...  I take it Frank is frustrated ... ?

Yes! Frustrated with myself for not being able to "get into it". Not 
with "you guys". I'm delighted "you guys" are working on this!

> Yes, he needs to fix the segment stuff.  I'm not sure how to do that.  I
> would probably create a label and eliminate the other MASM-like cruft.
> I.e., one big segment...  It's just an big binary blob.  I take it that that
> may be a problem if a linker is used?

I think so, yes. An MZ is a rather special "binary blob" and it is 
loaded differently than a .com.

> On his last post, he didn't state whether it's working yet or not.  I'm
> assuming it's not.

I assume we'll be able to hear Alex's jubilation even with the computer 
off! :)

> The segment stuff not being done correctly as you noted
> is likely one of the remaining reasons for that.  I.e., wrong offsets in
> those 16 bytes that looked "wrong" to me.

Yeah, I think so.

> IMO, I'm not using a convoluted method.

Well, I was referring to using the macros to "fake" an MZ header in "-f 
bin" mode, mostly. Now that I've "gotten into it" enough to read over 
Japheth's code... it is nicely commented. Clears up a lot of my 
puzzlement - but big thanks to Wolfgang for the clues! Apparently it is 
built with "jwasm -mz"... No mention of a linker! So perhaps using those 
macros isn't all that "convoluted"...

> I was just trying to get the main
> (.obj) portion of both assembly programs to match when compiled.  That's how
> I tell that assemblies are equivalent.  How do you?

Mostly, I don't. I *have* attempted to produce identical binaries, but 
as you point out, "alternative encodings", etc. make it more hassle than 
it's worth (in most cases... IMO...). I'm usually content with "works 
the same".

> It seems his NASM
> version is close to being correct, even if some stuff is incorrect for now.
> 
>> Why not assemble it with "-f obj" and link it to
>> an MZ, if that's what you want?

As you point out in a later message, Nasm won't do it, that's why. I did 
not know that - but I'm not surprised, now that I think of it. OMF is 
essentially a 16-bit format, but there are 32-bit "extensions". Nasm 
does a horrible job with the 32-bit extensions, so maybe just as well it 
doesn't attempt 64-bit! There must be 64-bit extensions, as well, since 
Jwasm does it!

> ... and ...
> 
>> Or... why not make it a .com file?
>>
> 
> IMO, there is nothing wrong with either...  Once he knows the NASM assembly
> is correct, he can .EXE it, .COM it,

But don't forget that startup conditions are different!

> or tweak it to his heart's content.
> AIUI, it's not working yet, so doing either of those is premature, IMO.
> 
>> ... Japheth does [all this "stuff" in his code] ...
>>
> 
> I haven't tried to figure out Japheth's code.

If/when you do, read the original!

> Maybe Alex has.

Probably. He's stripped a lot of Japheth's comments in what he posted, 
which is a shame, IMO.

> For me, that
> would just add to the confusion.  ISTM that an equivalent binary in NASM
> assembly is the first step.

Well, that's one way to approach it. Another way would be to understand 
the code first, and shoot for "equivalent functionality", rather than 
"identical binaries".

> There shouldn't really be any need for
> "porting" or rewriting since x86 binary is x86 binary.

A .com is the same as an MZ? A PE? ELF?

> It doesn't matter if
> NASM code or if JWasm code produced it.  If they're identical, they're
> identical.

True enough!

> I [may] get back to checking out stuff in a day or two.

Okay - I see you did...

I've attempted a "simplification" of Japheth's/Alex's code, intended to 
be a .com file right from the start. I skip the "resize memory block" 
and "malloc" and just ASSume there's some memory there I can use. I 
haven't even tested it in dos - it'll only tell me I haven't got a 
64-bit CPU (if it gets that far). If anyone should choose to test it - 
in case of smoke, pull the plug!

Best,
Frank

; from Japheth
;--- DOS program which switches to long-mode and back.
;--- Note: requires at least JWasm v2.
;--- Also: needs a 64bit cpu in real-mode to run.
;--- Parts of the source are based on samples supplied by
;--- sinsi and Tomasz Grysztar in the FASM forum.
;--- To create the binary enter:
;---  JWasm -mz DOS64.asm

; based on Alex Buell's "translation" to Nasm
; attempt to make a Nasm .com file - fbk
; with assistance from Rod Pemberton, Wolfgang Kern,...
; nasm -f bin -o dos64n2.com dos64n2.asm
;
; UNTESTED!!!

[map all dos64n2.map]

	BITS 16
	ORG 0x100

section .text

	jmp	start16
	
GDTR	dw 4 * 8 - 1
	dd GDT
IDTR	dw 256 * 16 - 1
	dd 0
nullidt dd 0x3FF
	dd 0

	align 8
GDT	dq 0
	dw 0xFFFF, 0, 0x9A00, 0x0AF
	dw 0xFFFF, 0, 0x9A00, 0x000
	dw 0xFFFF, 0, 0x9200, 0x000

wPICMask dw 0

start16:

	mov	eax, cs

	mov [loadseg], ax
	mov [loadseg2], ax

	shl	eax, 4 ; make it a linear address
	add	dword [GDTR + 2], eax

	mov	word [GDT + 2 * 8 + 2], ax
	mov	word [GDT + 3 * 8 + 2], ax
	shr	eax, 16
	mov	byte [GDT + 2 * 8 + 4], al
	mov	byte [GDT + 3 * 8 + 4], al

	smsw	ax
	test	al, 1
	jz	skip1

	mov	dx, err1
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err1:	db	"Mode is V86. Need REAL mode to"
         db  "switch to LONG mode.", 13, 10, '$'

skip1:
	xor	edx, edx
	mov	eax, 0x80000000
	cpuid
	test 	edx, 0x20000000
	jnz	skip2

	mov	dx, err2
	mov	ah, 9
	int	0x21
	mov	ah, 0x4C
	int	0x21

err2:	db	"No 64 bit cpu detected.", 13, 10, '$'

skip2:
	mov bx, cs
	add	bx, 0x10FF
	mov bl, 0
	mov es, bx

	sub	di, di
	mov	cx, 4096
	sub	eax, eax
	rep	stosd

	sub	di, di
	mov	ax, es
	movzx	eax, ax
	shl	eax, 4
	mov	cr3, eax

	lea	edx, [eax + 0x5000]
	mov	dword [IDTR + 2], edx

	or	eax, 111b
	add	eax, 0x1000
	mov	[es:di + 0x0000], eax
	add	eax, 0x1000
	mov	[es:di + 0x1000], eax
	add	eax, 0x1000
	mov	[es:di + 0x2000], eax
	mov	di, 0x3000
	mov	eax, 0 + 111b
	mov	cx, 256

..skip4:
	stosd
	add	di, 4
	add	eax, 0x1000
	loop	.skip4

	mov	ebx, cs
	shl	ebx, 4
	add 	[llg], ebx

	mov	di, 0x5000
	mov	cx, 32
	mov	edx, exception
	add	edx, ebx

make_exc_gates:
	mov	eax, edx
	stosw
	mov	ax, 8
	stosw
	mov	ax, 0x8E00
	stosd
	xor	eax, eax
	stosd
	stosd
	add	edx, 4
	loop	make_exc_gates
	mov	cx, 256 - 32

make_int_gates:
	mov	eax, interrupt
	add	eax, ebx
	stosw
	mov	ax, 8
	stosw
	mov	ax, 0x8E00
	stosd
	xor	eax, eax
	stosd
	stosd
	loop	make_int_gates

	mov	di, 0x5000
	mov	eax, ebx
	add	eax, clock
	mov	[es:di + 0x80 * 16 + 0], ax
	shr	eax, 16
	mov	[es:di + 0x80 * 16 + 6], ax

	mov	eax, ebx
	add	eax, keyboard
	mov	[es:di + 0x81 * 16 + 0], ax
	shr	eax, 16
	mov	[es:di + 0x81 * 16 + 6], ax

; clear NT flag (thanks Wolfgang!)
	pushf
	pop	ax
	and	ah, 0xBF
	push	ax
	popf

	cli
	in	al, 0xA1
	mov	ah, al
	in	al, 0x21
	mov	[wPICMask], ax
	mov	al, 10001b
	out	0x20, al
	mov	al, 10001b
	out	0xA0, al
	mov	al, 0x80
	out	0x21, al
	mov	al, 0x88
	out	0xA1, al
	mov	al, 100b
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov 	al, 1
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	al, 11111100b
	out	0x21, al
	in	al, 0xA1
	mov	al, 11111111b
	out	0xA1, al

; enable PAE (thanks, Wolfgang!)
	mov	eax, cr4
	or	eax, 1 << 5
	mov	cr4, eax

; enable long mode (thanks, Wolfgang!)
	mov	ecx, 0xC0000080

	rdmsr
	or	eax, 1 << 8
	wrmsr

	lgdt	[GDTR]
	lidt	[IDTR]

	mov	ecx, ss ; same as cs
	shl	ecx, 4
	add	ecx, esp ; save to use after the jump

	mov	eax, cr0
	or	eax, 0x80000001
	mov	cr0, eax

	db	0x66, 0xEA	; jmp 0x8:0x0
llg	dd	long_start
	dw	8

backtoreal:
	mov	eax, cr0
	and	eax, 0x7FFFFFF
	mov	cr0, eax
	mov	ecx, 0xC000080

	rdmsr
	and	ax, ~1
	wrmsr

	mov	ax, 24
	mov	ss, ax
	mov	ds, ax
	mov	es, ax
	mov	eax, cr0
	and	al, 0xFE
	mov	cr0, eax

	db	0xEA
	dw	$+4
loadseg:	dw 0 ; overwite at runtime
	mov	eax, cs
	mov	ss, ax
	mov ds, ax
	mov es, ax
	shr eax, 4
	
	sub	esp, eax
	lidt	[nullidt]
	mov	eax, cr4
	and	al, ~0x20
	mov	cr4, eax

	mov	al, 10001b
	out	0x20, al
	mov	al, 10001b
	out	0xA0, al
	mov	al, 0x08
	out	0x21, al
	mov	al, 0x70
	out	0xA1, al
	mov	al, 100b
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov	al, 1
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	ax, [wPICMask]
	out	0x21, al
	mov	al, ah
	out	0xA1, al
	sti

	mov	ax, 0x4C00
	int	0x21

;-------------------------
BITS 64

long_start:
	xor	eax, eax
	mov	ss, eax
	mov	esp, ecx
	sti

	call	WriteStrX
	db	'Hello 64bit', 10, 0

nextcmd:
	mov	r8b, 0

nocmd:
	cmp	r8b, 0
	jz	nocmd
	cmp	r8b, 1
	jz	esc_pressed
	cmp	r8b, 0x13
	jz	r_pressed
	call	WriteStrX
	db 	'Unknown key ', 0
	mov	al, r8b
	call	WriteB
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

r_pressed:
	call	WriteStrX
	db	10, "cr0=", 0
	mov	rax, cr0
	call	WriteQW
	call	WriteStrX
	db	10, "cr2=", 0
	mov	rax, cr2
	call	WriteQW
	call	WriteStrX
	db	10, "cr3=", 0
	mov	rax, cr3
	call	WriteQW
	call	WriteStrX
	db	10, "cr4=", 0
	mov	rax, cr4
	call	WriteQW
	call	WriteStrX
	db	10, "cr8=", 0
	mov	rax, cr8
	call	WriteQW
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

esc_pressed:
	jmp far	[bv]
bv:	dd	backtoreal
loadseg2:	dw	0 ; overwrite at runtime

scroll_screen:
	cld
	mov	rdi, rsi
	movzx	rax, word [rbp + 0x4a]
	push	rax
	lea	rsi, [rsi + 2 * rax]
	mov	cl, [rbp + 0x84]
	mul	cl
	mov	rcx, rax
	rep	movsw
	pop	rcx
	mov	ax, 0x0720
	rep	stosw
	ret

WriteChr:
	push	rbp
	push	rdi
	push	rsi
	push	rbx
	push 	rcx
	push	rdx
	push	rax
	mov	rdi, 0xB8000
	mov	rbp, 0x400
	cmp	byte [rbp + 0x63], 0xB4
	jnz	.skip5
	xor	di, di

..skip5:
	movzx	rbx, word [rbp + 0x4E]
	add	rdi, rbx
	movzx	rbx, byte [rbp + 0x62]
	mov	rsi, rdi
	movzx	rcx, byte [rbx * 2 + rbp + 0x50 + 1]
	movzx	rax, word [rbp + 0x4A]
	mul	rcx
	movzx	rdx, byte [rbx + 2 + rbp + 0x50]
	add	rax, rdx
	mov	dh, cl
	lea	rdi, [rdi + rax * 2]
	mov	al, [rsp]
	cmp	al, 10
	jz	.newline
	mov	[rdi], al
	mov	byte [rdi + 1], 7
	inc	dl
	cmp	dl, byte [rbp + 0x4a]
	jb	.skip6

..newline:
	mov	dl, 0
	inc	dh
	cmp	dh, byte [rbp + 0x84]
	jbe	.skip6
	dec	dh
	call	scroll_screen

..skip6:
	mov	[rbx * 2 + rbp + 0x50], dx
	pop	rax
	pop	rdx
	pop	rcx
	pop	rbx
	pop	rsi
	pop	rdi
	pop	rbp
	ret

WriteStr:
	push	rsi
	mov	rsi, rdx
	cld

..skip7:
	lodsb
	and	al, al
	jz	.skip8
	call	WriteChr
	jmp	.skip7

..skip8:
	pop	rsi
	ret

WriteStrX:
	push	rsi
	mov	rsi, [rsp + 8]
	cld

..skip9:
	lodsb
	and	al, al
	jz	.skip10
	call	WriteChr
	jmp	.skip9

..skip10:
	mov	[rsp + 8], rsi
	pop	rsi
	ret

WriteQW:
	push	rax
	shr	rax, 32
	call	WriteDW
	pop	rax

WriteDW:
	push	rax
	shr	rax, 16
	call	WriteW
	pop	rax

WriteW:
	push	rax
	shr	rax, 8
	call	WriteB
	pop	rax

WriteB:
	push	rax
	shr	rax, 4
	call	WriteNb
	pop	rax

WriteNb:
	and	al, 0x0F
	add	al, '0'
	cmp	al, '9'
	jbe	.skip11
	add	al, 7

..skip11:
	jmp	WriteChr

exception:
%assign excno 0
%rep	32
	push	excno
	jmp	.skip12
%assign excno excno + 1
%endrep

..skip12:
	call	WriteStrX
	db	10, "Exception ", 0
	pop	rax
	call	WriteB
	call	WriteStrX
	db 	" errcode=", 0
	mov	rax, [rsp + 0]
	call	WriteQW
	call	WriteStrX
	db	" rip=", 0
	mov	rax, [rsp + 8]
	call	WriteQW
	call	WriteStrX
	db	10, 0

..skip13:
	jmp $

clock:
	push 	rbp
	mov	rbp, 0x400
	inc	dword [rbp + 0x6C]
	pop	rbp

interrupt:
	push	rax
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

keyboard:
	push	rax
	in	al, 0x60
	test	al, 0x80
	jnz	.skip14
	mov	r8b, al

..skip14:
	in	al, 0x61
	out	0x61, al
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq
;-----------------------
0
Reply fbkotler9831 (124) 5/2/2011 7:45:23 AM

On Sun, 2011-05-01 at 20:32 -0400, Rod Pemberton wrote:

[ above snipped as other poster has replied to it]

> So, I decided to try .obj, i.e., "nasm -f .obj", since this too is suppos=
ed
> to support sections or segments.  It seems to like the 16-bit section of
> DOS64.asm.  But, it appears that "-f obj" with NASM won't generate 64-bit
> code for a .obj:
>=20
>   "dos64.asm:308:error: obj output format does not support 64-bit code"
>=20
> If segments or sections are not available with .bin, and .obj won't do th=
e
> 64-bit code, how is this thing ever going to completely compile with NASM=
?
>=20
> Unless someone knows how that stuff is supposed to work, I guess we're go=
ing
> to have to fake segments using .bin since it compiles 64-bit...
>=20
> It seems to me to be ridiculous that one cannot emit 64-bit code in an .o=
bj
> or that sections are not supported for a .com.  It's not up to an assembl=
er
> to decide what I can or cannot do.  It should do it.

Time to file a bug with the NASM developers? Seems to me that there are
two different bugs, inability to generate segments in binaries and
inability to emit 64 bit code in MSDOS object files
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 5/2/2011 9:01:08 AM

On Sun, 2011-05-01 at 20:32 -0400, Rod Pemberton wrote:

[ above snipped as other poster has replied to it]

> So, I decided to try .obj, i.e., "nasm -f .obj", since this too is suppos=
ed
> to support sections or segments.  It seems to like the 16-bit section of
> DOS64.asm.  But, it appears that "-f obj" with NASM won't generate 64-bit
> code for a .obj:
>=20
>   "dos64.asm:308:error: obj output format does not support 64-bit code"
>=20
> If segments or sections are not available with .bin, and .obj won't do th=
e
> 64-bit code, how is this thing ever going to completely compile with NASM=
?
>=20
> Unless someone knows how that stuff is supposed to work, I guess we're go=
ing
> to have to fake segments using .bin since it compiles 64-bit...
>=20
> It seems to me to be ridiculous that one cannot emit 64-bit code in an .o=
bj
> or that sections are not supported for a .com.  It's not up to an assembl=
er
> to decide what I can or cannot do.  It should do it.

Time to file a bug with the NASM developers? Seems to me that there are
two different bugs, inability to generate segments in binaries and
inability to emit 64 bit code in MSDOS object files
--=20
Tactical Nuclear Kittens


0
Reply alex.buell (44) 5/2/2011 9:06:55 AM

"Frank Kotler" <fbkotler@nospicedham.myfairpoint.net> wrote in message
news:iplnnq$sl1$1@speranza.aioe.org...
>  [SNIP]
>
> I've attempted a "simplification" of Japheth's/Alex's code, intended to
> be a .com file right from the start. I skip the "resize memory block"
> and "malloc" and just ASSume there's some memory there I can use. I
> haven't even tested it in dos - it'll only tell me I haven't got a
> 64-bit CPU (if it gets that far). If anyone should choose to test it -
> in case of smoke, pull the plug!
>
> ; from Japheth
> ;--- DOS program which switches to long-mode and back.
> ;--- Note: requires at least JWasm v2.
> ;--- Also: needs a 64bit cpu in real-mode to run.
> ;--- Parts of the source are based on samples supplied by
> ;--- sinsi and Tomasz Grysztar in the FASM forum.
> ;--- To create the binary enter:
> ;---  JWasm -mz DOS64.asm
>
> ; based on Alex Buell's "translation" to Nasm
> ; attempt to make a Nasm .com file - fbk
> ; with assistance from Rod Pemberton, Wolfgang Kern,...
> ; nasm -f bin -o dos64n2.com dos64n2.asm
> ;
> ; UNTESTED!!!
>
> [SNIP]

Sorry, that code is a no go for me.

I compiled with NASM 2.09rc6 with -f bin.

1) Win98SE dosbox
1443 bytes
Echoes numbers for keystrokes for a while.  No indication if in 64-bit mode.

2) RM MS-DOS v7.10 clean boot
*FAILS*
nasm: fatal: assertion addr <= s->start failed at output/outbin.c:1462

3) RM MS-DOS v7.10 after running some DJGPP programs (DPMI host loaded?...)
1699 bytes
- 512 0 bytes at the beginning
- deleting leaves 25 byte differences
- byte values are off by one
Echoes numbers for keystrokes for a while.  No indication if in 64-bit mode.

4) RM MS-DOS v7.10 run DPMI host first, CWSDPMI
-same as 2)
-attempt to find out why 3) compiled...

Uh, I have to assume the DOS version of NASM 2.09rc6 is botched.  Should I
try a newer version?


Rod Pemberton


0
Reply do_not_have7664 (117) 5/2/2011 7:27:45 PM

"Rod Pemberton" <do_not_have@nospicedham.noavailemail.cmm> wrote in message
news:ipn0a2$75h$1@speranza.aioe.org...
>

FYI, 2.10rc5 doesn't exhbit this failure:

> 2) RM MS-DOS v7.10 clean boot
> *FAILS*
> nasm: fatal: assertion addr <= s->start failed at output/outbin.c:1462
>
> [snip]
> Uh, I have to assume the DOS version of NASM 2.09rc6 is botched.  Should I
> try a newer version?
>

Rod Pemberton


0
Reply do_not_have7664 (117) 5/2/2011 8:59:15 PM

Rod Pemberton wrote:
> "Frank Kotler" <fbkotler@nospicedham.myfairpoint.net> wrote in message
> news:iplnnq$sl1$1@speranza.aioe.org...
>>  [SNIP]
>> ; UNTESTED!!!
>>
>> [SNIP]
> 
> Sorry, that code is a no go for me.

Thanks for trying it, Rod!

> I compiled with NASM 2.09rc6 with -f bin.
> 
> 1) Win98SE dosbox
> 1443 bytes

Interesting. I'm getting 5593 bytes. I don't see any big blocks of zeros...

> Echoes numbers for keystrokes for a while.  No indication if in 64-bit mode.

Echoes numbers is good. What's it do after "a while"? I'm happy if it 
doesn't reboot immediately!  Does it dump registers if you hit 'r'? Does 
it exit if you hit ESC? Funny that string isn't showing up. Well, 
there's a bug in it... probably more than one. Not a bad start for 
"winging it"...

> 2) RM MS-DOS v7.10 clean boot
> *FAILS*
> nasm: fatal: assertion addr <= s->start failed at output/outbin.c:1462

Now *that* looks like a Nasm bug!

> 3) RM MS-DOS v7.10 after running some DJGPP programs (DPMI host loaded?...)
> 1699 bytes

Interesting...

> - 512 0 bytes at the beginning

Right at the beginning, or at the "align" directive? There was a known 
bug in some versions in "-f bin" mode, where a few "align" directives 
would produce a much larger file than expected. I think it was adding in 
the origin every time(?). I *think* it's been fixed.

> - deleting leaves 25 byte differences
> - byte values are off by one

Odd...

> Echoes numbers for keystrokes for a while.  No indication if in 64-bit mode.
> 
> 4) RM MS-DOS v7.10 run DPMI host first, CWSDPMI
> -same as 2)
> -attempt to find out why 3) compiled...
> 
> Uh, I have to assume the DOS version of NASM 2.09rc6 is botched.  Should I
> try a newer version?

If you're up for it, please do! We're up to 2.09.08. That "align bug" 
should be fixed. You might still hit that "assertion" bug. That may have 
been fixed, too - Cyrill's been doing a lot of work on it!

The 2.10xx versions have got a reworked preprocessor to support 
recursive macros. The idea being support for OOP, I understand. I find 
the idea of "information hiding" in asm pretty [expletive deleted], so I 
don't think it'll interest me, but we need testers for that, too!

I've gotta upgrade my own version of Nasm, too! I'm still running 
2.10rc2-20101108... There's a down side to Nasm not being "dead" after 
all! Testing every single update gets to be a PITA, even for me. We have 
an ever-growing "test suite", but it's the slightly "oddball" projects 
that find most of the bugs. So we need testers! Everybody thats "up for 
it", jump in!

http://www.nasm.us/pub/nasm/releasebuilds/

Or day-to-day "snapshots" here:

http://www.nasm.us/pub/nasm/snapshots/

Thanks again for trying it! Actually, there's a 64-bit machine within 
arm's length of me (if I stretch) - the roommate's. Not running dos. I 
could put my HD in it, but... it doesn't support IDE (I've just been 
informed), just SATA. However... There's an ex-girlfriend's machine 
kicking around that he's supposed to be repairing (reboots partway 
through the Windows install, apparently). If that's a "Windows problem" 
rather than a "hardware problem", I may be able to get dos running on 
that and test it myself. I'll have to "get into it" again. Not right now...

Best,
Frank
0
Reply fbkotler9831 (124) 5/2/2011 9:19:56 PM

On Mon, 2011-05-02 at 15:27 -0400, Rod Pemberton wrote:

> Echoes numbers for keystrokes for a while.  No indication if in 64-bit
> mode.

Looks like the program has succeeded in setting up the interrupt vector
in long mode to intercept the keyboard IRQ. Does pressing 'r' dump out
some registers?=20
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 5/2/2011 9:23:17 PM

Rod Pemberton wrote:
> "Rod Pemberton" <do_not_have@nospicedham.noavailemail.cmm> wrote in message
> news:ipn0a2$75h$1@speranza.aioe.org...
> 
> FYI, 2.10rc5 doesn't exhbit this failure:

Excellent! Thanks!

Best,
Frank

0
Reply fbkotler9831 (124) 5/2/2011 9:26:03 PM

Hi!

I've finally gone through my code and commented exactly what it does.
Two things have become readily apparent.=20

1. Japteth has built it as an .EXE.=20
2. As it is an .EXE, he has to go through some hoops to make it a tiny
model as EXEs can have more than one code or data segments.

I have a nasty suspicion that he has confused 16 bit segments with 32/64
bit segments hence the reason why his code looks really weird. :) To
clear things up a bit, I'd better rename the 32/64 'segments' as
'selectors', that is the correct term!

I'll just finish commenting the rest of the code, make a copy of it and
then throw away the code that's not needed for a .COM and see if it
works.=20

Within the next few days I'm going to lose my access to an amd64 box to
test it on, and I will have to put the code away for the next time I get
access (but I'll publish what I've done so far)

This program has a lot of possibilities, one could turn it into a 64 bit
DOS extender.
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 5/2/2011 9:57:15 PM

On Mon, 2011-05-02 at 17:19 -0400, Frank Kotler wrote:

> I've gotta upgrade my own version of Nasm, too! I'm still running=20
> 2.10rc2-20101108... There's a down side to Nasm not being "dead"
> after=20
> all! Testing every single update gets to be a PITA, even for me. We
> have=20
> an ever-growing "test suite", but it's the slightly "oddball"
> projects=20
> that find most of the bugs. So we need testers! Everybody thats "up
> for=20
> it", jump in!=20

Did my patch for building NASM under MSDOS ever get accepted into main
line - sent it in a couple of years back (2.04.x?) IIRC some files
needed renaming to fit within the 8.3 filename format mandated by DOS.=20
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 5/2/2011 10:06:42 PM

"Frank Kotler" <fbkotler@nospicedham.myfairpoint.net> wrote in message
news:ipn7f3$qha$1@speranza.aioe.org...
> Rod Pemberton wrote:
>
> > I compiled with NASM 2.09rc6 with -f bin.
> >
> > 1) Win98SE dosbox
> > 1443 bytes
>
> Interesting. I'm getting 5593 bytes. I don't see any big blocks of
> zeros...
>

Wow!  Is that -f bin ?

> > Echoes numbers for keystrokes for a while.  No indication if in 64-bit
> > mode.
>
> Echoes numbers is good. What's it do after "a while"?
>

Two stopped.  One kept on going.

> Does it dump registers if you hit 'r'?

It prints 1 0 0 0 0 each on a separate line.  (All three versions.)

> Does it exit if you hit ESC?

It prints 'A' and locks up.  (All three versions.)

> > 3) RM MS-DOS v7.10 after running some DJGPP programs
> > 1699 bytes
>
> Interesting...
>
> > - 512 0 bytes at the beginning
>
> Right at the beginning, or at the "align" directive?
>

Right at the front. 0x00 to 0xFF are 0x00.


Rod Pemberton



0
Reply do_not_have7664 (117) 5/2/2011 10:27:41 PM

"Rod Pemberton" <do_not_have@nospicedham.noavailemail.cmm> wrote in message
news:ipkimh$55s$1@speranza.aioe.org...
> "Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
> message news:j8jv88-c1h.ln1@nntp.local.net...
> >
> > > I [may] get back to checking out stuff in a day or two.
> >
> > Thanks, it's been an challenging project!
> >
>
> I've got more fixes, but not all of them.  I didn't check to see if you
> found them in your last post...
>
> Ok, I went through and checked all the differences.  There were an
> additional 16 places that are incorrect.  I marked the ones I found but
> currently have no fix for as "no fix".  There are 7 off those.  Those
> offsets will need to be fixed.  FYI, some instructions were "byte" form
> instead of "word" form according to NDISASM.  Japheth's code says WORD
> though.
>
> So, that fixes 9 places and leaves 7 places to look at: segments or their
> labels, offsets of various labels.
>


I'm down to two offsets being incorrect....  With Frank's and Steve's
suggestions and examples, I fixed 5 of the 7 remaining offsets that were
incorrect.  They need _TEXT subtracted from them and SECTION directives
for the segments, as cleared up by MAP directive.  I'll post it once it's
working.

These two lines are the remaining problems (IMO):

  mov bx, _TEXT  ; below .skip4

  dw _TEXT16 ; in backtoreal

We need SEG _TEXT or _TEXT>>4, and likewise for __TEXT16.  E.g., getting
0x310 for _TEXT instead of 0x31, i.e., 0x310>>4.

I need to figure out how to shift a label's by 4 for two of them.  This
should be SEG, or maybe WRT, but SEG isn't supported in -f bin, apparently.
I tried a few things with equ's, %define, %assign.  There is always an error
so far...  I.e., can't do that to a scalar or symbol not defined yet, etc.
It looks like Steve is doing something similar with equ's, but that didn't
want to work for me with newer NASM's...


Rod Pemberton



0
Reply do_not_have7664 (117) 5/2/2011 10:35:05 PM

"Rod Pemberton" <do_not_have@nospicedham.noavailemail.cmm> wrote in message
news:ipnb9c$41o$1@speranza.aioe.org...
> I'm down to two offsets being incorrect....  With Frank's and Steve's
> suggestions and examples, I fixed 5 of the 7 remaining offsets that were
> incorrect.  They need _TEXT subtracted from them and SECTION directives
> for the segments, as cleared up by MAP directive.  I'll post it once it's
> working.
>
> These two lines are the remaining problems (IMO):
>
>   mov bx, _TEXT  ; below .skip4
>
>   dw _TEXT16 ; in backtoreal
>
> We need SEG _TEXT or _TEXT>>4, and likewise for __TEXT16.  E.g., getting
> 0x310 for _TEXT instead of 0x31, i.e., 0x310>>4.
>
> I need to figure out how to shift a label's by 4 for two of them.  This
> should be SEG, or maybe WRT, but SEG isn't supported in -f bin,
apparently.
> I tried a few things with equ's, %define, %assign.  There is always an
error
> so far...  I.e., can't do that to a scalar or symbol not defined yet, etc.
> It looks like Steve is doing something similar with equ's, but that didn't
> want to work for me with newer NASM's...
>

I made a mistake with that.  It's

  mov ax, STACK

not

  dw _TEXT16

that's the problem.


It works.  It's not pretty.  I decided to fix the code at those two spots.
One went well.  The other location doesn't seem to want to have different
code there...  Self-modifying?  I don't know what's wrong.  So, except for
the one of the two locations, the .obj portion is binary equivalent.  The
other had to be hard coded.  Search for "work around".  In the process, I
deleted the source (very rare...) and recoverd most of it.  So, I hope I
didn't miss any fixes in the reworked section.  I pasted in the .exe header
the way Alex did it.  Steve posted a link to a .exe header, for those
interested.  So, it functions, but it needs somebody to work on it to clean
it up.


;nasm -f bin
[map all dos64.map]

; 48 byte header

db 0x4D, 0x5A, 0x0F, 0x00, 0x04, 0x00, 0x03, 0x00
db 0x03, 0x00, 0x01, 0x01, 0xFF, 0xFF, 0x5E, 0x00
db 0x00, 0x10, 0x00, 0x00, 0x3A, 0x00, 0x00, 0x00
db 0x1E, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8F, 0x01
db 0x00, 0x00, 0xC4, 0x02, 0x00, 0x00, 0xC7, 0x02
db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00

ORG -0x30 ; takes into account the header tacked on

BITS 16
SECTION .text
_TEXT16:

GDTR dw 4 * 8 - 1
dd GDT
IDTR dw 256 * 16 - 1
dd 0
nullidt dw 0x3FF
dd 0

align 8
GDT dq 0
dw 0xFFFF, 0, 0x9A00, 0x0AF
dw 0xFFFF, 0, 0x9A00, 0x000
dw 0xFFFF, 0, 0x9200, 0x000

wPICMask dw 0

start16:
push cs
pop ds
mov ax, cs
movzx eax, ax
shl eax, 4
add dword [GDTR + 2], eax
mov word [GDT + 2 * 8 + 2], ax
mov word [GDT + 3 * 8 + 2], ax
shr eax, 16
mov byte [GDT + 2 * 8 + 4], al
mov byte [GDT + 3 * 8 + 4], al

mov ax, ss
mov dx, es
sub ax, dx
mov bx, sp
shr bx, 4
add bx, ax
mov ah, 0x4A
int 0x21
push cs
pop es
mov ax, ss
mov dx, cs
sub ax, dx
shl ax, 4
add ax, sp
push ds
pop ss
mov sp, ax

smsw ax
test al, 1
jz skip1

mov dx, err1
mov ah, 9
int 0x21
mov ah, 0x4C
int 0x21

err1: db "Mode is V86. Need REAL mode to switch to LONG mode!", 13, 10,'$'

skip1:
xor edx, edx
mov eax, 0x80000001
cpuid
test edx, 0x20000000
jnz skip2

mov dx, err2
mov ah, 9
int 0x21
mov ah, 0x4C
int 0x21

err2: db "No 64bit cpu detected.", 13, 10, '$'

skip2:
mov bx, 0x1000
mov ah, 0x48
int 0x21
jnc skip3
mov dx, err3
mov ah, 9
int 0x21
mov ah, 0x4C
int 0x21

err3: db "Out of memory", 13, 10, '$'

skip3:
add ax, 0x100 - 1
mov al, 0
mov es, ax

sub di, di
mov cx, 4096
sub eax, eax
rep stosd

sub di, di
mov ax, es
movzx eax, ax
shl eax, 4
mov cr3, eax

lea edx, [eax + 0x5000]
mov dword [IDTR + 2], edx

or eax, 111b
add eax, 0x1000
mov [es:di + 0x0000], eax
add eax, 0x1000
mov [es:di + 0x1000], eax
add eax, 0x1000
mov [es:di + 0x2000], eax
mov di, 0x3000
mov eax, 0 + 111b
mov cx, 256

..skip4:
stosd
add di, 4
add eax, 0x1000
loop .skip4

;NASM -f bin work around for unsupported SEG
;mov bx, _TEXT
mov bx, 0x31
movzx ebx, bx
shl ebx, 4
add [llg], ebx

mov di, 0x5000
mov cx, 32
mov edx, exception-_TEXT
add edx, ebx

make_exc_gates:
mov eax, edx
stosw
mov ax, 8
stosw
mov ax, 0x8E00
stosd
xor eax, eax
stosd
stosd
add edx, 4
loop make_exc_gates
mov cx, 256 - 32

make_int_gates:
mov eax, interrupt-_TEXT
add eax, ebx
stosw
mov ax, 8
stosw
mov ax, 0x8E00
stosd
xor eax, eax
stosd
stosd
loop make_int_gates

mov di, 0x5000
mov eax, ebx
add eax, clock-_TEXT
mov [es:di + 0x80 * 16 + 0], ax
shr eax, 16
mov [es:di + 0x80 * 16 + 6], ax

mov eax, ebx
add eax, keyboard-_TEXT
mov [es:di + 0x81 * 16 + 0], ax
shr eax, 16
mov [es:di + 0x81 * 16 + 6], ax

pushf
pop ax
and ah, 0xBF
push ax
popf

cli
in al, 0xA1
mov ah, al
in al, 0x21
mov [wPICMask], ax
mov al, 10001b
out 0x20, al
mov al, 10001b
out 0xA0, al
mov al, 0x80
out 0x21, al
mov al, 0x88
out 0xA1, al
mov al, 100b
out 0x21, al
mov al, 2
out 0xA1, al
mov al, 1
out 0x21, al
out 0xA1, al
in al, 0x21
mov al, 11111100b
out 0x21, al
in al, 0xA1
mov al, 11111111b
out 0xA1, al

mov eax, cr4
or eax, 1 << 5
mov cr4, eax

mov ecx, 0xC0000080

rdmsr
or eax, 1 << 8
wrmsr

lgdt [GDTR]
lidt [IDTR]

mov cx, ss
movzx ecx, cx
shl ecx, 4
add ecx, esp

mov eax, cr0
or eax, 0x80000001
mov cr0, eax

db 0x66, 0xEA ; jmp 0x8:0x0
llg dd long_start-_TEXT
dw 8

backtoreal:
mov eax, cr0
and eax, 0x7FFFFFFF
mov cr0, eax
mov ecx, 0xC0000080
rdmsr
and ax, ~1
wrmsr
mov ax, 24
mov ss, ax
mov ds, ax
mov es, ax
mov eax, cr0
and al, 0xFE
mov cr0, eax

db 0xEA
dw $+4
dw _TEXT16
mov ax, STACK
;NASM -f bin work around for unsupported SEG
 shr ax, 4
mov ss, ax
mov sp, 4096
push cs
pop ds
lidt [nullidt]
mov eax, cr4
and al, ~0x20
mov cr4, eax

mov al, 10001b
out 0x20, al
mov al, 10001b
out 0xA0, al
mov al, 0x08
out 0x21, al
mov al, 0x70
out 0xA1, al
mov al, 100b
out 0x21, al
mov al, 2
out 0xA1, al
mov al, 1
out 0x21, al
out 0xA1, al
in al, 0x21
mov ax, [wPICMask]
out 0x21, al
mov al, ah
out 0xA1, al
sti

mov ax, 0x4C00
int 0x21


BITS 64
SECTION .text2 vstart=0x310 align=8
_TEXT:

long_start:
xor eax, eax
mov ss, eax
mov esp, ecx
sti

call WriteStrX
db 'Hello 64bit', 10, 0

nextcmd:
mov r8b, 0

nocmd:
cmp r8b, 0
jz nocmd
cmp r8b, 1
jz esc_pressed
cmp r8b, 0x13
jz r_pressed
call WriteStrX
db 'unknown key ', 0
mov al, r8b
call WriteB
call WriteStrX
db 10, 0
jmp nextcmd

r_pressed:
call WriteStrX
db 10, "cr0=", 0
mov rax, cr0
call WriteQW
call WriteStrX
db 10, "cr2=", 0
mov rax, cr2
call WriteQW
call WriteStrX
db 10, "cr3=", 0
mov rax, cr3
call WriteQW
call WriteStrX
db 10, "cr4=", 0
mov rax, cr4
call WriteQW
call WriteStrX
db 10, "cr8=", 0
mov rax, cr8
call WriteQW
call WriteStrX
db 10, 0
jmp nextcmd

esc_pressed:
jmp dword far [rel bv]
bv: dd backtoreal
dw 16

scroll_screen:
cld
mov rdi, rsi
movzx rax, byte [rbp + 0x4a]
push rax
lea rsi, [rsi + 2 * rax]
mov cl, [rbp + 0x84]
mul cl
mov rcx, rax
rep movsw
pop rcx
mov ax, 0x0720
rep stosw
ret

WriteChr:
push rbp
push rdi
push rsi
push rbx
push rcx
push rdx
push rax
mov rdi, dword 0xB8000
mov rbp, dword 0x400
cmp byte [rbp + 0x63], 0xB4
jnz .skip5
xor di, di

..skip5:
movzx rbx, byte [rbp + 0x4E]
add rdi, rbx
movzx rbx, byte [rbp + 0x62]
mov rsi, rdi
movzx rcx, byte [rbx * 2 + rbp + 0x50 + 1]
movzx rax, byte [rbp + 0x4A]
mul rcx
movzx rdx, byte [rbx * 2 + rbp + 0x50]
add rax, rdx
mov dh, cl
lea rdi, [rdi + rax * 2]
mov al, [rsp]
cmp al, 10
jz .newline
mov [rdi], al
mov byte [rdi + 1], 7
inc dl
cmp dl, byte [rbp + 0x4a]
jb .skip6

..newline:
mov dl, 0
inc dh
cmp dh, byte [rbp + 0x84]
jbe .skip6
dec dh
call scroll_screen

..skip6:
mov [rbx * 2 + rbp + 0x50], dx
pop rax
pop rdx
pop rcx
pop rbx
pop rsi
pop rdi
pop rbp
ret

WriteStr:
push rsi
mov rsi, rdx
cld

..skip7:
lodsb
and al, al
jz .skip8
call WriteChr
jmp .skip7

..skip8:
pop rsi
ret

WriteStrX:
push rsi
mov rsi, [rsp + 8]
cld

..skip9:
lodsb
and al, al
jz .skip10
call WriteChr
jmp .skip9

..skip10:
mov [rsp + 8], rsi
pop rsi
ret

WriteQW:
push rax
shr rax, 32
call WriteDW
pop rax

WriteDW:
push rax
shr rax, 16
call WriteW
pop rax

WriteW:
push rax
shr rax, 8
call WriteB
pop rax

WriteB:
push rax
shr rax, 4
call WriteNb
pop rax

WriteNb:
and al, 0x0F
add al, '0'
cmp al, '9'
jbe .skip11
add al, 7

..skip11:
jmp WriteChr

exception:
%assign excno 0
%rep 32
push excno
jmp .skip12
%assign excno excno + 1
%endrep

..skip12:
call WriteStrX
db 10, "Exception ", 0
pop rax
call WriteB
call WriteStrX
db " errcode=", 0
mov rax, [rsp + 0]
call WriteQW
call WriteStrX
db " rip=", 0
mov rax, [rsp + 8]
call WriteQW
call WriteStrX
db 10, 0

..skip13:
jmp $

clock:
push rbp
mov rbp, dword 0x400
inc dword [rbp + 0x6C]
pop rbp

interrupt:
push rax
mov al, 0x20
out 0x20, al
pop rax
iretq

keyboard:
push rax
in al, 0x60
test al, 0x80
jnz .skip14
mov r8b, al

..skip14:
in al, 0x61
out 0x61, al
mov al, 0x20
out 0x20, al
pop rax
iretq

SECTION .bss start=0x5e0 align=8
STACK:


HTH,


Rod Pemberton


0
Reply do_not_have7664 (117) 5/3/2011 1:06:53 AM

On Mon, 2011-05-02 at 21:06 -0400, Rod Pemberton wrote:

> It works.  It's not pretty.  I decided to fix the code at those two spots=
..
> One went well.  The other location doesn't seem to want to have different
> code there...  Self-modifying?  I don't know what's wrong.  So, except fo=
r
> the one of the two locations, the .obj portion is binary equivalent.  The
> other had to be hard coded.  Search for "work around".  In the process, I
> deleted the source (very rare...) and recoverd most of it.  So, I hope I
> didn't miss any fixes in the reworked section.  I pasted in the .exe head=
er
> the way Alex did it.  Steve posted a link to a .exe header, for those
> interested.  So, it functions, but it needs somebody to work on it to cle=
an
> it up.=20

Yes, your version works very well, also I've finished commenting my
code.=20

Here is my non-working version:
	; 48 byte header

	db	0x4D, 0x5A, 0x0F, 0x00, 0x04, 0x00, 0x03, 0x00=20
	db	0x03, 0x00, 0x01, 0x01, 0xFF, 0xFF, 0x5E, 0x00
	db	0x00, 0x10, 0x00, 0x00, 0x3A, 0x00, 0x00, 0x00
	db	0x1E, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8F, 0x01
	db	0x00, 0x00, 0xC4, 0x02, 0x00, 0x00, 0xC7, 0x02=20
	db	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00

;%include 'exebin.mac'

	ORG	-0x30 	; takes into account the header tacked on

;	EXE_begin start16

	BITS	16

_TEXT16 segment use16=20
						; Global Descriptors Table Register
GDTR	dw	4 * 8 - 1			; GDTR size
	dd	GDT				; Linear address where GDT is stored=09

						; Interrupt Descriptors Table Register
IDTR	dw	256 * 16 - 1			; IDTR size
	dd	0				; Linear address where IDT is kept
nullidt dw	0x3FF=09
	dd	0

	align	8
GDT	dq	0				; Null descriptor
	dw	0xFFFF, 0, 0x9A00, 0x0AF	; 64 bit code descriptor
	dw	0xFFFF, 0, 0x9A00, 0x000	; Compatibility mode code descriptor
	dw	0xFFFF, 0, 0x9200, 0x000	; Compatibility mode data descriptor

wPICMask dw	0				; used for saving/restoring PIC masks

start16:
	push 	cs
	pop	ds
	mov	ax, cs
	movzx	eax, ax
	shl	eax, 4
	add	dword [GDTR + 2], eax		; convert offset to linear address
	mov	word [GDT + 2 * 8 + 2], ax
	mov	word [GDT + 3 * 8 + 2], ax
	shr	eax, 16
	mov	byte [GDT + 2 * 8 + 4], al
	mov	byte [GDT + 3 * 8 + 4], al

	mov	ax, ss				; stack segment address
	mov	dx, es				; extra segment address
	sub	ax, dx
	mov	bx, sp				; stack pointer
	shr	bx, 4				; shift to the right four times
	add	bx, ax				; adds result of ax to bx

	mov	ah, 0x4A			; frees unused memory
	int	0x21

	push	cs				; copies code segment into extra segment
	pop	es
	mov	ax, ss				; stack segment address
	mov	dx, cs				; code segment address
	sub	ax, dx
	shl	ax, 4				; shift to the left four times
	add	ax, sp				; adds result of ax to sp
	push	ds				; copies data segment into stack segment
	pop	ss
	mov	sp, ax				; creates a TINY model (CS=3DSS=3DDS=3DES)
=09
	smsw	ax				; Are we running in V86?=20
	test	al, 1
	jz	skip1				; Yay, we're in real mode

	mov	dx, err1			; We're in V86 mode, it's byebye
	mov	ah, 9
	int	0x21
	=09
	mov	ah, 0x4C			; Hand over to DOS
	int	0x21

err1:	db	"Mode is V86. Need REAL mode to switch to LONG mode!", 13, 10,
'$'

skip1:
	xor	edx, edx
	mov	eax, 0x80000001
	cpuid
	test 	edx, 0x20000000			; Is LONG mode supported?
	jnz	skip2				; Yay, we've got a 64 bit processor!

	mov	dx, err2			; We're 32 bit, it's byebye
	mov	ah, 9
	int	0x21

	mov	ah, 0x4C			; Hand over to DOS
	int	0x21

err2:	db	"No 64bit cpu detected.", 13, 10, '$'

skip2:
	mov	bx, 0x1000			; Allocate 4096 paragraphs
	mov	ah, 0x48
	int	0x21

	jnc	skip3				; Yay, we got what we wanted
	mov	dx, err3
	mov	ah, 9
	int	0x21

	mov	ah, 0x4C			; Hand over to DOS
	int	0x21

err3:	db	"Out of memory", 13, 10, '$'

skip3:
	add 	ax, 0x100 - 1			; Realigns to page boundary (adds 255 to it)
	mov	al, 0				; Zeros out low 8 bits of register
	mov	es, ax				; Sets extra segment to address of allocated block

	sub	di, di				; Sets DI to zero
	mov	cx, 4096			; Sets CX to 4096
	sub	eax, eax			; Zeros out EAX
	rep	stosd				; Clears 4 pages' worth (16KB) in ES:DI

	sub	di, di				; Clears DI
	mov	ax, es				; Copies extra segment into AX
	movzx	eax, ax				; Extends into EAX=20
	shl	eax, 4				; Shift to left four times
	mov	cr3, eax			; Loads CR3 with pointer to page table

	lea	edx, [eax + 0x5000]		; Load EDX with address in EAX plus 20KB
(20,480 bytes)
	mov	dword [IDTR + 2], edx		; Stores linear address into IDTR + 2

	or	eax, 111b			; Sets low 3 bits to 1
	add	eax, 0x1000			; Adds 4096 to result
	mov	[es:di + 0x0000], eax		; Stores result into [ES:DI] (first PDP
table)
	add	eax, 0x1000			; Add 4096 to result
	mov	[es:di + 0x1000], eax		; Stores result into [ES:DI + 4096] (first
page directory)
	add	eax, 0x1000			; add 4096 to result
	mov	[es:di + 0x2000], eax		; Stores result into [ES:DI + 8192] (first
page table)
	mov	di, 0x3000			; Sets DI to 12288 (12KB) (address of first page
table)
	mov	eax, 0 + 111b			; Sets EAX to 7 (why?)
	mov	cx, 256=09

..skip4:
	stosd					; Stores EAX into [ES:DI], DI incremented 4 bytes
	add	di, 4				; Increment DI by 4=20
	add	eax, 0x1000			; Adds 4096 to EAX
	loop	.skip4				; Is CX still not zero yet?

	mov	bx, _TEXT			; Load BX with address of _TEXT
	movzx	ebx, bx				; Extends to EBX
	shl	ebx, 4				; Shift to the left four times
	add 	[llg], ebx			; Store EBX as an linear address into llg=20
						; *** SELF MODIFYING CODE ***

	mov	di, 0x5000			; Loads DI with 0x5000 (20KB or 20,480 bytes)
	mov	cx, 32				; Loads CX with 32
	mov	edx, exception			; Load EDX with pointer to exception
	add	edx, ebx			; Adds EBX to EDX

make_exc_gates:
	mov	eax, edx			; Loads EAX with EDX
	stosw					; Stores AX into [ES:DI], DI incremented 2 bytes
	mov	ax, 8				; Loads AX with 8
	stosw					; Stores AX into [ES:DI], DI incremented 2 bytes
	mov	ax, 0x8E00			; Loads AX (why this value?)
	stosd					; Stores EAX into [ES:DI], DI incremented 4 bytes
	xor	eax, eax			; Zeros out EAX
	stosd					; Stores 0 into [ES:DI], DI incremented 4 bytes
	stosd					; As above
	add	edx, 4				; Adds 4 to EDX
	loop	make_exc_gates			; Is CX still not zero yet?=20

	mov	cx, 256 - 32			; Loads CX with 224

make_int_gates:
	mov	eax, interrupt			; Loads EAX with pointer to interrupt
	add	eax, ebx			; Adds address of _TEXT in EBX to EDX
	stosw					; Stores AX into [ES:DI], DI incremented 2 bytes
	mov	ax, 8				; Loads AX with 8
	stosw					; Stores AX into [ES:DI], DI incremented 2 bytes
	mov	ax, 0x8E00			; Loads AX (why this value?)
	stosd					; Stores EAX into [ES:DI], DI incremented 4 bytes
	xor	eax, eax			; Zeros out EAX
	stosd					; Stores 0 into [ES:DI], DI incremented 4 bytes
	stosd					; As above
	loop	make_int_gates			; Is CX still not zero yet?

	mov	di, 0x5000			; Loads DI with 0x5000 (20KB or 20,480 bytes)
	mov	eax, ebx			; Loads EAX with address of _TEXT in EAX
	add	eax, clock			; Adds offset to clock to EAX
	mov	[es:di + 0x80 * 16 + 0], ax	; Pointer to IRQ 0 handler stored
	shr	eax, 16				; Shift to the right 16 times as a linear address
	mov	[es:di + 0x80 * 16 + 6], ax	; Pointer to IRQ 0 handler stored=20

	mov	eax, ebx			; Loads EAX with address of _TEXT in EAX
	add	eax, keyboard			; Adds offset to keyboard to EAX
	mov	[es:di + 0x81 * 16 + 0], ax	; Pointer to IRQ 1 handler stored
	shr	eax, 16				; Shift to the right 16 times as a linear address
	mov	[es:di + 0x81 * 16 + 6], ax	; Pointer to IRQ 1 handler stored

	pushf					; Clears the NT flag in readiness for interrupt handling
	pop	ax			=09
	and	ah, 0xBF
	push	ax
	popf

	cli					; Disable interrupts
	in	al, 0xA1
	mov	ah, al
	in	al, 0x21
	mov	[wPICMask], ax			; Copies PIC contents to restore later
	mov	al, 10001b			; Initialise PIC 1
	out	0x20, al
	mov	al, 10001b			; Initialise PIC 2
	out	0xA0, al
	mov	al, 0x80			; IRQ 0-7 handles interrupt 0x80 .. 0x87
	out	0x21, al
	mov	al, 0x88			; IRQ 8-15 handles interrupt 0x88 .. 0x8F
	out	0xA1, al
	mov	al, 100b			; slave to IRQ 2
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov 	al, 1				; EOI?
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	al, 11111100b			; Only enable clock and keyboard IRQs
	out	0x21, al
	in	al, 0xA1
	mov	al, 11111111b
	out	0xA1, al

	mov	eax, cr4
	or	eax, 1 << 5
	mov	cr4, eax			; Enables PAE36 as a prelude to LONG mode

	mov	ecx, 0xC0000080			; EFER MSR=20

	rdmsr
	or	eax, 1 << 8			; Enables LONG mode
	wrmsr

	lgdt	[GDTR]				; Loads GDT
	lidt	[IDTR]				; Loads IDT

	mov	cx, ss				; Loads CX with address of SS segment
	movzx	ecx, cx				; Extends to ECX
	shl	ecx, 4				; Shift to left four times
	add	ecx, esp

	mov	eax, cr0
	or	eax, 0x80000001			; Enable paging and protected mode
	mov	cr0, eax

	db	0x66, 0xEA			; jmp far 0x8:0x0
llg	dd	long_start			; (64 bit routine)
	dw	8

backtoreal:
	mov	eax, cr0
	and	eax, 0x7FFFFFFF			; Disable paging and protected mode
	mov	cr0, eax

	mov	ecx, 0xC0000080			; EFER MSR

	rdmsr
	and	ax, ~1				; Disables LONG mode
	wrmsr

	mov	ax, 24				; Loads AX with 24 - why this value?
	mov	ss, ax			=09
	mov	ds, ax
	mov	es, ax				; sets SS=3DDS=3DES with 24

	mov	eax, cr0
	and	al, 0xFE			; Enables REAL mode
	mov	cr0, eax

	db	0xEA				; jmp _TEXT16:$+4
	dw	$+4
	dw	_TEXT16

	mov	ax, STACK			; Reinitalise SS with STACK
	mov	ss, ax
	mov	sp, 4096			; Sets up stack pointer

	push	cs
	pop	ds				; DS=3DCS

	lidt	[nullidt]			; Loads IDT with null descriptor

	mov	eax, cr4
	and	al, ~0x20			; Disables PAE36 as a prelude to REAL mode
	mov	cr4, eax

	mov	al, 10001b			; Initialise PCI 1
	out	0x20, al
	mov	al, 10001b			; Initialise PCI 2
	out	0xA0, al
	mov	al, 0x08			; IRQ 0 - 7 now handles interrupts 0x08 .. 0x0F
	out	0x21, al
	mov	al, 0x70			; IRQ 8 - 15 now handles interrupts 0x70 ... 0x77
	out	0xA1, al
	mov	al, 100b			; slaved to IRQ 2
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov	al, 1				; EOI
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	ax, [wPICMask]			; Restore original PIC
	out	0x21, al
	mov	al, ah
	out	0xA1, al
	sti					; Enable interrupts

	mov	ax, 0x4C00			; Hands over to DOS
	int	0x21

	BITS	64

	align	8, db 0

_TEXT	segment use64

	; 64 bit code - rbx must preserve linear address of _TEXT

long_start:
	xor	eax, eax
	mov	ss, eax
	mov	esp, ecx
	sti					; Enable interrupts

	call	WriteStrX			; Yay, LONG mode!
	db	'Hello 64bit', 10, 0

nextcmd:
	mov	r8b, 0

nocmd:
	cmp	r8b, 0
	jz	nocmd
	cmp	r8b, 1
	jz	esc_pressed
	cmp	r8b, 0x13
	jz	r_pressed
	call	WriteStrX
	db 	'Unknown key ', 0
	mov	al, r8b
	call	WriteB
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

r_pressed:
	call	WriteStrX
	db	10, "cr0=3D", 0
	mov	rax, cr0
	call	WriteQW
	call	WriteStrX
	db	10, "cr2=3D", 0
	mov	rax, cr2
	call	WriteQW
	call	WriteStrX
	db	10, "cr3=3D", 0
	mov	rax, cr3
	call	WriteQW
	call	WriteStrX
	db	10, "cr4=3D", 0
	mov	rax, cr4
	call	WriteQW
	call	WriteStrX
	db	10, "cr8=3D", 0
	mov	rax, cr8
	call	WriteQW
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

esc_pressed:
	jmp	dword far [rel bv]
bv:	dd	backtoreal
	dw	16

scroll_screen:
	cld
	mov	rdi, rsi
	movzx	rax, byte [rbp + 0x4a]
	push	rax
	lea	rsi, [rsi + 2 * rax]
	mov	cl, [rbp + 0x84]
	mul	cl
	mov	rcx, rax
	rep	movsw
	pop	rcx
	mov	ax, 0x0720
	rep	stosw
	ret

WriteChr:
	push	rbp
	push	rdi
	push	rsi
	push	rbx
	push 	rcx
	push	rdx
	push	rax
	mov	rdi, dword 0xB8000
	mov	rbp, dword 0x400
	cmp	byte [rbp + 0x63], 0xB4
	jnz	.skip5
	xor	di, di

..skip5:
	movzx	rbx, byte [rbp + 0x4E]
	add	rdi, rbx
	movzx	rbx, byte [rbp + 0x62]
	mov	rsi, rdi
	movzx	rcx, byte [rbx * 2 + rbp + 0x50 + 1]
	movzx	rax, word [rbp + 0x4A]
	mul	rcx
	movzx	rdx, byte [rbx * 2 + rbp + 0x50]
	add	rax, rdx
	mov	dh, cl
	lea	rdi, [rdi + rax * 2]
	mov	al, [rsp]
	cmp	al, 10
	jz	.newline
	mov	[rdi], al
	mov	byte [rdi + 1], 7
	inc	dl
	cmp	dl, byte [rbp + 0x4a]
	jb	.skip6

..newline:
	mov	dl, 0
	inc	dh
	cmp	dh, byte [rbp + 0x84]
	jbe	.skip6
	dec	dh
	call	scroll_screen

..skip6:
	mov	[rbx * 2 + rbp + 0x50], dx
	pop	rax
	pop	rdx
	pop	rcx
	pop	rbx
	pop	rsi
	pop	rdi
	pop	rbp
	ret

WriteStr:
	push	rsi
	mov	rsi, rdx
	cld

..skip7:
	lodsb
	and	al, al
	jz	.skip8
	call	WriteChr
	jmp	.skip7

..skip8:
	pop	rsi
	ret

WriteStrX:
	push	rsi
	mov	rsi, [rsp + 8]
	cld

..skip9:
	lodsb
	and	al, al
	jz	.skip10
	call	WriteChr
	jmp	.skip9

..skip10:
	mov	[rsp + 8], rsi
	pop	rsi
	ret

WriteQW:
	push	rax
	shr	rax, 32
	call	WriteDW
	pop	rax

WriteDW:
	push	rax
	shr	rax, 16
	call	WriteW
	pop	rax

WriteW:
	push	rax
	shr	rax, 8
	call	WriteB
	pop	rax

WriteB:
	push	rax
	shr	rax, 4
	call	WriteNb
	pop	rax

WriteNb:
	and	al, 0x0F
	add	al, '0'
	cmp	al, '9'
	jbe	.skip11
	add	al, 7

..skip11:
	jmp	WriteChr

exception:
%assign excno 0
%rep	32
	push	excno
	jmp	.skip12
%assign excno excno + 1
%endrep

..skip12:
	call	WriteStrX
	db	10, "Exception ", 0
	pop	rax
	call	WriteB
	call	WriteStrX
	db 	" errcode=3D", 0
	mov	rax, [rsp + 0]
	call	WriteQW
	call	WriteStrX
	db	" rip=3D", 0
	mov	rax, [rsp + 8]
	call	WriteQW
	call	WriteStrX
	db	10, 0

..skip13:
	jmp $

clock:
	push 	rbp
	mov	rbp, dword 0x400
	inc	dword [rbp + 0x6C]
	pop	rbp

interrupt:
	push	rax
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

keyboard:
	push	rax
	in	al, 0x60
	test	al, 0x80
	jnz	.skip14
	mov	r8b, al

..skip14:
	in	al, 0x61
	out	0x61, al
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

(this is yours with my comments!)

STACK segment use16
	resb 	0

;	EXE_stack 4096
;	EXE_end

	; 48 byte header

	db	0x4D, 0x5A, 0x0F, 0x00, 0x04, 0x00, 0x03, 0x00=20
	db	0x03, 0x00, 0x01, 0x01, 0xFF, 0xFF, 0x5E, 0x00
	db	0x00, 0x10, 0x00, 0x00, 0x3A, 0x00, 0x00, 0x00
	db	0x1E, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8F, 0x01
	db	0x00, 0x00, 0xC4, 0x02, 0x00, 0x00, 0xC7, 0x02=20
	db	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00

;%include 'exebin.mac'

	ORG	-0x30 	; takes into account the header tacked on

;	EXE_begin start16

	BITS	16

	SECTION	.text
_TEXT16:
						; Global Descriptors Table Register
GDTR	dw	4 * 8 - 1			; GDTR size
	dd	GDT				; Linear address where GDT is stored=09

						; Interrupt Descriptors Table Register
IDTR	dw	256 * 16 - 1			; IDTR size
	dd	0				; Linear address where IDT is kept
nullidt dw	0x3FF=09
	dd	0

	align	8
GDT	dq	0				; Null descriptor
	dw	0xFFFF, 0, 0x9A00, 0x0AF	; 64 bit code descriptor
	dw	0xFFFF, 0, 0x9A00, 0x000	; Compatibility mode code descriptor
	dw	0xFFFF, 0, 0x9200, 0x000	; Compatibility mode data descriptor

wPICMask dw	0				; used for saving/restoring PIC masks

start16:
	push 	cs
	pop	ds
	mov	ax, cs
	movzx	eax, ax
	shl	eax, 4
	add	dword [GDTR + 2], eax		; convert offset to linear address
	mov	word [GDT + 2 * 8 + 2], ax
	mov	word [GDT + 3 * 8 + 2], ax
	shr	eax, 16
	mov	byte [GDT + 2 * 8 + 4], al
	mov	byte [GDT + 3 * 8 + 4], al

	mov	ax, ss				; stack segment address
	mov	dx, es				; extra segment address
	sub	ax, dx
	mov	bx, sp				; stack pointer
	shr	bx, 4				; shift to the right four times
	add	bx, ax				; adds result of ax to bx

	mov	ah, 0x4A			; frees unused memory
	int	0x21

	push	cs				; copies code segment into extra segment
	pop	es
	mov	ax, ss				; stack segment address
	mov	dx, cs				; code segment address
	sub	ax, dx
	shl	ax, 4				; shift to the left four times
	add	ax, sp				; adds result of ax to sp
	push	ds				; copies data segment into stack segment
	pop	ss
	mov	sp, ax				; creates a TINY model (CS=3DSS=3DDS=3DES)
=09
	smsw	ax				; Are we running in V86?=20
	test	al, 1
	jz	skip1				; Yay, we're in real mode

	mov	dx, err1			; We're in V86 mode, it's byebye
	mov	ah, 9
	int	0x21
	=09
	mov	ah, 0x4C			; Hand over to DOS
	int	0x21

err1:	db	"Mode is V86. Need REAL mode to switch to LONG mode!", 13, 10,
'$'

skip1:
	xor	edx, edx
	mov	eax, 0x80000001
	cpuid
	test 	edx, 0x20000000			; Is LONG mode supported?
	jnz	skip2				; Yay, we've got a 64 bit processor!

	mov	dx, err2			; We're 32 bit, it's byebye
	mov	ah, 9
	int	0x21

	mov	ah, 0x4C			; Hand over to DOS
	int	0x21

err2:	db	"No 64bit cpu detected.", 13, 10, '$'

skip2:
	mov	bx, 0x1000			; Allocate 4096 paragraphs
	mov	ah, 0x48
	int	0x21

	jnc	skip3				; Yay, we got what we wanted
	mov	dx, err3
	mov	ah, 9
	int	0x21

	mov	ah, 0x4C			; Hand over to DOS
	int	0x21

err3:	db	"Out of memory", 13, 10, '$'

skip3:
	add 	ax, 0x100 - 1			; Realigns to page boundary (adds 255 to it)
	mov	al, 0				; Zeros out low 8 bits of register
	mov	es, ax				; Sets extra segment to address of allocated block

	sub	di, di				; Sets DI to zero
	mov	cx, 4096			; Sets CX to 4096
	sub	eax, eax			; Zeros out EAX
	rep	stosd				; Clears 4 pages' worth (16KB) in ES:DI

	sub	di, di				; Clears DI
	mov	ax, es				; Copies extra segment into AX
	movzx	eax, ax				; Extends into EAX=20
	shl	eax, 4				; Shift to left four times
	mov	cr3, eax			; Loads CR3 with pointer to page table

	lea	edx, [eax + 0x5000]		; Load EDX with address in EAX plus 20KB
(20,480 bytes)
	mov	dword [IDTR + 2], edx		; Stores linear address into IDTR + 2

	or	eax, 111b			; Sets low 3 bits to 1
	add	eax, 0x1000			; Adds 4096 to result
	mov	[es:di + 0x0000], eax		; Stores result into [ES:DI] (first PDP
table)
	add	eax, 0x1000			; Add 4096 to result
	mov	[es:di + 0x1000], eax		; Stores result into [ES:DI + 4096] (first
page directory)
	add	eax, 0x1000			; add 4096 to result
	mov	[es:di + 0x2000], eax		; Stores result into [ES:DI + 8192] (first
page table)
	mov	di, 0x3000			; Sets DI to 12288 (12KB) (address of first page
table)
	mov	eax, 0 + 111b			; Sets EAX to 7 (why?)
	mov	cx, 256=09

..skip4:
	stosd					; Stores EAX into [ES:DI], DI incremented 4 bytes
	add	di, 4				; Increment DI by 4=20
	add	eax, 0x1000			; Adds 4096 to EAX
	loop	.skip4				; Is CX still not zero yet?

	mov	bx, 0x31			; Load BX with address of _TEXT
	movzx	ebx, bx				; Extends to EBX
	shl	ebx, 4				; Shift to the left four times
	add 	[llg], ebx			; Store EBX as an linear address into llg=20
						; *** SELF MODIFYING CODE ***

	mov	di, 0x5000			; Loads DI with 0x5000 (20KB or 20,480 bytes)
	mov	cx, 32				; Loads CX with 32
	mov	edx, exception - _TEXT		; Load EDX with pointer to exception
	add	edx, ebx			; Adds EBX to EDX

make_exc_gates:
	mov	eax, edx			; Loads EAX with EDX
	stosw					; Stores AX into [ES:DI], DI incremented 2 bytes
	mov	ax, 8				; Loads AX with 8
	stosw					; Stores AX into [ES:DI], DI incremented 2 bytes
	mov	ax, 0x8E00			; Loads AX (why this value?)
	stosd					; Stores EAX into [ES:DI], DI incremented 4 bytes
	xor	eax, eax			; Zeros out EAX
	stosd					; Stores 0 into [ES:DI], DI incremented 4 bytes
	stosd					; As above
	add	edx, 4				; Adds 4 to EDX
	loop	make_exc_gates			; Is CX still not zero yet?=20

	mov	cx, 256 - 32			; Loads CX with 224

make_int_gates:
	mov	eax, interrupt - _TEXT		; Loads EAX with pointer to interrupt
	add	eax, ebx			; Adds address of _TEXT in EBX to EDX
	stosw					; Stores AX into [ES:DI], DI incremented 2 bytes
	mov	ax, 8				; Loads AX with 8
	stosw					; Stores AX into [ES:DI], DI incremented 2 bytes
	mov	ax, 0x8E00			; Loads AX (why this value?)
	stosd					; Stores EAX into [ES:DI], DI incremented 4 bytes
	xor	eax, eax			; Zeros out EAX
	stosd					; Stores 0 into [ES:DI], DI incremented 4 bytes
	stosd					; As above
	loop	make_int_gates			; Is CX still not zero yet?

	mov	di, 0x5000			; Loads DI with 0x5000 (20KB or 20,480 bytes)
	mov	eax, ebx			; Loads EAX with address of _TEXT in EAX
	add	eax, clock - _TEXT		; Adds offset to clock to EAX
	mov	[es:di + 0x80 * 16 + 0], ax	; Pointer to IRQ 0 handler stored
	shr	eax, 16				; Shift to the right 16 times as a linear address
	mov	[es:di + 0x80 * 16 + 6], ax	; Pointer to IRQ 0 handler stored=20

	mov	eax, ebx			; Loads EAX with address of _TEXT in EAX
	add	eax, keyboard - _TEXT		; Adds offset to keyboard to EAX
	mov	[es:di + 0x81 * 16 + 0], ax	; Pointer to IRQ 1 handler stored
	shr	eax, 16				; Shift to the right 16 times as a linear address
	mov	[es:di + 0x81 * 16 + 6], ax	; Pointer to IRQ 1 handler stored

	pushf					; Clears the NT flag in readiness for interrupt handling
	pop	ax			=09
	and	ah, 0xBF
	push	ax
	popf

	cli					; Disable interrupts
	in	al, 0xA1
	mov	ah, al
	in	al, 0x21
	mov	[wPICMask], ax			; Copies PIC contents to restore later
	mov	al, 10001b			; Initialise PIC 1
	out	0x20, al
	mov	al, 10001b			; Initialise PIC 2
	out	0xA0, al
	mov	al, 0x80			; IRQ 0-7 handles interrupt 0x80 .. 0x87
	out	0x21, al
	mov	al, 0x88			; IRQ 8-15 handles interrupt 0x88 .. 0x8F
	out	0xA1, al
	mov	al, 100b			; slave to IRQ 2
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov 	al, 1				; EOI?
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	al, 11111100b			; Only enable clock and keyboard IRQs
	out	0x21, al
	in	al, 0xA1
	mov	al, 11111111b
	out	0xA1, al

	mov	eax, cr4
	or	eax, 1 << 5
	mov	cr4, eax			; Enables PAE36 as a prelude to LONG mode

	mov	ecx, 0xC0000080			; EFER MSR=20

	rdmsr
	or	eax, 1 << 8			; Enables LONG mode
	wrmsr

	lgdt	[GDTR]				; Loads GDT
	lidt	[IDTR]				; Loads IDT

	mov	cx, ss				; Loads CX with address of SS segment
	movzx	ecx, cx				; Extends to ECX
	shl	ecx, 4				; Shift to left four times
	add	ecx, esp

	mov	eax, cr0
	or	eax, 0x80000001			; Enable paging and protected mode
	mov	cr0, eax

	db	0x66, 0xEA			; jmp far 0x8:0x0
llg	dd	long_start - _TEXT		; (64 bit routine)
	dw	8

backtoreal:
	mov	eax, cr0
	and	eax, 0x7FFFFFFF			; Disable paging and protected mode
	mov	cr0, eax

	mov	ecx, 0xC0000080			; EFER MSR

	rdmsr
	and	ax, ~1				; Disables LONG mode
	wrmsr

	mov	ax, 24				; Loads AX with 24 - why this value?
	mov	ss, ax			=09
	mov	ds, ax
	mov	es, ax				; sets SS=3DDS=3DES with 24

	mov	eax, cr0
	and	al, 0xFE			; Enables REAL mode
	mov	cr0, eax

	db	0xEA				; jmp _TEXT16:$+4
	dw	$+4
	dw	_TEXT16

	mov	ax, STACK			; Reinitalise SS with STACK
	shr	ax, 4				; Converts to a segment address
	mov	ss, ax
	mov	sp, 4096			; Sets up stack pointer

	push	cs
	pop	ds				; DS=3DCS

	lidt	[nullidt]			; Loads IDT with null descriptor

	mov	eax, cr4
	and	al, ~0x20			; Disables PAE36 as a prelude to REAL mode
	mov	cr4, eax

	mov	al, 10001b			; Initialise PCI 1
	out	0x20, al
	mov	al, 10001b			; Initialise PCI 2
	out	0xA0, al
	mov	al, 0x08			; IRQ 0 - 7 now handles interrupts 0x08 .. 0x0F
	out	0x21, al
	mov	al, 0x70			; IRQ 8 - 15 now handles interrupts 0x70 ... 0x77
	out	0xA1, al
	mov	al, 100b			; slaved to IRQ 2
	out	0x21, al
	mov	al, 2
	out	0xA1, al
	mov	al, 1				; EOI
	out	0x21, al
	out	0xA1, al
	in	al, 0x21
	mov	ax, [wPICMask]			; Restore original PIC
	out	0x21, al
	mov	al, ah
	out	0xA1, al
	sti					; Enable interrupts

	mov	ax, 0x4C00			; Hands over to DOS
	int	0x21

	BITS	64

	SECTION	.text2 vstart=3D0x310 align=3D8
_TEXT:

	; 64 bit code - rbx must preserve linear address of _TEXT

long_start:
	xor	eax, eax
	mov	ss, eax
	mov	esp, ecx
	sti					; Enable interrupts

	call	WriteStrX			; Yay, LONG mode!
	db	'Hello 64bit', 10, 0

nextcmd:
	mov	r8b, 0

nocmd:
	cmp	r8b, 0
	jz	nocmd
	cmp	r8b, 1
	jz	esc_pressed
	cmp	r8b, 0x13
	jz	r_pressed
	call	WriteStrX
	db 	'Unknown key ', 0
	mov	al, r8b
	call	WriteB
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

r_pressed:
	call	WriteStrX
	db	10, "cr0=3D", 0
	mov	rax, cr0
	call	WriteQW
	call	WriteStrX
	db	10, "cr2=3D", 0
	mov	rax, cr2
	call	WriteQW
	call	WriteStrX
	db	10, "cr3=3D", 0
	mov	rax, cr3
	call	WriteQW
	call	WriteStrX
	db	10, "cr4=3D", 0
	mov	rax, cr4
	call	WriteQW
	call	WriteStrX
	db	10, "cr8=3D", 0
	mov	rax, cr8
	call	WriteQW
	call	WriteStrX
	db	10, 0
	jmp	nextcmd

esc_pressed:
	jmp	dword far [rel bv]
bv:	dd	backtoreal
	dw	16

scroll_screen:
	cld
	mov	rdi, rsi
	movzx	rax, byte [rbp + 0x4a]
	push	rax
	lea	rsi, [rsi + 2 * rax]
	mov	cl, [rbp + 0x84]
	mul	cl
	mov	rcx, rax
	rep	movsw
	pop	rcx
	mov	ax, 0x0720
	rep	stosw
	ret

WriteChr:
	push	rbp
	push	rdi
	push	rsi
	push	rbx
	push 	rcx
	push	rdx
	push	rax
	mov	rdi, dword 0xB8000
	mov	rbp, dword 0x400
	cmp	byte [rbp + 0x63], 0xB4
	jnz	.skip5
	xor	di, di

..skip5:
	movzx	rbx, byte [rbp + 0x4E]
	add	rdi, rbx
	movzx	rbx, byte [rbp + 0x62]
	mov	rsi, rdi
	movzx	rcx, byte [rbx * 2 + rbp + 0x50 + 1]
	movzx	rax, word [rbp + 0x4A]
	mul	rcx
	movzx	rdx, byte [rbx * 2 + rbp + 0x50]
	add	rax, rdx
	mov	dh, cl
	lea	rdi, [rdi + rax * 2]
	mov	al, [rsp]
	cmp	al, 10
	jz	.newline
	mov	[rdi], al
	mov	byte [rdi + 1], 7
	inc	dl
	cmp	dl, byte [rbp + 0x4a]
	jb	.skip6

..newline:
	mov	dl, 0
	inc	dh
	cmp	dh, byte [rbp + 0x84]
	jbe	.skip6
	dec	dh
	call	scroll_screen

..skip6:
	mov	[rbx * 2 + rbp + 0x50], dx
	pop	rax
	pop	rdx
	pop	rcx
	pop	rbx
	pop	rsi
	pop	rdi
	pop	rbp
	ret

WriteStr:
	push	rsi
	mov	rsi, rdx
	cld

..skip7:
	lodsb
	and	al, al
	jz	.skip8
	call	WriteChr
	jmp	.skip7

..skip8:
	pop	rsi
	ret

WriteStrX:
	push	rsi
	mov	rsi, [rsp + 8]
	cld

..skip9:
	lodsb
	and	al, al
	jz	.skip10
	call	WriteChr
	jmp	.skip9

..skip10:
	mov	[rsp + 8], rsi
	pop	rsi
	ret

WriteQW:
	push	rax
	shr	rax, 32
	call	WriteDW
	pop	rax

WriteDW:
	push	rax
	shr	rax, 16
	call	WriteW
	pop	rax

WriteW:
	push	rax
	shr	rax, 8
	call	WriteB
	pop	rax

WriteB:
	push	rax
	shr	rax, 4
	call	WriteNb
	pop	rax

WriteNb:
	and	al, 0x0F
	add	al, '0'
	cmp	al, '9'
	jbe	.skip11
	add	al, 7

..skip11:
	jmp	WriteChr

exception:
%assign excno 0
%rep	32
	push	excno
	jmp	.skip12
%assign excno excno + 1
%endrep

..skip12:
	call	WriteStrX
	db	10, "Exception ", 0
	pop	rax
	call	WriteB
	call	WriteStrX
	db 	" errcode=3D", 0
	mov	rax, [rsp + 0]
	call	WriteQW
	call	WriteStrX
	db	" rip=3D", 0
	mov	rax, [rsp + 8]
	call	WriteQW
	call	WriteStrX
	db	10, 0

..skip13:
	jmp $

clock:
	push 	rbp
	mov	rbp, dword 0x400
	inc	dword [rbp + 0x6C]
	pop	rbp

interrupt:
	push	rax
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

keyboard:
	push	rax
	in	al, 0x60
	test	al, 0x80
	jnz	.skip14
	mov	r8b, al

..skip14:
	in	al, 0x61
	out	0x61, al
	mov	al, 0x20
	out	0x20, al
	pop	rax
	iretq

	SECTION .bss start=3D0x5e0 align=3D8
STACK:
	resb 	0

;	EXE_stack 4096
;	EXE_end

Now I'm going to turn both of those into a proper .COM file, strip out
some code that won't be needed.=20
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 5/3/2011 10:48:11 AM

Freedom on the Oceans wrote:
> 
> On Mon, 2011-05-02 at 21:06 -0400, Rod Pemberton wrote:
> 
> > It works.  It's not pretty.  I decided to fix the code at those two spots.

I didn't follow this thread. Is the problem to convert
the wasm source to nasm source?

Here a NASM version which generates an identically binary (with the
exception of some alternative encodings). You will need the include file 
max.inc to teach NASM the correct operand order and proper register
names ( http://www.bitlib.de/mac.inc ). I use NASM version 0.98.39 so 
the 64 bit code is just included as binary bytes.


        ; nasm -O99  -f bin -o test test.asm

        %include "mac.inc"

        seg 16
        section .text vstart=0

        dc.b    'MZ'            ; Magic number
        dc.w    dosfilesize%($200) ; Bytes on last page of file (0->512)
        dc.w    (dosfilesize-1)/512+1           
                                ; Pages in file (Page=512 byte)
        dc.w    nr_reloc        ; Relocations (nr of entries)
        dc.w    doshead_size/16 ; Size of header size in paragraphs (16 byte)
        dc.w    BSS_size        ; Minimum extra paragraphs needed
        dc.w    $0ffff          ; Maximum extra paragraphs needed
        dc.w    STACK           ; Initial (relative) SS value (ss=load_adr+nr)
        dc.w    dosstack        ; Initial SP value
        dc.w    0               ; Checksum
        dc.w    dosmain         ; Initial IP value
        dc.w    0               ; Initial (relative) CS value (cs=load_adr+nr)
        dc.w    reloc           ; File address of relocation table
        dc.w    0               ; Overlay number
        dc.w    0               ; Reserved words

reloc:  dc.l    reloc1 
        dc.l    reloc2 
        dc.l    reloc3 

nr_reloc equ ($-reloc)/4
        align 16

doshead_size equ $-$$

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

        TEXT16_ equ 0
        seg 16
        section .text16 vstart=0

                        ; Global Descriptors Table Register
GDTR_:  dc.w    4*8-1   ; limit of GDT (size minus one)
        dc.l    GDT     ; linear address of GDT

                        ; Interrupt Descriptor Table Register
IDTR_:  dc.w    256*16-1; limit of IDT (size minus one)
        dc.l    0       ; linear address of IDT
nullidt:dc.w    $3FF
        dc.l    0
  
;       align 8
        dc.b    $02e,$8b,$0c0,$2e,$8b,$0c0  ; wasm doesn't use NULL bytes for align

GDT:    dc.w    $0000,0,$0000,$000  ; null desciptor
        dc.w    $0FFFF,0,$9A00,$0AF  ; 64-bit code descriptor
        dc.w    $0FFFF,0,$9A00,$000  ; compatibility mode code descriptor
        dc.w    $0FFFF,0,$9200,$000  ; compatibility mode data descriptor

wPICMask: dc.w  0       ; variable to save/restore PIC masks



dosmain:move.w  s6,-[sp]    
        move.w  [sp]+,s0    
        move.w  s6,r0       
        movu.wl r0,r0       
        lsl.l   4,r0     
        add.l   r0,[GDTR_+2]     ; convert offset to linear address
        move.w  r0,[GDT+2*8+2]   
        move.w  r0,[GDT+3*8+2]   
        lsr.l   16,r0     
        move.b  r0,[GDT+2*8+4]    
        move.b  r0,[GDT+3*8+4]
   
        move.w  s7,r0       
        move.w  s1,r1       
        sub.w   r1,r0       
        move.w  r7,r3       
        lsr.w   4,r3     
        add.w   r0,r3       
        move.b  $4a,m0     
        trap    $21            ; free unused memory   
        move.w  s6,-[sp]    
        move.w  [sp]+,s1    
        move.w  s7,r0       
        move.w  s6,r1       
        sub.w   r1,r0       
        lsl.w   4,r0     
        add.w   r7,r0       
        move.w  s0,-[sp]    
        move.w  [sp]+,s7    
        move.w  r0,r7           ; make a TINY model, CS=SS=DS=ES
             
        move.w  cr0,r0       
        tst.b   1,r0     
        beq.b   .10       
        move.w  .err1,r1   
        move.b  $09,m0     
        trap    $21        
        move.b  $4c,m0     
        trap    $21        

..err1:  dc.b "Mode is V86. Need REAL mode to switch to LONG mode!",13,10,'$'

..10:

        eor.l   r1,r1                
        move.l  $80000001,r0   ; test if long-mode is supported     
        cpuid
        tst.l   $20000000,r1
        bne.b   .20
                      
        move.w  .err2,r1            
        move.b  $09,m0              
        trap    $21                 
        move.b  $4c,m0              
        trap    $21  

..err2: dc.b "No 64bit cpu detected.",13,10,'$'
               
..20:    move.w  $1000,r3            
        move.b  $48,m0              
        trap    $21                 
        bcc.b   .30
        move.w  .err3,r1            
        move.b  $09,m0              
        trap    $21                 
        move.b  $4c,m0              
        trap    $21  

..err3: dc.b "Out of memory",13,10,'$'

..30:   add.w    $100-1,r0      ; align to page boundary        
       move.b   $00,r0              
       move.w   r0,s1                


;--- setup page directories and tables

        sub.w   r6,r6                  
        move.w  4096,r2              
        sub.l   r0,r0                  
        rep_r2  move.l  r0,[r6.w]+-{s1}        ; clear 4 pages
        sub.w   r6,r6                  
        move.w  s1,r0                  
        movu.wl r0,r0                  
        lsl.l   4,r0                
        move.l  r0,cr3          ; load page-map level-4 base
        lea.l   [r0+$5000],r1       
        move.l  r1,[IDTR_+2]
                     
        orq.l   111b,r0                
        add.l   $1000,r0          
        move.l  r0,[s1:r6.w]           ; first PDP table       
        add.l   $1000,r0          
        move.l  r0,[s1:r6.w+$1000]      ; first page directory 
        add.l   $1000,r0          
        move.l  r0,[s1:r6.w+$2000]      ; first page table 
        move.w  $3000,r6               ; address of first page table 
        move.l  0+111b,r0          
        move.w  256,r2                 ; number of pages to map (1 MB)

..40:    move.l  r0,[r6.w]+-{s1}        
        addq.w  4,r6                
        add.l   $1000,r0          
        dbf.w   r2,.40

;--- setup ebx/rbx with linear address of _TEXT

        reloc1  equ $+1             
        move.w  TEXT_,r3         
        movu.wl r3,r3                  
        lsl.l   4,r3                
        add.l   r3,[llg]                  


;--- create IDT

        move.w  $5000,r6              
        move.w  32,r2              
        move.l  exception,r1      
        add.l   r3,r1                  

make_exc_gates:
        move.l  r1,r0                  
        move.w  r0,[r6.w]+-{s1}        
        move.w  8,r0              
        move.w  r0,[r6.w]+-{s1}        
        move.w  $8e00,r0              
        move.l  r0,[r6.w]+-{s1}        
        eor.l   r0,r0                  
        move.l  r0,[r6.w]+-{s1}        
        move.l  r0,[r6.w]+-{s1}        
        addq.l  4,r1                
        dbf.w   r2, make_exc_gates            

        move.w  256-32,r2

make_int_gates:
        move.l  interrupt,r0       
        add.l   r3,r0                  
        move.w  r0,[r6.w]+-{s1}        
        move.w  8,r0              
        move.w  r0,[r6.w]+-{s1}        
        move.w  $8e00,r0              
        move.l  r0,[r6.w]+-{s1}        
        eor.l   r0,r0                  
        move.l  r0,[r6.w]+-{s1}        
        move.l  r0,[r6.w]+-{s1}        
        dbf.w   r2, make_int_gates

             
        move.w  $5000,r6              
        move.l  r3,r0                  
        add.l   clock,r0     
        move.w  r0,[s1:r6.w+$80*16+0]   ; set IRQ 0 handle  
        lsr.l   16,r0                
        move.w  r0,[s1:r6.w+$80*16+6]   
        
        move.l  r3,r0                  
        add.l   keyboard,r0          
        move.w  r0,[s1:r6.w+$81*16+0]   
        lsr.l   16,r0                
        move.w  r0,[s1:r6.w+$81*16+6]     

;--- clear NT flag

        move.w  sr,-[sp]               
        move.w  [sp]+,r0               
        and.b   $0bf,m0                
        move.w  r0,-[sp]               
        move.w  [sp]+,sr    

;--- reprogram PIC: change IRQ 0-7 to INT 80h-87h, IRQ 8-15 to INT 88h-8Fh
           
        bclr.w  9,sr                  
        in.b    $0a1,r0                
        move.b  r0,m0                  
        in.b    $21,r0                
        move.w  r0,[wPICMask]               
        move.b  10001b,r0      ; begin PIC 1 initialization            
        out.b   r0,$20                
        move.b  10001b,r0      ; begin PIC 2 initialization         
        out.b   r0,0xa0                
        move.b  $80,r0         ; IRQ 0-7: interrupts 80h-87h         
        out.b   r0,$21                
        move.b  $88,r0         ; IRQ 8-15: interrupts 88h-8Fh        
        out.b   r0,$0a1                
        move.b  100b,r0        ; slave connected to IRQ2        
        out.b   r0,$21                
        move.b  2,r0                
        out.b   r0,$0a1                
        move.b  1,r0           ; Intel environment, manual EOI         
        out.b   r0,$21                
        out.b   r0,$0a1                
        in.b    $21,r0                
        move.b  11111100b,r0   ; enable only clock and keyboard IRQ           
        out.b   r0,$21                
        in.b    $0a1,r0                
        move.b  11111111b,r0                
        out.b   r0,$0a1  
                    
        move.l   cr4,r0                 
        orq.l    1<<5,r0                
        move.l   r0,cr4  ; enable physical-address extensions (PAE)

               
        move.l  $0c0000080,r2           ; EFER MSR
           
        rdmsr
        or.l    1<<8,r0        ; enable long mode                 
        wrmsr
      
        move.w  [GDTR_],gdtr               
        move.w  [IDTR_],idtr
                   
        move.w  s7,r2                  
        movu.wl r2,r2           ; get base of SS                 
        lsl.l   4,r2                
        add.l   r7,r2
                        
        move.l  cr0,r0                 
        or.l    $80000001,r0          
        move.l  r0,cr0          ; enable paging + pmode
       
        llg equ $+2             
        jmp.wl  $0008,long_start        ; jmp 0008:oooooooo   

;--- switch back to real-mode and exit

backtoreal:
     
        move.l  cr0,r0                 
        and.l   $7fffffff,r0           ; disable paging      
        move.l  r0,cr0                 
        move.l  $0c0000080,r2           ; EFER MSR
      
        rdmsr
        and.b   ~1,m0          ; disable long mode (EFER.LME=0)                 
        wrmsr
        
        move.w  24,r0    
        move.w  r0,s7                  
        move.w  r0,s0                  
        move.w  r0,s1                  
        move.l  cr0,r0                 
        and.b   $0fe,r0                
        move.l  r0,cr0          ; back to real mode
                   
        reloc2 equ $+3
        jmp.ww  $0000,_60    
_60:
        reloc3 equ $+1
        move.w  STACK,r0              
        move.w  r0,s7                  
        move.w  4096,r7              
        move.w  s6,-[sp]               
        move.w  [sp]+,s0               
        move.w  [nullidt],idtr             
      
      
        move.l  cr4,r0                 
        and.b   ~$20,r0                
        move.l  r0,cr4  ; disable physical-address extensions
              
        move.b  10001b,r0      ; begin PIC 1 initialization            
        out.b   r0,$20                
        move.b  10001b,r0      ; begin PIC 2 initialization           
        out.b   r0,$0a0                
        move.b  $08,r0         ; IRQ 0-7: back to ints 8h-Fh         
        out.b   r0,$21                
        move.b  $70,r0         ; IRQ 8-15: back to ints 70h-77h        
        out.b   r0,$0a1                
        move.b  100b,r0        ; slave connected to IRQ2        
        out.b   r0,$21                
        move.b  2,r0                
        out.b   r0,$0a1                
        move.b  1,r0           ; Intel environment, manual EOI     
        out.b   r0,$21                
        out.b   r0,$0a1                
        in.b    $21,r0                
        move.w  [wPICMask],r0     ; restore PIC masks           
        out.b   r0,$21                
        move.b  m0,r0                  
        out.b   r0,$0a1                
        bset.w  9,sr                  
        move.w  $4c00,r0              
        trap    $21   

TEXT16_size equ ($-$$+15)/16*16

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;--- here's the 64bit code segment.
;--- since 64bit code is always flat but the DOS mz format is segmented,
;--- there are restrictions, because the assembler doesn't know the
;--- linear address where the 64bit segment will be loaded:
;--- + direct addressing with constants isn't possible (mov [0B8000h],rax)
;---   since the rip-relative address will be calculated wrong.
;--- + 64bit offsets (mov rax, offset <var>) must be adjusted by the linear
;---   address where the 64bit segment was loaded (is in rbx).
;---
;--- rbx must preserve linear address of _TEXT

        align 16
        TEXT_ equ TEXT16_ + ($-$$)/16
;       seg 64
        section .text_ vstart=0


long_start:

dc.b $033,$0C0,$08E,$0D0,$08B,$0E1,$0FB,$0E8,$078,$001,$000,$000,$048,$065,$06C,$06C
dc.b $06F,$020,$036,$034,$062,$069,$074,$00A,$000,$041,$0B0,$000,$041,$080,$0F8,$000
dc.b $074,$0FA,$041,$080,$0F8,$001,$00F,$084,$095,$000,$000,$000,$041,$080,$0F8,$013
dc.b $074,$023,$0E8,$04D,$001,$000,$000,$075,$06E,$06B,$06E,$06F,$077,$06E,$020,$06B
dc.b $065,$079,$020,$000,$041,$08A,$0C0,$0E8,$073,$001,$000,$000,$0E8,$033,$001,$000
dc.b $000,$00A,$000,$0EB,$0C4,$0E8,$02A,$001,$000,$000,$00A,$063,$072,$030,$03D,$000
dc.b $00F,$020,$0C0,$0E8,$036,$001,$000,$000,$0E8,$017,$001,$000,$000,$00A,$063,$072
dc.b $032,$03D,$000,$00F,$020,$0D0,$0E8,$023,$001,$000,$000,$0E8,$004,$001,$000,$000
dc.b $00A,$063,$072,$033,$03D,$000,$00F,$020,$0D8,$0E8,$010,$001,$000,$000,$0E8,$0F1
dc.b $000,$000,$000,$00A,$063,$072,$034,$03D,$000,$00F,$020,$0E0,$0E8,$0FD,$000,$000
dc.b $000,$0E8,$0DE,$000,$000,$000,$00A,$063,$072,$038,$03D,$000,$044,$00F,$020,$0C0
dc.b $0E8,$0E9,$000,$000,$000,$0E8,$0CA,$000,$000,$000,$00A,$000,$0E9,$058,$0FF,$0FF
dc.b $0FF,$0FF,$02D,$000,$000,$000,$000,$097,$002,$000,$000,$010,$000,$0FC,$048,$08B
dc.b $0FE,$048,$00F,$0B7,$045,$04A,$050,$048,$08D,$034,$046,$08A,$08D,$084,$000,$000
dc.b $000,$0F6,$0E1,$048,$08B,$0C8,$0F3,$066,$0A5,$059,$066,$0B8,$020,$007,$0F3,$066
dc.b $0AB,$0C3,$055,$057,$056,$053,$051,$052,$050,$048,$0C7,$0C7,$000,$080,$00B,$000
dc.b $048,$0C7,$0C5,$000,$004,$000,$000,$080,$07D,$063,$0B4,$075,$003,$066,$033,$0FF
dc.b $048,$00F,$0B7,$05D,$04E,$048,$003,$0FB,$048,$00F,$0B6,$05D,$062,$048,$08B,$0F7
dc.b $048,$00F,$0B6,$04C,$05D,$051,$048,$00F,$0B7,$045,$04A,$048,$0F7,$0E1,$048,$00F
dc.b $0B6,$054,$05D,$050,$048,$003,$0C2,$08A,$0F1,$048,$08D,$03C,$047,$08A,$004,$024
dc.b $03C,$00A,$074,$00D,$088,$007,$0C6,$047,$001,$007,$0FE,$0C2,$03A,$055,$04A,$072
dc.b $013,$0B2,$000,$0FE,$0C6,$03A,$0B5,$084,$000,$000,$000,$076,$007,$0FE,$0CE,$0E8
dc.b $069,$0FF,$0FF,$0FF,$066,$089,$054,$05D,$050,$058,$05A,$059,$05B,$05E,$05F,$05D
dc.b $0C3,$056,$048,$08B,$0F2,$0FC,$0AC,$022,$0C0,$074,$007,$0E8,$072,$0FF,$0FF,$0FF
dc.b $0EB,$0F4,$05E,$0C3,$056,$048,$08B,$074,$024,$008,$0FC,$0AC,$022,$0C0,$074,$007
dc.b $0E8,$05D,$0FF,$0FF,$0FF,$0EB,$0F4,$048,$089,$074,$024,$008,$05E,$0C3,$050,$048
dc.b $0C1,$0E8,$020,$0E8,$001,$000,$000,$000,$058,$050,$048,$0C1,$0E8,$010,$0E8,$001
dc.b $000,$000,$000,$058,$050,$048,$0C1,$0E8,$008,$0E8,$001,$000,$000,$000,$058,$050
dc.b $048,$0C1,$0E8,$004,$0E8,$001,$000,$000,$000,$058,$024,$00F,$004,$030,$03C,$039
dc.b $076,$002,$004,$007,$0E9,$019,$0FF,$0FF,$0FF

exception:
dc.b $06A,$000,$0EB,$07C,$06A,$001,$0EB
dc.b $078,$06A,$002,$0EB,$074,$06A,$003,$0EB,$070,$06A,$004,$0EB,$06C,$06A,$005,$0EB
dc.b $068,$06A,$006,$0EB,$064,$06A,$007,$0EB,$060,$06A,$008,$0EB,$05C,$06A,$009,$0EB
dc.b $058,$06A,$00A,$0EB,$054,$06A,$00B,$0EB,$050,$06A,$00C,$0EB,$04C,$06A,$00D,$0EB
dc.b $048,$06A,$00E,$0EB,$044,$06A,$00F,$0EB,$040,$06A,$010,$0EB,$03C,$06A,$011,$0EB
dc.b $038,$06A,$012,$0EB,$034,$06A,$013,$0EB,$030,$06A,$014,$0EB,$02C,$06A,$015,$0EB
dc.b $028,$06A,$016,$0EB,$024,$06A,$017,$0EB,$020,$06A,$018,$0EB,$01C,$06A,$019,$0EB
dc.b $018,$06A,$01A,$0EB,$014,$06A,$01B,$0EB,$010,$06A,$01C,$0EB,$00C,$06A,$01D,$0EB
dc.b $008,$06A,$01E,$0EB,$004,$06A,$01F,$0EB,$000,$0E8,$026,$0FF,$0FF,$0FF,$00A,$045
dc.b $078,$063,$065,$070,$074,$069,$06F,$06E,$020,$000,$058,$0E8,$04F,$0FF,$0FF,$0FF
dc.b $0E8,$00F,$0FF,$0FF,$0FF,$020,$065,$072,$072,$063,$06F,$064,$065,$03D,$000,$048
dc.b $08B,$004,$024,$0E8,$016,$0FF,$0FF,$0FF,$0E8,$0F7,$0FE,$0FF,$0FF,$020,$072,$069
dc.b $070,$03D,$000,$048,$08B,$044,$024,$008,$0E8,$001,$0FF,$0FF,$0FF,$0E8,$0E2,$0FE
dc.b $0FF,$0FF,$00A,$000,$0EB,$0FE

clock:
dc.b $055,$048,$0C7,$0C5,$000,$004,$000,$000,$0FF,$045,$06C,$05D

interrupt:
dc.b $050,$0B0,$020,$0E6,$020,$058,$048,$0CF

keyboard:
dc.b $050,$0E4,$060,$0A8,$080,$075
dc.b $003,$044,$08A,$0C0,$0E4,$061,$0E6,$061,$0B0,$020,$0E6,$020,$058,$048,$0CF

TEXT_size equ $-$$

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

dosfilesize equ doshead_size + TEXT16_size + TEXT_size

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;--- 4k stack, used in both modes

STACK equ TEXT_+($-$$+15)/16    
section .bss vstart=0

     blk.b 4096

dosstack equ $ 

BSS_size equ ($-$$+15)/16 + 1   ; why +1 ????
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
0
Reply klee6431 (4) 5/3/2011 4:10:15 PM

"Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
message news:b43598-vld.ln1@nntp.local.net...
> On Mon, 2011-05-02 at 21:06 -0400, Rod Pemberton wrote:
>
> > I decided to fix the code at those two spots.
> > One went well.  The other location doesn't seem
> > to want to have different
> > code there...  [...]  I don't know what's wrong.
>
> Yes, your version works very well, also I've finished
> commenting my code.
>

BTW, Herbert, in his version, showed us how to fix those two problem
locations using 'equ' and 'vstart' etc.  It also allowed me to fixed the
subtract _TEXT values.  So, it's clean now.  Also, that code I said was
self-modifying, probably wasn't...  Also, you missed one 'byte' instead of
'word' fix.  Anyway, I'm posting a reworked version.  I also added your
comments to it.  Some lines will wrap.  So, you'll only need to add the
comments since you posted, and unwrap a few lines.  It might be nice to
have a version with Japheth's comments too.

HTH,


Rod Pemberton


;nasm -f bin
[map all dos64.map]

SECTION .text vstart=0x0
; 48 byte header

db 0x4D, 0x5A, 0x0F, 0x00, 0x04, 0x00, 0x03, 0x00
db 0x03, 0x00, 0x01, 0x01, 0xFF, 0xFF, 0x5E, 0x00
db 0x00, 0x10, 0x00, 0x00, 0x3A, 0x00, 0x00, 0x00
db 0x1E, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8F, 0x01
db 0x00, 0x00, 0xC4, 0x02, 0x00, 0x00, 0xC7, 0x02
db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00

;%include 'exebin.mac'

ORG -0x30 ; takes into account the header tacked on

; EXE_begin start16

BITS 16

_TEXT16 equ 0
SECTION .text1 vstart=0x0

; Global Descriptors Table Register
GDTR dw 4 * 8 - 1 ; GDTR size
dd GDT ; Linear address where GDT is stored

; Interrupt Descriptors Table Register
IDTR dw 256 * 16 - 1 ; IDTR size
dd 0 ; Linear address where IDT is kept
nullidt dw 0x3FF
dd 0

align 8
GDT dq 0 ; Null descriptor
dw 0xFFFF, 0, 0x9A00, 0x0AF ; 64 bit code descriptor
dw 0xFFFF, 0, 0x9A00, 0x000 ; Compatibility mode code descriptor
dw 0xFFFF, 0, 0x9200, 0x000 ; Compatibility mode data descriptor

wPICMask dw 0 ; used for saving/restoring PIC masks

start16:
push cs
pop ds
mov ax, cs
movzx eax, ax
shl eax, 4
add dword [GDTR + 2], eax ; convert offset to linear address
mov word [GDT + 2 * 8 + 2], ax
mov word [GDT + 3 * 8 + 2], ax
shr eax, 16
mov byte [GDT + 2 * 8 + 4], al
mov byte [GDT + 3 * 8 + 4], al

mov ax, ss ; stack segment address
mov dx, es ; extra segment address
sub ax, dx
mov bx, sp ; stack pointer
shr bx, 4 ; shift to the right four times
add bx, ax ; adds result of ax to bx

mov ah, 0x4A ; frees unused memory
int 0x21

push cs ; copies code segment into extra segment
pop es
mov ax, ss ; stack segment address
mov dx, cs ; code segment address
sub ax, dx
shl ax, 4 ; shift to the left four times
add ax, sp ; adds result of ax to sp
push ds ; copies data segment into stack segment
pop ss
mov sp, ax ; creates a TINY model (CS=SS=DS=ES)

smsw ax ; Are we running in V86?
test al, 1
jz skip1 ; Yay, we're in real mode

mov dx, err1 ; We're in V86 mode, it's byebye
mov ah, 9
int 0x21

mov ah, 0x4C ; Hand over to DOS
int 0x21

err1: db "Mode is V86. Need REAL mode to switch to LONG mode!", 13, 10,'$'

skip1:
xor edx, edx
mov eax, 0x80000001
cpuid
test edx, 0x20000000 ; Is LONG mode supported?
jnz skip2 ; Yay, we've got a 64 bit processor!

mov dx, err2 ; We're 32 bit, it's byebye
mov ah, 9
int 0x21

mov ah, 0x4C ; Hand over to DOS
int 0x21

err2: db "No 64bit cpu detected.", 13, 10, '$'

skip2:
mov bx, 0x1000 ; Allocate 4096 paragraphs
mov ah, 0x48
int 0x21

jnc skip3 ; Yay, we got what we wanted
mov dx, err3
mov ah, 9
int 0x21

mov ah, 0x4C ; Hand over to DOS
int 0x21

err3: db "Out of memory", 13, 10, '$'

skip3:
add ax, 0x100 - 1 ; Realigns to page boundary (adds 255 to it)
mov al, 0 ; Zeros out low 8 bits of register
mov es, ax ; Sets extra segment to address of allocated block

sub di, di ; Sets DI to zero
mov cx, 4096 ; Sets CX to 4096
sub eax, eax ; Zeros out EAX
rep stosd ; Clears 4 pages' worth (16KB) in ES:DI

sub di, di ; Clears DI
mov ax, es ; Copies extra segment into AX
movzx eax, ax ; Extends into EAX
shl eax, 4 ; Shift to left four times
mov cr3, eax ; Loads CR3 with pointer to page table

lea edx, [eax + 0x5000] ; Load EDX with address in EAX plus 20KB (20,480
bytes)
mov dword [IDTR + 2], edx ; Stores linear address into IDTR + 2

or eax, 111b ; Sets low 3 bits to 1
add eax, 0x1000 ; Adds 4096 to result
mov [es:di + 0x0000], eax ; Stores result into [ES:DI] (first PDP table)
add eax, 0x1000 ; Add 4096 to result
mov [es:di + 0x1000], eax ; Stores result into [ES:DI + 4096] (first page
directory)
add eax, 0x1000 ; add 4096 to result
mov [es:di + 0x2000], eax ; Stores result into [ES:DI + 8192] (first page
table)
mov di, 0x3000 ; Sets DI to 12288 (12KB) (address of first page table)
mov eax, 0 + 111b ; Sets EAX to 7 (why?)
mov cx, 256

..skip4:
stosd ; Stores EAX into [ES:DI], DI incremented 4 bytes
add di, 4 ; Increment DI by 4
add eax, 0x1000 ; Adds 4096 to EAX
loop .skip4 ; Is CX still not zero yet?

mov bx, _TEXT ; Load BX with address of _TEXT
movzx ebx, bx ; Extends to EBX
shl ebx, 4 ; Shift to the left four times
add [llg], ebx ; Store EBX as an linear address into llg

mov di, 0x5000 ; Loads DI with 0x5000 (20KB or 20,480 bytes)
mov cx, 32 ; Loads CX with 32
mov edx, exception ; Load EDX with pointer to exception
add edx, ebx ; Adds EBX to EDX

make_exc_gates:
mov eax, edx ; Loads EAX with EDX
stosw ; Stores AX into [ES:DI], DI incremented 2 bytes
mov ax, 8 ; Loads AX with 8
stosw ; Stores AX into [ES:DI], DI incremented 2 bytes
mov ax, 0x8E00 ; Loads AX (why this value?)
stosd ; Stores EAX into [ES:DI], DI incremented 4 bytes
xor eax, eax ; Zeros out EAX
stosd ; Stores 0 into [ES:DI], DI incremented 4 bytes
stosd ; As above
add edx, 4 ; Adds 4 to EDX
loop make_exc_gates ; Is CX still not zero yet?

mov cx, 256 - 32 ; Loads CX with 224

make_int_gates:
mov eax, interrupt ; Loads EAX with pointer to interrupt
add eax, ebx ; Adds address of _TEXT in EBX to EDX
stosw ; Stores AX into [ES:DI], DI incremented 2 bytes
mov ax, 8 ; Loads AX with 8
stosw ; Stores AX into [ES:DI], DI incremented 2 bytes
mov ax, 0x8E00 ; Loads AX (why this value?)
stosd ; Stores EAX into [ES:DI], DI incremented 4 bytes
xor eax, eax ; Zeros out EAX
stosd ; Stores 0 into [ES:DI], DI incremented 4 bytes
stosd ; As above
loop make_int_gates ; Is CX still not zero yet?

mov di, 0x5000 ; Loads DI with 0x5000 (20KB or 20,480 bytes)
mov eax, ebx ; Loads EAX with address of _TEXT in EAX
add eax, clock ; Adds offset to clock to EAX
mov [es:di + 0x80 * 16 + 0], ax ; Pointer to IRQ 0 handler stored
shr eax, 16 ; Shift to the right 16 times as a linear address
mov [es:di + 0x80 * 16 + 6], ax ; Pointer to IRQ 0 handler stored

mov eax, ebx ; Loads EAX with address of _TEXT in EAX
add eax, keyboard ; Adds offset to keyboard to EAX
mov [es:di + 0x81 * 16 + 0], ax ; Pointer to IRQ 1 handler stored
shr eax, 16 ; Shift to the right 16 times as a linear address
mov [es:di + 0x81 * 16 + 6], ax ; Pointer to IRQ 1 handler stored

pushf ; Clears the NT flag in readiness for interrupt handling
pop ax
and ah, 0xBF
push ax
popf

cli ; Disable interrupts
in al, 0xA1
mov ah, al
in al, 0x21
mov [wPICMask], ax ; Copies PIC contents to restore later
mov al, 10001b ; Initialise PIC 1
out 0x20, al
mov al, 10001b ; Initialise PIC 2
out 0xA0, al
mov al, 0x80 ; IRQ 0-7 handles interrupt 0x80 .. 0x87
out 0x21, al
mov al, 0x88 ; IRQ 8-15 handles interrupt 0x88 .. 0x8F
out 0xA1, al
mov al, 100b ; slave to IRQ 2
out 0x21, al
mov al, 2
out 0xA1, al
mov al, 1 ; EOI?
out 0x21, al
out 0xA1, al
in al, 0x21
mov al, 11111100b ; Only enable clock and keyboard IRQs
out 0x21, al
in al, 0xA1
mov al, 11111111b
out 0xA1, al

mov eax, cr4
or eax, 1 << 5
mov cr4, eax ; Enables PAE36 as a prelude to LONG mode

mov ecx, 0xC0000080 ; EFER MSR

rdmsr
or eax, 1 << 8 ; Enables LONG mode
wrmsr

lgdt [GDTR] ; Loads GDT
lidt [IDTR] ; Loads IDT

mov cx, ss ; Loads CX with address of SS segment
movzx ecx, cx ; Extends to ECX
shl ecx, 4 ; Shift to left four times
add ecx, esp

mov eax, cr0
or eax, 0x80000001 ; Enable paging and protected mode
mov cr0, eax

db 0x66, 0xEA ; jmp far 0x8:0x0
llg dd long_start ; (64 bit routine)
dw 8

backtoreal:
mov eax, cr0
and eax, 0x7FFFFFFF ; Disable paging and protected mode
mov cr0, eax

mov ecx, 0xC0000080 ; EFER MSR

rdmsr
and ax, ~1 ; Disables LONG mode
wrmsr

mov ax, 24 ; Loads AX with 24 - why this value?
mov ss, ax
mov ds, ax
mov es, ax ; sets SS=DS=ES with 24

mov eax, cr0
and al, 0xFE ; Enables REAL mode
mov cr0, eax

db 0xEA ; jmp _TEXT16:$+4
dw $+4
dw _TEXT16

mov ax, STACK ; Reinitalise SS with STACK
mov ss, ax
mov sp, 4096 ; Sets up stack pointer

push cs
pop ds ; DS=CS

lidt [nullidt] ; Loads IDT with null descriptor

mov eax, cr4
and al, ~0x20 ; Disables PAE36 as a prelude to REAL mode
mov cr4, eax

mov al, 10001b ; Initialise PCI 1
out 0x20, al
mov al, 10001b ; Initialise PCI 2
out 0xA0, al
mov al, 0x08 ; IRQ 0 - 7 now handles interrupts 0x08 .. 0x0F
out 0x21, al
mov al, 0x70 ; IRQ 8 - 15 now handles interrupts 0x70 ... 0x77
out 0xA1, al
mov al, 100b ; slaved to IRQ 2
out 0x21, al
mov al, 2
out 0xA1, al
mov al, 1 ; EOI
out 0x21, al
out 0xA1, al
in al, 0x21
mov ax, [wPICMask] ; Restore original PIC
out 0x21, al
mov al, ah
out 0xA1, al
sti ; Enable interrupts

mov ax, 0x4C00 ; Hands over to DOS
int 0x21


BITS 64

_TEXT equ _TEXT16 + ($-$$+15)/16
SECTION .text2 vstart=0x0 align=8

; 64 bit code - rbx must preserve linear address of _TEXT

long_start:
xor eax, eax
mov ss, eax
mov esp, ecx
sti ; Enable interrupts

call WriteStrX ; Yay, LONG mode!
db 'Hello 64bit', 10, 0

nextcmd:
mov r8b, 0

nocmd:
cmp r8b, 0
jz nocmd
cmp r8b, 1
jz esc_pressed
cmp r8b, 0x13
jz r_pressed
call WriteStrX
db 'unknown key ', 0
mov al, r8b
call WriteB
call WriteStrX
db 10, 0
jmp nextcmd

r_pressed:
call WriteStrX
db 10, "cr0=", 0
mov rax, cr0
call WriteQW
call WriteStrX
db 10, "cr2=", 0
mov rax, cr2
call WriteQW
call WriteStrX
db 10, "cr3=", 0
mov rax, cr3
call WriteQW
call WriteStrX
db 10, "cr4=", 0
mov rax, cr4
call WriteQW
call WriteStrX
db 10, "cr8=", 0
mov rax, cr8
call WriteQW
call WriteStrX
db 10, 0
jmp nextcmd

esc_pressed:
jmp dword far [rel bv]
bv: dd backtoreal
dw 16

scroll_screen:
cld
mov rdi, rsi
movzx rax, byte [rbp + 0x4a]
push rax
lea rsi, [rsi + 2 * rax]
mov cl, [rbp + 0x84]
mul cl
mov rcx, rax
rep movsw
pop rcx
mov ax, 0x0720
rep stosw
ret

WriteChr:
push rbp
push rdi
push rsi
push rbx
push rcx
push rdx
push rax
mov rdi, dword 0xB8000
mov rbp, dword 0x400
cmp byte [rbp + 0x63], 0xB4
jnz .skip5
xor di, di

..skip5:
movzx rbx, byte [rbp + 0x4E]
add rdi, rbx
movzx rbx, byte [rbp + 0x62]
mov rsi, rdi
movzx rcx, byte [rbx * 2 + rbp + 0x50 + 1]
movzx rax, byte [rbp + 0x4A]
mul rcx
movzx rdx, byte [rbx * 2 + rbp + 0x50]
add rax, rdx
mov dh, cl
lea rdi, [rdi + rax * 2]
mov al, [rsp]
cmp al, 10
jz .newline
mov [rdi], al
mov byte [rdi + 1], 7
inc dl
cmp dl, byte [rbp + 0x4a]
jb .skip6

..newline:
mov dl, 0
inc dh
cmp dh, byte [rbp + 0x84]
jbe .skip6
dec dh
call scroll_screen

..skip6:
mov [rbx * 2 + rbp + 0x50], dx
pop rax
pop rdx
pop rcx
pop rbx
pop rsi
pop rdi
pop rbp
ret

WriteStr:
push rsi
mov rsi, rdx
cld

..skip7:
lodsb
and al, al
jz .skip8
call WriteChr
jmp .skip7

..skip8:
pop rsi
ret

WriteStrX:
push rsi
mov rsi, [rsp + 8]
cld

..skip9:
lodsb
and al, al
jz .skip10
call WriteChr
jmp .skip9

..skip10:
mov [rsp + 8], rsi
pop rsi
ret

WriteQW:
push rax
shr rax, 32
call WriteDW
pop rax

WriteDW:
push rax
shr rax, 16
call WriteW
pop rax

WriteW:
push rax
shr rax, 8
call WriteB
pop rax

WriteB:
push rax
shr rax, 4
call WriteNb
pop rax

WriteNb:
and al, 0x0F
add al, '0'
cmp al, '9'
jbe .skip11
add al, 7

..skip11:
jmp WriteChr

exception:
%assign excno 0
%rep 32
push excno
jmp .skip12
%assign excno excno + 1
%endrep

..skip12:
call WriteStrX
db 10, "Exception ", 0
pop rax
call WriteB
call WriteStrX
db " errcode=", 0
mov rax, [rsp + 0]
call WriteQW
call WriteStrX
db " rip=", 0
mov rax, [rsp + 8]
call WriteQW
call WriteStrX
db 10, 0

..skip13:
jmp $

clock:
push rbp
mov rbp, dword 0x400
inc dword [rbp + 0x6C]
pop rbp

interrupt:
push rax
mov al, 0x20
out 0x20, al
pop rax
iretq

keyboard:
push rax
in al, 0x60
test al, 0x80
jnz .skip14
mov r8b, al

..skip14:
in al, 0x61
out 0x61, al
mov al, 0x20
out 0x20, al
pop rax
iretq

STACK equ _TEXT + ($-$$+15)/16
SECTION .bss vstart=0x0


0
Reply do_not_have7664 (117) 5/4/2011 6:15:07 AM

On Wed, 2011-05-04 at 02:15 -0400, Rod Pemberton wrote:
> BTW, Herbert, in his version, showed us how to fix those two problem
> locations using 'equ' and 'vstart' etc.  It also allowed me to fixed
> the
> subtract _TEXT values.  So, it's clean now.  Also, that code I said
> was
> self-modifying, probably wasn't...  Also, you missed one 'byte'
> instead of
> 'word' fix.  Anyway, I'm posting a reworked version.  I also added
> your
> comments to it.  Some lines will wrap.  So, you'll only need to add
> the
> comments since you posted, and unwrap a few lines.  It might be nice
> to
> have a version with Japheth's comments too.=20

Your code works perfectly. My version is a pure .COM version, at one
point I had it working but I changed something that I couldn't remember
what I did, and it hasn't worked ever since. It must be something really
simple, just have to find it.=20

I have been testing it with 386SWAT but it crashes totally under
VirtualBox. Any other advanced debugger I can try? There must be a
debugger out there that can debug through the switch from real mode to
protected/long mode.=20
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 5/4/2011 1:25:38 PM

"Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
message news:jn0898-evf.ln1@nntp.local.net...
>
> I have been testing it with 386SWAT
....

> Any other advanced debugger I can try? There must be a
> debugger out there that can debug through the switch from
> real mode to protected/long mode.
>

Uh, wow...  386SWAT is the debugger that it claims to do that.

You could also look at Japheth's debxxxf.  Japheth says "it may" do a bunch
of stuff like that:
http://www.japheth.de/


Rod Pemberton
PS. added alt.lang.asm, alt.os.development


0
Reply do_not_have7664 (117) 5/4/2011 6:56:54 PM

On Wed, 2011-05-04 at 14:56 -0400, Rod Pemberton wrote:
> > Any other advanced debugger I can try? There must be a
> > debugger out there that can debug through the switch from
> > real mode to protected/long mode
>=20
> Uh, wow...  386SWAT is the debugger that it claims to do that.
>=20
> You could also look at Japheth's debxxxf.  Japheth says "it may" do a
> bunch of stuff like that:
> http://www.japheth.de/=20

Thanks, taking a look at that. I have now stopped work on dos64 as I no
longer have use of the amd64 box for testing. However, I've now started
working on dos32, a version that will do pretty much the same thing in
32 bit protected mode.

Right now, I'm trying to get it to flip into protected mode and write a
character to the screen.=20
--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 5/4/2011 8:14:32 PM

"Freedom on the Oceans" <alex.buell@nospicedham.munted.org.uk> wrote in
message news:amo898-prf.ln1@nntp.local.net...
>
> I've now started working on dos32, a version that will do
> pretty much the same thing in 32 bit protected mode.
>
> Right now, I'm trying to get it to flip into protected mode
> and write a character to the screen.
>

That's much easier.  Disable interrupts.  Setup a GDT with NULL, code, and
data descriptors.  Setup the code and data descriptors with 4GB segment with
base address zero and for code or data, respectively. Get CR0.  Set CR0.PE.
Save CR0.  That enables PM.  Do a far jump to the next instruction.  That
activates PM by setting CS selector to your code selector, and flushing the
instruction cache.  Setup the other selectors (DS, ES, FS, GS, SS) to the
data selector.  That's all that's needed.  Optionally, you can setup an IDT
for interrupts and enable interrupts, or setup and activate paging, or set
ESP for a stack, PM-to-RM and RM-to-PM switches, or v86 mode, disable NMI,
reset PICs, add a Grub MB header, etc.  There are lots of good examples of
this on the 'net or in alt.os.development archives for bootloaders, for OS
startups, for DOS, for Linux, in C, in asm, etc.


Rod Pemberton




0
Reply do_not_have7664 (117) 5/4/2011 10:30:47 PM

"Freedom on the Oceans" wrote in message
> However, I've now started working on dos32, a version that
> will do pretty much the same thing in 32 bit protected mode.

>Right now, I'm trying to get it to flip into protected mode
> and write a character to the screen.

Deja vu!


Mike Gonta
look and see - many look but few see

http://aeBIOS.com



0
Reply mikegonta3947 (9) 5/4/2011 10:32:56 PM

On Wed, 2011-05-04 at 18:30 -0400, Rod Pemberton wrote:
> > Right now, I'm trying to get it to flip into protected mode
> > and write a character to the screen.
> >
>=20
> That's much easier.  Disable interrupts.  Setup a GDT with NULL, code, an=
d
> data descriptors.  Setup the code and data descriptors with 4GB segment w=
ith
> base address zero and for code or data, respectively. Get CR0.  Set CR0.P=
E.
> Save CR0.  That enables PM.  Do a far jump to the next instruction.  That
> activates PM by setting CS selector to your code selector, and flushing t=
he
> instruction cache.  Setup the other selectors (DS, ES, FS, GS, SS) to the
> data selector.  That's all that's needed.  Optionally, you can setup an I=
DT
> for interrupts and enable interrupts, or setup and activate paging, or se=
t
> ESP for a stack, PM-to-RM and RM-to-PM switches, or v86 mode, disable NMI=
,
> reset PICs, add a Grub MB header, etc.  There are lots of good examples o=
f
> this on the 'net or in alt.os.development archives for bootloaders, for O=
S
> startups, for DOS, for Linux, in C, in asm, etc.

That is exactly what I've coded, yet it hangs. There must be something
I'm totally missing.=20

	ORG	0x100
	BITS	16

	jmp	start16

gdtinfo:
	dw	gdt_end - gdt - 1
	dd	gdt

gdt:
	dq	0			; always unused
gdt_code:
	dw	0xFFFF
	dw	0
	db	0
	db	10011010b
	db	11001111b
	db 	0
gdt_data:
	dw	0xFFFF=20
	dw	0
	db	0
	db	10010010b
	db	11001111b
	db	0
gdt_end:

start16:
	smsw	ax
	test	al, 1
	jz	.skip1

	mov	dx, .err1
	mov	ah, 9
	int	0x21

	mov	ah, 0x4C
	int	0x21

..err1:	db 	"Mode is V86. Need REAL mode to switch to protected mode!",
13, 10, '$'

..skip1:
	cli=09
	lgdt 	[gdtinfo]

	mov	eax, cr0
	or	al, 1
	mov	cr0, eax		; enters protected mode

	jmp	0x08:protected_mode 	; flush pipeline and go 32 bits

	BITS	32

protected_mode:
	mov	eax, 0x10
	mov	ds, ax
	mov	es, ax
	mov	fs, ax
	mov	gs, ax
	mov	ss, ax

	mov	eax, 0xB8000
	mov	word [ds:eax], 0x0F01

	sti

loop:	hlt
	jmp	loop


--=20
Tactical Nuclear Kittens

0
Reply alex.buell (44) 5/4/2011 11:21:12 PM

Freedom on the Oceans wrote:

....
> gdtinfo:
> 	dw	gdt_end - gdt - 1
> 	dd	gdt

This needs to be a linear address.

mov eax, cs
shl eax, 4
add [gdtinfo + 2], eax

.... somewhere before you lgdt. And I don't think you can sti until 
you've got an idt set up...

Best,
Frank
0
Reply fbkotler9831 (124) 5/5/2011 1:50:07 AM

Frank Kotler figured:

> Freedom on the Oceans wrote:

> ...
>> gdtinfo:
>> dw gdt_end - gdt - 1
>> dd gdt

> This needs to be a linear address.

> mov eax, cs
> shl eax, 4
> add [gdtinfo + 2], eax

> ... somewhere before you lgdt. And I don't think you can sti until you've
> got an idt set up...

Yes, and GDT entries need to be aligned at 8 byte bounds.
This 'jmp start 16' might be compiled as two byte code and so fit
(together with the six byte GDTR-copy) this needs by coincidence?

I just use the otherwise redundant Null-entry to hold the GDTR-copy
and the remaining two bytes to store the realmode backlink segment.
Got three flyes on one strike then!
__
wolfgang


0
Reply nowhere583 (184) 5/5/2011 8:46:05 AM

wolfgang kern wrote:
> Frank Kotler figured:
> 
>> Freedom on the Oceans wrote:
> 
>> ...
>>> gdtinfo:
>>> dw gdt_end - gdt - 1
>>> dd gdt
> 
>> This needs to be a linear address.
> 
>> mov eax, cs
>> shl eax, 4
>> add [gdtinfo + 2], eax
> 
>> ... somewhere before you lgdt. And I don't think you can sti until you've
>> got an idt set up...
> 
> Yes, and GDT entries need to be aligned at 8 byte bounds.

Oops, forgot that one. This could be a "problem" with certain versions 
of Nasm. I forget exactly where it started and ended, but some versions 
produce an insane number of zeros at an "align" directive. Fortunately...

> This 'jmp start 16' might be compiled as two byte code and so fit
> (together with the six byte GDTR-copy) this needs by coincidence?

Happy coincidence, in any case. Good old 0.98.39 might assemble that as 
a "near" jump (three bytes), though...

Won't hurt to specify it as "jmp short start16", in any case.

> I just use the otherwise redundant Null-entry to hold the GDTR-copy

That's still redundant in 64-bit code, then? I was concerned to see 
Japheth set ss to zero...

> and the remaining two bytes to store the realmode backlink segment.
> Got three flyes on one strike then!

Good swattin'! :)

Best,
Frank
0
Reply fbkotler9831 (124) 5/5/2011 11:31:32 AM

Hi,

On May 2, 5:06=A0pm, Freedom on the Oceans
<alex.bu...@nospicedham.munted.org.uk> wrote:
> On Mon, 2011-05-02 at 17:19 -0400, Frank Kotler wrote:
> > I've gotta upgrade my own version of Nasm, too! I'm still running
> > 2.10rc2-20101108... There's a down side to Nasm not being "dead"
> > after
> > all! Testing every single update gets to be a PITA, even for me.
> >
> Did my patch for building NASM under MSDOS ever get accepted into main
> line - sent it in a couple of years back (2.04.x?) IIRC some files
> needed renaming to fit within the 8.3 filename format mandated by DOS.
> --
> Tactical Nuclear Kittens

I'm no NASM developer, but since it's right up my alley (and nobody
else answered yet, probably lost among all the other messages), lemme
say this ...

I think I remember you mentioning this a few months back. I think I
pointed you to my .BAT "patch" that let DJGPP compile 2.08.01 in pure
DOS. So, if they ever did let it compile in pure DOS previously, I'm
unaware of it. Part of the problem was the switch to Autotools, which
means you have to "./configure && make", which is a pain (needs
various things), esp. to generate config.h, and that needs LFNs (and
I'm not totally convinced even DOSLFN would work, my limited tests
weren't encouraging). I never updated the .BAT since then, too lazy /
busy, didn't see a huge need. Nobody else really cares about DOS
compatibility, honestly.

On another more encouraging note, they did fix it where the
openwcom.mak file would build for DOS target, at least. It had a few
issues though, but (finally) they patched in my (trivial) changes a
few months ago so that it did indeed work to build *on* DOS *for* DOS.
I have not tried native DOS with the latest version (32-bit cpu
disconnected, which disorganized me), but it *should* work, in theory.
Hmmm, I keep forgetting about DOSEMU, but I don't typically use Linux
much, at least not for DOS development (not as comfortable there).
Bah, I'm so disorganized.  :-(

NASM 2.x needs C99 and at least 32-bit ints, but I don't know of any
other compilers for DOS that have been proven to work besides DJGPP
(GCC) and OpenWatcom. Perhaps CC386 would, but I've not tried (and not
optimistic, should I be??).

P.S. Go to http://www.bttr-software.de/forum/ if you want Japheth's
attention.
0
Reply rugxulo3735 (22) 5/5/2011 5:46:18 PM

 Frank Kotler asked:
....
>> I just use the otherwise redundant Null-entry to hold the GDTR-copy

> That's still redundant in 64-bit code, then? I was concerned to see 
> Japheth set ss to zero...

Perhaps an attempt to produce an exception as trap for mode change.

As mentioned earlier in my other reply, DS ES SS have no meaning
during 64-bit mode and are treated as zero-based regardless of
their contents.
But data-descriptors become valid in 'compatible-long mode'.

__
wolfgang
 


0
Reply nowhere583 (184) 5/5/2011 9:31:56 PM

49 Replies
39 Views

(page loaded in 0.499 seconds)

Similiar Articles:




7/8/2012 10:44:37 AM


Reply: