f



AMD64 environment

I came across a thread the other day in comp.lang.c.  It referred to
AMD64's new RIP-relative addressing mode.  For some reason I had it
stuck in my brain that this was only the target of branch instructions
and not data.  I don't even understand why I had that thought, but
nonetheless there it is.

In that thread, On Wednesday, December 14, 2016 at 4:01:41 PM UTC-5,
Robert Wessel wrote:
> RIP for data addresses was added in x86-64.  IP relative addressing
> for branches has been there since the 8086.

I was curious.  How does AMD64's operating environment work?  Does this
new RIP-relative addressing assume a selector like CS:, DS:, or SS:?
Or does it work in a flat mode where CS: = DS: = ES: = SS:?  Or ... ?

When the code exists:

    mov   rax,data[rip]

Would this ultimately access CS:[RIP+offset data]?  Or some other
selector?  Or ... ?

Thank you in advance.

Best regards,
Rick C. Hodgin
0
Rick
12/20/2016 1:22:57 PM
comp.lang.asm.x86 5035 articles. 0 followers. Post Follow

16 Replies
867 Views

Similar Articles

[PageSpeed] 17

On 20/12/2016 13:22, Rick C. Hodgin wrote:
> I came across a thread the other day in comp.lang.c.  It referred to
> AMD64's new RIP-relative addressing mode.

"New" is a matter of perspective, as it 13 years old now.

>  For some reason I had it
> stuck in my brain that this was only the target of branch instructions
> and not data.  I don't even understand why I had that thought, but
> nonetheless there it is.
> 
> In that thread, On Wednesday, December 14, 2016 at 4:01:41 PM UTC-5,
> Robert Wessel wrote:
>> RIP for data addresses was added in x86-64.  IP relative addressing
>> for branches has been there since the 8086.
> 
> I was curious.  How does AMD64's operating environment work?

There is a specific encoding of ModRM/SiB which uses %rip as a base
register.  All other memory operand construction is the same.

> Does this new RIP-relative addressing assume a selector like CS:, DS:, or SS:?
> Or does it work in a flat mode where CS: = DS: = ES: = SS:?  Or ... ?

It is just a memory operand, and will use the appropriate implicit
selector, or explicitly-overridden selector.

> 
> When the code exists:
> 
>     mov   rax,data[rip]
> 
> Would this ultimately access CS:[RIP+offset data]?  Or some other
> selector?  Or ... ?

%ds, because it is a plain data reference.  You can use explicit segment
overrides to change this if you wish.

~Andrew
0
Andrew
12/20/2016 1:47:17 PM
On Tuesday, December 20, 2016 at 8:54:18 AM UTC-5, Andrew Cooper wrote:
> On 20/12/2016 13:22, Rick C. Hodgin wrote:
> > I came across a thread the other day in comp.lang.c.  It referred to
> > AMD64's new RIP-relative addressing mode.
> 
> "New" is a matter of perspective, as it 13 years old now.

I was thinking "new ability added to the x86 instruction set."

> >  For some reason I had it
> > stuck in my brain that this was only the target of branch instructions
> > and not data.  I don't even understand why I had that thought, but
> > nonetheless there it is.
> > 
> > In that thread, On Wednesday, December 14, 2016 at 4:01:41 PM UTC-5,
> > Robert Wessel wrote:
> >> RIP for data addresses was added in x86-64.  IP relative addressing
> >> for branches has been there since the 8086.
> > 
> > I was curious.  How does AMD64's operating environment work?
> 
> There is a specific encoding of ModRM/SiB which uses %rip as a base
> register.  All other memory operand construction is the same.
> 
> > Does this new RIP-relative addressing assume a selector like CS:, DS:, or SS:?
> > Or does it work in a flat mode where CS: = DS: = ES: = SS:?  Or ... ?
> 
> It is just a memory operand, and will use the appropriate implicit
> selector, or explicitly-overridden selector.
> 
> > 
> > When the code exists:
> > 
> >     mov   rax,data[rip]
> > 
> > Would this ultimately access CS:[RIP+offset data]?  Or some other
> > selector?  Or ... ?
> 
> %ds, because it is a plain data reference.  You can use explicit segment
> overrides to change this if you wish.

Wouldn't that require that you then use the same selector for CS: and
DS:/ES:?  Or, at least selectors with the same base offset?  Or, as
I've read about LONG mode, is there a 64-bit mode which introduces a
flat model where they are all the same?

Best regards,
Rick C. Hodgin
0
Rick
12/20/2016 1:58:49 PM
"Rick C. Hodgin" <rick.c.hodgin@nospicedham.gmail.com> writes:
>On Tuesday, December 20, 2016 at 8:54:18 AM UTC-5, Andrew Cooper wrote:
>> On 20/12/2016 13:22, Rick C. Hodgin wrote:
>> > When the code exists:
>> > 
>> >     mov   rax,data[rip]
>> > 
>> > Would this ultimately access CS:[RIP+offset data]?  Or some other
>> > selector?  Or ... ?
>> 
>> %ds, because it is a plain data reference.  You can use explicit segment
>> overrides to change this if you wish.
>
>Wouldn't that require that you then use the same selector for CS: and
>DS:/ES:?  Or, at least selectors with the same base offset?  Or, as
>I've read about LONG mode, is there a 64-bit mode which introduces a
>flat model where they are all the same?

In 64-bit mode, CS.base is treated as 0 (the contents of CS.base are
ignored), and ES, DS, and SS are ignored completely, using a base of
0.  So 64-bit mode specifies a flat model, and of course RIP-relative
addressing mode is defined to work in that setting.

FS and GS bases are not ignored and actually used, but everything else
aboyt FS and GS is ignored (or not there; the base is 64 bits long,
and there is no space for anything else).  These are typically used
for thread-local storage.

- anton
-- 
M. Anton Ertl                    Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
0
anton
12/20/2016 2:20:29 PM
On 20/12/2016 13:58, Rick C. Hodgin wrote:
> On Tuesday, December 20, 2016 at 8:54:18 AM UTC-5, Andrew Cooper wrote:
>> On 20/12/2016 13:22, Rick C. Hodgin wrote:
>>> When the code exists:
>>>
>>>     mov   rax,data[rip]
>>>
>>> Would this ultimately access CS:[RIP+offset data]?  Or some other
>>> selector?  Or ... ?
>>
>> %ds, because it is a plain data reference.  You can use explicit segment
>> overrides to change this if you wish.
> 
> Wouldn't that require that you then use the same selector for CS: and
> DS:/ES:?

Not specifically

> Or, at least selectors with the same base offset?

Not specifically

> Or, as I've read about LONG mode, is there a 64-bit mode which introduces a
> flat model where they are all the same?

Long mode deprecates most aspects of segmentation other than:

* CS attributes
* SS attributes
* FS and GS bases

Therefore, most things are flat and the only useful overrides are %fs
and %gs, typically used to implement thread-local-storage.

It isn't very useful to combine %rip-relative addressing with %fs/%gs
overrides because it would require an unnecessarily complicated
compiler/linker to generate the correct interactions, but it is
certainly possible.

~Andrew
0
Andrew
12/20/2016 2:39:20 PM
On Tuesday, December 20, 2016 at 9:39:23 AM UTC-5, Anton Ertl wrote:
> "Rick C. Hodgin" <rick.c.hodgin@nospicedham.gmail.com> writes:
> >On Tuesday, December 20, 2016 at 8:54:18 AM UTC-5, Andrew Cooper wrote:
> >> On 20/12/2016 13:22, Rick C. Hodgin wrote:
> >> > When the code exists:
> >> > 
> >> >     mov   rax,data[rip]
> >> > 
> >> > Would this ultimately access CS:[RIP+offset data]?  Or some other
> >> > selector?  Or ... ?
> >> 
> >> %ds, because it is a plain data reference.  You can use explicit segment
> >> overrides to change this if you wish.
> >
> >Wouldn't that require that you then use the same selector for CS: and
> >DS:/ES:?  Or, at least selectors with the same base offset?  Or, as
> >I've read about LONG mode, is there a 64-bit mode which introduces a
> >flat model where they are all the same?
> 
> In 64-bit mode, CS.base is treated as 0 (the contents of CS.base are
> ignored), and ES, DS, and SS are ignored completely, using a base of
> 0.  So 64-bit mode specifies a flat model, and of course RIP-relative
> addressing mode is defined to work in that setting.

Does this mean all 64-bit binaries are built as bricks that fit into
memory wherever they happen to fit, and then everything in the static
global memory is always referenced as an offset relative to wherever
RIP happens to be at the current time within that brick?

I assume RIP-relative addressing is not used for stack operations, or
for dynamically allocated memory at runtime?

> FS and GS bases are not ignored and actually used, but everything else
> aboyt FS and GS is ignored (or not there; the base is 64 bits long,
> and there is no space for anything else).  These are typically used
> for thread-local storage.

I remember reading about new uses for FS and GS, but never knew what
they were.

My mind is still thinking in a type of "extension of i386 model" mode,
where selectors are increased to 64-bits, but operationally it's
generally the same except for the REX prefixes.  Interesting.

Best regards,
Rick C. Hodgin
0
Rick
12/20/2016 2:46:13 PM
"Rick C. Hodgin" wrote:

> I came across a thread the other day in comp.lang.c.  It referred to
> AMD64's new RIP-relative addressing mode.

The "RIP-relative addressing mode" is also standard on Intel x64
and not just AMD64.


Mike Gonta
look and see - many look but few see

http://mikegonta.com


0
Mike
12/20/2016 4:16:07 PM
On Tuesday, December 20, 2016 at 11:24:34 AM UTC-5, Mike Gonta wrote:
> "Rick C. Hodgin" wrote:
> 
> > I came across a thread the other day in comp.lang.c.  It referred to
> > AMD64's new RIP-relative addressing mode.
> 
> The "RIP-relative addressing mode" is also standard on Intel x64
> and not just AMD64.

Well of course.  But it was first defined by AMD.  Intel rejected the
AMD64 architecture for quite some time due to their Itanium product
line before finally agreeing to support it.  It followed in various
flavors and names until later being finally settled upon.

An interesting naming history:  IA-32 (x86), IA-64 (Itanium) and then
Intel x64, which is smaller than x86. :-) LOL.

Best regards,
Rick C. Hodgin
0
Rick
12/20/2016 4:34:14 PM
On Tue, 20 Dec 2016 08:34:14 -0800 (PST)
"Rick C. Hodgin" <rick.c.hodgin@nospicedham.gmail.com> wrote:

> On Tuesday, December 20, 2016 at 11:24:34 AM UTC-5, Mike Gonta wrote:
> > "Rick C. Hodgin" wrote:
> >   
> > > I came across a thread the other day in comp.lang.c.  It referred
> > > to AMD64's new RIP-relative addressing mode.  
> > 
> > The "RIP-relative addressing mode" is also standard on Intel x64
> > and not just AMD64.  
> 
> Well of course.  But it was first defined by AMD.  Intel rejected the
> AMD64 architecture for quite some time due to their Itanium product
> line before finally agreeing to support it.  It followed in various
> flavors and names until later being finally settled upon.
> 
> An interesting naming history:  IA-32 (x86), IA-64 (Itanium) and then
> Intel x64, which is smaller than x86. :-) LOL.
> 
> Best regards,
> Rick C. Hodgin

It is amd64, but Intel used some other name, (some enhanced memory
technology or so).

-- 
press any key to continue or any other to quit...
0
Melzzzzz
12/20/2016 4:46:31 PM
On 20/12/2016 14:20, Anton Ertl wrote:
> "Rick C. Hodgin" <rick.c.hodgin@nospicedham.gmail.com> writes:
>> On Tuesday, December 20, 2016 at 8:54:18 AM UTC-5, Andrew Cooper wrote:
>>> On 20/12/2016 13:22, Rick C. Hodgin wrote:
>>>> When the code exists:
>>>>
>>>>     mov   rax,data[rip]
>>>>
>>>> Would this ultimately access CS:[RIP+offset data]?  Or some other
>>>> selector?  Or ... ?
>>>
>>> %ds, because it is a plain data reference.  You can use explicit segment
>>> overrides to change this if you wish.
>>
>> Wouldn't that require that you then use the same selector for CS: and
>> DS:/ES:?  Or, at least selectors with the same base offset?  Or, as
>> I've read about LONG mode, is there a 64-bit mode which introduces a
>> flat model where they are all the same?
> 
> In 64-bit mode, CS.base is treated as 0 (the contents of CS.base are
> ignored), and ES, DS, and SS are ignored completely

Almost.  %ss.dpl is the single most important piece of information
available in register state (more commonly known as current privilege
level), and is very definitely still around in 64bit.

%cs.dpl must not be used to determine privilege, due to non-conforming
code segments (something which no-one was sad to see AMD kill in 64bit
mode).

~Andrew

0
Andrew
12/20/2016 4:56:47 PM
Melzzzzz <mel@nospicedham.zzzzz.com> writes:
>On Tue, 20 Dec 2016 08:34:14 -0800 (PST)
>"Rick C. Hodgin" <rick.c.hodgin@nospicedham.gmail.com> wrote:
>> An interesting naming history:  IA-32 (x86), IA-64 (Itanium) and then
>> Intel x64, which is smaller than x86. :-) LOL.

x64 is what Microsoft calls it.

>It is amd64, but Intel used some other name, (some enhanced memory
>technology or so).

AMD first called it x86-64, then switched to AMD64 (paralleling Intel,
which first called its 32-bit architecture i386, and later IA-32).
When Intel first picked AMD64 up, they were still hyping IA-64, so
their first name for AMD64 was IA32e; later they switched to EM64T,
and since 2006 to Intel 64; not sure if that was before or after
Microsoft first using x64, but x64 makes sense in this context: it
does not matter whether x=AMD or x=Intel.

- anton
-- 
M. Anton Ertl                    Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
0
anton
12/20/2016 5:13:17 PM
"Rick C. Hodgin" <rick.c.hodgin@nospicedham.gmail.com> writes:
>Does this mean all 64-bit binaries are built as bricks that fit into
>memory wherever they happen to fit, and then everything in the static
>global memory is always referenced as an offset relative to wherever
>RIP happens to be at the current time within that brick?

Position-independent code is possible (and a motivation and a
motivation for introducing RIP-relative addressing), but not required.

- anton
-- 
M. Anton Ertl                    Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
0
anton
12/20/2016 5:29:31 PM
On Tuesday, December 20, 2016 at 12:39:45 PM UTC-5, Anton Ertl wrote:
> Melzzzzz <mel@nospicedham.zzzzz.com> writes:
> >On Tue, 20 Dec 2016 08:34:14 -0800 (PST)
> >"Rick C. Hodgin" <rick.c.hodgin@nospicedham.gmail.com> wrote:
> >> An interesting naming history:  IA-32 (x86), IA-64 (Itanium) and then
> >> Intel x64, which is smaller than x86. :-) LOL.
> 
> x64 is what Microsoft calls it.

That's surprising from "Wintel." :-)

> >It is amd64, but Intel used some other name, (some enhanced memory
> >technology or so).
> 
> AMD first called it x86-64, then switched to AMD64 (paralleling Intel,
> which first called its 32-bit architecture i386, and later IA-32).
> When Intel first picked AMD64 up, they were still hyping IA-64, so
> their first name for AMD64 was IA32e; later they switched to EM64T,
> and since 2006 to Intel 64; not sure if that was before or after
> Microsoft first using x64, but x64 makes sense in this context: it
> does not matter whether x=AMD or x=Intel.

"A rose by any other name would wither and die."
    -- Peter O'Toole's character Alan Swann from "My Favorite Year."

Another great line from that movie:

    "I HAD ONE LINE!! ... I forgot it."

LOL!

Best regards,
Rick C. Hodgin
0
Rick
12/20/2016 5:48:25 PM
On Tuesday, December 20, 2016 at 12:39:46 PM UTC-5, Anton Ertl wrote:
> "Rick C. Hodgin" <rick.c.hodgin@nospicedham.gmail.com> writes:
> >Does this mean all 64-bit binaries are built as bricks that fit into
> >memory wherever they happen to fit, and then everything in the static
> >global memory is always referenced as an offset relative to wherever
> >RIP happens to be at the current time within that brick?
> 
> Position-independent code is possible (and a motivation and a
> motivation for introducing RIP-relative addressing), but not required.

It makes sense.  And I'll be honest, your explanation here about x64
is the first time that the nomenclature has actually struck a chord
of understanding within me as to why it was used.

I appreciate this exchange today.  My knowledge of x64 has probably
doubled. :-)

Thank you all!

Best regards,
Rick C. Hodgin
0
Rick
12/20/2016 6:05:50 PM
Rick C. Hodgin asked:

>I came across a thread the other day in comp.lang.c.  It referred to
> AMD64's new RIP-relative addressing mode.  For some reason I had it
> stuck in my brain that this was only the target of branch instructions
> and not data.  I don't even understand why I had that thought, but
> nonetheless there it is.
 
> In that thread, On Wednesday, December 14, 2016 at 4:01:41 PM UTC-5,
> Robert Wessel wrote:
>> RIP for data addresses was added in x86-64.  IP relative addressing
>> for branches has been there since the 8086.
 
> I was curious.  How does AMD64's operating environment work?  Does this
> new RIP-relative addressing assume a selector like CS:, DS:, or SS:?
> Or does it work in a flat mode where CS: = DS: = ES: = SS:?  Or ... ?
> 
> When the code exists:
> 
>    mov   rax,data[rip]
> 
> Would this ultimately access CS:[RIP+offset data]?  Or some other
> selector?  Or ... ?

Long Mode (aka x86-64) ignore DS,ES,SS,CS as this four are set to base 
zero and unlimited range, only attributes (PL) can be set for CS.

So yes, any reference with [RIP] act almost the same way as relative 
branch calculation and as I see it this is limited to +/-2GB.
__
wolfgang 
0
wolfgang
12/20/2016 6:06:08 PM
RIP -   This is the full instruction pointer and should be used instead of =
EIP (which will be inaccurate if the address space is larger than 4 GiB, wh=
ich may happen even with 4 GiB or less of RAM).
But, offsets are limited to 32 bits.  This means that only a 4GB window int=
o the potential 64-bit address space can be accessed from a given base valu=
e. This is mainly an issue when accessing static global data. It is standar=
d to access this data using PC-relative addressing using RIP
page 3 from https://www.lri.fr/~filliatr/ens/compil/x86-64.pdf

This help you?
0
UTF
12/21/2016 9:52:41 AM
On Wednesday, December 21, 2016 at 12:42:06 PM UTC-5, C=C4=83t=C4=83lin Geo=
rge Fe=C8=99til=C4=83 wrote:
> RIP -   This is the full instruction pointer and should be used instead o=
f EIP (which will be inaccurate if the address space is larger than 4 GiB, =
which may happen even with 4 GiB or less of RAM).
> But, offsets are limited to 32 bits.  This means that only a 4GB window i=
nto the potential 64-bit address space can be accessed from a given base va=
lue. This is mainly an issue when accessing static global data. It is stand=
ard to access this data using PC-relative addressing using RIP
> page 3 from https://www.lri.fr/~filliatr/ens/compil/x86-64.pdf
>=20
> This help you?

Yes.  Thank you, C=C4=83t=C4=83lin.

Best regards,
Rick C. Hodgin
0
Rick
12/21/2016 5:46:38 PM
Reply: