Byte data interchangability between endian machines

  • Follow


I have a request to locate a standard utility that would run on either
(or both) Intel Linux and SPARC Solaris with which a data management
staff member could pass a data file and have the data bytes swapped
between big-endian and little-endian format.

Is anyone aware of such a utility?
-- 
<URL: http://wiki.tcl.tk/ > In God we trust.
Even if explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.
<URL: mailto:lvirden@yahoo.com > <URL: http://www.purl.org/NET/lvirden/ >
0
Reply lvirden272 (610) 12/4/2003 6:44:43 PM

lvirden@yahoo.com writes:

> I have a request to locate a standard utility that would run on either
> (or both) Intel Linux and SPARC Solaris with which a data management
> staff member could pass a data file and have the data bytes swapped
> between big-endian and little-endian format.
>
> Is anyone aware of such a utility?

"byte data" doesn't have endian problems.

I assume you are trying to move binary data containing integers, shorts,
floats, and other things affected by endianness.

I don't know of a utility, but that's what the XDR
protocol was designed for.

Maybe you want to try to clarify your question...
0
Reply Dan 12/4/2003 7:14:28 PM


On Thu, 04 Dec 2003 18:44:43 +0000, lvirden wrote:

> 
> I have a request to locate a standard utility that would run on either
> (or both) Intel Linux and SPARC Solaris with which a data management
> staff member could pass a data file and have the data bytes swapped
> between big-endian and little-endian format.
> 
> Is anyone aware of such a utility?

man dd   (see the swab option)

0
Reply Steve 12/4/2003 8:14:13 PM

In article <bqnvar$1u0$1@srv38.cas.org>,  <lvirden@yahoo.com> wrote:
>
>I have a request to locate a standard utility that would run on either
>(or both) Intel Linux and SPARC Solaris with which a data management
>staff member could pass a data file and have the data bytes swapped
>between big-endian and little-endian format.
>
>Is anyone aware of such a utility?

Any such utility would have to be specially crafted to suit your
particular data files.

Far better to use a transfer format which is endian-neutral.
E.g. text.

HTH
John
-- 
We had a woodhenge here once but it rotted.
0
Reply newstmp 12/4/2003 8:17:37 PM

<lvirden@yahoo.com> wrote in message news:bqnvar$1u0$1@srv38.cas.org...
>
> I have a request to locate a standard utility that would run on either
> (or both) Intel Linux and SPARC Solaris with which a data management
> staff member could pass a data file and have the data bytes swapped
> between big-endian and little-endian format.
>
> Is anyone aware of such a utility?
> --
> <URL: http://wiki.tcl.tk/ > In God we trust.
> Even if explicitly stated to the contrary, nothing in this posting
> should be construed as representing my employer's opinions.
> <URL: mailto:lvirden@yahoo.com > <URL: http://www.purl.org/NET/lvirden/ >

If the data you are trying to bring over is binary data, with a mix of ints,
shorts, bytes, chars, etc., I can't think of a utility that will work.  For
instance, say the data can be represented by the following C struct:

    struct demo {
        int      a;        /* Assume 32 bit int for demo */
        short b;        /* 16 bits */
        short c;        /* Another 16 bits */
};

Given the 8 bytes, say number 0, 1, 2, 3, 4, 5, 6, 7, the to swap them, the
byte order would become
   3, 2, 1, 0, 5, 4, 7, 6

If the structure was 4 shorts, then the resultant conversion would be:
   1, 0, 3, 2, 5, 4, 7, 6

This is based on my experience converting binary data from a VAX to Sun.  It
assumes that the byte packing is the same on each.

Maybe there is a utility out there that you can define the structure and have
it convert the file.  For us, we have a function that we keep up today as the
structure changes over time.  Fortunatly, we only have a single structure that
we have this issue with.

Brad



0
Reply Brad 12/4/2003 9:58:54 PM

In comp.unix.solaris Steve Wampler <swampler@noao.edu> wrote:
> On Thu, 04 Dec 2003 18:44:43 +0000, lvirden wrote:
>> I have a request to locate a standard utility that would run on
>> either (or both) Intel Linux and SPARC Solaris with which a data
>> management staff member could pass a data file and have the data
>> bytes swapped between big-endian and little-endian format.
>>
>> Is anyone aware of such a utility?

> man dd   (see the swab option)

Aren't we starting to mix two different things here? There is big
versus little endian, (which IIRC is bit ordering) and then there is
byte swapped.  I think swab does the latter.  I didn't see (at first
glance) an option in a dd manpage that suggested it would do the
former.

rick jones
-- 
Process shall set you free from the need for rational thought. 
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to raj in cup.hp.com  but NOT BOTH...
0
Reply Rick 12/5/2003 12:40:51 AM

On Fri, 5 Dec 2003, Rick Jones wrote:

> Aren't we starting to mix two different things here? There is big
> versus little endian, (which IIRC is bit ordering) and then there is

Nope; the most significant bit of all bytes, regardless of the
endiness of the architecture, is always bit 7 (I think its safe
to assume 8 bit bytes these days.  :-).

-- 
Rich Teer, SCNA, SCSA                               .  *   * . * .* .
                                                     .   *   .   .*
President,                                          * .  . /\ ( .  . *
Rite Online Inc.                                     . .  / .\   . * .
                                                    .*.  / *  \  . .
                                                      . /*   o \     .
Voice: +1 (250) 979-1638                            *   '''||'''   .
URL: http://www.rite-online.net                     ******************
0
Reply Rich 12/5/2003 12:58:15 AM

Rich Teer <rich.teer@rite-group.com> writes:
>On Fri, 5 Dec 2003, Rick Jones wrote:

>> Aren't we starting to mix two different things here? There is big
>> versus little endian, (which IIRC is bit ordering) and then there is

>Nope; the most significant bit of all bytes, regardless of the
>endiness of the architecture, is always bit 7 (I think its safe
>to assume 8 bit bytes these days.  :-).

That depends on whether you number the bits from least significant to
most significant (typical for ascii character strings), or from most
significant to least significant (typical for ebcdic character
strings).

0
Reply Neil 12/5/2003 1:29:34 AM

"Neil W Rickert" <rickert+nn@cs.niu.edu> wrote in message
news:bqon1u$91g$1@husk.cso.niu.edu...
> Rich Teer <rich.teer@rite-group.com> writes:
> >On Fri, 5 Dec 2003, Rick Jones wrote:
>
> >> Aren't we starting to mix two different things here? There is big
> >> versus little endian, (which IIRC is bit ordering) and then there is
>
> >Nope; the most significant bit of all bytes, regardless of the
> >endiness of the architecture, is always bit 7 (I think its safe
> >to assume 8 bit bytes these days.  :-).
>
> That depends on whether you number the bits from least significant to
> most significant (typical for ascii character strings), or from most
> significant to least significant (typical for ebcdic character
> strings).
>

Back to conversion with bits between a sparc and a vax.  The vax C and Ada
compilers allocates bits right to left, least significant bit of the byte
first, where as on a Sun, it is the most significant bit first.  So, in going
from Vax to/from Sun, you have to flip each of the bits for the parts of the
struct that are bit elements!

Brad



0
Reply Brad 12/5/2003 3:09:37 AM

Neil W Rickert <rickert+nn@cs.niu.edu> writes:

> Rich Teer <rich.teer@rite-group.com> writes:
> >On Fri, 5 Dec 2003, Rick Jones wrote:
> 
> >> Aren't we starting to mix two different things here? There is big
> >> versus little endian, (which IIRC is bit ordering) and then there is
> 
> >Nope; the most significant bit of all bytes, regardless of the
> >endiness of the architecture, is always bit 7 (I think its safe
> >to assume 8 bit bytes these days.  :-).
> 
> That depends on whether you number the bits from least significant to
> most significant (typical for ascii character strings), or from most
> significant to least significant (typical for ebcdic character
> strings).

And in  addition to the  numbering of the bits,  for interchangability
between machines  you need to take  into account whether the  bit 7 is
transmited first or if it's the bit 0.

 

-- 
__Pascal_Bourguignon__                              .  *   * . * .* .
http://www.informatimago.com/                        .   *   .   .*
                                                    * .  . /\ ( .  . *
Living free in Alaska or in Siberia, a               . .  / .\   . * .
grizzli's life expectancy is 35 years,              .*.  / *  \  . .
but no more than 8 years in captivity.                . /*   o \     .
http://www.theadvocates.org/                        *   '''||'''   .
                                                    ******************
0
Reply Pascal 12/5/2003 10:28:50 AM

lvirden@yahoo.com wrote in message news:<bqnvar$1u0$1@srv38.cas.org>...
> I have a request to locate a standard utility that would run on either
> (or both) Intel Linux and SPARC Solaris with which a data management
> staff member could pass a data file and have the data bytes swapped
> between big-endian and little-endian format.
> 
> Is anyone aware of such a utility?

I'm not aware of anything that will do this, although it would be
trivual to write something to do it **IF** the data ***always***
consisted of 2-byte integers, or always consisted of 4 byte integers.
But if the data is a mix of 2, 4 and 8 byte data types, in an
essentially random order, I think you have a serious problem.

You might like to look at a thread started by me on comp.unix.cray
(but which I copied to one of the Sun or Solaris groups)

http://groups.google.com/groups?dq=&hl=en&lr=&ie=UTF-8&threadm=bbdc5d83.0309040657.11503d0c%40posting.google.com&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DUTF-8%26group%3Dcomp.unix.cray

I was having problems writing data (I wanted 2 and 4-byte data types
in little endian format) for a program of mine:

http:/atlc.sourceforge.net

on a machine (a Cray Y-MP) that had:

sizeof(char)=1
sizeof(short)=8
sizeof(int)=8
sizeof(long)=8

An excellent solution was suggested by Eric Sosman at Sun. This solved
my problem, as I could then write 2 and 4 byte integers on a machine
with only 8-byte shorts. However, Eric's solution had an interesting
side-effect, that *might* be useful to you.

Using Eric's solution, 1, 2 and 4-byte data types could be interpreted
correctly no matter if run on a Intel processor or a SPARC processor,
or read on an Intel or SPARC processor, the data was always
interpreted correctly. (In fact, I don't know what the endianess of
that Cray is). There was no need to worry about what processor was
used to write the data, or what processor was used to read the data. I
could have used Intel, SPARC, Cray, or any other processor. There was
no implicit routines to convert from one to the other, it just
happened. Very convenient for me indeed.

Of course that does not answer your question, but it might mean that
with a bit of re-coding of the program that reads and writes the data
file, you can read and write the data file in a platform independant
manner.

Another option, if you know the exact format of the data type i.e.

char (1 -byte)
int (4-byte)
short (2-byte)
double (8-byte)
etc

is to write a conversion routine yourself. If longs are written, on
the Intel machine they will of course be 4-bytes, rather than 8-bytes
on the SPARC. The easiest way around that is to not use longs, or to
replace by 'long long' There might be an option to convert longs to
8-byte on the compiler run on the Intel machine.

Sorry I can't answer your question, but perhaps that will give you
some ideas. I can't see how any general purpose program can possibly
work unless it knows the size of the data types you wrote. It's not
possible for it to know if the first 4 bytes of your data file consist
of 4 characters, two shorts, or one int.

Dr. David Kirkby.
0
Reply see_my_signature_for 12/5/2003 11:44:59 AM

Dr. David Kirkby writes:
> I'm not aware of anything that will do this, although it would be trivual
> to write something to do it **IF** the data ***always*** consisted of
> 2-byte integers...

dd with the 'swab' keyword will do exactly that.
-- 
John Hasler
john@dhh.gt.org (John Hasler)
Dancing Horse Hill
Elmwood, WI
0
Reply john4584 (1601) 12/5/2003 2:00:33 PM

On Fri, 05 Dec 2003 00:58:15 GMT, Rich Teer <rich.teer@rite-group.com> wrote:
> On Fri, 5 Dec 2003, Rick Jones wrote:
> 
>> Aren't we starting to mix two different things here? There is big
>> versus little endian, (which IIRC is bit ordering) and then there is
> 
> Nope; the most significant bit of all bytes, regardless of the
> endiness of the architecture, is always bit 7 (I think its safe
> to assume 8 bit bytes these days.  :-).

Motorola's hardware docs for their PowerPC-based microcontrollers number
the bytes with bit 7 being the least significant.  They are at least
consistent, as they also have bit 15 or 31 as the lsb for longer types.

I don't know why they do it this way.  Just to be contrary?  The m68k
books I have are done the conventional way so it isn't just an endian
issue.

It is odd, though, to see schematics with chips hooked up "backwards" so
that they will work as their docs specify.


-- 
 -| Bob Hauck
 -| To Whom You Are Speaking
 -| http://www.haucks.org/
0
Reply Bob 12/5/2003 2:08:15 PM

On 2003-12-05, Bob Hauck <postmaster@localhost.localdomain> wrote:

> I don't know why they do it this way. 

Because that's the way IBM has been doing it for hundreds of
years, and the PPC is an IBM design.

> Just to be contrary?

Contrary to what?  When IBM started to number bits that way
40-some years ago, there wasn't anything to which to be
contrary.

-- 
Grant Edwards                   grante             Yow!  "THE LITTLE PINK
                                  at               FLESH SISTERS," I saw them
                               visi.com            at th' FLUROESCENT BULB
                                                   MAKERS CONVENTION...
0
Reply grante (5411) 12/5/2003 3:28:57 PM

On Fri, 05 Dec 2003 00:40:51 +0000, Rick Jones wrote:

> Aren't we starting to mix two different things here? There is big
> versus little endian, (which IIRC is bit ordering) and then there is
> byte swapped.  I think swab does the latter.  I didn't see (at first
> glance) an option in a dd manpage that suggested it would do the
> former.

No, big-versus-little endianness is a byte order issue.  See:

    http://info.astrian.net/jargon/terms/b/big-endian.html

Also (in response to comments by others), the order in which
the bits/bytes/whatever are transmitted is irrelevant, it's
the order in the resulting file (the original statement wasn't
concerned about on-the-fly reordering).  Nor the numbering order
of the bits matter (except in bit arrays), just the ordering
of the significance.  I know of no 'modern' machine that orders
bit significance in a byte from lowest-to-highest (so that
byte 01000000 is 2 instead of 128.  Someone else may, however.

That being said, dd's swab only swaps pairs of 8-bit bytes, so
if the data is anything other than 16-bit values it won't help.
(And if the data is of mixed bit-lengths, others have already
pointed out that no standard utility is likely to have any
hope of handling the data.)

On the other hand, it's pretty easy to grow your own, check
the man page on ntohl, ntohs, htonl, and htons.

I have a utility that handles any byte reordering that can
be expressed in multiples of 256-bytes or less (which is
sufficient for my needs), but it's written in a language most
people haven't heard of (Unicon).

0
Reply Steve 12/5/2003 4:53:29 PM

On 05 Dec 2003 15:28:57 GMT, Grant Edwards <grante@visi.com> wrote:
> On 2003-12-05, Bob Hauck <postmaster@localhost.localdomain> wrote:
> 
>> I don't know why they do it this way. 
> 
> Because that's the way IBM has been doing it for hundreds of
> years, and the PPC is an IBM design.

Ah, hadn't thought of that.  It explains why the 68K line had it the way
I'm used to.

 
> Contrary to what?

Contrary to the perfection of the 68000 series of course!


-- 
 -| Bob Hauck
 -| To Whom You Are Speaking
 -| http://www.haucks.org/
0
Reply postmaster6 (1752) 12/5/2003 5:08:04 PM

On Fri, 05 Dec 2003 09:53:29 -0700, Steve Wampler wrote:

> (so that byte 01000000 is 2 instead of 128

Sigh.  Hopfully, they see it as 64.  (Why I use computers
to do math...)

0
Reply Steve 12/5/2003 5:42:54 PM

In article <Pine.SOL.4.58.0312041655420.20332@zaphod.rite-group.com>,
Rich Teer  <rich.teer@rite-group.com> wrote:
>On Fri, 5 Dec 2003, Rick Jones wrote:
>
>> Aren't we starting to mix two different things here? There is big
>> versus little endian, (which IIRC is bit ordering) and then there is
>
>Nope; the most significant bit of all bytes, regardless of the
>endiness of the architecture, is always bit 7

Not true.  There are cases of bits being numbered either way around.
You are right however in saying this is irrelevant to the big-endian/
little-endian debate.

What "big-endian" means is that multi-byte values are stored in memory
with the most significant byte first, whilst "little-endian" means the
least significant byte is stored first.

It's quite meaningless to talk about the order of bits within a byte.

John
-- 
We had a woodhenge here once but it rotted.
0
Reply newstmp 12/6/2003 12:27:03 PM

In article <87he0fblst.fsf@thalassa.informatimago.com>,
Pascal Bourguignon  <spam@thalassa.informatimago.com> wrote:
[snip]
>And in  addition to the  numbering of the bits,  for interchangability
>between machines  you need to take  into account whether the  bit 7 is
>transmited first or if it's the bit 0.

No you don't.  For all common forms of data connection this is entirely
standardized.

John
-- 
We had a woodhenge here once but it rotted.
0
Reply newstmp 12/6/2003 12:28:32 PM

newstmp@sinodun.org.uk (John Winters) writes:

>In article <87he0fblst.fsf@thalassa.informatimago.com>,
>Pascal Bourguignon  <spam@thalassa.informatimago.com> wrote:
>[snip]
>>And in  addition to the  numbering of the bits,  for interchangability
>>between machines  you need to take  into account whether the  bit 7 is
>>transmited first or if it's the bit 0.

>No you don't.  For all common forms of data connection this is entirely
>standardized.

Indeed; it may surprise some that ethernet frames are send least
significant bit first.  (The broadcast bit is the first bit send; it's
the lsb of the first byte)

Casper
-- 
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
0
Reply Casper 12/6/2003 1:06:51 PM

In article <bqshun$3ls$1@bennet.home.linuxemporium.co.uk>, John Winters wrote:

> What "big-endian" means is that multi-byte values are stored in memory
> with the most significant byte first, whilst "little-endian" means the
> least significant byte is stored first.
> 
> It's quite meaningless to talk about the order of bits within a byte.

Unless the bits are transferred one at a time, like say with Ethernet or a
UART, or a disk read-write head, or CAN, or I2C, or SPI, or ....

-- 
Grant Edwards                   grante             Yow!  Did YOU find a
                                  at               DIGITAL WATCH in YOUR box
                               visi.com            of VELVEETA?
0
Reply Grant 12/6/2003 10:46:24 PM

Grant Edwards writes:
> Unless the bits are transferred one at a time, like say with Ethernet or
> a UART, or a disk read-write head, or CAN, or I2C, or SPI, or ....

But then the bit order is determined by the peripheral hardware, not the
cpu.
-- 
John Hasler
john@dhh.gt.org (John Hasler)
Dancing Horse Hill
Elmwood, WI
0
Reply john4584 (1601) 12/6/2003 11:36:46 PM

John Hasler <john@dhh.gt.org> writes:

> Grant Edwards writes:
> > Unless the bits are transferred one at a time, like say with Ethernet or
> > a UART, or a disk read-write head, or CAN, or I2C, or SPI, or ....
> 
> But then the bit order is determined by the peripheral hardware, not the
> cpu.

Not always. The  original Woz Machine was primitive  enough to let the
CPU output and input the bits itself.  

-- 
__Pascal_Bourguignon__                              .  *   * . * .* .
http://www.informatimago.com/                        .   *   .   .*
                                                    * .  . /\  ()  . *
Living free in Alaska or in Siberia, a               . .  / .\   . * .
grizzli's life expectancy is 35 years,              .*.  / *  \  . .
but no more than 8 years in captivity.                . /*   o \     .
http://www.theadvocates.org/                        *   '''||'''   .
SCO Spam-magnet: postmaster@sco.com                 ******************
0
Reply spam173 (586) 12/7/2003 6:42:54 PM

Pascal Bourguignon wrote:

> John Hasler <john@dhh.gt.org> writes:
> 
>> Grant Edwards writes:
>> > Unless the bits are transferred one at a time, like say with Ethernet
>> > or a UART, or a disk read-write head, or CAN, or I2C, or SPI, or ....
>> 
>> But then the bit order is determined by the peripheral hardware, not the
>> cpu.
> 
> Not always. The  original Woz Machine was primitive  enough to let the
> CPU output and input the bits itself.
> 

One way to make things accessible is to use the network-ordering code. In C,
the calls are

ntoh Network To Host
hton Host To Network
ntohl Network To Host Long
htonl Host To Network Long

This is done in the fastest mode possible for the CPU by the C libraries.
The byte order for networks is Big Endian (since most of the machies that
worked on the internet backbone were Big Endian. x86 is little-endian, but
PPC HP Sun RISC etc are all big-endian.
0
Reply mark.hackett (31) 12/7/2003 9:32:22 PM

23 Replies
46 Views

(page loaded in 0.263 seconds)

Similiar Articles:












7/29/2012 3:22:19 AM


Reply: