I am working in 16-bit 808x assembler (yes, on an old 808x CPU) and
adding two 8-bit numbers to each other and saturating the result.
Here is the best I could come up with:
; AL and DL have unsigned 8-bit values to add
; DL is actually extended into DX (ie. 00xxh)
cbw ; Make AL a signed word
add dx,ax ; Do word sized add, into DX
jns @@HIGH ; jump if result not signed/negative
xor dx,dx ; Underflowed; make 0
@@HIGH:
or dh,dh ; Did it overflow?
jz @@OK ; if not, we're ok
mov dx,00FFh ; Overflowed; force our saturation value (FFh)
@@OK:
This is part of an ADPCM inner loop.
Is this about as fast as I can expect? Or am I missing an obvious
technique? (I am not an 808x assembler expert)
|
|
0
|
|
|
|
Reply
|
Jim
|
4/10/2008 9:59:55 PM |
|
In article <e9aa9511-4087-485f-822d-6dad776bce9a@c65g2000hsa.googlegroups.com>,
Jim Leonard <spamtrap@crayne.org> writes:
> I am working in 16-bit 808x assembler (yes, on an old 808x CPU) and
> adding two 8-bit numbers to each other and saturating the result.
> Here is the best I could come up with:
>
> ; AL and DL have unsigned 8-bit values to add
And the result should be unsigned also I suppose. In that case you
only need to test for overflow. Unsigned overflow sets the carry
flag. Extend the carry flag to all ones (FFh) for overflow and
zero for the normal case. Then it goes something like this:
add dl,al ; do byte size add
sbb dh,dh ; carry -> FFh no carry -> 0
or dl,dh ; saturate
; xor dh.dh ; Want word size result???
|
|
0
|
|
|
|
Reply
|
Dick
|
4/11/2008 2:32:55 AM
|
|
Jim Leonard wrote:
> I am working in 16-bit 808x assembler (yes, on an old 808x CPU) and
> adding two 8-bit numbers to each other and saturating the result.
> Here is the best I could come up with:
>
> ; AL and DL have unsigned 8-bit values to add
> ; DL is actually extended into DX (ie. 00xxh)
> cbw ; Make AL a signed word
> add dx,ax ; Do word sized add, into DX
> jns @@HIGH ; jump if result not signed/negative
> xor dx,dx ; Underflowed; make 0
> @@HIGH:
> or dh,dh ; Did it overflow?
> jz @@OK ; if not, we're ok
> mov dx,00FFh ; Overflowed; force our saturation value (FFh)
> @@OK:
>
> This is part of an ADPCM inner loop.
>
> Is this about as fast as I can expect? Or am I missing an obvious
> technique? (I am not an 808x assembler expert)
What's wrong with the obvious approach?
add dl,al ; Carry set if it overflows!
sbb al,al ; AL = 0xff if overflow
or dl,al ; Turns any overflow into 0xff
OK?
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
4/11/2008 7:41:42 AM
|
|
On Apr 11, 2:41 am, Terje Mathisen <spamt...@crayne.org> wrote:
> Jim Leonard wrote:
> > I am working in 16-bit 808x assembler (yes, on an old 808x CPU) and
> > adding two 8-bit numbers to each other and saturating the result.
> > Here is the best I could come up with:
>
> > ; AL and DL have unsigned 8-bit values to add
> > ; DL is actually extended into DX (ie. 00xxh)
> > cbw ; Make AL a signed word
> > add dx,ax ; Do word sized add, into DX
> > jns @@HIGH ; jump if result not signed/negative
> > xor dx,dx ; Underflowed; make 0
> > @@HIGH:
> > or dh,dh ; Did it overflow?
> > jz @@OK ; if not, we're ok
> > mov dx,00FFh ; Overflowed; force our saturation value (FFh)
> > @@OK:
>
> > This is part of an ADPCM inner loop.
>
> > Is this about as fast as I can expect? Or am I missing an obvious
> > technique? (I am not an 808x assembler expert)
>
> What's wrong with the obvious approach?
>
> add dl,al ; Carry set if it overflows!
> sbb al,al ; AL = 0xff if overflow
> or dl,al ; Turns any overflow into 0xff
>
> OK?
Yes -- I was missing the forest for the trees, again. Thanks! (and
thanks to Mr. Wesseling as well).
|
|
0
|
|
|
|
Reply
|
Jim
|
4/11/2008 6:05:19 PM
|
|
I wouldn't do it that way.
I would add the two supposedly unsigned 8-bit integers and test the 9
th bit for 1, which if present, the addition has overflowed.
Note that any non-overflow result of 255 is for your purposes
indistinguishable from an overflow,
So you just add, test 9th bit, jump around or load or OR 255h.
|
|
0
|
|
|
|
Reply
|
Terence
|
4/11/2008 11:40:10 PM
|
|
On Apr 12, 9:40�am, Terence <spamt...@crayne.org> wrote:
> I wouldn't do it that way.
> I would add the two supposedly unsigned 8-bit integers and test the 9
> th bit for 1, which if present, the addition has overflowed.
>
> Note that any non-overflow result of 255 is for your purposes
> indistinguishable from an overflow,
> So you just add, test 9th bit, jump around or load or OR 255h.
Make that last "255h" into 255 or else 0FFh! (Sigh!)
|
|
0
|
|
|
|
Reply
|
Terence
|
4/12/2008 6:28:22 AM
|
|
Terence wrote:
>> I wouldn't do it that way.
>> I would add the two supposedly unsigned 8-bit integers and test the 9
>> th bit for 1, which if present, the addition has overflowed.
>> Note that any non-overflow result of 255 is for your purposes
>> indistinguishable from an overflow,
>> So you just add, test 9th bit, jump around or load or OR 255h.
> Make that last "255h" into 255 or else 0FFh! (Sigh!)
ok. :)
First I thought: 'what's wrong by using the Carry-flag ? ',
but now I may see the flaw in the whole story.
The result can turn out to be FF(h) without an overflow,
so using FF(h) as overflow indication may lead to errors.
;just use carry-flag as indication:
ADD DL,AL
JC error
;or expand it to a true result
MOV DH,0 ;
ADD DL,AL ;also possible: MOVZX DX,AL
ADC DH,0 ;DH NZ meens overflow but DX holds true value
;or use just bit0 of DH for overflow (a bit shorter/faster):
ADD DL,AL
SETC DH ;DH NZ means overflow
;or if DL should just be clamped to FF(h):
ADD DL,AL
SETC DH ;becomes 0 or 1
NEG DH ;becomes 0 or FF
OR DL,DH ;DL keeps value or becomes FF(h)
SETcc can avoid branches and gain speed by this,
but it will not work on older CPUs.
(my example suffers from partial register stalls and
dependencies, so it should by used within other pairing
instructions for full speed),
__
wolfgang
|
|
0
|
|
|
|
Reply
|
Wolfgang
|
4/12/2008 10:51:54 AM
|
|
Terence <spamtrap@crayne.org> wrote in part:
> On Apr 12, 9:40�am, Terence <spamt...@crayne.org> wrote:
>> I wouldn't do it that way. I would add the two supposedly
>> unsigned 8-bit integers and test the 9 th bit for 1,
>> which if present, the addition has overflowed.
CF is also set with two byte adds.
>> Note that any non-overflow result of 255 is for your
>> purposes indistinguishable from an overflow, So you just
>> add, test 9th bit, jump around or load or OR 255h.
> Make that last "255h" into 255 or else 0FFh! (Sigh!)
Well, the OP did specify an old 808x which gives a more
complex answer:
Terje's clever SBB/OR is fastest on modern CPUs by a
great deal since there are no poorly predictable branches.
Your JC saves one instruction (two byte-transfers),
but only on the non-saturate case. For the saturate case,
it costs two more instr ( MOV/JMP ) [4 bytes].
Optimum will depend on the data.
-- Robert
|
|
0
|
|
|
|
Reply
|
Robert
|
4/12/2008 7:29:18 PM
|
|
On 11 Apr, 08:41, Terje Mathisen <spamt...@crayne.org> wrote:
> What's wrong with the obvious approach?
>
> � add al,dl � � ; Carry set if it overflows!
> � sbb al,al � � ; AL = 0xff if overflow
> � or dl,al � � �; Turns any overflow into 0xff
Very neat. Now, what about the signed case (saturating at +127 or
-128)? The best I've managed is:
add al,dl ; Signed addition
jno ok ; Jump if no overflow
sbb al,al ; al is 0x00 or 0xff
xor al,127 ; Saturate to -128 or +127
ok:
Richard.
http://www.rtrussell.co.uk/
To reply by email change 'news' to my forename.
|
|
0
|
|
|
|
Reply
|
Richard
|
4/12/2008 8:52:03 PM
|
|
On Apr 13, 5:29�am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
> CF is also set with two byte adds.
But this never occurs with adding one byte to one bye in two-byte
registers (with zero bits to the left). The objective is to count to
no more than 254 since 255 itself is the overflow signal.
|
|
0
|
|
|
|
Reply
|
Terence
|
4/13/2008 12:03:07 AM
|
|
"Terence" <spamtrap@crayne.org> wrote in message
news:55e84590-158c-4042-b75c-1bf284404b9d@p25g2000pri.googlegroups.com...
> On Apr 13, 5:29 am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
>
> > CF is also set with two byte adds.
>
> But this never occurs with adding one byte to one bye in two-byte
> registers (with zero bits to the left). The objective is to count to
> no more than 254 since 255 itself is the overflow signal.
>
Do you like these two changes to "Terje's solution"?
inc al
adc dl,al ; Carry set if it overflows!
sbb al,al ; AL = 0xff if overflow
or dl,al ; Turns any overflow into 0xff
Rod Pemberton
|
|
0
|
|
|
|
Reply
|
Rod
|
4/13/2008 3:06:56 AM
|
|
"Terence" <spamtrap@crayne.org> wrote in message
news:55e84590-158c-4042-b75c-1bf284404b9d@p25g2000pri.googlegroups.com...
> On Apr 13, 5:29 am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
>
> > CF is also set with two byte adds.
>
> But this never occurs with adding one byte to one bye in two-byte
> registers (with zero bits to the left). The objective is to count to
> no more than 254 since 255 itself is the overflow signal.
>
Oh, you didn't like the prior two changes? (inc doesn't update carry...)
Do you like these two changes (instead) to "Terje's solution"?
add al, 1
adc dl,al ; Carry set if it overflows!
sbb al,al ; AL = 0xff if overflow
or dl,al ; Turns any overflow into 0xff
Rod Pemberton
|
|
0
|
|
|
|
Reply
|
Rod
|
4/13/2008 3:10:57 AM
|
|
Terence wrote:
> On Apr 13, 5:29 am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
>
>> CF is also set with two byte adds.
>
> But this never occurs with adding one byte to one bye in two-byte
> registers (with zero bits to the left). The objective is to count to
> no more than 254 since 255 itself is the overflow signal.
That's a red herring:
It really doesn't matter if you say:
if ((sum = a+b) > 254) sum = 255;
or if you write:
if ((sum = a+b) > 255) sum = 255;
Please notice that as long as the sum is above 254, saturation will
always return 255, and at that point there is no way to determine if
this was the result of saturation after overflow, or simply because the
sum actually was 255!
OK?
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
4/13/2008 1:00:06 PM
|
|
Richard Russell wrote:
> On 11 Apr, 08:41, Terje Mathisen <spamt...@crayne.org> wrote:
>> What's wrong with the obvious approach?
>>
>> add al,dl ; Carry set if it overflows!
>> sbb al,al ; AL = 0xff if overflow
>> or dl,al ; Turns any overflow into 0xff
>
> Very neat. Now, what about the signed case (saturating at +127 or
> -128)? The best I've managed is:
>
> add al,dl ; Signed addition
> jno ok ; Jump if no overflow
> sbb al,al ; al is 0x00 or 0xff
> xor al,127 ; Saturate to -128 or +127
> ok:
That is pretty neat code, I like it!
The generic way to handle arbitrary clamping values is something like this:
;; if ( x < MIN) x = MIN;
;; else if (x > MAX) x = MAX;
cmp eax,MIN ; Carry set if (eax < MIN)
sbb ebx,ebx ; EBX = -1 if underflow
cmp eax,MAX+1 ; Carry clear if (eax > MAX)
adc ebx,ebx ; Merge carry with previous result
At this point there are 4 possible combinations:
second 0 second 1
first 0 OVFL OK
first 1 impossible UNDFL
These four corresponds to EBX = 0, 1, -2, -1 which means that it is
possible to store the original value into a four-element table, then use
EBX to retrieve either the same value back or the clamped results:
cmp eax,MIN ; Carry set if (eax < MIN)
sbb ebx,ebx ; EBX = -1 if underflow
cmp eax,MAX+1 ; Carry clear if (eax > MAX)
adc ebx,ebx ; Merge carry with previous result
mov clamp_table[4], eax
mov eax,clamp_table[ebx*4+8]
This code will only be faster than the naive approach if clamped values
are both common and non-predictable, if this isn't true then it is much
better to simply use the generic C-style if/else if/else approach!
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
4/13/2008 1:31:59 PM
|
|
"Terje Mathisen" <spamtrap@crayne.org> wrote in message
news:rrSdnYmBeZ1Knp_VnZ2dneKdnZydnZ2d@giganews.com...
> Terence wrote:
> > On Apr 13, 5:29 am, Robert Redelmeier <red...@ev1.net.invalid> wrote:
> >
> >> CF is also set with two byte adds.
> >
> > But this never occurs with adding one byte to one bye in two-byte
> > registers (with zero bits to the left). The objective is to count to
> > no more than 254 since 255 itself is the overflow signal.
>
> That's a red herring:
>
> It really doesn't matter if you say:
>
> if ((sum = a+b) > 254) sum = 255;
>
> or if you write:
>
> if ((sum = a+b) > 255) sum = 255;
>
> Please notice that as long as the sum is above 254, saturation will
> always return 255, and at that point there is no way to determine if
> this was the result of saturation after overflow, or simply because the
> sum actually was 255!
>
> OK?
>
Let's plug some values into the code you posted:
> add dl,al ; Carry set if it overflows!
> sbb al,al ; AL = 0xff if overflow
> or dl,al ; Turns any overflow into 0xff
if (al+dl) = 0, result al=0, dl=0
if 1< (al+dl) < 254, result al=0, dl=dl<>0
if (al+dl) = 255, result al=0, dl=dl=0xff
if 255 < (al+dl), result al=0xff, dl=0xff
dl is 0xff at sum > 254, but al is 0xff at sum > 255 ...
> Please notice that as long as the sum is above 254, saturation will
> always return 255,
False, if one uses al. True, if one uses dl.
> and at that point there is no way to determine if
> this was the result of saturation after overflow, or simply because the
> sum actually was 255!
False, if one uses al. True, if one uses dl.
If Jim needs to distinguish 255 from overflow as Terence noted, perhaps this
is sufficient:
add dl,al ; Carry set if it overflows!
sbb dl,dl ; DL = 0xff if overflow
if (al+dl) = 0, result dl=0, al=al
if 1< (al+dl) < 254, result dl=0, al=al
if (al+dl) = 255, result dl=0, al=al
if 255 < (al+dl), result dl=0xff, al=al
Rod Pemberton
|
|
0
|
|
|
|
Reply
|
Rod
|
4/13/2008 3:27:29 PM
|
|
On Apr 13, 8:27 am, "Rod Pemberton" <spamt...@crayne.org> wrote:
> "Terje Mathisen" <spamt...@crayne.org> wrote in message
....
While we're at it, how about complementing this with a similar
solution for signed saturation (-128 to +127) of signed bytes for the
fun of it? :)
Alex
|
|
0
|
|
|
|
Reply
|
Alexei
|
4/14/2008 12:19:25 AM
|
|
"Alexei A. Frounze" <spamtrap@crayne.org> wrote in message
news:58afc87d-dcb4-4688-97d6-be2c044c8828@s13g2000prd.googlegroups.com...
> On Apr 13, 8:27 am, "Rod Pemberton" <spamt...@crayne.org> wrote:
> > "Terje Mathisen" <spamt...@crayne.org> wrote in message
> ...
>
> While we're at it, how about complementing this with a similar
> solution for signed saturation (-128 to +127) of signed bytes for the
> fun of it? :)
>
??? How's your server?
http://groups.google.com/group/comp.lang.asm.x86/msg/36af00ed6b8201ae
http://groups.google.com/group/comp.lang.asm.x86/msg/b994b1ae492f9833
RP
|
|
0
|
|
|
|
Reply
|
Rod
|
4/14/2008 6:41:38 AM
|
|
On Apr 13, 11:41 pm, "Rod Pemberton" <spamt...@crayne.org> wrote:
> "Alexei A. Frounze" <spamt...@crayne.org> wrote in messagenews:58afc87d-dcb4-4688-97d6-be2c044c8828@s13g2000prd.googlegroups.com...
>
> > On Apr 13, 8:27 am, "Rod Pemberton" <spamt...@crayne.org> wrote:
> > > "Terje Mathisen" <spamt...@crayne.org> wrote in message
> > ...
>
> > While we're at it, how about complementing this with a similar
> > solution for signed saturation (-128 to +127) of signed bytes for the
> > fun of it? :)
>
> ??? How's your server?
>
> http://groups.google.com/group/comp.lang.asm.x86/msg/36af00ed6b8201aehttp://groups.google.com/group/comp.lang.asm.x86/msg/b994b1ae492f9833
>
> RP
Missed the posts. Still I was wondering if someone came up with a
compact method not involving jumps and LUTs.
Alex
|
|
0
|
|
|
|
Reply
|
Alexei
|
4/14/2008 10:39:12 AM
|
|
On Apr 13, 10:27 am, "Rod Pemberton" <spamt...@crayne.org> wrote:
> If Jim needs to distinguish 255 from overflow as Terence noted, perhaps this
> is sufficient:
I do not need to distinguish 255 from overflow.
The quick background for this is that it is the clamping function for
adding a signed 8-bit integer to an unsigned 8-bit integer. The
function is an adapted ADPCM routine; whereas most ADPCM works on
signed 16-bit samples, I have a need for it to work on unsigned 8-bit
samples.
The inner loop of the routine has a previous unsigned sample, and a
new signed sample, and needs to add them together to get the new
sample. The result needs to clamp at both 0 and 255.
With this in mind, what is the best solution, assuming the previous
unsigned sample is in DL and the new signed sample is in AL?
|
|
0
|
|
|
|
Reply
|
Jim
|
4/14/2008 1:58:12 PM
|
|
In article <7f1061af-6435-47c0-b2e3-38b64cb1f044@a22g2000hsc.googlegroups.com>
spamtrap@crayne.org "Jim Leonard " writes:
> On Apr 13, 10:27 am, "Rod Pemberton" <spamt...@crayne.org> wrote:
> > If Jim needs to distinguish 255 from overflow as Terence noted, perhaps this
> > is sufficient:
>
> I do not need to distinguish 255 from overflow.
>
> The quick background for this is that it is the clamping function for
> adding a signed 8-bit integer to an unsigned 8-bit integer. The
> function is an adapted ADPCM routine; whereas most ADPCM works on
> signed 16-bit samples, I have a need for it to work on unsigned 8-bit
> samples.
> The inner loop of the routine has a previous unsigned sample, and a
> new signed sample, and needs to add them together to get the new
> sample. The result needs to clamp at both 0 and 255.
>
> With this in mind, what is the best solution, assuming the previous
> unsigned sample is in DL and the new signed sample is in AL?
Just to be absolutely clear about the problem (and without any
attempt at optimising the code), can you confirm that the below
is what you're after?
(unsigned sample in DL, signed "update" in AL)
sub dh, dh ;DH = 0
cbw ;sign-extend AL to AX
add dx, ax ;signed addition
Now:
if DH == 0, nothing to do: ans in DL
if DH == FF, -ve overflow so clamp to 0: set DL = 0
if DH == 1, +ve overflow so clamp to FF: set DL = FF
Is that correct?
Pete
--
"We have not inherited the earth from our ancestors,
we have borrowed it from our descendants."
|
|
0
|
|
|
|
Reply
|
pete
|
4/15/2008 5:11:19 AM
|
|
On Apr 15, 12:11 am, pete <spamt...@crayne.org> wrote:
> Just to be absolutely clear about the problem (and without any
> attempt at optimizing the code), can you confirm that the below
> is what you're after?
>
> (unsigned sample in DL, signed "update" in AL)
>
> sub dh, dh ;DH = 0
> cbw ;sign-extend AL to AX
> add dx, ax ;signed addition
> Now:
> if DH == 0, nothing to do: ans in DL
> if DH == FF, -ve overflow so clamp to 0: set DL = 0
> if DH == 1, +ve overflow so clamp to FF: set DL = FF
>
> Is that correct?
Yes, that's exactly correct. My beginner's attempt at applying the
clamping has been done with JMPs, which are really costly in an inner
loop (especially on my lowly 808x) and I feel that I'm missing
something obvious.
Was Terje's solution:
add dl,al ; Carry set if it overflows!
sbb al,al ; AL = 0xff if overflow
or dl,al ; Turns any overflow into 0xff
....the right one? I see overflow handling but not underflow...
|
|
0
|
|
|
|
Reply
|
Jim
|
4/15/2008 2:30:34 PM
|
|
In article <9c1c6a51-9572-4ffb-9546-b94987c1720f@l42g2000hsc.googlegroups.com>
spamtrap@crayne.org "Jim Leonard " writes:
> On Apr 15, 12:11 am, pete <spamt...@crayne.org> wrote:
> > Just to be absolutely clear about the problem (and without any
> > attempt at optimizing the code), can you confirm that the below
> > is what you're after?
> >
> > (unsigned sample in DL, signed "update" in AL)
> >
> > sub dh, dh ;DH = 0
> > cbw ;sign-extend AL to AX
> > add dx, ax ;signed addition
> > Now:
> > if DH == 0, nothing to do: ans in DL
> > if DH == FF, -ve overflow so clamp to 0: set DL = 0
> > if DH == 1, +ve overflow so clamp to FF: set DL = FF
> >
> > Is that correct?
>
> Yes, that's exactly correct. My beginner's attempt at applying the
> clamping has been done with JMPs, which are really costly in an inner
> loop (especially on my lowly 808x) and I feel that I'm missing
> something obvious.
>
> Was Terje's solution:
>
> add dl,al ; Carry set if it overflows!
> sbb al,al ; AL = 0xff if overflow
> or dl,al ; Turns any overflow into 0xff
>
> ...the right one? I see overflow handling but not underflow...
Well, I'd guess that Terje has more expertise in his little
finger that I in my entire body :-) but below is the best I can
come up with -- still a jmp in there though...
; unsigned sample in DL, signed "update" in AL
sub dh, dh
cbw
add dx, ax
or dh, dh
jz done
; either DH == FF (-ve overflow) or DH == 1 (+ve overflow)
sar dh, 1 ;now DH is 0 or FF, but the wrong way round
not dh ;invert it
mov dl, dh
done:
Perhaps (certainly!) someone can improve on that?
Pete
--
"We have not inherited the earth from our ancestors,
we have borrowed it from our descendants."
|
|
0
|
|
|
|
Reply
|
pete
|
4/17/2008 5:05:43 PM
|
|
pete wrote:
> In article <9c1c6a51-9572-4ffb-9546-b94987c1720f@l42g2000hsc.googlegroups.com>
>> Was Terje's solution:
>>
>> add dl,al ; Carry set if it overflows!
>> sbb al,al ; AL = 0xff if overflow
>> or dl,al ; Turns any overflow into 0xff
>>
>> ...the right one? I see overflow handling but not underflow...
That didn't even try to handle a unsigned/signed mix.
>
> Well, I'd guess that Terje has more expertise in his little
This only means I have more scars and have made more mistakes than most
of the rest of you.
> finger that I in my entire body :-) but below is the best I can
> come up with -- still a jmp in there though...
>
> ; unsigned sample in DL, signed "update" in AL
> sub dh, dh
> cbw
> add dx, ax
> or dh, dh
> jz done
> ; either DH == FF (-ve overflow) or DH == 1 (+ve overflow)
> sar dh, 1 ;now DH is 0 or FF, but the wrong way round
> not dh ;invert it
> mov dl, dh
> done:
What about converting the problem (sort of) to an unsigned case?
; The two first instructions can be skipped if the high
; halfs are known to be zero
and dx,255
and ax,255
; Move the signed update into the [0-255] range:
add ax,128
add ax,dx ; Result must be in [0..510] range, of which
; 128..383 is in range
; The trivial way to clamp this is to adjust & branch:
sub ax,128
jc underflow
cmp ax,256
jnc overflow
Can we do this differently? Sure, but only if clamping is both common
and non-predictable:
sub ax,128
mov dx,0xff00
cmovc al,dl
cmp ax,256
cmovnc al,dh
How about doing it without conditional moves?
sub ax,128
sbb dx,dx ; -1 if underflow
cmp ax,256
sbb bx,bx
xor dx,-1
and ax,dx
or ax,bx
Anyway, unless clamping is very common, it is hard to beat the branch
predictors!
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
4/17/2008 6:14:08 PM
|
|
On 17 Apr, 20:14, Terje Mathisen <spamt...@crayne.org> wrote:
> pete wrote:
> >> Was Terje's solution:
>
> >> � add dl,al � � ; Carry set if it overflows!
> >> � sbb al,al � � ; AL = 0xff if overflow
> >> � or dl,al � � �; Turns any overflow into 0xff
>
> >> ...the right one? �I see overflow handling but not underflow...
>
[...]
>
> Anyway, unless clamping is very common, it is hard to beat the branch
> predictors!
>
I don't think the 808x CPU has a branch predictor. I guess one could
just add up the cycles used for each instruction on such an old CPU
and come up with the best performing code for various statistics of
the data. It should be pretty much deterministic (does it even use
cache memory?).
BR,
Danjel McGougan
|
|
0
|
|
|
|
Reply
|
Danjel
|
4/18/2008 9:31:30 AM
|
|
Danjel McGougan wrote:
> On 17 Apr, 20:14, Terje Mathisen <spamt...@crayne.org> wrote:
>> pete wrote:
>>>> Was Terje's solution:
>>>> add dl,al ; Carry set if it overflows!
>>>> sbb al,al ; AL = 0xff if overflow
>>>> or dl,al ; Turns any overflow into 0xff
>>>> ...the right one? I see overflow handling but not underflow...
> [...]
>> Anyway, unless clamping is very common, it is hard to beat the branch
>> predictors!
>>
>
> I don't think the 808x CPU has a branch predictor. I guess one could
Ouch, I forgot about the 8088/8086 target!
> just add up the cycles used for each instruction on such an old CPU
> and come up with the best performing code for various statistics of
> the data. It should be pretty much deterministic (does it even use
> cache memory?).
The only cache at all is a 6/8 byte prefetch queue for instruction
bytes, but on 8088 that queue is very often empty since it has to
compete with all memory accesses.
The performance is easy to calculate though:
Without taken branches:
4 cycles per byte touched, code & data.
Taken branches were something like 4-8 cycles afair, might be wrong.
; Input: AL = signed byte, DL = unsigned byte
; Output: AL unsigned clamped byte
xor dx,dx
next: ; bytes
mov dl,[bx+si] ; 2+1
lodsb ; 1+1
cbw ; 1 -128 to 127
add ax,dx ; 2 -128 to 383
js underflow ; 2
test ah,ah ; 2
jnz overflow ; 2
store:
stosb ; 1+1 Store clamped result
jmp next ; 2+t
This looks like 14 bytes total, i.e. about 56 cycles for a nonclamped
result.
underflow:
xor ax,ax ; 2
stosb ; 1+1
jmp next ; 2+t
Underflow adds a taken branch but get rid of 2 code bytes, so we're
talking about +/- zero extra cost.
overflow:
mov al,255 ; 3
stosb ; 1+1
jmp next ; 2+t
Overflow requires 3 extra code bytes and a taken branch, i.e.
approximately 16+ cycles.
If most samples turn out to be in range, then we should simplify the
code to optimize this path even further:
xor dx,dx
next: ; bytes
mov dl,[bx+si] ; 2+1
lodsb ; 1+1
cbw ; 1 -128 to 127
add ax,dx ; 2 -128 to 383
test ah,0ffh
jnz clamp
stosb
jmp next
clamp:
js underflow
mov al,255
stosb
jmp next
underflow:
xor ax,ax
stosb
jmp next
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
4/18/2008 6:21:01 PM
|
|
On the real 8088 IBM PC the cycle count is total bytes in the
instruction and data being referenced times the wait states. The PC
had so many waits on memory access that only a few instructions such
as multiply and divide would take more CPU cycles than the fetch.
On Fri, 18 Apr 2008 02:31:30 -0700 (PDT), Danjel McGougan
<spamtrap@crayne.org> wrote:
>On 17 Apr, 20:14, Terje Mathisen <spamt...@crayne.org> wrote:
>> pete wrote:
>> >> Was Terje's solution:
>>
>> >> � add dl,al � � ; Carry set if it overflows!
>> >> � sbb al,al � � ; AL = 0xff if overflow
>> >> � or dl,al � � �; Turns any overflow into 0xff
>>
>> >> ...the right one? �I see overflow handling but not underflow...
>>
>[...]
>>
>> Anyway, unless clamping is very common, it is hard to beat the branch
>> predictors!
>>
>
>I don't think the 808x CPU has a branch predictor. I guess one could
>just add up the cycles used for each instruction on such an old CPU
>and come up with the best performing code for various statistics of
>the data. It should be pretty much deterministic (does it even use
>cache memory?).
>
>BR,
>Danjel McGougan
|
|
0
|
|
|
|
Reply
|
dave
|
4/18/2008 8:57:11 PM
|
|
On Apr 18, 3:57 pm, dave <spamt...@crayne.org> wrote:
> On the real 8088 IBM PC the cycle count is total bytes in the
> instruction and data being referenced times the wait states. The PC
> had so many waits on memory access that only a few instructions such
> as multiply and divide would take more CPU cycles than the fetch.
....but they still had to execute. Just because an opcode takes 4 or 8
cycles to fetch doesn't mean its execution time is magically erased.
Since the (tiny) prefetch queue can be full after a slow operation,
optimization still matters.
I originally asked about the topic because a jump clears the prefetch
queue, and was trying to find a way to clamp without jumping. I have
to ask, though: The penalty on my CPU for jumping is 1. clearing the
prefetch queue and 2. 17 cycles. I will have to look at Terje's
suggestions and see if the solutions he provided actually beat that,
which may have been what he was referring to when he wrote "unless
clamping is very common, it is hard to beat the branch
predictors!"
|
|
0
|
|
|
|
Reply
|
Jim
|
4/21/2008 2:23:39 PM
|
|
On Apr 18, 1:21 pm, Terje Mathisen <spamt...@crayne.org> wrote:
> The only cache at all is a 6/8 byte prefetch queue for instruction
> bytes, but on 8088 that queue is very often empty since it has to
> compete with all memory accesses.
4 bytes, actually, on 8088. It's 6 on 8086 and NEC V20/30.
> Taken branches were something like 4-8 cycles afair, might be wrong.
17, actually. That's where the detective work comes in; to "beat" the
penalty of a jump, the alternate solution has to be 16 cycles or less
(including opcode fetch as you noted).
I think, based on what you wrote, that 17 cycles is not that big a
deal when alternate solutions would most likely be longer/slower.
> If most samples turn out to be in range, then we should simplify the
> code to optimize this path even further:
"Most" is subjective, but in an ADPCM decoder (what this code is for)
I would agree that the ratio of needing clamping to not needing is
about 1:8, so yes, most samples will be in range.
> xor dx,dx
> next: ; bytes
> mov dl,[bx+si] ; 2+1
> lodsb ; 1+1
> cbw ; 1 -128 to 127
> add ax,dx ; 2 -128 to 383
> test ah,0ffh
> jnz clamp
> stosb
> jmp next
>
> clamp:
> js underflow
> mov al,255
> stosb
> jmp next
>
> underflow:
> xor ax,ax
> stosb
> jmp next
>
> Terje
Looks like a winner; thanks for all your help!
|
|
0
|
|
|
|
Reply
|
Jim
|
4/21/2008 2:33:56 PM
|
|
Jim Leonard wrote:
> On Apr 18, 1:21 pm, Terje Mathisen <spamt...@crayne.org> wrote:
>> The only cache at all is a 6/8 byte prefetch queue for instruction
>> bytes, but on 8088 that queue is very often empty since it has to
>> compete with all memory accesses.
>
> 4 bytes, actually, on 8088. It's 6 on 8086 and NEC V20/30.
>
>> Taken branches were something like 4-8 cycles afair, might be wrong.
>
> 17, actually. That's where the detective work comes in; to "beat" the
> penalty of a jump, the alternate solution has to be 16 cycles or less
> (including opcode fetch as you noted).
>
> I think, based on what you wrote, that 17 cycles is not that big a
> deal when alternate solutions would most likely be longer/slower.
>
>> If most samples turn out to be in range, then we should simplify the
>> code to optimize this path even further:
>
> "Most" is subjective, but in an ADPCM decoder (what this code is for)
> I would agree that the ratio of needing clamping to not needing is
> about 1:8, so yes, most samples will be in range.
>
>> xor dx,dx
>> next: ; bytes
>> mov dl,[bx+si] ; 2+1
>> lodsb ; 1+1
>> cbw ; 1 -128 to 127
>> add ax,dx ; 2 -128 to 383
>> test ah,0ffh
>> jnz clamp
>> stosb
>> jmp next
>>
>> clamp:
>> js underflow
>> mov al,255
>> stosb
>> jmp next
>>
>> underflow:
>> xor ax,ax
>> stosb
>> jmp next
>>
>> Terje
>
> Looks like a winner; thanks for all your help!
>
Might it be "better" (at least smaller code) to place the stosb after
the "next" label? but then you'd need a setup JMP to start,
[and a "cx=0" test at some point!]
xor dx,dx
jmp short start
next: ; bytes
stosb
start:
mov dl,[bx+si] ; 2+1
lodsb ; 1+1
cbw ; 1 -128 to 127
add ax,dx ; 2 -128 to 383
test ah,0ffh
jz next
clamp:
js underflow
mov al,255
jmp next
underflow:
xor ax,ax
jmp next
|
|
0
|
|
|
|
Reply
|
Esra
|
4/21/2008 8:07:15 PM
|
|
Hello. You can use carry to generate a saturation mask. The idea
revolves around this concept:
0 - 1 = -1 (all bits set)
0 - 0 = 0
add al, bl ; this is ur addition
xor cl, cl ; zero d cl register
sbb cl, cl ; cl = cl - cl - carry = 0 - 0 - carry = 0 - carry
or al, cl ; this will saturate; the cl = 0xff if carry was set.. else
it is 0x00
w00t++ w00t++
Saturating w/o branching is child's play; enjoy!
|
|
0
|
|
|
|
Reply
|
aku
|
4/22/2008 7:38:05 AM
|
|
Esra Sdrawkcab wrote:
> Might it be "better" (at least smaller code) to place the stosb after
> the "next" label? but then you'd need a setup JMP to start,
> [and a "cx=0" test at some point!]
>
> xor dx,dx
> jmp short start
>
> next: ; bytes
> stosb
> start:
> mov dl,[bx+si] ; 2+1
> lodsb ; 1+1
> cbw ; 1 -128 to 127
> add ax,dx ; 2 -128 to 383
> test ah,0ffh
> jz next
Yes indeed, this is a very common/useful optimization, I didn't do this
simply because I didn't know what the surrounding code looked like, I
just wrote a minimal wrapper around the core clamping code.
BTW, there is a very sneaky way to avoid that initial jump:
Instead of
jmp short start
next:
stosb
start:
you can instead write:
mov al,0AAh
org $-1 ; Back up one byte into the immediate constant above
next:
stosb
I.e. on the first iteration we do a dummy two-byte instruction with a
one-byte immediate, which we then make sure contains the STOSB opcode byte.
The gain is that we save one opcode byte (MOV instead of JMP SHORT) and
a taken branch, so the actual cost of rotating the loop like this is 8
cycles on the first iteration, instead of 17, and each iteration saves
two opcode bytes or 8 cycles.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
4/22/2008 6:01:06 PM
|
|
aku ankka wrote:
> Hello. You can use carry to generate a saturation mask. The idea
> revolves around this concept:
>
> 0 - 1 = -1 (all bits set)
> 0 - 0 = 0
>
> add al, bl ; this is ur addition
> xor cl, cl ; zero d cl register
This XOR will overwrite the flags from the previous ADD!
INC/DEC are the only arithmetic opcodes which doesn't modify the carry flag.
> sbb cl, cl ; cl = cl - cl - carry = 0 - 0 - carry = 0 - carry
> or al, cl ; this will saturate; the cl = 0xff if carry was set.. else
> it is 0x00
>
> w00t++ w00t++
>
> Saturating w/o branching is child's play; enjoy!
Children often make simple mistakes. :-)
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
4/22/2008 6:03:00 PM
|
|
On Apr 22, 9:03�pm, Terje Mathisen <spamt...@crayne.org> wrote:
> aku ankka wrote:
> > Hello. You can use carry to generate a saturation mask. The idea
> > revolves around this concept:
>
> > 0 - 1 = -1 (all bits set)
> > 0 - 0 = 0
>
> > add al, bl ; this is ur addition
> > xor cl, cl ; zero d cl register
>
> This XOR will overwrite the flags from the previous ADD!
> INC/DEC are the only arithmetic opcodes which doesn't modify the carry flag.
>
> > sbb cl, cl ; cl = cl - cl - carry = 0 - 0 - carry = 0 - carry
> > or �al, cl ; this will saturate; the cl = 0xff if carry was set.. else
> > it is 0x00
>
> > w00t++ w00t++
>
> > Saturating w/o branching is child's play; enjoy!
>
> Children often make simple mistakes. :-)
>
> Terje
>
> --
> - <Terje.Mathi...@hda.hydro.com>
> "almost all programming can be viewed as an exercise in caching"
Yea; d xor was unnecessary, I was about to post follow-up but didn't
see dat post o mine because of moderation lag. But since u tried to to
correct tha sh1t, here's what I wanted to post about five seconds
after hitting d send button:
a - a - carry = 0 - carry
In other words the register doesn't need to be zeroed to begin wid.
I'm baffled u didn't see dat one.. I did but wanted to quote miself so
there u go.. thxbye
|
|
0
|
|
|
|
Reply
|
aku
|
4/23/2008 6:43:47 AM
|
|
Terje Mathisen wrote:
> you can instead write:
>
> mov al,0AAh
> org $-1 ; Back up one byte into the immediate constant above
> next:
> stosb
>
> I.e. on the first iteration we do a dummy two-byte instruction with a
> one-byte immediate, which we then make sure contains the STOSB opcode byte.
>
> The gain is that we save one opcode byte (MOV instead of JMP SHORT) and
> a taken branch, so the actual cost of rotating the loop like this is 8
> cycles on the first iteration, instead of 17, and each iteration saves
> two opcode bytes or 8 cycles.
>
Neat.
|
|
0
|
|
|
|
Reply
|
Esra
|
4/23/2008 8:20:41 AM
|
|
On Apr 22, 1:01 pm, Terje Mathisen <spamt...@crayne.org> wrote:
> BTW, there is a very sneaky way to avoid that initial jump:
>
> Instead of
>
> jmp short start
> next:
> stosb
> start:
>
> you can instead write:
>
> mov al,0AAh
> org $-1 ; Back up one byte into the immediate constant above
> next:
> stosb
>
> I.e. on the first iteration we do a dummy two-byte instruction with a
> one-byte immediate, which we then make sure contains the STOSB opcode byte.
>
> The gain is that we save one opcode byte (MOV instead of JMP SHORT) and
> a taken branch, so the actual cost of rotating the loop like this is 8
> cycles on the first iteration, instead of 17, and each iteration saves
> two opcode bytes or 8 cycles.
I hate to seem more daft than I already am, but after re-reading the
above for a week I still don't get the trick. I thought ORG was for
specifying where the assembler assumes the binary code will get
loaded; how does that save two opcode bytes per iteration?
|
|
0
|
|
|
|
Reply
|
Jim
|
5/1/2008 8:13:06 PM
|
|
In article <25740676-bb54-4871-b103-d3b7135e4f28@j22g2000hsf.googlegroups.com>
spamtrap@crayne.org "Jim Leonard" writes:
> On Apr 22, 1:01 pm, Terje Mathisen <spamt...@crayne.org> wrote:
> > BTW, there is a very sneaky way to avoid that initial jump:
> >
> > Instead of
> >
> > jmp short start
> > next:
> > stosb
> > start:
> >
> > you can instead write:
> >
> > mov al,0AAh
> > org $-1 ; Back up one byte into the immediate constant above
> > next:
> > stosb
> >
> > I.e. on the first iteration we do a dummy two-byte instruction with a
> > one-byte immediate, which we then make sure contains the STOSB opcode byte.
> >
> > The gain is that we save one opcode byte (MOV instead of JMP SHORT) and
> > a taken branch, so the actual cost of rotating the loop like this is 8
> > cycles on the first iteration, instead of 17, and each iteration saves
> > two opcode bytes or 8 cycles.
>
> I hate to seem more daft than I already am, but after re-reading the
> above for a week I still don't get the trick.
The "trick" is to locate the next: label in the middle of the
instruction 'mov al,imm8' -- the 'org $-1' winds back the program
counter one location and assembles the STOSB opcode in the
location occupied by the imm8 value of the previous instruction.
This results in a benign 'mov al,0AAh' first time through the
code (benign in the sense that AL will soon after be loaded with
the proper value) but when looping back to the next: label the
code will find a STOSB (i.e. jumps into the middle of the mov
instruction).
> I thought ORG was for
> specifying where the assembler assumes the binary code will get
> loaded; how does that save two opcode bytes per iteration?
This saving comes from having a single STOSB at the "top" of the
loop; the alternative earlier loop had 3 STOSB instructions (one
in each branch) -- hence the saving of 2 x STOSB.
Very neat and goes some way to justify my earlier comment about
Terje's little finger ;-)
Pete
Incidentally, I've used the 'org $-1' technique in the past for
self-modifying code to jump to alternative addresses (like a
switch() in C). Something like
jmp short +2
org $-1
disp db ?
proc1: some code
ret
proc2: more code etc...
ret
Then, at some point before the jmp you just load up the
displacement byte (disp) with 0, (proc2-proc1), (proc3-proc1),
etc. according to which section of code you want to jump to.
Not much use on modern processors I guess, but it worked well on
the 8086 and z80!
--
"We have not inherited the earth from our ancestors,
we have borrowed it from our descendants."
|
|
0
|
|
|
|
Reply
|
pete
|
5/2/2008 4:38:05 AM
|
|
Jim Leonard wrote:
> On Apr 22, 1:01 pm, Terje Mathisen <spamt...@crayne.org> wrote:
>> The gain is that we save one opcode byte (MOV instead of JMP SHORT) and
>> a taken branch, so the actual cost of rotating the loop like this is 8
>> cycles on the first iteration, instead of 17, and each iteration saves
>> two opcode bytes or 8 cycles.
>
> I hate to seem more daft than I already am, but after re-reading the
> above for a week I still don't get the trick. I thought ORG was for
> specifying where the assembler assumes the binary code will get
> loaded; how does that save two opcode bytes per iteration?
Pete wrote a good explanation, using ORG together with $ is for relative
instead of absolute addresses.
The two bytes saved per iteration was the unconditional jump from the
bottom of the loop, getting rid of those required the loop to be
rotated, i.e. moving the STOSB to the top, and getting rid of the
initial jump past this opcode was the reason for my dummy immediate
instruction.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
5/2/2008 7:11:22 AM
|
|
Terje Mathisen wrote:
>
> BTW, there is a very sneaky way to avoid that initial jump:
>
> Instead of
>
> jmp short start
> next:
> stosb
> start:
>
> you can instead write:
>
> mov al,0AAh
> org $-1 ; Back up one byte into the immediate constant above
> next:
> stosb
>
> I.e. on the first iteration we do a dummy two-byte instruction with a
> one-byte immediate, which we then make sure contains the STOSB opcode byte.
>
It's worth noting that this will cause an L1 cache miss on some platforms.
-hpa
|
|
0
|
|
|
|
Reply
|
H
|
6/2/2008 4:43:53 PM
|
|
H. Peter Anvin wrote:
> Terje Mathisen wrote:
>>
>> BTW, there is a very sneaky way to avoid that initial jump:
>>
>> Instead of
>>
>> jmp short start
>> next:
>> stosb
>> start:
>>
>> you can instead write:
>>
>> mov al,0AAh
>> org $-1 ; Back up one byte into the immediate constant above
>> next:
>> stosb
>>
>> I.e. on the first iteration we do a dummy two-byte instruction with a
>> one-byte immediate, which we then make sure contains the STOSB opcode
>> byte.
>>
>
> It's worth noting that this will cause an L1 cache miss on some platforms.
Which was why I did note it in one of my posts, and returned here when
it was clear that the target was an original 808x, which really didn't
care about stuff like cache coherence or partially decoded instructions. :-)
The trick was a win on every x86 up to and including the 486 which was
the last time I used it.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
6/2/2008 5:53:52 PM
|
|
|
38 Replies
84 Views
(page loaded in 0.316 seconds)
Similiar Articles: MZDDE toolbox question - comp.soft-sys.matlabHello everyone, ignore part of my first message I added the line zGetRefresh ... If any one knows of a more elegant way to do this please let me know? > Furthermore, the ... Problem with "Send Event" Script Step - comp.databases.filemaker ...However, this seems to be a real Kludge and there must be a more elegant way to accomplish ... devel ... work from the Send Event script step. You don't need to add "cmd /c ... Recursive interpolation function - comp.databases.filemaker ...... you may use contatore=0 as first call), but I'd like to know if there's a more elegant way ... will be second serie of values (y) inside the calculations, and I must add a ... Serialization in Matlab - comp.soft-sys.matlabBut surely there has to be > a more elegant way of serializing the data between these two? ... Matlab - comp.soft-sys.matlab... is there some magic sauce that Java could add ... smf: setting debug level in named? - comp.unix.solaris... bunch of stuff but none of it looks like a way to ... but it seems like changing that ought to be more graceful. ... feel we need this changed is because it adds a level of ... IDL reading grib data on 64bit-machine - comp.lang.idl-pvwave ...... the GRIB routine from the code library was the more elegant ... To run it add the flag -32 to the idl command. I don ... Best way to Read BMP image and write data (in Hex ... finding a quote in a text string - comp.databases.filemaker ...Is there any way to search for a double ... Still, more software usefulness for the ... the string to find text in between any kind of double quotes ... what do I need to add ... How to identify a process which exhibits a transient burst of ...... second, you need to sample at least > at twice or more ... I can't think of an elegant way to catch it. Two options. ... Subject: how to probe newly added PCI NIC? Subject: How ... Balance point of a solid - comp.soft-sys.math.mathematica ...Any thoughts or ways to approach this problem appreciated. ... Cayley-Menger, but Ray's seems somehow more elegant ... Sacramento, CA - comp.dsp They will pass any law that adds ... The "USER" environment variable - comp.unix.solaris... then USER=`whoami` export USER fi It's not elegant ... in sshd_config (X11 fordwarding won't work then any more ... Is there any way to change environment variable of a ... 1-wire to USB converter that can use 1820s directly - comp.home ...... adapter which drives strings of 1820s the same way that device of Pete's does, or add that to ... the PVR for the long haul, and is a lot simpler and more elegant than any ... Signed multiplication - comp.lang.vhdl... at it's heart, is nothing more than a sum of all of the powers in any given number. ... 1101 x 0111) All that happens is you add ... Unless you saturate to most positive number ... efficiency in awk - comp.lang.awkIsn't including the ``library'' > functions by multiple -f's more elegant ... efficiency in awk - comp.lang.awk In gawk, the standard way to do it is to add -f /usr ... shutdown vs. reboot - comp.unix.solaris... to think that it does indeed effect a graceful shutdown. Can anyone confirm one way ... Reboot is more like pulling the plug and I ... sparc it's probably no big deal, add some ... speed issues traversing the DOM. - comp.lang.javascript... to get a better solution, and hopefully a more elegant ... boxes, but it was too ugly..so the need to add more ... // not the more efficient way to detect, blah blah, yeah I know ... Unsaturated Solutions, Saturated Solutions, Supersaturated ...For all solutions the solute must be added to the solvent, NOT the other way around ... It can't saturate any more, and you may or may not see some extra ... CHEMTUTOR SOLUTIONS... the solvent no longer can dissolve any more ... The salt adds ions to the liquid that ... and the areas of lower charge are more soluble in the fat of the cream. In this way ... 7/17/2012 7:18:47 PM
|