hi,
i'm beginner with Mathlab, and my current job is just to rewrite a mathlab program with C/C++ language.
During this work, i have rewrite several mathlab function and check my implementation vs Mathlab output using a precision of 1e-6.
this criteria work well until the end of the program, because result of computation which are 'double' are dump into a file as 'int32'.
I use a round in both implementation (mathlab & C++), but sometimes there are some variations at more than 1e-6 which lead to rounding error.
As i cannot "really" increase or decrease my compiler and FPU accuracy, i think i can control the mathlab accuracy to fit with C++/compiler/ia32 constraint.
Have you a way to do this ?
NB: i use Mathlab R2010a and my compiler is Visual Studio 2008 (floating optimization are disable through /fp:precise).
All help would be welcome.
Thanks.
|
|
0
|
|
|
|
Reply
|
Alex
|
5/20/2010 11:32:03 AM |
|
On 20 Mai, 13:32, "Alex Buisson" <alex.buis...@gmail.com> wrote:
> hi,
> i'm beginner with Mathlab, and my current job is just to rewrite a mathlab program with C/C++ language.
>
> During this work, i have rewrite several mathlab function and check my implementation vs Mathlab output using a precision of 1e-6.
The differences you see are due to your lack of skills in numerical
computing. Writing numerical software that works well is an art that
requires years of deliberate studies, training and practice to
master.
Don't expect to reproduce the results from matlab without investing
the several years worth of effort required to learn numerical
programming.
Rune
|
|
0
|
|
|
|
Reply
|
Rune
|
5/20/2010 11:43:37 AM
|
|
Dear Alex!
> During this work, i have rewrite several mathlab function and check my implementation vs Mathlab output using a precision of 1e-6.
>
> this criteria work well until the end of the program, because result of computation which are 'double' are dump into a file as 'int32'.
After Rune's charming advice to study the problem for some years, I'd suggest to dumb DOUBLE variables as DOUBLEs - rounding to int32 wastes the computational accuracy.
The "precision of 1e-6" is not an exact description. If you want to check the caclulation of SIN and COS, an *absolute* precision of 1e-6 is very rough. If you compare the result of 10^31.1 the absolute precision of 1e-6 is meaningless. Can you give us some more details?
Jan
|
|
0
|
|
|
|
Reply
|
Jan
|
5/20/2010 1:22:04 PM
|
|
On 20 Mai, 15:22, "Jan Simon" <matlab.THIS_Y...@nMINUSsimon.de> wrote:
> Dear Alex!
>
> > During this work, i have rewrite several mathlab function and check my =
implementation vs Mathlab output using a precision of 1e-6.
>
> > this criteria work well until the end of the program, because result of=
computation which are 'double' are dump into a file as 'int32'.
>
> After Rune's charming advice to study the problem for some years,
Only a fool would ridicule that advice.
> I'd suggest to dumb DOUBLE variables as DOUBLEs =A0- rounding to int32 wa=
stes the computational accuracy.
Sure. But assuming the OP knows what he is doing and why he wants
INTs (I know...), there is little more to do than actually learning
how to implement numerics. Which brings us straight back to square 2.
Rune
|
|
0
|
|
|
|
Reply
|
Rune
|
5/20/2010 1:40:31 PM
|
|
Dear Rune!
> > > During this work, i have rewrite several mathlab function and check my implementation vs Mathlab output using a precision of 1e-6.
> > > this criteria work well until the end of the program, because result of computation which are 'double' are dump into a file as 'int32'.
> >
> > After Rune's charming advice to study the problem for some years,
>
> Only a fool would ridicule that advice.
The OP "thinks he can control the mathlab accuracy to fit with C++/compiler/ia32 constraint". He will need a more practical advice than starting to study numerics for years: You cannot control the accuracy of Matlab. If two (correct) implementations of an algorithm reply different values, this is caused by the limited precision of floating point calculations.
E.g. look at Matlab's ASIN, which uses the C-implementation from FDLIBM (www.netlib.org). But if you compile the same source with BCC, OWC, LCC, MSVC compilers, you get slightly different results!
> Which brings us straight back to square 2.
Square 2 is 4 - no difference if it is written as DOUBLE or INT32.
Jan
|
|
0
|
|
|
|
Reply
|
Jan
|
5/20/2010 7:22:19 PM
|
|
On 20 Mai, 21:22, "Jan Simon" <matlab.THIS_Y...@nMINUSsimon.de> wrote:
> Dear Rune!
>
> > > > During this work, i have rewrite several mathlab function and check my implementation vs Mathlab output using a precision of 1e-6.
> > > > this criteria work well until the end of the program, because result of computation which are 'double' are dump into a file as 'int32'.
>
> > > After Rune's charming advice to study the problem for some years,
>
> > Only a fool would ridicule that advice.
>
> The OP "thinks he can control the mathlab accuracy to fit with C++/compiler/ia32 constraint".
The OP has no clue what he is talking about.
> He will need a more practical advice than starting to study numerics for years:
Like it or not, but that's what he needs to do if he wants to
replicate matlab's numerical results with some accuracy.
> You cannot control the accuracy of Matlab. If two (correct) implementations of an algorithm reply different values, this is caused by the limited precision of floating point calculations.
Sure. And those of us who actually have studied numerics for some
non-trivial amount of time would know that there are a multitude
of tricks, methods and strategies one would need to be accutely
aware of to get within those couple of orders of magnitude of
the correct result.
Matlab is based on LINPACK. I used to read the companion literature
to that library. You get the big picture quite quickly by reading
a little bit about finite-precision binary formats and one book on
linear algebra (the Golub - van Loan book). From there on, the
remaining >99% of the literature treated the question about squeezing
another couple of significant digits or bits out of the computations.
Rune
|
|
0
|
|
|
|
Reply
|
Rune
|
5/20/2010 8:25:43 PM
|
|
Thanks to your answers,
I just want to say that if my experience with mathlab is very small, i have a huge (i hope) knowledge in programming (more than 10 years) in video coding, and that i'm n'ot discoreing the floatting point problems.
I agree that using 1e-6 as a precision threshold to perform my validation is'nt enough.
I have made some change to check my C/C++ accuracy again 2*DLB_EPSILON (from the standard math lib : 2.2204460492503131e-016 /* smallest such that 1.0+DBL_EPSILON != 1.0 */).
And in the same time i take a look to mathlab EPS and how to change my simple round instruction to a custom rounding at a given number of digits.
Thanks guys!
|
|
0
|
|
|
|
Reply
|
Alex
|
5/24/2010 6:53:03 AM
|
|
On 24 Mai, 08:53, "Alex Buisson" <alex.buis...@gmail.com> wrote:
> Thanks to your answers,
>
> I just want to say that if my experience with mathlab is very small, i have a huge (i hope) knowledge in programming (more than 10 years) in video coding, and that i'm n'ot discoreing the floatting point problems.
I am not sure you are fully aware what you are heading into.
Try the script below.
It generates a vector of random data, and sums the numbers
in different orders. From an analytic standpoint the sums
should be equal, since in exact arithmetics addition commutes
a + b = b + a
but with finite precision arithmetics that no longer is true:
a + b =/= b + a.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
N = 100000;
x0 = randn(N,1);
s0 = 0;
for n=1:N
s0 = s0+x0(n);
end
x1 = sort(x0);
s1 = 0;
for n=1:N
s1 = s1+x1(n);
end
x2 = flipud(x1);
s2 = 0;
for n=1:N
s2 = s2 + x2(n);
end
e01 = abs(s0 - s1);
e02 = abs(s0 - s2);
e12 = abs(s1 - s2);
fprintf('s0 = %g \n',s0);
fprintf('s1 = %g \n',s1);
fprintf('s2 = %g \n',s2);
fprintf('e01 = %g \n',e01);
fprintf('e02 = %g \n',e02);
fprintf('e12 = %g \n',e12);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Output on my screen:
s0 = 185.971
s1 = 185.971
s2 = 185.971
e01 = 2.45421e-010
e02 = 2.57046e-010
e12 = 1.16245e-011
So even an as simple and trivial operation as summing
a sequence of numbers introduces errors, depending on
the order in which the numbers are summed. The effect
is significant with a surprisingly low number of terms;
just try to run the same script a few times with N = 5,
N = 10, or N =15.
The difference between a good and a poor implementation
of the same numerical algorithm lies in the managment
of these kinds of errors, and the control of the resulting
damage.
It can be proved that the sum s1 above, sorting the numbers
and summing them in ascending order, is the optimum strategy
to minimuze the error: The sum is, at any one time, dominated
by the largest error term yet added. Make sure to always
add the smallest term to minimize the effect on the sum.
However, this optimum strategy is obtained at a sigificant
runtime cost.
There are stuff one can do to prevent these kinds of errors
to creeping in (like sorting the numbers in the example above).
The trick is to know *both* what to do *and* when to do it:
- How badly do any errors introduced at *this* stage mess
up the end result? Maybe the damage is small enough to
be deemed acceptable, and can go unhandled. But the rule
of thumb is that one does what little one can, at any one
stage.
- What can be done about it? The answer to this one might
very well be 'nothing.'
Again, this is stuff that takes years to learn and master.
Do not underestimate the task ahead of you.
Rune
|
|
0
|
|
|
|
Reply
|
Rune
|
5/24/2010 10:00:29 AM
|
|
Dear Rune!
> N = 100000;
> x0 = randn(N,1);
>
> s0 = 0;
> for n=1:N
> s0 = s0+x0(n);
> end
>
> x1 = sort(x0);
> s1 = 0;
> for n=1:N
> s1 = s1+x1(n);
> end
> It can be proved that the sum s1 above, sorting the numbers
> and summing them in ascending order, is the optimum strategy
> to minimuze the error: The sum is, at any one time, dominated
> by the largest error term yet added. Make sure to always
> add the smallest term to minimize the effect on the sum.
I disagree. Sorting normally distributed values decreases the accuracy of a sum: The first hals of the sum accumulates the nagative values. The temporary result is a large negative number. Then some small positive numbers are added - without influencing the sum due to the round off!
Sorting normally distributed values according to their absolute value is more accurate:
format long g
x0 = randn(1e5,1);
s0 = sum(x0)
s1 = sum(sort(x0))
[dum, index] = sort(abs(x0));
s2 = sum(x0(index))
==> Compare with sum calculated with quadruple precision:
http://www.mathworks.com/matlabcentral/fileexchange/26800
s3 = XSum(x0)
See e.g.:
sum([1e20, 1, -1e20])
Standard sorting can help, if all values are positive.
@OP:
If you talk about such a omplicated field as floating point properties, the terminology gets really important. So:
- "Matlab" is not called "mathlab".
- I still cannot see, if your "1e-6" means an absolute or relative accuracy.
- The "precision" is equivalent to the internally used number of bits to represent the floating point variables - 64 for DOUBLE, 32 for SINGLE. In addition the processor can store intermediate values with 80 bits also, see:
feature('setprecision', 64) % 80 bit
feature('setprecision', 53) % 64 bit
Therefore using the term "1e-6 precision" is meaningless. But it is obvious, that you mean the "accuracy".
Kind regards, Jan
|
|
0
|
|
|
|
Reply
|
Jan
|
5/24/2010 2:51:06 PM
|
|
On 24 Mai, 16:51, "Jan Simon" <matlab.THIS_Y...@nMINUSsimon.de> wrote:
> Dear Rune!
>
>
>
>
>
> > N =A0=3D 100000;
> > x0 =3D randn(N,1);
>
> > s0 =3D 0;
> > for n=3D1:N
> > =A0 =A0 s0 =3D s0+x0(n);
> > end
>
> > x1 =3D sort(x0);
> > s1 =3D 0;
> > for n=3D1:N
> > =A0 =A0 s1 =3D s1+x1(n);
> > end
> > It can be proved that the sum s1 above, sorting the numbers
> > and summing them in ascending order, is the optimum strategy
> > to minimuze the error: The sum is, at any one time, dominated
> > by the largest error term yet added. Make sure to always
> > add the smallest term to minimize the effect on the sum.
>
> I disagree. Sorting normally distributed values decreases the accuracy of=
a sum: The first hals of the sum accumulates the nagative values. The temp=
orary result is a large negative number. Then some small positive numbers a=
re added - without influencing the sum due to the round off!
>
> Sorting normally distributed values according to their absolute value is =
more accurate:
Agreed.
Rune
|
|
0
|
|
|
|
Reply
|
Rune
|
5/24/2010 2:57:33 PM
|
|
"Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de> wrote in message <hte3oq$t98$1@fred.mathworks.com>...
>
>
> I disagree. Sorting normally distributed values decreases the accuracy of a sum: The first hals of the sum accumulates the nagative values. The temporary result is a large negative number. Then some small positive numbers are added - without influencing the sum due to the round off!
>
> Sorting normally distributed values according to their absolute value is more accurate:
Ah I see many people who takes care of summing accurately the noise (e.g., normally distributed). LOL
Bruno
|
|
0
|
|
|
|
Reply
|
Bruno
|
5/24/2010 10:21:04 PM
|
|
Jan Simon wrote:
> Sorting normally distributed values according to their absolute value is
> more accurate:
That's the approach I normally think of, but then I get into thinking,
"Ah, but suppose I have a negative number and a positive number whose
magnitudes are nearly equal; would I not increase precision by
"canceling" those two numbers pairwise rather than just sorting by
magnitude? At about this point I get lost in figuring out what the best
approach is.
|
|
0
|
|
|
|
Reply
|
Walter
|
5/24/2010 11:12:17 PM
|
|
Walter Roberson <roberson@hushmail.com> wrote in message <lJDKn.12285$7d5.1065@newsfe17.iad>...
> Jan Simon wrote:
>
> > Sorting normally distributed values according to their absolute value is
> > more accurate:
>
> That's the approach I normally think of, but then I get into thinking,
> "Ah, but suppose I have a negative number and a positive number whose
> magnitudes are nearly equal; would I not increase precision by
> "canceling" those two numbers pairwise rather than just sorting by
> magnitude? At about this point I get lost in figuring out what the best
> approach is.
Yes Walter, sorting is the first reflex people often think of. But no one use it, it has limitation when applied on *real* signal (not noise), not even mention the costly sorting step. Some of the serious approaches are Knuths, Rump's, Li's and al (Xblas implementation).
Note that Matlab more resent version does necessary not sum in the order of the input array. The only way to control the order is write the Mex file, or use an existing library (like Xsum by Jan).
Bruno
|
|
0
|
|
|
|
Reply
|
Bruno
|
5/25/2010 6:11:04 AM
|
|
"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <hteu4g$kn4$1@fred.mathworks.com>...
> "Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de> wrote in message <hte3oq$t98$1@fred.mathworks.com>...
>
> Ah I see many people who takes care of summing accurately the noise (e.g., normally distributed). LOL
>
> Bruno
Yes, I agree. This discussion so far seems to be missing one important thing, the accuracy of the original data. Take this simple list:
-1 1e-20 1
The obvious "best" sum is 1e-20, right? You can see that with your own eyes. You can get it by judicious sorting of the data first to put it in this order and then doing the sum in that order:
-1 1 1e20
But that leaves out the entire discussion of the accuracy of the original data, which you can easily get at with eps of the largest magnitude number:
eps(1) = 2.2204e-016
So one can go through a lot of effort to use judicious reordering/sorting schemes to get the "best" answer for the sum based on the inputs, but it may not be worth it. e.g., the above "best" answer for the sum based on the inputs is 1e-20, but it is only accurate to about 4e-16, so have we really gained anything as far as the accuracy of the result by the reordering this particular list? Is the answer 1e-20 really better than an answer of 0 one would get by summing the list in the original order? Well, this depends. If the original data was generated by some process that produced exact results in the floating point data, then maybe the sum of 1e-20 really *is* better and has more meaning. But if the original data was generated by some process that did not do this (which is usually the case), then you would have to say that the sum answer of 1e-20 really isn't any better than a sum
answer of 0, the result one would get with a simple sum of the data in the original order.
Bottom line is one needs to understand the accuracy of the original data. Go ahead and use the "best" scheme if your original data warrants it, but first understand the trades you are making with extra code complexity, execution time, and what you are really buying as far as the "accuracy" of the result is concerned.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
5/25/2010 3:25:20 PM
|
|
On 25 Mai, 17:25, "James Tursa"
<aclassyguy_with_a_k_not_...@hotmail.com> wrote:
> Bottom line is one needs to understand the accuracy of the original data.=
Go ahead and use the "best" scheme if your original data warrants it, but =
first understand the trades you are making with extra code complexity, exec=
ution time, and what you are really buying as far as the "accuracy" of the =
result is concerned.
You are right, assuming that the data are 'measurements' of
some kind. However, these accuracy issues have a habit of turning
up in the midst of numerical computations.
Assume your analytic expression ends up somehwere along the
lines of
y =3D exp(x + 1) - exp(x - 1)
where x is 'large'. The end result of this computation needs
not be very large, but the intermediate results are such that
several significant digits are lost, which in turn tend to
seriously mess up subsequent computations that rely on y
above.
This kind of thing happened in what is known as Tompson and
Haskell's method for modelling acoustic media, which was
developed some time in the 1950s. Because of the large
magnitudes of the intermediate results, what looked as a
simple and convenient modelling method gained the reputation
as a totally unpredictable numerical beast.
Several attempts have been made to come up with numerically
stable methods to do the computations, and all of the
successful ones (notably by Henrik Schmidt in the '80s and
Sven Ivansson in the '90s) relied on carefully and cautiously
re-formulating and re-ordering the individual terms that needed
to be computed.
Rune
|
|
0
|
|
|
|
Reply
|
Rune
|
5/25/2010 4:56:35 PM
|
|
Dear James, dear OP!
> Yes, I agree. This discussion so far seems to be missing one important thing, the accuracy of the original data. Take this simple list:
>
> -1 1e-20 1
>
> The obvious "best" sum is 1e-20, right?
You hit the point! 1e-20 is the mathematically correct answer. But usually mathematics with floating point numbers are used on a computer to model physical measurements. A well implemented model has to consider the measurement errors, so actually this sum must be calculated as e.g. (if the values have been measured with a relative error of 1e-16 - unlikely for real world problems...):
{-1+-1e15} + {1e-20+-1e-35} + {1+-1e-15}
Here the curly braces means a variation of the measurement values. The result would be an interval!
E.g. for the numerical integration of a function, the sensitivities to the initial values and parameters have to be calculated also - e.g. by a sufficiently small variation of the inputs- otherwise the resulting number is more or less meaningless.
A trustworthy comparison of numerical algorithms means a check, if the resulting intervals overlap considering the accuracy (or precision?!) of the input data.
INTLAB written by Rump is a very well implemented interval library written in Matlab:
http://www.ti3.tu-harburg.de/~rump/intlab/
Kind regards, Jan
|
|
0
|
|
|
|
Reply
|
Jan
|
5/26/2010 9:55:07 AM
|
|
|
15 Replies
275 Views
(page loaded in 0.186 seconds)
|