I'm working with a large complex array. After some calculations, I need to split it up and work separately with the real and imaginary parts. Unfortunately, it doesn't look like the copy-on-write behavior of MATLAB works with the real and imaginary parts of the array, even though (I think) the complex data is stored in memory as separate real and imaginary parts. So the following does not prevent allocation of two new arrays when the data is split into real and imaginary parts:
% (test data – 1000x1000 complex array)
w = normrnd(0,1,1000,1000) + 1i*normrnd(0,1,1000,1000);
% (calculations with w)
% split w into real and imaginary parts
re = real(w);
im = imag(w);
clear w
% (work on re and im separately)
Does anyone know of a way to separate complex data into real and imaginary parts without allocating new memory? The copy-on-write behavior is great for memory optimization with non-complex data. If complex data really is stored in memory as separate real and imaginary parts with a combined header, it seems straightforward to allow copy-on-write on these arrays.
|
|
0
|
|
|
|
Reply
|
John
|
11/15/2010 5:52:03 PM |
|
"John Barber" <johnpbarber@REMOVEyahooTHIS.com> wrote in message <ibrs03$89q$1@fred.mathworks.com>...
>
> Does anyone know of a way to separate complex data into real and imaginary parts without allocating new memory?
It should be straightforward to write a MEX routine that will do this. I don't know of any way in M-code...
|
|
0
|
|
|
|
Reply
|
Matt
|
11/15/2010 6:28:06 PM
|
|
"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <ibru3m$so2$1@fred.mathworks.com>...
>
> It should be straightforward to write a MEX routine that will do this. I don't know of any way in M-code...
I'm not so sure. It is not quite straight-forward as you might think. For the Mex APIs of latest Matlab version, mxSetPr might crash if the input pointer is pointed on the memory occupied by another mxArray! One needs reverse engineering and set manually the internal structure of mxArray as I did in my INPLACEARRAY package on FEX.
Bruno
|
|
0
|
|
|
|
Reply
|
Bruno
|
11/15/2010 6:52:04 PM
|
|
"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <ibrvgj$5ch$1@fred.mathworks.com>...
> "Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <ibru3m$so2$1@fred.mathworks.com>...
>
> >
> > It should be straightforward to write a MEX routine that will do this. I don't know of any way in M-code...
>
> I'm not so sure. It is not quite straight-forward as you might think. For the Mex APIs of latest Matlab version, mxSetPr might crash if the input pointer is pointed on the memory occupied by another mxArray! One needs reverse engineering and set manually the internal structure of mxArray as I did in my INPLACEARRAY package on FEX.
>
> Bruno
A tricky problem indeed. I do not believe the MATLAB sharing scheme allows the sharing of the real & imaginary parts separately. The CrossLink field of the mxArray header just points to the other sharing mxArray header, not to the pr & pi data pointers. So I don't think this can be done at the MATLAB level. But the following mex scheme (as yet untested) *might* work:
- Make sure the original variable is unshared.
- Pass the variable to a mex routine (gets passed by reference).
Inside the mex routine:
- Create a shared data copy of the input.
- Used mexEvalString to clear the original variable from the MATLAB workspace. This should make the mex shared data copy of the variable an unshared data copy (I hope). It will be a bit tricky to get the name of the passed in variable for use in the clear command, but it can be done (or simply pass in the name as another argument).
- Create two empty mxArray real variables.
- Attach the real & imaginary parts of the combined mxArray separately to the two real mxArray variables and set the sizes accordingly.
- Detach & null out the pr & pi pointers of the combined mxArray.
- Use mxDestroyArray on the combined mxArray. (since pr & pi pointers no longer point to the data memory, it won't get freed).
- Return the two real mxArrays back to the MATLAB workspace.
If I get some time later on today I might try this out.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/15/2010 7:18:04 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibs11c$i9e$1@fred.mathworks.com>...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <ibrvgj$5ch$1@fred.mathworks.com>...
> > "Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <ibru3m$so2$1@fred.mathworks.com>...
> >
> > >
> > > It should be straightforward to write a MEX routine that will do this. I don't know of any way in M-code...
> >
> > I'm not so sure. It is not quite straight-forward as you might think. For the Mex APIs of latest Matlab version, mxSetPr might crash if the input pointer is pointed on the memory occupied by another mxArray! One needs reverse engineering and set manually the internal structure of mxArray as I did in my INPLACEARRAY package on FEX.
> >
> > Bruno
>
> A tricky problem indeed. I do not believe the MATLAB sharing scheme allows the sharing of the real & imaginary parts separately. The CrossLink field of the mxArray header just points to the other sharing mxArray header, not to the pr & pi data pointers. So I don't think this can be done at the MATLAB level. But the following mex scheme (as yet untested) *might* work:
>
> - Make sure the original variable is unshared.
> - Pass the variable to a mex routine (gets passed by reference).
> Inside the mex routine:
> - Create a shared data copy of the input.
> - Used mexEvalString to clear the original variable from the MATLAB workspace. This should make the mex shared data copy of the variable an unshared data copy (I hope). It will be a bit tricky to get the name of the passed in variable for use in the clear command, but it can be done (or simply pass in the name as another argument).
> - Create two empty mxArray real variables.
> - Attach the real & imaginary parts of the combined mxArray separately to the two real mxArray variables and set the sizes accordingly.
> - Detach & null out the pr & pi pointers of the combined mxArray.
> - Use mxDestroyArray on the combined mxArray. (since pr & pi pointers no longer point to the data memory, it won't get freed).
> - Return the two real mxArrays back to the MATLAB workspace.
>
> If I get some time later on today I might try this out.
>
>
> James Tursa
After a bit more thought I don't think the shared data copy part of the mex routine is necessary. Since the input is pass-by-reference one should be able to just modify the input mxArray directly to detach and reattach the pr & pi data memory before clearing it.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/15/2010 7:23:04 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibs1ao$8ar$1@fred.mathworks.com>...
>
> After a bit more thought I don't think the shared data copy part of the mex routine is necessary. Since the input is pass-by-reference one should be able to just modify the input mxArray directly to detach and reattach the pr & pi data memory before clearing it.
>
> James Tursa
Well, here it is. At least it didn't bomb. CAUTION: This is bare bones with absolutely no argument checking etc ... it is just a proof of concept. I will work on a robust version later.
James Tursa
------------------------------------------------------------------------------------------------------
// File: split_real_imag.c
// Programmer: James Tursa
// Purpose: Splits an imaginary double input into two real outputs without
// a data copy and clears the input variable.
// Syntax: [R I] = split_real_imag(A,'A');
// A = input array (will be cleared!)
// 'A' = the name of A
// R = the real part as a real variable
// I = the imaginary part as a real variable
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
char *name;
char clearname[100] = {"clear "};
mwSize dims[2] = {0,0};
name = mxArrayToString(prhs[1]);
sprintf(clearname+6,"%s",name);
mxFree(name);
plhs[0] = mxCreateDoubleMatrix(0, 0, mxREAL);
plhs[1] = mxCreateDoubleMatrix(0, 0, mxREAL);
mxSetPr(plhs[0],mxGetPr(prhs[0]));
mxSetPr(plhs[1],mxGetPi(prhs[0]));
mxSetDimensions(plhs[0],mxGetDimensions(prhs[0]),mxGetNumberOfDimensions(prhs[0]));
mxSetDimensions(plhs[1],mxGetDimensions(prhs[0]),mxGetNumberOfDimensions(prhs[0]));
mxSetPr(prhs[0],NULL);
mxSetPi(prhs[0],NULL);
mxSetDimensions(prhs[0],dims,2);
mexEvalString(clearname);
}
------------------------------------------------------------------------------------------------------
>> mex split_real_imag.c
>> format debug
>> A = rand(2)+rand(2)*i
A =
Structure address = 44c1190
m = 2
n = 2
pr = d5d9e10
pi = d5da1d0
0.8147 + 0.6324i 0.1270 + 0.2785i
0.9058 + 0.0975i 0.9134 + 0.5469i
>> [B C] = split_real_imag(A,'A')
B =
Structure address = 44cc3d8
m = 2
n = 2
pr = d5d9e10
pi = 0
0.8147 0.1270
0.9058 0.9134
C =
Structure address = 44cc480
m = 2
n = 2
pr = d5da1d0
pi = 0
0.6324 0.2785
0.0975 0.5469
>> whos
Name Size Bytes Class Attributes
B 2x2 32 double
C 2x2 32 double
>> clear all
>> whos
|
|
0
|
|
|
|
Reply
|
James
|
11/15/2010 7:42:03 PM
|
|
Just for fun, below is the bare bones of an object-oriented alternative. The classdef down at the bottom is for an object that stores its real and imaginary parts in separate object properties. It is constructed as follows,
>> ob=myClass(1,2)
ob =
1.0000 + 2.0000i
However, manipulations to the real and imaginary parts made using dot-indexing syntax will obey copy-on-write rules.
>> ob.im=5, im=ob.im, %No memory alloc.
ob =
1.0000 + 5.0000i
im =
5
I've defined a few arithmetic operations as well,
>> ob+1, ob.*ob
ans =
2.0000 + 5.0000i
ans =
-24.0000 +10.0000i
Again, though, it's just bare bones....
%%%%%Put in myClass.m
classdef myClass
properties
re; im;
end % properties
methods
function obj= myClass(re,im)
obj.re=re; obj.im=im;
end
function out= double(obj)
out=obj.re+i*obj.im;
end
function display(obj)
T=evalc('double(obj)'); disp(strrep(T,'ans',inputname(1))),
end
function out=plus(L,R)
[L,R]=preProc(L,R);
out=myClass( L.re+R.re , L.im+R.im);
end
function out=minus(L,R)
[L,R]=preProc(L,R);
out=myClass( L.re-R.re , L.im-R.im);
end
function out = times(L,R)
[L,R]=preProc(L,R);
out=myClass(L.re.*R.re - L.im.*R.im , L.re.*R.im + L.im.*R.re);
end
end % methods
end % class
function varargout=preProc(varargin)
varargout=varargin;
for ii=1:length(varargin)
if isnumeric(varargin{ii})
varargout{ii}=myClass(real(varargin{ii}),imag(varargin{ii}));
end
end
end
|
|
0
|
|
|
|
Reply
|
Matt
|
11/15/2010 7:56:03 PM
|
|
All,
Thanks for the quick replies. I was curious if there was a non-MEX way to do it.
James,
You beat me to it as far as writing the code. Will you post this to the file exchange when finished? Seems like others would find it useful.
-John
|
|
0
|
|
|
|
Reply
|
John
|
11/15/2010 8:05:06 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibs2eb$ngt$1@fred.mathworks.com>...
>
> Well, here it is. At least it didn't bomb.
Warning James, I play with your code few times (with and without sharing arrays with the input A). I did not bomb instantaneously, but it did bomb (Matlab crash) much later.
Bruno
|
|
0
|
|
|
|
Reply
|
Bruno
|
11/15/2010 8:18:03 PM
|
|
"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <ibs4hr$iig$1@fred.mathworks.com>...
> "James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibs2eb$ngt$1@fred.mathworks.com>...
>
> >
> > Well, here it is. At least it didn't bomb.
>
> Warning James, I play with your code few times (with and without sharing arrays with the input A). I did not bomb instantaneously, but it did bomb (Matlab crash) much later.
>
> Bruno
I will need to play with it some more as well. Offhand I don't see any reason that it would bomb provided that the input array is not shared. If it *is* shared, even with ans, then I would expect it to bomb. I will port the sharing detection code over from my MTIMESX routine (actually a version not submitted yet) to detect sharing at the front end and disallow the split in that case and see how that works.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/15/2010 8:41:03 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibs5sv$ihd$1@fred.mathworks.com>...
> If it *is* shared, even with ans, then I would expect it to bomb.
Yes I test A that is shared. I guess in that case the same memory block is free twice and Matlab is bombed later then.
Bruno
|
|
0
|
|
|
|
Reply
|
Bruno
|
11/15/2010 8:56:03 PM
|
|
"John Barber" <johnpbarber@REMOVEyahooTHIS.com> wrote in message <ibrs03$89q$1@fred.mathworks.com>...
>
> Does anyone know of a way to separate complex data into real and imaginary parts without allocating new memory? The copy-on-write behavior is great for memory optimization with non-complex data.
================
Might there be a way to write a MEX function that lets you pass an arbitrary function handle and apply that function to the real/imaginary part of the array only? For example, if you wanted to sum over just the real part of the array, ignoring the imaginary part, you would call
S=realfun(@sum,X) %sums the real part of X
I'm guessing that this is the motivation behind John's question.
Otherwise, I can't see why it would be useful to isolate the real and imaginary parts with copy-on-write semantics. Why would you want to obtain a shared copy of the real/imaginary part as a separate array, when copy-on-write prevents you from making any changes to it?
|
|
0
|
|
|
|
Reply
|
Matt
|
11/15/2010 9:05:31 PM
|
|
"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <ibs7ar$nf8$1@fred.mathworks.com>...
> "John Barber" <johnpbarber@REMOVEyahooTHIS.com> wrote in message <ibrs03$89q$1@fred.mathworks.com>...
> >
> > Does anyone know of a way to separate complex data into real and imaginary parts without allocating new memory? The copy-on-write behavior is great for memory optimization with non-complex data.
> ================
>
> Might there be a way to write a MEX function that lets you pass an arbitrary function handle and apply that function to the real/imaginary part of the array only? For example, if you wanted to sum over just the real part of the array, ignoring the imaginary part, you would call
>
> S=realfun(@sum,X) %sums the real part of X
>
> I'm guessing that this is the motivation behind John's question.
> Otherwise, I can't see why it would be useful to isolate the real and imaginary parts with copy-on-write semantics. Why would you want to obtain a shared copy of the real/imaginary part as a separate array, when copy-on-write prevents you from making any changes to it?
Matt, you raise a good point. At some point I suppose we will get a post asking the reverse question ... how do you combine two real arrays into one complex array while at the same time clearing the two inputs and without a data copy? I could write that one as well I suppose.
But your higher level view is an interesting one. The only way it could work that I see it is if the function in question was not allowed to make a shared data copy of the input. That is because the function would have to be passed an illegal piecemeal copy of the input to work on, and any data shared copy of that illegal piecemeal copy would result in a MATLAB crash downstream. I don't know how to police this directly in the mex routine, so there would have to be a fair amount of trust on the part of the user to not do this. That being said, there *appears* to be a mechanism in place, at least for mexCallMATLAB calls, that does just that. For example, if you call a function that routinely returns a shared data copy (e.g., reshape) from a mex routine using mexCallMATLAB, you will *not* get a shared data copy but will instead get a deep copy. So this apparent behavior (I haven't
researched this much yet) may in fact make this functionality feasible and somewhat robust in a mex routine.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/15/2010 9:58:03 PM
|
|
"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <ibs6p3$hce$1@fred.mathworks.com>...
> "James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibs5sv$ihd$1@fred.mathworks.com>...
> > If it *is* shared, even with ans, then I would expect it to bomb.
>
> Yes I test A that is shared. I guess in that case the same memory block is free twice and Matlab is bombed later then.
>
> Bruno
I have put in a shared check and as of yet I am not able to break it or crash MATLAB. So maybe it is worthwhile to submit to the FEX after some clean up. Probably the reverse functionality as well.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/15/2010 10:00:06 PM
|
|
On 10-11-15 03:58 PM, James Tursa wrote:
> Matt, you raise a good point. At some point I suppose we will get a post
> asking the reverse question ... how do you combine two real arrays into
> one complex array while at the same time clearing the two inputs and
> without a data copy? I could write that one as well I suppose.
To what extent does complex() do that, I wonder?
If the case of such a function, if the second real array (the one to go in the
complex slot) was all zeroes, then normally matlab would detect that and would
eliminate the complex part, returning a double with no complex memory slot.
I've often thought that that must be a pain for lower-level functions. Anyhow,
it raises the question of whether the same functionality would be desirable
(or necessary, for internal reasons?) for this kind of splicing function
proposed ?
|
|
0
|
|
|
|
Reply
|
Walter
|
11/15/2010 10:11:11 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibsadb$gcq$1@fred.mathworks.com>...
>
> The only way it could work that I see it is if the function in question was not allowed to make a shared data copy of the input. That is because the function would have to be passed an illegal piecemeal copy of the input to work on, and any data shared copy of that illegal piecemeal copy would result in a MATLAB crash downstream.
=======
Not sure what you mean by a "piecemeal copy". For a realfun(@fun,...) routine for example, wouldn't you just temporarily set the Pi of the input to NULL? And if this is what you mean by a piecemeal copy, what conflict would a data shared copy create? Any data-shared copy in the workspace of @fun would be unable to make any changes to this piecemeal copy before it went out of scope. Any attempted change would result in the creation of deep copy by copy-on-write rules (assuming @fun was not a user-created MEX of course).
|
|
0
|
|
|
|
Reply
|
Matt
|
11/15/2010 10:33:03 PM
|
|
"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <ibscev$1to$1@fred.mathworks.com>...
> "James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibsadb$gcq$1@fred.mathworks.com>...
> >
> > The only way it could work that I see it is if the function in question was not allowed to make a shared data copy of the input. That is because the function would have to be passed an illegal piecemeal copy of the input to work on, and any data shared copy of that illegal piecemeal copy would result in a MATLAB crash downstream.
> =======
>
> Not sure what you mean by a "piecemeal copy". For a realfun(@fun,...) routine for example, wouldn't you just temporarily set the Pi of the input to NULL?
Yes, that is an example of what I would call an illegal piecemeal copy.
> And if this is what you mean by a piecemeal copy, what conflict would a data shared copy create? Any data-shared copy in the workspace of @fun would be unable to make any changes to this piecemeal copy before it went out of scope.
Suppose @fun creates a shared data copy of the input off to the side (e.g., in a global variable). Then @fun returns. Now you have two variables at the MATLAB level that are shared, one of them has pr and pi data and the other one only has pr data. What problems might that cause? Quite frankly I don't know. Maybe my "crash" assumption was a bit hasty and it would be OK as you surmise, but I will have to think about it some more. Certainly if you detach the pi pointer and call @fun and get a failure then you would get a memory leak and a permanently altered input variable with mexCallMATLAB. You would have to use mexCallMATLABWithTrap (not available in earlier versions of MATLAB) to make sure you could piece things back together again in the mex routine.
> Any attempted change would result in the creation of deep copy by copy-on-write rules (assuming @fun was not a user-created MEX of course).
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/15/2010 10:59:04 PM
|
|
"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <ibs7ar$nf8$1@fred.mathworks.com>...
> "John Barber" <johnpbarber@REMOVEyahooTHIS.com> wrote in message <ibrs03$89q$1@fred.mathworks.com>...
> >
> > Does anyone know of a way to separate complex data into real and imaginary parts without allocating new memory? The copy-on-write behavior is great for memory optimization with non-complex data.
> ================
>
> Might there be a way to write a MEX function that lets you pass an arbitrary function handle and apply that function to the real/imaginary part of the array only? For example, if you wanted to sum over just the real part of the array, ignoring the imaginary part, you would call
>
> S=realfun(@sum,X) %sums the real part of X
>
> I'm guessing that this is the motivation behind John's question.
> Otherwise, I can't see why it would be useful to isolate the real and imaginary parts with copy-on-write semantics. Why would you want to obtain a shared copy of the real/imaginary part as a separate array, when copy-on-write prevents you from making any changes to it?
Matt,
That is an interesting idea to apply an arbitrary function handle to only one part of the data. However, what I am looking for is closer to what James' MEX code accomplishes. That is, take a complex array and without allocating additional memory, split it up into two real arrays. That way I can go on and process them separately with no additional memory footprint. In my case I don't need to recombine them into complex data, so destroying the original complex array is no problem.
I can see how the reverse functionality would be useful as well - combine two arrays into a complex array without consuming memory. Complex() does not function this way.
I think that my use of the term copy-on-write in the op was misleading. The intention was to take advantage of MATLAB's pass variables by value semantics combined with its copy-on-write behavior to more efficiently (memory-wise) split a complex array. This trick is easy to do with real data. It can be useful for renaming a variable, and I imagine this behavior is also closely related to the ability to do in-place operations and functions.
b = a;
clear a
% do stuff with b. This effectively renames a variable without ever making a second copy of it in memory.
|
|
0
|
|
|
|
Reply
|
John
|
11/15/2010 11:21:03 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibsdvo$9he$1@fred.mathworks.com>...
>
> Suppose @fun creates a shared data copy of the input off to the side (e.g., in a global variable). Then @fun returns. Now you have two variables at the MATLAB level that are shared, one of them has pr and pi data and the other one only has pr data. What problems might that cause? Quite frankly I don't know. Maybe my "crash" assumption was a bit hasty and it would be OK as you surmise, but I will have to think about it some more.
OK, I have thought about it some more and one problem is memory leaking. Suppose you have one variable that has pr and pi and it is sharing with another variable that only has pr. Then you clear the first variable. MATLAB thinks that both parts are shared (there is no partial sharing as far as MATLAB memory manager is concerned), so it frees the header of the first variable then removes its address from the CrossLink sharing list. It leaves the pr and pi memory alone because it thinks both parts are shared. Now you have lost the pi pointer forever and the memory gets leaked since the second variable knows nothing about this memory. So it may not cause a crash but it will I think lead to memory leaks and at some point a MATLAB failure if the leakage is severe enough.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/15/2010 11:42:04 PM
|
|
"John Barber" <johnpbarber@REMOVEyahooTHIS.com> wrote in message <ibsf8v$2cn$1@fred.mathworks.com>...
>
> That is an interesting idea to apply an arbitrary function handle to only one part of the data. However, what I am looking for is closer to what James' MEX code accomplishes. That is, take a complex array and without allocating additional memory, split it up into two real arrays. That way I can go on and process them separately with no additional memory footprint.
============
But for that to be useful, you would have to be sure that the original complex array is not shared. Otherwise, you're not really saving anything. As soon as you made any changes to re or im (the results of the split), new memory blocks for this changed data would be required and you would end up consuming the same memory footprint regardless.
|
|
0
|
|
|
|
Reply
|
Matt
|
11/16/2010 2:50:29 PM
|
|
"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <ibu5nl$9ec$1@fred.mathworks.com>...
>
> But for that to be useful, you would have to be sure that the original complex array is not shared. Otherwise, you're not really saving anything. As soon as you made any changes to re or im (the results of the split), new memory blocks for this changed data would be required and you would end up consuming the same memory footprint regardless.
That's correct - this only is useful for unshared arrays (and has been mentioned, can _only_ be done with unshared arrays to prevent memory leaks or other problems, hence the need for some MEX code to test for multiple variables pointing to the same mxArray.)
In my case, I know that the original array is not shared. I am currently doing:
re = real(w);
im = imag(w);
clear w
%(go on and calculate using re and im separately)
During the interval between 'im=imag(w);' and 'clear w', there are two copies of the data in memory. Because the mxArrays used by w are unshared, there is no problem clearing the variable 'w' after creating re and im, but the spike in memory use is not good. Because re and im use new, unshared mxArrays, they can be modified without regard to each other or other variables. What Jame's code accomplishes is to implement all three of these lines of code _without_ the temporary spike in memory use. That is, it makes the two mxArrays that were formerly jointly associated with w become solely associated with re and im respectively, sacrificing the workspace variable w in the process.
The same issue applies when creating a complex array from two real arrays. Using z=complex(a,b) creates a new complex array with copies of a and b in new mxArrays. a and b are still available for further processing. If you are willing to sacrifice a and b (and if their mxArrays are not shared with other variables), you can create z without having to allocate two new mxArrays, which can be a make-or-break for some situations.
|
|
0
|
|
|
|
Reply
|
John
|
11/16/2010 3:26:03 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibsggc$kim$1@fred.mathworks.com>...
>
>
> OK, I have thought about it some more and one problem is memory leaking. Suppose you have one variable that has pr and pi and it is sharing with another variable that only has pr. Then you clear the first variable.
========
I'm not seeing how that scenario would arise. You are passing to @fun a real variable only, i.e., a piecemeal copy of what I think you're calling the first variable, one whose Pr points to either the real or imaginary part of the first variable. The inner workings of @fun have no access to the entirety of the first variable and should have no opportunity to clear it.
|
|
0
|
|
|
|
Reply
|
Matt
|
11/16/2010 3:42:03 PM
|
|
"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <ibu8ob$p6i$1@fred.mathworks.com>...
> "James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibsggc$kim$1@fred.mathworks.com>...
> >
> >
> > OK, I have thought about it some more and one problem is memory leaking. Suppose you have one variable that has pr and pi and it is sharing with another variable that only has pr. Then you clear the first variable.
> ========
>
> I'm not seeing how that scenario would arise. You are passing to @fun a real variable only, i.e., a piecemeal copy of what I think you're calling the first variable, one whose Pr points to either the real or imaginary part of the first variable. The inner workings of @fun have no access to the entirety of the first variable and should have no opportunity to clear it.
You pass @fun a real variable, yes. Then @fun creates and stores a shared copy of that real variable. Then @fun returns to the mex routine. Then the mex routine attaches the imaginary part back onto the aforementioned input array. Now you have a real array sharing with a complex array and a potential memory leak situation. I am using your scenario where you temporarily NULL the pi pointer and save the actual value off to the side while the @fun call is made. But really, it doesn't matter how the illegal real array is constructed. If it or a shared copy of it ever gets back to MATLAB then the potential for the memory leak is there.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/16/2010 4:19:03 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibsggc$kim$1@fred.mathworks.com>...
> "James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibsdvo$9he$1@fred.mathworks.com>...
> >
> > Suppose @fun creates a shared data copy of the input off to the side (e.g., in a global variable). Then @fun returns. Now you have two variables at the MATLAB level that are shared, one of them has pr and pi data and the other one only has pr data. What problems might that cause? Quite frankly I don't know. Maybe my "crash" assumption was a bit hasty and it would be OK as you surmise, but I will have to think about it some more.
>
> OK, I have thought about it some more and one problem is memory leaking. Suppose you have one variable that has pr and pi and it is sharing with another variable that only has pr. Then you clear the first variable.
==============
OK, you were talking about the situation where the piecemeal copy made it back to the MATLAB level. So, yes, I see the problem that global variables could cause.
To prevent that, I guess it would be good to have a subclass of mxArray that could not be shared... Something with a different copy constructor perhaps?
|
|
0
|
|
|
|
Reply
|
Matt
|
11/16/2010 4:24:04 PM
|
|
"John Barber" <johnpbarber@REMOVEyahooTHIS.com> wrote in message <ibs3pi$piq$1@fred.mathworks.com>...
> All,
> Thanks for the quick replies. I was curious if there was a non-MEX way to do it.
>
> James,
> You beat me to it as far as writing the code. Will you post this to the file exchange when finished? Seems like others would find it useful.
>
> -John
OK, I posted the code to the FEX. It has been polished up and allows for any numeric type, not just double (however, I just realized that I forgot to include a sparse check, so I will have to fix that and resubmit a new version). You can find it here:
http://www.mathworks.com/matlabcentral/fileexchange/29420-extract-real-imaginary-parts-without-data-copy
I changed the functionality somewhat from my earlier posted example. Instead of clearing the input array I empty it instead. That simplifies the code a bit (I don't have to put in special code to go find the name) and it allows the code to work on cell or field elements. The code also allows the input array to be shared if the 'shared' directive is used, in which case *all* of the variables that are shared with the input variable are emptied as well. I am not sure what use it is to use this function with a group of shared variables and end up having all of them emptied at once, but the functionality was easy to implement since the sharing detection code had to be there anyway so I included it. The default is no sharing allowed.
I will work on the reverse function and some better documentation and upload a version with all of that in the next few days.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/16/2010 4:33:04 PM
|
|
"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <ibub74$7ic$1@fred.mathworks.com>...
> "James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibsggc$kim$1@fred.mathworks.com>...
> > "James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibsdvo$9he$1@fred.mathworks.com>...
> > >
> > > Suppose @fun creates a shared data copy of the input off to the side (e.g., in a global variable). Then @fun returns. Now you have two variables at the MATLAB level that are shared, one of them has pr and pi data and the other one only has pr data. What problems might that cause? Quite frankly I don't know. Maybe my "crash" assumption was a bit hasty and it would be OK as you surmise, but I will have to think about it some more.
> >
> > OK, I have thought about it some more and one problem is memory leaking. Suppose you have one variable that has pr and pi and it is sharing with another variable that only has pr. Then you clear the first variable.
> ==============
>
> OK, you were talking about the situation where the piecemeal copy made it back to the MATLAB level. So, yes, I see the problem that global variables could cause.
Yes, exactly.
> To prevent that, I guess it would be good to have a subclass of mxArray that could not be shared... Something with a different copy constructor perhaps?
Certainly in a mex routine you can control that. But at the MATLAB level you cannot, and that is the rub.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/16/2010 4:36:04 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ibubo0$ct0$1@fred.mathworks.com>...
>
> OK, I posted the code to the FEX.
>
> http://www.mathworks.com/matlabcentral/fileexchange/29420-extract-real-imaginary-parts-without-data-copy
Oops ... not quite ready yet. I pulled it to fix some problems and will upload tomorrow.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/16/2010 4:50:05 PM
|
|
What would really be nice would be for the MathWorks folks to enable this type of functionality in MATLAB itself. I imagine it would take some changes to their memory manager to keep track of writes to the pr and pi arrays separately. In effect, this would enable pass-by-value semantics and copy-on-write behavior for the real and imaginary mxArrays individually. They would also need to make sure that real(), imag(), and complex() were compatible with this behavior.
This would open up a lot of possibilities with both memory and possible speed improvements. I can't think of any drawbacks (other than having to modify MATLAB's memory manager, of course).
Two examples:
% (generate complex array C)
% split into real and imaginary parts:
x = real(C);
y = imag(C);
clear C
% (operate on x and y separately with no memory penalty)
% generate real arrays A and B
C = complex(A,B);
clear A B
% (operate on C as a complex array with no memory penalty)
I'm not sure if there would be a way to do an operation on just the real or imaginary part individually, as was mentioned earlier in the thread. There would either need to be a wrapper function to do it, or some strange new syntax like the following:
real(C) = real(C)*2;
Of course, if real(), imag() and complex() behaved as I proposed above, it would be easy enough to do the following without any memory allocation:
% (generate complex array C)
% split into real and imaginary parts:
x = real(C);
y = imag(C);
clear C
% operate on x and y separately:
x = x*2;
% recombine x and y into a complex array:
C = complex(x,y);
clear x y
% (back to operating on C as a complex array)
Any thoughts? I've submitted this as a feature request. If anyone else thinks this behavior would be useful, please let the MathWorks folks know.
|
|
0
|
|
|
|
Reply
|
John
|
11/18/2010 9:09:04 PM
|
|
"John Barber" <johnpbarber@REMOVEyahooTHIS.com> wrote in message <ic44lg$im0$1@fred.mathworks.com>...
>
> What would really be nice would be for the MathWorks folks to enable this type of functionality in MATLAB itself. I imagine it would take some changes to their memory manager to keep track of writes to the pr and pi arrays separately. In effect, this would enable pass-by-value semantics and copy-on-write behavior for the real and imaginary mxArrays individually. They would also need to make sure that real(), imag(), and complex() were compatible with this behavior.
>
> This would open up a lot of possibilities with both memory and possible speed improvements. I can't think of any drawbacks (other than having to modify MATLAB's memory manager, of course).
(snip)
> Any thoughts? I've submitted this as a feature request. If anyone else thinks this behavior would be useful, please let the MathWorks folks know.
For me it is just an interesting programming exercise. I have never needed this capability in the projects I typically work on. For TMW, they would probably have to add another CrossLinkImag field to their mxArray header. Not a very hard thing to do in and of itself but now you are changing a fundamental variable header structure that MATLAB has had pretty much the same for several years. This enhancement might not be worth it to them.
As a related side issue, does anyone know what happened to the name field of the mxArray header? For much older versions of MATLAB this used to be a char array. Then it changed to a char * and pointed to a string with the name. In the latest versions of MATLAB this field is simply NULL ... no pointer to the name anymore. Has this been moved or has it been eliminated? I suspect the latter since I can't find a name pointer anywhere nearby in the header. So perhaps the names are kept off to the side somewhere in a workspace name list of some type separate from the mxArray structure itself.
James Tursa
|
|
0
|
|
|
|
Reply
|
James
|
11/18/2010 10:02:04 PM
|
|
"John Barber" <johnpbarber@REMOVEyahooTHIS.com> wrote in message <ic44lg$im0$1@fred.mathworks.com>...
> What would really be nice would be for the MathWorks folks to enable this type of functionality in MATLAB itself. I imagine it would take some changes to their memory manager to keep track of writes to the pr and pi arrays separately. In effect, this would enable pass-by-value semantics and copy-on-write behavior for the real and imaginary mxArrays individually. They would also need to make sure that real(), imag(), and complex() were compatible with this behavior.
>
The enhancement request should not stop at real/imag data, but much generally.
Right now the data are considered as shared in a very restrict way:
- the real part of both array are the same size and stored in same address
To me it seems they can extend the definition of sharing much generally like this:
- when the memory of real part and imaginary part of one array *overlaps* with the memory of the other array.
When they do this extension, if I'm not mistaken the same cross links and copy-on-write principle still work, and it open the whole world of memory saving and unnecessary copying. For example as you have suggested:
A = real(C)
B = imag(C)
Or even more important, something like this
A = C(:,:,k)
Right now the three instructions copy the whole piece of memory, when this can be avoid.
Just a though for people at Mathworks.
Bruno
|
|
0
|
|
|
|
Reply
|
Bruno
|
11/18/2010 10:35:03 PM
|
|
"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <ic47os$e55$1@fred.mathworks.com>...
> For me it is just an interesting programming exercise. I have never needed this capability in the projects I typically work on. For TMW, they would probably have to add another CrossLinkImag field to their mxArray header. Not a very hard thing to do in and of itself but now you are changing a fundamental variable header structure that MATLAB has had pretty much the same for several years. This enhancement might not be worth it to them.
>
> As a related side issue, does anyone know what happened to the name field of the mxArray header? For much older versions of MATLAB this used to be a char array. Then it changed to a char * and pointed to a string with the name. In the latest versions of MATLAB this field is simply NULL ... no pointer to the name anymore. Has this been moved or has it been eliminated? I suspect the latter since I can't find a name pointer anywhere nearby in the header. So perhaps the names are kept off to the side somewhere in a workspace name list of some type separate from the mxArray structure itself.
>
> James Tursa
I agree with you that that the Mathworks folks would need a compelling reason to implement this. Certainly there will be the occasional case where this is a make-or-break need for someone - if there isn't enough memory to make a copy, you can't separate real from imaginary. I'd guess that for intermediate-sized arrays where the memory limit isn't an issue, this would still result in a substantial speedup by avoiding the memory allocations that currently happen. I wonder if TMW has any sort of usage data on calls to various functions in the at-large MATLAB user base. If lots of people use real()/imag() and complex(), maybe it becomes worth it to them to make the change. If not, you've shown that it isn't too hard to do with MEX files, so someone with a compelling need can always go that route.
Regardless of TMW's response, thanks for your efforts on this "interesting programming exercise". I look forward to seeing your code on the File Exchange.
-John
|
|
0
|
|
|
|
Reply
|
John
|
11/18/2010 11:50:07 PM
|
|
"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <ic49mn$ktq$1@fred.mathworks.com>...
>
> The enhancement request should not stop at real/imag data, but much generally.
>
> Right now the data are considered as shared in a very restrict way:
> - the real part of both array are the same size and stored in same address
>
> To me it seems they can extend the definition of sharing much generally like this:
> - when the memory of real part and imaginary part of one array *overlaps* with the memory of the other array.
>
> When they do this extension, if I'm not mistaken the same cross links and copy-on-write principle still work, and it open the whole world of memory saving and unnecessary copying. For example as you have suggested:
>
> A = real(C)
> B = imag(C)
>
> Or even more important, something like this
>
> A = C(:,:,k)
>
> Right now the three instructions copy the whole piece of memory, when this can be avoid.
>
> Just a though for people at Mathworks.
>
> Bruno
Bruno,
You raise a very good point - memory sharing could be much more general. I imagine that at some point in generalizing the concept, the complexity would start to go against the philosophy of MATLAB. That is, they would have to expose the entire memory manager to the user, with all of the accompanying costs and benefits.
Your third example is an interesting one. I've had the thought that some sort of 'page' system for 3-dimensional arrays would be handy - each page gets its own (not necessarily contiguous with other pages) mxArray. It would be great for things like color image data, usually an MxNx3 array. For image processing operations on single colors, indexing is a hassle, and the memory allocations needed to break the data out into separate arrays for processing can become both a speed and out-of-memory issue.
(How long before someone chimes in and tells us to just start using C/C++ if we're so concerned with this kind of stuff?)
-John
|
|
0
|
|
|
|
Reply
|
John
|
11/19/2010 12:03:05 AM
|
|
On 10-11-18 05:50 PM, John Barber wrote:
> I agree with you that that the Mathworks folks would need a compelling
> reason to implement this. Certainly there will be the occasional case
> where this is a make-or-break need for someone - if there isn't enough
> memory to make a copy, you can't separate real from imaginary.
It isn't clear to me that Mathworks is (or even should be) overly concerned
about making deep changes in order to _potentially_ reduce memory use on 32
bit systems. For 64 bit systems, just throw more memory at the problem, and as
32 bit systems appear to be dying off...
|
|
0
|
|
|
|
Reply
|
Walter
|
11/19/2010 12:19:55 AM
|
|
Walter Roberson <roberson@hushmail.com> wrote in message <ic4frk$8r6$1@canopus.cc.umanitoba.ca>...
> On 10-11-18 05:50 PM, John Barber wrote:
>
> > I agree with you that that the Mathworks folks would need a compelling
> > reason to implement this. Certainly there will be the occasional case
> > where this is a make-or-break need for someone - if there isn't enough
> > memory to make a copy, you can't separate real from imaginary.
>
> It isn't clear to me that Mathworks is (or even should be) overly concerned
> about making deep changes in order to _potentially_ reduce memory use on 32
> bit systems. For 64 bit systems, just throw more memory at the problem, and as
> 32 bit systems appear to be dying off...
Very good point, and I doubt that this enhancement would make the cut. Still, even for systems with plenty of physical memory, the time spent copying data from one array to another can add up rather quickly for some situations.
|
|
0
|
|
|
|
Reply
|
John
|
11/19/2010 1:47:05 AM
|
|
On 18/11/10 7:47 PM, John Barber wrote:
> Still, even for systems with plenty of physical memory, the time spent
> copying data from one array to another can add up rather quickly for
> some situations.
True, memory _sizes_ are increasing far faster than memory access
_speed_, so solving problems by throwing memory at them can easily
result in a reduction in efficiency. You are quite right that copying
can add up greatly.
It would be interesting if at some point it was possible for the
programmer to hint about memory trade-offs. For example, if you know
that X and Y are going to be operated on element-wise and that they are
both going to be large, then it could be useful to hint that you didn't
want them to clash in cache access, perhaps even with some sort of
specification of the pattern of co-access that will be most common.
With the 32 bit version memory is allocated where there is a fit for it,
possibly even with a "first fit" algorithm rather than block-allocator
(otherwise fragmentation wouldn't be as much of a problem), with there
being no mechanism that I have been able to discern to avoid cache
conflicts.
This kind of hinting could get a bit syntactically difficult, though,
considering the pervasive automatic allocation of memory for variables,
coupled with the semantics that when a variable is assigned to its
previous properties are overwritten (with an odd exception having to do
with the symbolic toolbox and "assumptions").
It seems to me that for any sufficiently large variable that is being
selectively overwritten, that it could potentially be worth-while to
create a "shadow" array that indicates which parts of the array have
changed and the associated new values; when the array was being read
from, the shadow would be consulted first to determine whether the
indices had been touched, and if so the data would be read out of the
shadow, with untouched data read out of the original array. After some
fraction of the original array had been shadowed, it would become
worthwhile to rewrite the entire array, but before that point, the
access to the shadowing information would (by design) be less expensive
than copying the entire array. Of course, in-place operations on a
non-shared array would not bother to create a shadow: this kind of
optimization would be used to save some DMA during operations that would
otherwise require copy-on-write to break sharing.
This kind of internal semantics would, though, require some kind of flag
to be associated with compiled functions as to whether they knew about
the new style of arrays or instead needed the shadow to be resolved, for
backwards compatibility. The functionality would not come for free --
but if it saves copying a gigabyte around to alter a kilobyte or 100,
then it could potentially be worth-while.
|
|
0
|
|
|
|
Reply
|
Walter
|
11/19/2010 3:33:17 AM
|
|
|
34 Replies
271 Views
(page loaded in 0.142 seconds)
|