Bootstrapping multivariate data

  • Permalink
  • submit to reddit
  • Email
  • Follow


Hi all,

I am searching for a Matlab function that can do the non-parametric bootstrapping of multivariate data. For instance, I have a matrix of sample data (MxN) where M is the dimension of the random vector (multivariate data), and N is the number of observation. I want to generate (resample) bootstrap data from this initial multivariate data. Does anyone knows this function?

Thank you very much for your help.

Best regards, 
CT DO 
0
Reply CT 8/17/2010 5:59:05 AM

See related articles to this posting


"CT " <cong-thanh.do@hotmail.fr> wrote in message <i4d8f9$n3a$1@fred.mathworks.com>...
> Hi all,
> 
> I am searching for a Matlab function that can do the non-parametric bootstrapping of multivariate data. For instance, I have a matrix of sample data (MxN) where M is the dimension of the random vector (multivariate data), and N is the number of observation. I want to generate (resample) bootstrap data from this initial multivariate data. Does anyone knows this function?
> 
> Thank you very much for your help.
> 
> Best regards, 
> CT DO 

matlab does not not have a pre-built function for multivariate data.However, in the file exhcnage you can find a code, the function is called 'bstrag'
0
Reply Rogelio 8/17/2010 6:33:42 AM

On 8/17/2010 1:59 AM, CT wrote:
> I am searching for a Matlab function that can do the non-parametric
> bootstrapping of multivariate data. For instance, I have a matrix of
> sample data (MxN) where M is the dimension of the random vector
> (multivariate data), and N is the number of observation. I want to
> generate (resample) bootstrap data from this initial multivariate data.
> Does anyone knows this function?

If you have access to the Statistics Toolbox, the BOOTSTRP function does 
what you are asking.  it is here:

<http://www.mathworks.com/access/helpdesk/help/toolbox/stat /bootstrp.html>
0
Reply Peter 8/17/2010 12:34:34 PM

Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message <i4dvkq$8ns$2@fred.mathworks.com>...
> On 8/17/2010 1:59 AM, CT wrote:
> > I am searching for a Matlab function that can do the non-parametric
> > bootstrapping of multivariate data. For instance, I have a matrix of
> > sample data (MxN) where M is the dimension of the random vector
> > (multivariate data), and N is the number of observation. I want to
> > generate (resample) bootstrap data from this initial multivariate data.
> > Does anyone knows this function?
> 
> If you have access to the Statistics Toolbox, the BOOTSTRP function does 
> what you are asking.  it is here:
> 
> <http://www.mathworks.com/access/helpdesk/help/toolbox/stat /bootstrp.html>

Isn't this just:

X(:,ceil(rand(1,N)*N))

where X is the sample matrix?
0
Reply Simon 8/17/2010 2:18:05 PM

Thank you all for your replies. I'll try to perform your suggestions and will let you know about the results.

CT DO


"Simon Preston" <preston.simon+mathsworks@gmail.com> wrote in message <i4e5mt$c2f$1@fred.mathworks.com>...
> Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message <i4dvkq$8ns$2@fred.mathworks.com>...
> > On 8/17/2010 1:59 AM, CT wrote:
> > > I am searching for a Matlab function that can do the non-parametric
> > > bootstrapping of multivariate data. For instance, I have a matrix of
> > > sample data (MxN) where M is the dimension of the random vector
> > > (multivariate data), and N is the number of observation. I want to
> > > generate (resample) bootstrap data from this initial multivariate data.
> > > Does anyone knows this function?
> > 
> > If you have access to the Statistics Toolbox, the BOOTSTRP function does 
> > what you are asking.  it is here:
> > 
> > <http://www.mathworks.com/access/helpdesk/help/toolbox/stat /bootstrp.html>
> 
> Isn't this just:
> 
> X(:,ceil(rand(1,N)*N))
> 
> where X is the sample matrix?
0
Reply cong-thanh.do (12) 8/17/2010 4:31:11 PM

On 8/17/2010 10:18 AM, Simon Preston wrote:
>> <http://www.mathworks.com/access/helpdesk/help/toolbox/stat
>> /bootstrp.html>

Sorry, for some reason that link was missing an "s"
<http://www.mathworks.com/access/helpdesk/help/toolbox/stats/bootstrp.html>

> Isn't this just:
>
> X(:,ceil(rand(1,N)*N))
>
> where X is the sample matrix?

That's the basis of it, yes.  But:

1) It's kind of tedious to write the same loop over and over, regardless 
of how simple that loop is,
1) There is a good deal of flexibility in the arguments you can pass to 
BOOTSTRP, so a single matrix isn't the only case it handles for you, and
2) (in recent MATLAB releases) There is support for parallelizing the 
computations using PARFOR (if your installation supports that)

Just as an aside, since 2008b you might find it easier to use RANDI to 
generate random integers.
0
Reply Peter.Perkins (345) 8/17/2010 5:58:47 PM

Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message <i4eikn$ndn$1@fred.mathworks.com>...
> On 8/17/2010 10:18 AM, Simon Preston wrote:
> >> <http://www.mathworks.com/access/helpdesk/help/toolbox/stat
> >> /bootstrp.html>
> 
> Sorry, for some reason that link was missing an "s"
> <http://www.mathworks.com/access/helpdesk/help/toolbox/stats/bootstrp.html>
> 
> > Isn't this just:
> >
> > X(:,ceil(rand(1,N)*N))
> >
> > where X is the sample matrix?
> 
> That's the basis of it, yes.  But:
> 
> 1) It's kind of tedious to write the same loop over and over, regardless 
> of how simple that loop is,
> 1) There is a good deal of flexibility in the arguments you can pass to 
> BOOTSTRP, so a single matrix isn't the only case it handles for you, and
> 2) (in recent MATLAB releases) There is support for parallelizing the 
> computations using PARFOR (if your installation supports that)
> 
> Just as an aside, since 2008b you might find it easier to use RANDI to 
> generate random integers.

Just one thing to point out, you said that M is the dimention of the data. I thought that you ment different groups or different experiments where the data was collected, after all thats why your data is not of dimenation N*M x 1, for instance. If the columns of the matrix represent different groups, for some or another reason, you cannot pool the series. As far as know 'bootstrp' does not distinguishes among different groups. If this last statement is incorrect, can someone send me the link to read about it. 
Thanks
0
Reply Rogelio 8/17/2010 7:31:04 PM

I mean that I have N observations of the random vectors x, the vector x has M elements, these are the seed data. So each variable here is a vector (of M elements). Their probability density distribution (pdf) might be multivariate distribution, e.g. Gaussian mixture model (GMM). Since the bootstrap here is non-parametric, the N observations will be used instead of a concrete pdf.

I have tried to used BOOTSTRP to perform the bootstrapping, but it is not easy, even unfeasible (tell me if I am wrong), since the manual of BOOTSTRP in Matlab is not clear in this case (I think).

If the generated data is only X(:,ceil(rand(1,N)*N)), I don't see anything new that the bootstrap can bring. As I see, this is only a disorder of the initial data, we cannot expect anything different from the new data, I'm wrong?

"Rogelio " <rogelioa@math.uio.no> wrote in message <i4eo1o$c8b$1@fred.mathworks.com>...
> Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message <i4eikn$ndn$1@fred.mathworks.com>...
> > On 8/17/2010 10:18 AM, Simon Preston wrote:
> > >> <http://www.mathworks.com/access/helpdesk/help/toolbox/stat
> > >> /bootstrp.html>
> > 
> > Sorry, for some reason that link was missing an "s"
> > <http://www.mathworks.com/access/helpdesk/help/toolbox/stats/bootstrp.html>
> > 
> > > Isn't this just:
> > >
> > > X(:,ceil(rand(1,N)*N))
> > >
> > > where X is the sample matrix?
> > 
> > That's the basis of it, yes.  But:
> > 
> > 1) It's kind of tedious to write the same loop over and over, regardless 
> > of how simple that loop is,
> > 1) There is a good deal of flexibility in the arguments you can pass to 
> > BOOTSTRP, so a single matrix isn't the only case it handles for you, and
> > 2) (in recent MATLAB releases) There is support for parallelizing the 
> > computations using PARFOR (if your installation supports that)
> > 
> > Just as an aside, since 2008b you might find it easier to use RANDI to 
> > generate random integers.
> 
> Just one thing to point out, you said that M is the dimention of the data. I thought that you ment different groups or different experiments where the data was collected, after all thats why your data is not of dimenation N*M x 1, for instance. If the columns of the matrix represent different groups, for some or another reason, you cannot pool the series. As far as know 'bootstrp' does not distinguishes among different groups. If this last statement is incorrect, can someone send me the link to read about it. 
> Thanks
0
Reply CT 8/18/2010 6:17:24 AM

If you are saying or have a feeling that your data might come from a multivariate distribution, then as far as I know 'bootstrp' will pool your data together, assuming they come from the same pdf which might be an erronous assumption. 
> I have tried to used BOOTSTRP to perform the bootstrapping, but it is not easy, even unfeasible (tell me if I am wrong), since the manual of BOOTSTRP in Matlab is not clear in this case (I think)<
Why? can you tell us what is the mistake or post the code
>As I see, this is only a disorder of the initial data, we cannot expect anything different from the new data, I'm wrong?<
What the bootstrapring does, roughly speaking, is to resample with replacement. We create pseudo random variables out from your original data. The empirical pdf will converge to the pdf, this is asymptotically. 


"CT " <cong-thanh.do@hotmail.fr> wrote in message <i4fttk$qpk$1@fred.mathworks.com>...
> I mean that I have N observations of the random vectors x, the vector x has M elements, these are the seed data. So each variable here is a vector (of M elements). Their probability density distribution (pdf) might be multivariate distribution, e.g. Gaussian mixture model (GMM). Since the bootstrap here is non-parametric, the N observations will be used instead of a concrete pdf.
> 
> I have tried to used BOOTSTRP to perform the bootstrapping, but it is not easy, even unfeasible (tell me if I am wrong), since the manual of BOOTSTRP in Matlab is not clear in this case (I think).
> 
> If the generated data is only X(:,ceil(rand(1,N)*N)), I don't see anything new that the bootstrap can bring. As I see, this is only a disorder of the initial data, we cannot expect anything different from the new data, I'm wrong?
> 
> "Rogelio " <rogelioa@math.uio.no> wrote in message <i4eo1o$c8b$1@fred.mathworks.com>...
> > Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message <i4eikn$ndn$1@fred.mathworks.com>...
> > > On 8/17/2010 10:18 AM, Simon Preston wrote:
> > > >> <http://www.mathworks.com/access/helpdesk/help/toolbox/stat
> > > >> /bootstrp.html>
> > > 
> > > Sorry, for some reason that link was missing an "s"
> > > <http://www.mathworks.com/access/helpdesk/help/toolbox/stats/bootstrp.html>
> > > 
> > > > Isn't this just:
> > > >
> > > > X(:,ceil(rand(1,N)*N))
> > > >
> > > > where X is the sample matrix?
> > > 
> > > That's the basis of it, yes.  But:
> > > 
> > > 1) It's kind of tedious to write the same loop over and over, regardless 
> > > of how simple that loop is,
> > > 1) There is a good deal of flexibility in the arguments you can pass to 
> > > BOOTSTRP, so a single matrix isn't the only case it handles for you, and
> > > 2) (in recent MATLAB releases) There is support for parallelizing the 
> > > computations using PARFOR (if your installation supports that)
> > > 
> > > Just as an aside, since 2008b you might find it easier to use RANDI to 
> > > generate random integers.
> > 
> > Just one thing to point out, you said that M is the dimention of the data. I thought that you ment different groups or different experiments where the data was collected, after all thats why your data is not of dimenation N*M x 1, for instance. If the columns of the matrix represent different groups, for some or another reason, you cannot pool the series. As far as know 'bootstrp' does not distinguishes among different groups. If this last statement is incorrect, can someone send me the link to read about it. 
> > Thanks
0
Reply Rogelio 8/18/2010 6:55:23 AM

By the way ...... what is the statistc that you are bootstraping? it will be nice if you post the code.

"Rogelio " <rogelioa@math.uio.no> wrote in message <i4g04r$fa8$1@fred.mathworks.com>...
> If you are saying or have a feeling that your data might come from a multivariate distribution, then as far as I know 'bootstrp' will pool your data together, assuming they come from the same pdf which might be an erronous assumption. 
> > I have tried to used BOOTSTRP to perform the bootstrapping, but it is not easy, even unfeasible (tell me if I am wrong), since the manual of BOOTSTRP in Matlab is not clear in this case (I think)<
> Why? can you tell us what is the mistake or post the code
> >As I see, this is only a disorder of the initial data, we cannot expect anything different from the new data, I'm wrong?<
> What the bootstrapring does, roughly speaking, is to resample with replacement. We create pseudo random variables out from your original data. The empirical pdf will converge to the pdf, this is asymptotically. 
> 
> 
> "CT " <cong-thanh.do@hotmail.fr> wrote in message <i4fttk$qpk$1@fred.mathworks.com>...
> > I mean that I have N observations of the random vectors x, the vector x has M elements, these are the seed data. So each variable here is a vector (of M elements). Their probability density distribution (pdf) might be multivariate distribution, e.g. Gaussian mixture model (GMM). Since the bootstrap here is non-parametric, the N observations will be used instead of a concrete pdf.
> > 
> > I have tried to used BOOTSTRP to perform the bootstrapping, but it is not easy, even unfeasible (tell me if I am wrong), since the manual of BOOTSTRP in Matlab is not clear in this case (I think).
> > 
> > If the generated data is only X(:,ceil(rand(1,N)*N)), I don't see anything new that the bootstrap can bring. As I see, this is only a disorder of the initial data, we cannot expect anything different from the new data, I'm wrong?
> > 
> > "Rogelio " <rogelioa@math.uio.no> wrote in message <i4eo1o$c8b$1@fred.mathworks.com>...
> > > Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message <i4eikn$ndn$1@fred.mathworks.com>...
> > > > On 8/17/2010 10:18 AM, Simon Preston wrote:
> > > > >> <http://www.mathworks.com/access/helpdesk/help/toolbox/stat
> > > > >> /bootstrp.html>
> > > > 
> > > > Sorry, for some reason that link was missing an "s"
> > > > <http://www.mathworks.com/access/helpdesk/help/toolbox/stats/bootstrp.html>
> > > > 
> > > > > Isn't this just:
> > > > >
> > > > > X(:,ceil(rand(1,N)*N))
> > > > >
> > > > > where X is the sample matrix?
> > > > 
> > > > That's the basis of it, yes.  But:
> > > > 
> > > > 1) It's kind of tedious to write the same loop over and over, regardless 
> > > > of how simple that loop is,
> > > > 1) There is a good deal of flexibility in the arguments you can pass to 
> > > > BOOTSTRP, so a single matrix isn't the only case it handles for you, and
> > > > 2) (in recent MATLAB releases) There is support for parallelizing the 
> > > > computations using PARFOR (if your installation supports that)
> > > > 
> > > > Just as an aside, since 2008b you might find it easier to use RANDI to 
> > > > generate random integers.
> > > 
> > > Just one thing to point out, you said that M is the dimention of the data. I thought that you ment different groups or different experiments where the data was collected, after all thats why your data is not of dimenation N*M x 1, for instance. If the columns of the matrix represent different groups, for some or another reason, you cannot pool the series. As far as know 'bootstrp' does not distinguishes among different groups. If this last statement is incorrect, can someone send me the link to read about it. 
> > > Thanks
0
Reply Rogelio 8/18/2010 7:08:05 AM

On 8/18/2010 2:55 AM, Rogelio wrote:
> If you are saying or have a feeling that your data might come from a
> multivariate distribution, then as far as I know 'bootstrp' will pool
> your data together, assuming they come from the same pdf which might be
> an erronous assumption.

Rogelio, your definition of "multivariate" seems to mean "grouped" or 
"stratified" or "from a mixture distribution".  The usual way to define 
"multivariate" is simply that there are multiple variables.  You are 
correct that BOOTSTRP does not resample with stratification, but it's 
not clear that that is what the OP was asking about.
0
Reply Peter 8/18/2010 12:22:25 PM

For instance, I have a matrix X(M,N) = X(3,500) of initial data. There are thus N = 500 observations of random vector tri-variate random vector x following the multivariate normal distribution. These data can be generated by the code:
mu = [1 -1 -2]; Sigma = [2 -1 1; -1 2 -1; 1 -1 2];
X = mvnrnd(mu, Sigma, 500);
I don't know if I can use 'bootstrp' to generate the data of the same nature, i.e. they follow (asymptotically) the multivariate normal distribution that I have used to generate X:
[bootstat, bootsamp] = bootstrp(10, [], X); (I don't care about the stats of the data at the moment, I want to have the resampled data only).

However, 'bootstrp' returns the matrix bootsamp of dimension 500x10, so 'bootstrp' has done only for one dimensional variable? And I don't know if 'bootstrp' can return the stats for multivariate distribution or not? (here are the mean vector and covariance matrix)

"Rogelio " <rogelioa@math.uio.no> wrote in message <i4g0sl$a6k$1@fred.mathworks.com>...
> By the way ...... what is the statistc that you are bootstraping? it will be nice if you post the code.
> 
> "Rogelio " <rogelioa@math.uio.no> wrote in message <i4g04r$fa8$1@fred.mathworks.com>...
> > If you are saying or have a feeling that your data might come from a multivariate distribution, then as far as I know 'bootstrp' will pool your data together, assuming they come from the same pdf which might be an erronous assumption. 
> > > I have tried to used BOOTSTRP to perform the bootstrapping, but it is not easy, even unfeasible (tell me if I am wrong), since the manual of BOOTSTRP in Matlab is not clear in this case (I think)<
> > Why? can you tell us what is the mistake or post the code
> > >As I see, this is only a disorder of the initial data, we cannot expect anything different from the new data, I'm wrong?<
> > What the bootstrapring does, roughly speaking, is to resample with replacement. We create pseudo random variables out from your original data. The empirical pdf will converge to the pdf, this is asymptotically. 
> > 
> > 
> > "CT " <cong-thanh.do@hotmail.fr> wrote in message <i4fttk$qpk$1@fred.mathworks.com>...
> > > I mean that I have N observations of the random vectors x, the vector x has M elements, these are the seed data. So each variable here is a vector (of M elements). Their probability density distribution (pdf) might be multivariate distribution, e.g. Gaussian mixture model (GMM). Since the bootstrap here is non-parametric, the N observations will be used instead of a concrete pdf.
> > > 
> > > I have tried to used BOOTSTRP to perform the bootstrapping, but it is not easy, even unfeasible (tell me if I am wrong), since the manual of BOOTSTRP in Matlab is not clear in this case (I think).
> > > 
> > > If the generated data is only X(:,ceil(rand(1,N)*N)), I don't see anything new that the bootstrap can bring. As I see, this is only a disorder of the initial data, we cannot expect anything different from the new data, I'm wrong?
> > > 
> > > "Rogelio " <rogelioa@math.uio.no> wrote in message <i4eo1o$c8b$1@fred.mathworks.com>...
> > > > Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message <i4eikn$ndn$1@fred.mathworks.com>...
> > > > > On 8/17/2010 10:18 AM, Simon Preston wrote:
> > > > > >> <http://www.mathworks.com/access/helpdesk/help/toolbox/stat
> > > > > >> /bootstrp.html>
> > > > > 
> > > > > Sorry, for some reason that link was missing an "s"
> > > > > <http://www.mathworks.com/access/helpdesk/help/toolbox/stats/bootstrp.html>
> > > > > 
> > > > > > Isn't this just:
> > > > > >
> > > > > > X(:,ceil(rand(1,N)*N))
> > > > > >
> > > > > > where X is the sample matrix?
> > > > > 
> > > > > That's the basis of it, yes.  But:
> > > > > 
> > > > > 1) It's kind of tedious to write the same loop over and over, regardless 
> > > > > of how simple that loop is,
> > > > > 1) There is a good deal of flexibility in the arguments you can pass to 
> > > > > BOOTSTRP, so a single matrix isn't the only case it handles for you, and
> > > > > 2) (in recent MATLAB releases) There is support for parallelizing the 
> > > > > computations using PARFOR (if your installation supports that)
> > > > > 
> > > > > Just as an aside, since 2008b you might find it easier to use RANDI to 
> > > > > generate random integers.
> > > > 
> > > > Just one thing to point out, you said that M is the dimention of the data. I thought that you ment different groups or different experiments where the data was collected, after all thats why your data is not of dimenation N*M x 1, for instance. If the columns of the matrix represent different groups, for some or another reason, you cannot pool the series. As far as know 'bootstrp' does not distinguishes among different groups. If this last statement is incorrect, can someone send me the link to read about it. 
> > > > Thanks
0
Reply CT 8/18/2010 3:55:28 PM

Here's some very basic code that might illustrate what's' going on

%%  Generate your original data set

mu = [1 -1 -2]; Sigma = [2 -1 1; -1 2 -1; 1 -1 2];

X = mvnrnd(mu, Sigma, 500);



%%  Sampling with replacement to create a new data set



% Generate an index

boot_index = randsample(1:length(X),length(X), 'true')'



% Use the index to create a new dataset

Boot_dataset = X(bootindex,:)



A bootstrap is simply repeating this same operation nboot times and then 
calculating something interesting using this set of new data sets.



Jumping back to the whole "multivariate" discussion.



Each time you're drawing from X, you're extracting an entire row.

All of the elements of this row are related in that they are a single output 
from your original multivariate normal distribution.



All of this assumes that you need to perform a nonparametric bootstrap.



If you have prior knowledge that your population is described by a 
multivariate normal distribution with



mu = [1 -1 -2]



and



Sigma = [2 -1 1; -1 2 -1; 1 -1 2];



then its often entirely appropriate to use parametric bootstrap  and 
generate your new dataset using mvnrnd.


0
Reply Richard 8/18/2010 4:56:17 PM

Just a correction, the covariance matrix that I have used is only an example to illustrate the generation of multivariate data. A matrix like that might have no sense.
Thank you for the discussions.
0
Reply CT 8/19/2010 4:13:58 PM
comp.soft-sys.matlab 203507 articles. 529 followers. Post

13 Replies
728 Views

Similar Articles

[PageSpeed] 6


  • Permalink
  • submit to reddit
  • Email
  • Follow


Reply:

Similar Artilces:

ANNC: Data Loom, Multivariate Data Visualization for Windows
Hi, I've just released Data Loom for Windows. You can download it from the files section of the Yahoo Data Loom group: http://groups.yahoo.com/group/dataloom/files/ It is the file called "Data Loom Setup.zip". Data Loom displays the data in a parallel coordinates chart, where each data variable is represented by a vertical axis and each data point is represented by a trace drawn across all the axes. Such charts can help you to see relationships between variables and to identify and characterize outliers in your data. It reads tab-delimited text files. Thank ...

Creating Clones of data rows and back to Multivariate data structure
Hi, Lets suppose I have a table "Big_data", I also have a table called "NotSoBig_dataset1". 1. The table Big_data (A) would have variables Market (Mkt) - 52 different markets, Product (Prod) -- 30 different products, Period (Per) -- 104 different weeks 2003/2004 data. It also has lots of variables of interest to do with Price, Volume, discount and lots of other measures . 2. The table "NotSoBig_dataset1" (B) is quite tricky. It has a variable Mkt1 (has same market names as variable Mkt in A but not all markets are necessarily present), Prod1 (has same product n...

Re: Creating Clones of data rows and back to Multivariate data
Why do you want to do this? The structure you propose is generally less flexible and harder to work with than the one you have. Assuming some good reason ... Test data: data given; infile cards dsd missover; input Market $ Product $ Period : yymmdd8. Var1 Var2; format period date9.; cards; RET,B69,20030901,45,67 RET,B69,20030901,25,17 RET,B69,20030901,90,69 BHI,HG,20040708,876, BHI,HG,20040708,9,987 WER,UI,20050921,12,70 WER,UI,20050921,,908 WER,UI,20050921,79,96 ; The big question is whether or not you are able to specify an upper bound on the num...

Re: Creating Clones of data rows and back to Multivariate data #2
Hari, Study the log for the program below and you should understand how FIRST. and LAST. variables are set. data w ; input x y z ; cards ; 1 1 1 2 1 1 3 1 1 3 2 1 3 2 2 3 3 2 ; data _null_ ; set w ; by x y z ; put _all_ ; run ; Ian Whitlock ================ Date: Thu, 12 Jan 2006 11:53:03 -0800 Reply-To: Hari <excel_hari@YAHOO.COM> Sender: "SAS(r) Discussion" From: Hari <excel_hari@YAHOO.COM> Organization: http://groups.google.com Subject: Re: Creating Clones of data rows and back...

Re: Creating Clones of data rows and back to Multivariate data #4 1550792
On Thu, 12 Jan 2006 11:19:48 -0800, Hari <excel_hari@YAHOO.COM> wrote: >"Howard Schreier <hs AT dc-sug DOT org>" wrote: >> Why do you want to do this? The structure you propose is generally less >> flexible and harder to work with than the one you have. >> >> Assuming some good reason ... >> >> Test data: >> >> data given; >> infile cards dsd missover; >> input Market $ Product $ Period : yymmdd8. Var1 Var2; >> format period date9.; >> cards; >> RET,B69,20030901,45,67 >&...

Re: Creating Clones of data rows and back to Multivariate data #4 658717
On Sat, 14 Jan 2006 04:00:40 -0800, Hari <excel_hari@YAHOO.COM> wrote: >Hi, > >I learned post-haste that the 2 variables var1 (numeric) and var2 >(string) need to be dealt in a slightly different manner. For sample >case, Var1 contains Circulation figures then Var2 is a categorical >variable (like Size of Ad can be Full page, Half Page and lets say 3 >more levels) > >If my data file has 5 levels of Var2 then each unique row (made from >Market/product/Week combination) should contain only the variables >Market/product/Week and 5 new variables called Circu...

bootstrap and missing data?
Hello, Might anyone please kindly answer my question? Thank you very much. I am using Amos. I need to do bootstrapping. However, I have missing data. I found from the literature that, " AMOS requires that the input database be complete for diagnosing sample data non- normality and for using any of its bootstrap features. In other words, if you have missing data, you must solve the missing data problem before you can use AMOS's non-normality diagnostic and bootstrap features." How to solve the missing data problem before I use Amos to diagonize non-normality and d...

Bootstrpping multivariate data
Hi all, I am searching for a Matlab function that can do the non-parametric bootstrapping of multivariate data. For instance, I have a matrix of sample data (MxN) where M is the dimension of the random vector (multivariate data), and N is the number of observation. I want to generate (resample) bootstrap data from this initial multivariate data. Does anyone knows this function? Thank you very much for your help. Best regards, CT DO ...

spread of data, multivariate
Hi all, how is the spread of data computed, if i know cov matrix in multivariate ...

Multivariate data analysis
Hi everyone, I have some problem regrading data analysis in Matlab. Are there any Matlab functions that allow me to find out the relationship between two input variables? Say p is a set of experimental data, which depends on both input x and y, how can I work out p=f(x,y) by using Matlab? Thanks very much Ericson <ericson@sonet.com> wrote in message news:<eeccb23.-1@webx.raydaftYaTP>... > Hi everyone, I have some problem regrading data analysis in Matlab. > Are there any Matlab functions that allow me to find out the > relationship between two input variables? Say p ...

Bootstrap with clustered data
I want to do some bootstrapping with data which is clustered (multiple observations per person). I plan to modify the %boot macro. Has anyone already done this? Any tips? -----Original Message----- >From: BruceBrad <BruceBrad@INAME.COM> >Sent: Jan 15, 2008 2:05 AM >To: SAS-L@LISTSERV.UGA.EDU >Subject: Bootstrap with clustered data > >I want to do some bootstrapping with data which is clustered (multiple >observations per person). I plan to modify the %boot macro. Has anyone >already done this? Any tips? Instead of this, see David Cassell's paper "Don...

lev-mar multivariate data
I am trying to fit a Beam Profile, which is a 2 dimentional Gaussian and was wondering if anyone might have any suggestions on how to go about it without having to recreate too much of what is already available in LabVIEW. Thanks Eugene You need to rewrite the nonlinear Lev-Mar fit so the fit funtion operates on the entire array, instead of one point at the time and everything will be much easier. Then simply reshape your 2D gaussian data and model function to a 1D array of size (x*y) and fit as usual. <b>To get you started we've done some of the work for you:</b> :-) Qu...

Re: bootstrapping (genetic data)
On Wed, 30 Nov 2005 14:31:05 -0500, Zach Peery <zpeery@NATURE.BERKELEY.EDU> wrote: >Hi All, > >I have what I think is a pretty straight forward question about >bootstrapping genetic data. I would like to randomly select genes from >individuals (with replacement) from an original dataset and make new >individuals with those genes placed into a new dataset. Easy (and done in >the sas statements below), but the catch is that my data set has more than >one gene. The law of independent segregation says that different genes >must be independent for a given newly ...

Re: bootstrap and missing data?
zencaroline@GMAIL.COM wrote: > >Hello, > > Might anyone please kindly answer my question? Thank you very >much. > > I am using Amos. I need to do bootstrapping. However, I have >missing data. I found from the literature that, " AMOS requires that >the input database be complete for diagnosing sample data non- >normality and for using any of its bootstrap features. In other words, >if you have missing data, you must solve the missing data problem >before you can use AMOS's non-normality diagnostic and bootstrap >features." > >...

Draw ellipse for multivariate data
i want to draw ellipse for multivariate data. here is my wish; i have some random numbers x which is mu=zeros(1,p) % p is dimension var=eye(p) % p is dimension x=mvnrnd(mu,var,n) and i estimate the location and shape parameter of this data with MCD method. method is not important for you because i don't have any problem. And then i want draw an ellipsoid with center estimated location parameter and its' border is chi-squre with p degrees of freedom and alfa=0.05. in theory inequality is; 1-) (x-mu)'*inv(var)*(x-mu)<= chi2inv(p,alfa) x element of R1 2-) chi2in...

Nonlinear regression of multivariable data
i want to perform nonlinear regression on my data. i have 5 variables and user defined nonlinear equation to fit in data. can anyone provide .m file script for nonlinear regression through matlab. variables x1,x2,x3,x4 equation y = a*x1^b*x2^c*x3^d*x4^e On 4/18/2014 1:38 PM, Arya Harish wrote: > i want to perform nonlinear regression on my data. i have 5 > variables and user defined nonlinear equation to fit in data. can > anyone provide.m file script for nonlinear regression through matlab. > variables x1,x2,x3,x4 > equation y = a*x1^b*x2^c*x3^d*x4^e help optimfun ...

How to find the best fit for a data with multivariables
Hi I have a matrix of size 800 x 5 containing the data of the five independent variables a,b,c,d,e and vector 800 x 1 containing the data for the dependent variable z. I want to fit a model for the data giving the functional relation between dependent variable z and five independent variables a,b,c,d,e. My questions are: 1. Which functions/tools in MATLAB should I used to get the best fit? 2. Which MATLAB tools/functions should I used to get an expression of the fitted model and also validate the fitted model? Best Regards, Rabi "Rabi " <rabikhattak@gmail.c...

plotting multivariables from an excel data sheet
Hi all, I have some experimental data with three variables I would like to plot, but I am having trouble figuring it all out. v1, v2, and V are my variables. v1 represents voltage ranging from 0-5 while v2 represents voltage from 5-0 and V is the output voltage produced from the combination of the two...For example v1 = 0, v2 = 5, V = -7.53...The next entry would be v1 = 0, v2 = 4, V = -6.05...next v1 = 0, v2 = 3, V = -4.54, and so on...Please any help will be much apperciated its been awhile since I have used matlab so I have forgotten alot it would seem. Here is my code thus far, I know...

using mysqldump data for mysqld --bootstrap
When moving data from machine 1 to machine 2, I tried to script the moving of mysql data, basically in pseudocodish: foreach db mysqldump the database move file to new machine cat dumpfile | mysqld --bootstrap --datadir=$DDIR end This fails at first, since mysqld --bootstrap can't deal with linebreaks in statements, so the CREATE TABLE .... fails, as it is distributed over multiple lines. A little scripting to remove the newlines, and now the script works as planned, but is this intentional behaviour? Shouldn't mysqld accept the mysqldump files? (Next step coul...

Creating Bootstrap replicates for clustered data
I'm trying to modify the SAS %boot macro to do simple bootstrap replication with clustered data. Eg I have multiple observations per person, and want to resample people rather than observations. I wrote the following test code. The "where" statement doesn't work - it seems I can only use variables in the file being subsetted. Any other suggestions? I don't think proc surveyselect can do this. * Create test data; %let Nclusters=10; data datain (index=(cluster) sortedby=cluster); do cluster = 1 to &nclusters; do case = 1 to ceil(ranuni(0)*5); /* 1 to 5 cases per clu...

how to generate multivariate random data from a given
hi, I want to generate multivariate random data from a given distribution, which is not multivariate normal or student's t. especially, the idea is from the paper by Clayton el al(1985): Journal of royal statistical society, ser A. Does anybody have some suggestion? thanks Jeff ...

Draw ellipse for multivariate data #2
i want to draw ellipse for multivariate data. here is my wish; i have some random numbers x which is mu=zeros(1,p) % p is dimension var=eye(p) % p is dimension x=mvnrnd(mu,var,n) and i estimate the location and shape parameter of this data with MCD method. method is not important for you because i don't have any problem. And then i want draw an ellipsoid with center estimated location parameter and its' border is chi-squre with p degrees of freedom and alfa=0.05. in theory inequality is; 1-) (x-mu)'*inv(var)*(x-mu)<= chi2inv(p,alfa) x element of R1 2-) chi2in...

multivariate & multidimensional data generate
Good morning, everyone. I want to generate two datasets. The first one is assumed to be multivariate normally distributed with mean vector zero and variance-covariance matrix is diagonal matrix. The second one is multidimensional data which is 3*3. However, I did not get statistics toolbox. Please help me! ...

Log-Normal MultiVariate Data Generation
Hi: I have to generate log-normal multivariate data for my work. I am using the code thats available for this. Here is the link for the code I am using to generate Log-normal Data: http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=6426&objectType=file Now when I use for my input values (Mu, Sigma and CorrMat) I get an error saying that 'sigma should be positive semi-definite'. Is there a way to get around this error? How can I make my sigma a positive definite? Is there a general way in which this can be done so that it works for any input data? Thanks neo...