Re: Compressing data sets (was Re: [SAS-L]) #2 #4
It's not a big deal, but there is a difference between reading a file
directly and having another program as an intermediary, and someone
someday is going to get into trouble because they didn't take that into
I'm not sure what you mean by "using SAS/Connect to read non-SAS
Manager, Technical Development
Metrics Department, First Health
West Sacramento, California USA
>>> "Bruce Johnson" <bjohnson@SOLUCIENT.COM> 01/06/2004 9:36 AM >>>
But if another program is doing it, within SAS, wh...Re: Generating Random Data #4
> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On
> Behalf Of Zai Saki
> Sent: Wednesday, April 23, 2008 3:43 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: Generating Random Data
> Thanks, all methods were helpful.
> However I wasn't able to get an exact 5% with these. Is there a way to
> control for that.
Yes, there are probably as many ways as people willing to post a solution, but your original request did say "So if we generated 200 respondents we
want **approximately** ten 1...Re: Random data imputation #4
Thank you for the clear explanation=2C now I get it and I have better under=
standing about uniform distribution too.
Sophia> Date: Wed=2C 19 Nov 2008 12:58:47 -0800> From: NordlDJ@DSHS.WA.GOV>=
Subject: Re: Random data imputation> To: SAS-L@LISTSERV.UGA.EDU> > > -----=
Original Message-----> > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA=
..EDU] On> > Behalf Of Suhong Tong> > Sent: Wednesday=2C November 19=2C 2008=
12:39 PM> > To: SAS-L@LISTSERV.UGA.EDU> > Subject: Re: Random data imputat=
ion> >>...Best possible compression method for random data?
What is the best known compression method for random data?
I mean any random data, example in hex: it would be between 00-FF
Size let's say 1 mega >
Posted Via Usenet.com Premium Usenet Newsgroup Services
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
) What is the best known compression method for random data?
) I mean any random data, example in hex: it would be between 00-FF
) Size let's s...Re: Secure method of FTPing data #4 682027
Indeed, I've actually used PGP in the past to encrypt data using
external calls (via the X command) to PGP's command line utility with
the following macro:
/*Encrypt the CSV File Created in Previous Step*/
x "pgp -tea +force c:\HF\SASExtracts\Extract&todaydate..txt
%let email@example.com; /*The public key for PGP encryption*/
Then, SAS automatically FTP's the encrypted file.
It worked well and met NIST standards for encryption and transmission of
data at the time.
Mark J. Lamias
SAIC...Re: How to pick records randomly from a data sets #4
I don't know much about STAT, but why don't you try something like:
If you want less output use 40 instead of 30 after RANUNI(0).
Don't know, if that does what you need...
On Fri, 23 Mar 2007 06:12:56 -0400, Arild S <sko@KLP.NO> wrote:
>On Fri, 23 Mar 2007 07:10:33 +0100, Rune Runnest?ne@FASTLANE.NO> wrote:
>>The point of choosing only 2 data points per year i simplicity. The ...Re: Secure method of FTPing data #4 1560272
Thanks for the tip!
On Tue, 16 Jan 2007 05:22:07 -0800, MikeInWV <mikeslover@GMAIL.COM> wrote:
>If possible, look into using PGP software to encrypt the file before
>sending it to an FTP server. You can obtain it from
>One note---the recipient would need to have PGP and provide you with
>their public key. I have recently gone through this process and it
>> On Mon, 15 Jan 2007 20:50:50 -0800, daniel_coppin@BRADYCORP.COM wrote:
>> >I am using the following ...Re: How to randomly select a certain data from a large dataset #4
X-OriginalArrivalTime: 06 Nov 2009 03:45:55.0563 (UTC) FILETIME=[A50AC3B0:01CA5E93]
X-Scanned-By: Digested by UGA Mail Gateway on 188.8.131.52
Content-Type: text/plain; charset="iso-8859-1"
The output of the following code is approximately 2% sample.
if ranuni(2)<=3D0.02=3B /*approximately 2%=2C e.g: if ranuni(2)<=3D0.1 fo=
Hope ...Re: How to randomly select records from a large data set? #4
Remove the semicolon after ...small on the first line in:
proc surveyselect data=melody.melody_2 out=melody.melody2_small;
proc surveyselect data=melody.melody_2 out=melody.melody2_small
On Jan 22, 2008 10:14 AM, Melodyp <firstname.lastname@example.org> wrote:
> On Jan 21, 3:47 pm, Melodyp <pearsonmel...@gmail.com> wrote:
> > Hello! I have a question on how to select about 8500 records from a
> > nearly one million records dataset.
> > The 8500 records must be randomly sel...Re: Random contents using data (newbie question) #4
>Hello for the first time!
>How do I create a table using "data" instruction and populate it with
>random records? I need to create some sales report containing random
>records with amount of items sold, salesperson ID, item ID, date of
>Thank you in advance for any hints.
In addition to the great advice you ahve already seen, let me
add another point.
If you have some starting set of variables in a SAS data set,
you can sample from that to get working test data sets...Re: Method to distinguish different periods in time series data #4
I think that some good graphing can say a whole lot, and does not get
into the potentially problematic area of doing statistical inference
without thinking hard about it - always a dangerous area. Graphs can
be helpful in getting your ideas together about next steps, and if
done ethically and precisely they can be great descriptive tools to
support your other analysis.
Now you mentioned that you have several variables. Do you have a
single target variable and the others potentially affect this one, or
are all of the variables of equal or unknown importance?
On 10/15/07, kansaskannan@g...Re: COMPRESS function working differently in Data Step and Macro #4
The suggestion you provided using %BQUOTE got rid of the WARNING messages,
but did have the desired result in naming the HTML files. The COMPRESS
function only removed spaces, not punctuation - and the 2nd and 3rd
parameters on the COMPRESS function became part of the HTML file name.
Ex. nh_St.Mary's,'','p'.html nh_AnneArundel,'','p'.html
Any additional suggestions are appreciated. Thanks.
Center for Health Program Development and Management
University of Maryland, Baltimore County
-----Origina...Re: Randomly Select Observations on Infile Step (Large Data Sets) #4
Your proposal does not insure that every case has equal likelihood of being included in the random sample. In particular the cases near the end-of-file will not have the same probability of being chosed as the cases new the beginning.
I believe this is a good case for hash table use. Process the entire raw data file, using the hash table to hold all the variables for the 50 cases with the highest value of the random number. Obviously the hash table will start out with the first 50 cases from the raw file, but as new cases with higher random numbers are identified, the entry with th...Re: Los Angeles Times: Low-Tech Methods Used in Data Theft #4
> Consumers have been able to challenge adverse entries in their reports
> for years. Reporting companies are required to investigate and remove
> said item if it can't be substantiated. Furthermore, the consumer is
> required to be told when a credit report was used to as a basis of an
> adverse decision and is entitled to request a copy of that report,
> even if they have their free annual report allowance used already.
From what I read in the newspapers, that protection doesn't
work very well in practice. In other words, the consumer ends...Re: Random number generation
I, too, hope Ed enjoys his well earned vacation and my intent was
definitely not to pick on Ed, but discover some additional tidbits like
those you shared in your post.
And, of course, I used a program to choose my seed.
Don't miss the point though. Many on this list, including me, don't know
many of the things which you and others may find intuitively obvious.
Anything else you can add to my wish list?
On Wed, 4 Feb 2009 21:50:32 -0500, Ian Whitlock <iw1sas@GMAIL.COM> wrote:
>There is something funny here. There are exactly 12 4-digit s...Determine compression method from compressed data
Maybe one of you recognizes some characteristic bits in the stream
described below. My goal is to find a way to compress a different stream so
that the same given decoder will be able to decompress it, but I haven't
yet been able to determine the compression method nor its parameters.
There is no header (no gzip,arj,...). Data is stored in contiguous chunks;
each prefixed with one big-endian 16-bit word specifying the expected
length (in bytes) of the chunk when uncompressed, and a second word
specifying the number of bytes following in the compressed chunk.
The uncompressed length...Re: random numbers not random? #4
-> If you want to roll your own, you can take two different RNGs, call both of
-> them and multiply the results, use MOD as necessary to reduce the magnitude
-> to what is desired, for each random number. :-)
Hmmm... If both numbers are integers, then their product will have a
zero in the rightmost position if *either* (or both) of the initial
numbers have a zero there. So the probability of the product having a
zero there is nearly 20%, instead of 10% which would be the case in
truly random numbers.
Likewise, the probability would be 75% that the product wou...Re: Compressing data sets (was Re: [SAS-L])
On Mon, 5 Jan 2004 18:14:02 -0700, Jack Hamilton
>OK, "you or a program on your behalf will have to decompress them before
>Manager, Technical Development
>Metrics Department, First Health
>West Sacramento, California USA
>>>> "Richard Graham" <email@example.com> 01/05/2004 5:04 PM
>Actually you can compress(WINZIP, PKZIP) SAS data sets and use them in
>compressed format. There is software...Re: Data step question for longitudinal data #4
>>> "Gerstle, John" <yzg9@CDC.GOV> >>> wrote
What's the analysis question that needs to be addressed by the data?
From your description (granted it's generalized), it's hard to determine
exactly what form the data is needed for analysis.
I usually start at the question to determine what form the data needs to
be in for the final analysis, then look at the data I have and then
connect the dots. :) Sometimes this is straight forward, and sometimes
INTERESTING CODE SNIPPED
Eventually, I am going to be ...Re: Interview questions re data management #4
I'd begin with questions about storage of media, file management,
database versions, and data security. Ninety six CD's containing one
file each would not overwhelm anyone, but management of corrected
versions of data, passwords, statistical reports, and metadata adds up
to a daunting inventory problem. Ask the candidate to sketch a plan for
managing data receipts and deliveries. I would expect to see evidence
that the candidate understands receipt control, data security concerns,
and production processes. He or she should know to archive original
media immediately after copying...Re: combining data between different data sets #4
On Tue, 19 Jul 2005 07:35:18 -0700, rss <rslotpole@COMCAST.NET> wrote [in part]:
>In English: Ibm, dec and hwp. We own 100 shares of each. They pay a
>dividend. You multiple shares * dividend and then sum them to get the
>total dollars received. We want to add this value to the cash value
>and really don't want to associate it with every record in the dataset.
> Ticker is by date and our cashvalues are also by date
>Open two different data sets;
>If ticker (from first data set) = '*$$$' then
> Endshare (from first data set) =...Re: Random Numbers again : WAS RE: genetic inheritence in a #4
Subject: Randomness when switching seeds.
I ran your code. the step took 61 seconds. (Thankful that I did not
need 100,000 numbers.) I also ran
data q ;
seed = 0 ;
do obs = 1 to 1000 ;
call ranuni (seed,r) ;
if r > .01 then seed + 1 ;
r2 = ranuni ( 0 ) ;
This took .01 seconds (1/600 the time) and produced an approved random
sequence R2 as well as one, which like yours kept switching random
Finally, I made a format grouping by tenths and did freqs. Your
r ...Re: Converting surveymonkey data into sas data #4
Well that would depend on whether the character value is '3' '4' '5' or
If it is discrete then yes:
'0' = 1
'1' , '2' = 2
'3' , '4' , '5' = 3
Would be how I would put in a format.
Otherwise I would have to go with what I wrote.
From: "Dennis G. Fisher" <dfisher@CSULB.EDU>
Reply-To: "Dennis G. Fisher" <dfisher@CSULB.EDU>
Subject: Re: Converting surveymonkey data into sas data
Date: Mon, 3 Apr 2006 09:46:02 -0700
Why wo...Re: make a data set from an empty data? #4
OKay Kevin ,
So I posted the wrong code its been one heck of a day. I cant remember when
I wrote so much code. My fingers are worn down to the nub and my mind what
little I have is threatening a strike.
create table tmplt
If _N_ = 0 then set tmplt;
data = xx ;
data = xx ;
From: Kevin Roland Viel <kviel@EMORY.EDU>
Reply-To: Kevin Roland Viel <kviel@EMORY.EDU>
Subject: Re: make a...Compressing "random" data? They are not that random, actually we have a HINT...
Let's describe a problem that I want to solve. I need to compress
blocks of seemengly random data, which are created in a way described
Suppose that block size is always 128Kb for all blocks.
A compressible block of data A is xored with a block of data R that is
random perfectly random and saved into block B, which is then also
statistically perfectly random. Blocks R and B can be downloaded so
that anyone who has knowledge that B = R xor A can obtain A = R xor
B. I need to be able to construct an algorithm that will compress
block B so tha...