LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 1999, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 10 Dec 1999 09:23:32 -0700
Reply-To:   Mark S Dehaan/MSD/LMITCO/INEEL/US <MSD@INEL.GOV>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
Comments:   To: WHITLOI1 <WHITLOI1@WESTAT.COM>
From:   Mark S Dehaan/MSD/LMITCO/INEEL/US <MSD@INEL.GOV>
Subject:   Re: a random sample. I published 2 macro program ...
Comments:   To: SAS-L@LISTSERV.VT.EDU
Content-type:   text/plain; charset=us-ascii

Ian,

you expressed my sentiments exactly. I do appreciate the effort of the site owner, but I found very discomforting the fact that the disclaimer about software accuracy was in a smaller font and in a subtle color. (Flashing Bright Red would be appropriate here.) A lot of people may get burned by either inaccurate/incorrect code on the site, or naive/lazy "plug and chug" use of the code. SI's apprehension may be justified. Thanks Ian.

Regards, Mark DeHaan

WHITLOI1 <WHITLOI1@WESTAT.COM>@LISTSERV.VT.EDU> on 12/10/99 07:21:16 AM

Please respond to WHITLOI1 <WHITLOI1@WESTAT.COM>

Sent by: "SAS(r) Discussion" <SAS-L@LISTSERV.VT.EDU>

To: SAS-L@LISTSERV.VT.EDU cc:

Subject: Re: a random sample. I published 2 macro program ...

Subject: Re: a random sample. I published 2 macro program ... Summary: Problems with the code. Respondent: Ian Whitlock <whitloi1@westat.com>

Renaud Harduin <r.harduin@ABS-TECHNOLOGIES.COM> offered two programs on a popular subject - drawing random samples. He wrote

> Go to the www.SAShelp.com web site, I published 2 macro program : > > %ECH_SPLE : simple random sample (optimized in I/O, MEM and CPU) > with distinct observation ==> Efficency > %ECH_ALEA : Make a stratified random sample but requires more I/O > and CPU

I looked at the first program and found the following problems:

1) For any two "random" samples from a given data set generated by this program, the larger sample will contain the smaller sample. For example the code,

data w ; do s = 1 to 100 ; output ; end ; run ;

%ech_sple ( data = w , out = s10 , size = 10 ) %ech_sple ( data = w , out = s23 , size = 23 )

proc compare data = s10 compare = s23 ( obs = 10 ) ; run ;

produced a report with no differences found.

2) The variables I, J, and DSID are on the output sample.

3) The variable X cannot be on the input data set.

4) The last record can never be in the sample.

5) The probability of choosing the 0th obs (there isn't any) is 1/sample_size.

6) The number of logical obs is referenced but the program can produce incorrect result for every logically missing observation.

7) Duplicate choices must be eliminated in a subsequent step.

8) On efficiency - a nonworking linear search was used.

I didn't look at the second macro.

The site itself is impressive although I did get a glimmer of why the SAS Institute objects to sites using the SAS name. It is unfortunate that the quality of the programs is not monitored. This does not mean the other 93 tip/programs have the same quality, I didn't look at them.

I can go along with the SAS-L rational that discussion must be free and open, hence code posted need not work. In this context the reader has a clear warning. But I find it frightening, to see a professional looking web site without any monitoring of the quality of posted programs.

Ian Whitlock


Back to: Top of message | Previous page | Main SAS-L page