LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2007, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 14 Mar 2007 15:23:14 +1100
Reply-To:     "Johnson, David" <David.Johnson@CBA.COM.AU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Johnson, David" <David.Johnson@CBA.COM.AU>
Subject:      Re: Randomly splitting a dataset
Content-Type: text/plain; charset="us-ascii"

Every day, another nugget of knowledge; a former colleague called them "factoids".

I thought the SurveySelect procedure was part of one of the products other than SAS/Stat. Had I known this two days ago, I would not have coded about five steps to select one sample record from each of six class values (A to F), where a random half of the records had values between 0 and 1 and the other half have values between 1 and 2.

Shall I spend an hour now RTFMing and muddle my way through getting this from the proc, or would you like to take pity on me Dave and suggest some syntax? I've written some code to generate a sample data set.

Data CLIENTS; Do CLIENTID = 1 To 10000 By 1; SECCLASS = Substr( "ABCDEF", Ceil( RanUni( 1234) * 6), 1); LOSSRATE = RanUni( 7890) * 2; Output; End; Run;

Kind regards

David

/* - - - - - - - - - - - - - - - - - - - - - It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts. -Sir Arthur Conan Doyle - - - - - - - - - - - - - - - - - - - - - */

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of David L Cassell Sent: Wednesday, 14 March 2007 2:46 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: Randomly splitting a dataset

kaylom01@ODJFS.STATE.OH.US wrote: > >Hello- I was wondering if anyone can help with SAS coding to randomly >split a dataset into two parts. I have a dataset and I want to randomly

>divide it into two parts so that I can build a logistic regression >model with one half of the data and then test the model on the second >half. Any information/suggestions would be greatly appreciated!!! Thank

>you for your help and time.

Here's but one way:

proc surveyselect data=YourData out=OutStuff seed=49487 outall samprate=50; run;

Now you have a new variable SELECTED in your output, and 50% of your data will have SELECTED=1 while the rest have SELECTED=0. Split on that, using a WHERE clause in your data set options.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Rates near 39yr lows! $430K Loan for $1,399/mo - Paying Too Much? Calculate new payment http://www.lowermybills.com/lre/index.jsp?sourceid=lmb-9632-18226&moid=7 581

************** IMPORTANT MESSAGE ***************************** This e-mail message is intended only for the addressee(s) and contains information which may be confidential. If you are not the intended recipient please advise the sender by return email, do not use or disclose the contents, and delete the message and any attachments from your system. Unless specifically indicated, this email does not constitute formal advice or commitment by the sender or the Commonwealth Bank of Australia (ABN 48 123 123 124) or its subsidiaries. We can be contacted through our web site: commbank.com.au. If you no longer wish to receive commercial electronic messages from us, please reply to this e-mail by typing Unsubscribe in the subject line. **************************************************************


Back to: Top of message | Previous page | Main SAS-L page