LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2007, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 21 Sep 2007 08:54:46 -0500
Reply-To:     "Swank, Paul R" <Paul.R.Swank@UTH.TMC.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Swank, Paul R" <Paul.R.Swank@UTH.TMC.EDU>
Subject:      Re: Cluster analysis help needed
Comments: To: cherub <cherub2life@YAHOO.COM>
In-Reply-To:  <385670.72446.qm@web33314.mail.mud.yahoo.com>
Content-Type: text/plain; charset="us-ascii"

Once you have clusters for the 20000, find the cluster centroids and input these into fastclus for the total data set, specifying the number of clusters found originally. You will probably get some "drift" in the cluster centroids in the larger data set but if the original 20000 is fairly representative of the whole sample then they should be pretty close.

Paul R. Swank, Ph.D. Professor Director of Reseach Children's Learning Institute University of Texas Health Science Center-Houston

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of cherub Sent: Friday, September 21, 2007 8:40 AM To: SAS-L@LISTSERV.UGA.EDU Subject: Cluster analysis help needed

Any help will be highly appreciated!

Now I am running a cluster analysis for a large dataset (more than 400,000 obs), I ramdonly slected 20000 obs to do cluster analysis, and want to use the results of the result of this ramdom sample to be guidence for the the rest obs.

However, how to use the result of the random sample to score the rest and give the cluste for the rest obs?

Thanks very much.

--------------------------------- Pinpoint customers who are looking for what you sell.


Back to: Top of message | Previous page | Main SAS-L page