LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 1998)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 24 Jul 1998 16:48:45 -0500
Reply-To:     "Nichols, David" <nichols@SPSS.COM>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>
From:         "Nichols, David" <nichols@SPSS.COM>
Subject:      Re: Weighting in hierachical cluster?
Comments: To: Jonas Gunnarsson <fondjg@HHS.SE>

There's no reasonable way to use a weight (that I can see) when computing dissimilarities or similarities between two cases, which is how CLUSTER starts. If your weights are proper integer frequency weights, you can physically replicate the cases, so that when the analysis is done, the duplicates will all merge together early in the solution, and the number of them will be used in the weighted averages that result from the updating formulas when the joining of truly different cases/clusters begins. If the weights are noninteger sampling weights, SPSS isn't designed to handle complex samples (and I'm not sure how that would work out in this context anyway).

David Nichols Principal Support Statistician and Manager of Statistical Support SPSS Inc.

---------- From: Jonas Gunnarsson [SMTP:fondjg@HHS.SE] Sent: Monday, July 06, 1998 8:37 AM To: SPSSX-L@UGA.CC.UGA.EDU Subject: Weighting in hierachical cluster?

Dear all,

I am seeking a solution to an annoying problem. Using the hierarchical cluster analysis module in SPSS to estimate a range of appropriate cluslter solutions and their respectice cluster centroids, it is as far as I've discerned no way of weighting the data. In k-means weighting is applicable, however, this does not help me correct for biases in my data in the crucial first hierarchical analysis. It also deprives of me of about 400 extra cases that I could have used otherwise.

Is there a way around this problem, such as constructing new variables in the raw data file using the sample weights and THEN applying the clustering procedure to calculate distances? Any suggestions would be much appreciated!

-------------------------------------- Jonas Gunnarsson Foundation for Distribution Research at the Stockholm School of Economics fondjg@hhs.se Internet: www.hhs.se/fdr/staff/jonasg.htm


Back to: Top of message | Previous page | Main SPSSX-L page