Date: Thu, 11 Oct 2007 13:05:56 -0400
Reply-To: "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Subject: Re: PCA using SAS and R
In-Reply-To: A<1192111749.817029.276730@22g2000hsm.googlegroups.com>
Content-Type: text/plain; charset="us-ascii"
Song [sounpra@YAHOO.COM] said:
> On Oct 10, 10:27 pm, mkei...@WHARTON.UPENN.EDU ("Keintz, H. Mark")
> wrote:
> > Song [soun...@YAHOO.COM] said:
> >
> >
> >
> > > On Oct 10, 4:23 pm, peterflomconsult...@mindspring.com (Peter
Flom)
> > > wrote:
> > > > Song <soun...@YAHOO.COM> wrote
> >
> > > > >Can anyone please explain to me how running a simple PCA in SAS
> and
> > > R
> > > > >can produce conflicting results? You will notice that signs of
> the
> > > > >component score are different only in the 4th component. What
> is
> > > > >troubling to me is that this is not consistent. On a different
> > > > >dataset, the signs become different starting with the 2nd
> > component.
> >
> > > > The signs of the principal components are arbitrary. It's like
> > > > saying "John is taller than Mary" or "Mary is shorter than
John".
> >
> > > > If the magnitudes were noticeably different, that would be
> > troubling,
> > > but changes in the signs are no big deal
> >
> > > > Peter
> >
> > > Hi Peter --
> >
> > > I understand what you are saying...however this does matter if I'm
> > > trying to construct biplots using say the first two PCs. This
> > > wouldn't be a problem if I knew a priori that the sign will differ
> > > only after the second component...but it changes for different
> > > dataset. So the biplots that I get from SAS and R may or may not
be
> > > the same.
> >
> > > Song
> >
> > I doubt if you're going to be able to force SAS (or R, I suppose) to
> > pick a particular sign pattern for score coefficients.
> >
> > So why not set your own rule to standardize? Such as, if the first
> > coefficient for a given factor is negative, reverse the sign of all
> > coefficients for that factor. Then, no matter what you start with,
> > you'll end up with each factor beginning with a positive score
> > coefficient. The R and SAS coefficients should then be entirely
> > equivalent.
> >
> > Regards,
> > Mark- Hide quoted text -
> >
> > - Show quoted text -
>
> Hi Mark --
>
> If you simply reverse the signs of the coefficients manually and
> obtain the Pearson correlation for each principal component score
> versus the individual variables, how can one maintain consistency in
> the interpretation of the same dataset analyzed by two different
> individual? Although the correlation coefficients are the same in
> magnitude...saying two variables have a correlation of 0.8 is
different
> than saying they have a correlation of -0.8.
>
OK, we're going way back into the reptilian part of my brain, trying to
recall a long time ago when I did PCA. So someone with more recent
experience please step in and correct me if I'm wrong.
Song:
Your stated object was to have the same results in PCA under R vs SAS,
where the only initial difference you noticed was in the signs of every
scoring coefficients for selected factors.
For those factors, you're also getting different signs on the
correlation of the factor scores with the original items, right?
So yes, doing what I suggest would change then sign of those particular
correlations - it would change them so they would now become the same
under R vs SAS, meeting your original goal.
BTW, we are talking about standardized coefficients, right?
More specifically to your example. If we reverse the signs of a factor,
we are also changing its definition. So after sign reversal, a
correlation of item 1 of .8 with the factor representing, for example
"expansion" now becomes a correlation of -.8 with the factor
representing "contraction". In this limited sense the .8 and -.8
correlations mean the same thing. Note this does NOT imply any change
in the correlation among the original items.
Regards,
Mark
> Best regards,
>
> Song
|