Date: Mon, 31 Oct 2011 18:25:30 -0700
Reply-To: oslo <oslo@yahoo.com>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: oslo <hokut1@YAHOO.COM>
Subject: Re: commen frequency
In-Reply-To: <201110312324.p9VIZHZn027626@waikiki.cc.uga.edu>
Content-Type: text/plain; charset=iso-8859-1
Dear Tom;
Your helps are sincerely appreciated,
Thanks.
Oslo
________________________________
From: Tom Abernathy <tom.abernathy@GMAIL.COM>
To: SAS-L@LISTSERV.UGA.EDU
Sent: Monday, October 31, 2011 6:24 PM
Subject: Re: commen frequency
On Mon, 31 Oct 2011 14:27:56 -0700, oslo <hokut1@YAHOO.COM> wrote:
>Tom;
Thanks a lot. I think I could not explain the problem correctly. The
variables (rsxxxxxxxx) are unique in each column. That is there is only one
rs29024430 in column A. Likewise there is only one rs29009912 in the same
column (A). This is the case for other columns too. So overall I need the
get the similarity between columns. That is how many rsxxxxxxxx are common
in both column A and B. In other words how many of rsxxxxxxxx which exist in
column A are also in column B and so on.
Regards,
Oslo
So merge them by gene and set a 0/1 variable for each of the four sources.
Then create the combinations using AND. Something like this.
data current;
length a b c d $10.;
input a--d;
cards;
rs29024430 rs43708440 rs29024430 rs29024430
rs29009912 rs29009907 rs43708440 rs43708440
rs29009979 rs29009912 rs29009907 rs29010802
rs29010147 rs29012086 rs29009912 rs17871338
rs29010295 rs29012174 rs29009979 rs29009907
rs29010510 rs29013844 rs29010295 rs29009912
rs29011155 rs29016146 rs29010461 rs29009979
rs29012070 rs29017980 rs29011155 rs29010295
rs29012174 rs29018185 rs29012070 rs29010461
;;;;
proc sql noprint ;
create table a as select 1 as a,a as gene from current order by gene;
create table b as select 1 as b,b as gene from current order by gene;
create table c as select 1 as c,c as gene from current order by gene;
create table d as select 1 as d,d as gene from current order by gene;
quit;
data want;
merge a b c d;
by gene;
a = sum(0,a);
b = sum(0,b);
c = sum(0,c);
d = sum(0,d);
ab = a and b;
ac = a and c;
abc = a and b and c;
run;
proc means;
run;
|