LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 10 Sep 2008 13:08:03 -0500
Reply-To:     Mary <mlhoward@avalon.net>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Mary <mlhoward@AVALON.NET>
Subject:      Re: Question on Transpose with two columns needed
Comments: To: Steve Denham <stevedrd@yahoo.com>
Content-Type: text/plain; charset="iso-8859-1"

No, as I stated I have 3000 SNP's (a numeric and a character variable that match), for 6000 variables total on each record, so this approach requires too much coding, it seems. Thanks for trying.

-Mary ----- Original Message ----- From: Steve Denham To: Mary ; SAS-L@LISTSERV.UGA.EDU Sent: Wednesday, September 10, 2008 12:22 PM Subject: Re: Question on Transpose with two columns needed

Here's what I came up with. It's kind of brute force, but it did give what you asked.

data one; input ID $ rs001 rs001_allele $ rs002 rs002_allele $; datalines; 001 0 AA 1 CT 002 1 AB 0 CC 003 2 BB 2 TT 004 1 AB 1 CT ;

data twoa;set one; rs=rs001; rs_allele=rs001_allele; if index(rs_allele,'A')>0 or index(rs_allele,'B')>0 then snp='rs001'; if index(rs_allele,'C')>0 or index(rs_allele,'T')>0 then snp='rs002'; output; rs=rs002; rs_allele=rs002_allele; if index(rs_allele,'A')>0 or index(rs_allele,'B')>0 then snp='rs001'; if index(rs_allele,'C')>0 or index(rs_allele,'T')>0 then snp='rs002'; output; keep ID snp rs rs_allele; run;

You would need code for each of the haplotypes to put into the correct snp, but if you only have the two snps per record, this should move through it pretty quickly. Steve Denham Associate Director, Biostatistics MPI Research, Inc. Remove spamblock from header, and replace with stevedrd to reply to me.

----- Original Message ---- From: Mary <mlhoward@AVALON.NET> To: SAS-L@LISTSERV.UGA.EDU Sent: Wednesday, September 10, 2008 12:47:41 PM Subject: Question on Transpose with two columns needed

Hi,

I have data like this:

ID rs001 rs001_allele rs002 rs002_allele 001 0 AA 1 CT 002 1 AB 0 CC 003 2 BB 2 TT 004 1 AB 1 CT

Where I have 6000 total variables, 3000 for the coded values (0, 1, 2), and 3000 for the letter equivalents to those coded values (AA, AB, BB). The variable names for the coded values and their letter equivalants are identical, EXEMPT that the letter equivalent ends with _allele.

I need to store this in Oracle, which has a limit of 1000 variables, so I would have to break up the file in any case. Thus, it seems that transposing the data from wide to long would be a reasonable approach prior to storage in Oracle; particularly now that Howard S. has shown this morning that transpose can be very effective in solving some problems :-)

However, the numeric and letter variables go together, and so I would like these on the same row on the transposed long file. I would like to have the long file look like this:

ID SNP Numeric_Value Char_value 001 rs001 0 AA 001 rs002 1 CT 002 rs001 1 AB 002 rs002 0 CC 003 rs001 2 BB 003 rs002 2 TT 004 rs001 1 AB 005 rs002 1 CT

Could someone propose a possible solution to this tranpose problem?

-Mary


Back to: Top of message | Previous page | Main SAS-L page