LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2011, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Sun, 6 Mar 2011 10:41:36 -0500
Reply-To:   Li Sun <sunli.judy@GMAIL.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Li Sun <sunli.judy@GMAIL.COM>
Subject:   Re: proc contents
Content-Type:   text/plain; charset=ISO-8859-1

On Sun, 6 Mar 2011 01:28:08 -0500, Li Sun <sunli.judy@GMAIL.COM> wrote:

>On Wed, 2 Mar 2011 11:35:58 -0600, Joe Matise <snoopy369@GMAIL.COM> wrote: > >>Art, the general answer to your question might be something like this: >>/*create size-controlled macro variables*/ >>proc sql noprint; >> select (case when sequence eq 1 then name end), (case when sequence eq 2 >>then name end) >> into :mvar1,:mvar2 separated by " " >> from contents >> where sequence le 2; >>quit; >>%put &mvar2; >>In order to fully automate that, one option would be to put the maximum >>sequence out in the data step earlier (or determine it programmatically in >>an earlier stage); then write a macro or two to iteratively %do the first >>step (the case when) and the second step (the :mvarN). >> >>-Joe >>On Wed, Mar 2, 2011 at 10:08 AM, Arthur Tabachneck ><art297@rogers.com>wrote: >> >>> Susan, >>> >>> The goal is to create 5 (or however many are needed) macro variables, >not a >>> macro variable for each variable. In the present example, that would >only >>> produce five macro variables, but only containing the numbers 1 thru 5, >not >>> the desired variable names. >>> >>> Art >>> ------- >>> On Wed, 2 Mar 2011 11:03:04 -0500, Suzanne McCoy >>> <Suzanne.McCoy@CATALINAMARKETING.COM> wrote: >>> >>> >Select distinct sequence >>> > Into :seq1-:seq999999 >>> > >>> >-----Original Message----- >>> >From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of >>> Arthur >>> Tabachneck >>> >Sent: Wednesday, March 02, 2011 10:53 AM >>> >To: SAS-L@LISTSERV.UGA.EDU >>> >Subject: Re: proc contents >>> > >>> >I totally agree that Joe's suggestion using a %include is definitely >the >>> way Paul should go. However, I was playing with creating multiple macro >>> variables and realized that I didn't know how to do it efficiently using >>> proc sql. >>> > >>> >For example, the following works, but definitely isn't generalizable: >>> > >>> >/*create test file*/ >>> >data spssfile; >>> > retain question1-question13000 1; >>> >run; >>> > >>> >/*run proc contents*/ >>> >proc contents data=spssfile noprint out=contents; run; >>> > >>> >/*create sequence number based on variable name length*/ data contents >>> (drop=length_after); >>> > set contents (keep=name); >>> > if _n_ eq 1 or length_after gt 32000 then do; >>> > sequence+1; >>> > length_after=0; >>> > end; >>> > length_after+length(name); >>> >run; >>> > >>> >/*create size-controlled macro variables*/ proc sql noprint; >>> > select name into :mvar1 separated by " " >>> > from contents >>> > where sequence eq 1 >>> >; >>> > select name into :mvar2 separated by " " >>> > from contents >>> > where sequence eq 2 >>> >; >>> > select name into :mvar3 separated by " " >>> > from contents >>> > where sequence eq 3 >>> >; >>> > select name into :mvar4 separated by " " >>> > from contents >>> > where sequence eq 4 >>> >; >>> > select name into :mvar5 separated by " " >>> > from contents >>> > where sequence eq 5 >>> >; >>> >quit; >>> > >>> >Is there a way to write the above sql code that wouldn't require hard >>> coding the necessary splits? >>> > >>> >Art >>> >------- >>> >On Wed, 2 Mar 2011 08:30:48 -0600, Joe Matise <snoopy369@GMAIL.COM> >>> wrote: >>> > >>> >>Paul, I still recommend not actually inserting a list of 12749 >>> >>variables into your code; it makes the code entirely unreadable. Do >as >>> >>Mikee >>> >suggests >>> >>and put it to a file, and then %include it in your proc whatever: >>> >> >>> >>filename outnames "c:\temp\class_varnames.txt"; >>> >> >>> >>data _null_; >>> >>set sashelp.vcolumn(where=(memname = "CLASS")); >>> >> >>> >>file outnames; >>> >> >>> >>put name; >>> >> >>> >>run; >>> >> >>> >>proc freq data=sashelp.class; >>> >>table >>> >>%include outnames; >>> >>; >>> >>run; >>> >> >>> >>Note the double ; as the first ends %include. >>> >> >>> >>-Joe >>> >> >>> >>On Tue, Mar 1, 2011 at 5:33 PM, Swank, Paul R >>> ><Paul.R.Swank@uth.tmc.edu>wrote: >>> >> >>> >>> I have a multivariate data set with 12749 variables. Several hundred >>> >>> of these variables represent observations done by 2 raters on two >>> occasions. >>> >So >>> >>> there are four observations on each variable. I need to do a >>> >>> reliability with rater and occasion as facets of the measurement >>> >>> model. I use varcomp >>> >of >>> >>> mixed to get the variance components to use in computing intraclass >>> >>> correlations. The data has to be in univariate format. The variable >>> >>> names and there order in the data step are such that there is no way >>> >>> to specify >>> >a >>> >>> range of variables so each variable has to be typed out. I am lazy, >>> >number >>> >>> 1, and a poor typist number 2. I usually do a proc contents short to >>> >>> list the variable names to my output file. Then I copy them into the >>> >>> program >>> >and >>> >>> I can just copy and paste them into the steps I need to do. It saves >>> >>> a >>> >lot >>> >>> of typing and also cuts down on errors of mistyping the variable >name. >>> >>> However, it seems that �proc contents short� has a limit on the >>> >>> number >>> >of >>> >>> variables it will print. Proc contents itself does not seem to but >it >>> >lists >>> >>> all the variable names in a column along with labels and other >stuff. >>> >>> Not very convenient for cutting and pasting hundreds of variable >>> >>> names. So I >>> >was >>> >>> hoping there was some way to get the list of the variable names that >>> >>> goes across the page rather than just down. But it appears that >using >>> >>> the data definition method with proc sql has the same limitation as >>> >>> proc contents short. SO I am left with typing out all these variable >>> >>> names into a list >>> >so >>> >>> I can format the data the way I want. I have never had to do this >>> >>> with >>> >such >>> >>> a large data set and surprised to find that proc contents short does >>> >>> not work in this case. That�s it in a nutshell. >>> >>> >>> >>> >>> >>> >>> >>> Paul >>> >>> >>> >>> >>> >>> >>> >>> Dr. Paul R. Swank, >>> >>> >>> >>> Professor and Director of Research >>> >>> >>> >>> Children's Learning Institute >>> >>> >>> >>> University of Texas Health Science Center-Houston >>> >>> >>> >>> >>> >>> >>> >>> *From:* Joe Matise [mailto:snoopy369@gmail.com] >>> >>> *Sent:* Tuesday, March 01, 2011 5:04 PM >>> >>> *To:* Swank, Paul R >>> >>> *Cc:* SAS-L@listserv.uga.edu >>> >>> *Subject:* Re: proc contents >>> >>> >>> >>> >>> >>> >>> >>> Paul, what are you actually trying to do? I do hope it's not "make >a >>> >keep >>> >>> list with 12000 variables written out in my code", if so you should >>> >>> use a macro variable with SELECT INTO (well, or a dataset written >out >>> >>> into a >>> >text >>> >>> file, depending on whether the list is over the length limitation, I >>> >think >>> >>> 64k charcters if I recall correctly). >>> >>> >>> >>> If it's not that, then what are you doing with it? Odds are you can >>> >>> use some sort of programmatic code to do whatever it is either using >>> >>> PROC CONTENTS output to a dataset or DICTIONARY.TABLES (which is >>> >>> essentially >>> >the >>> >>> table equivalent of PROC CONTENTS). >>> >>> >>> >>> Also, SPSS should be able to give you the same list, in html or >excel >>> >>> format. I have had to do that before in order to deal with name >>> >shortening >>> >>> ... >>> >>> >>> >>> -Joe >>> >>> >>> >>> On Tue, Mar 1, 2011 at 4:34 PM, Swank, Paul R >>> >>> <Paul.R.Swank@uth.tmc.edu> >>> >>> wrote: >>> >>> >>> >>> Someone who shall remain nameless has sent me an SPSS data set of >>> >>> over >>> >>> 12000 variables. After finally getting it converted to a .por file >>> >>> and bringing it into SAS I want to get a short list of the variable >>> >>> names so >>> >I >>> >>> can cut and paste them in my program. None of the filenames make any >>> >sense >>> >>> and are not ordered to make it easy to specify ranges of variables. >I >>> >>> usually do this with "proc contents short;" However, while "proc >>> >contents;" >>> >>> will list the entire set of variable names with labels etc to the >>> >>> output window, "proc contents short;" will not. It truncates the >list >>> >>> of >>> >variables. >>> >>> Does anyone have a clue how I can get around this problem. I'm >trying >>> >>> not >>> >to >>> >>> have to type hundreds of variable names in my program nor copy and >>> >>> paste them one at a time. >>> >>> >>> >>> Dr. Paul R. Swank, >>> >>> Professor and Director of Research >>> >>> Children's Learning Institute >>> >>> University of Texas Health Science Center-Houston >>> >>> >>> >>> >>> >>> >>> > >Thank you for the ideas provided, just for me to practise, here is the >code. Kindly comment. > >proc transpose data = original out=column(keep=varname); >run; > >data broadvar1 broadvar2 broadvar3 broadvar4 broadvar5; > set column; > length=length(varname); > totlength+length; > cumulength=totlength+_n_-1; > select ; > when (cumulength < 32767) output broadvar1; > when (32767*2<cumulength < 32767*3) output broadvar2; > when (32767*3<cumulength < 32767*4) output broadvar3; > when (32767*4<cumulength < 32767*5) output broadvar4; > otherwise output broadvar5; > end; >run; > >proc sql; >select varname into : broadvar&n separated by ' ' from broadvar&n ; >quit; > >%put _user_; > >%let n=1; >%let n=2; >%let n=3; >%let n=4; >%let n=5;

This was an interesting and challenging problem.. I started with array however was stuck there. And then I realized that simple sum statements will work. So I forgot about array. Select when is more efficient when dealing with large amount of numeric variables accoring to SAS document. Art mentioned that 32767 is the length limit for a macro charactor so it would be not possible to put all the 12000 variables horizontally in one macro variable name ( unless the length of each variable name is less than 2). We were left with no choice but to split the variables. In real work I would change 32767*2 to 65534 as SAS prefers to "compare and select" rather than "calculate, compare and select" for large amount of processing. It will save memory and I/O. The thought was to change data from horizontal to vertical (proc transpose) and then sum their cumulative length including space required for reading purpose (Can we sum length horizontally in array and limit the length of variables to below 32767? I was stuck there for a few hours). We put as many variables as possibe for each amount of 32767 in their original sequence. There is always thousands of way doing a work in SAS ( that's why it is so charming). I was too late to this board but am very happy to find so many friends with the same interest and further more, to find that there are so many treasure to dig ..


Back to: Top of message | Previous page | Main SAS-L page