Date: Tue, 28 Oct 2008 01:21:27 -0400
Reply-To: Nancy Brucken <brucken@PROVIDE.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Nancy Brucken <brucken@PROVIDE.NET>
Subject: Re: Dataset from Dictionary?
Hi all,
If your company is a corporate sponsor of CDISC, you can download a copy
of the implementation guide domain specifications in Excel- it's available
under the Members Only section of the CDISC website.
Regards,
Nancy
On Mon, 27 Oct 2008 20:11:52 -0500, Lou <lpogoda@HOTMAIL.COM> wrote:
>"Ken Barz" <Ken.Barz@CPCMED.ORG> wrote in message
>news:92BD54B69F8AA446B8446927AAADE7B403A0645A@cpc-1.Colorado.local...
>> Well, yeah, I was just using airports and dictionary.columns as an
>> example of a similar dictionary. In the CDISC implementation guide,
>> they give the tables, variables, labels, , type, formats, ... in a data
>> dictionary format. So, I don't have an actual dataset that I can use
>> 'like' with.
>>
>> Since there's a whole slew of these things that need creating (and
>> across many different studies), I was hoping to pull the dictionary in
>> using Excel and have some way to use that metadata to create the actual
>> working dataset.
>
>The copy of the implementation guide I have came in PDF format - I've
tried
>various methods to get those "dictionaries" out of the PDF into something
>more usuable - a text file, spreadsheet, Word document, etc. I haven't
>found anything reliable.
>
>If you find something that can eport/import this stuff to a spreadsheet,
>please let me know. If you have a copy of the guide in some format more
>usuable than PDF, please let me know where you got it.
>
>That aside, once you have your data dictionary in a spreadsheet, using it
to
>create a dataset with those variables with those attributes isn't a big
>deal - I suppose someone could write a PROC, but I don't know of one
already
>out there. My place of business doesn't currently license access to PC
file
>formats, so we save the spreadsheet to a tab delimited text file, or
>alternatively as an XML file that can be read using the excel tagset.
>
>Anyway, we write the code to produce whatever variables are called for in
>the domain, but don't pay any attention to the attributes or the presence
of
>extraneous variables. This dataset will also contain variables needed for
>things like supplementary qualifier datasets and/or comment datatsets,
etc.
>Once all the necessary variables are generated with their values, we read
>the spreadsheet surrogate like so (if the surrogate is a tab delimited
text
>file):
>
>data map (keep = col1 col2 col4);
> retain keeper 0;
> infile map lrecl = 1000 pad dlm = '09'x dsd;
> length col1 - col10 $ 100;
> input col1 - col10;
> if lowcase(compress(col1)) = 'variablename' then do;
> keeper = 1;
> return;
> end;
> if keeper then do;
> col3 = lowcase(col3);
> col4 = compress(col4, '$');
> if col3 = 'num' then col4 = '8';
> if col3 = 'char' then col4 = '$ ' || col4;
> if col1 ^= ' ' then output;
> end;
>run;
>
>The "keeper" variable lets the program discard useless-to-the-program
header
>information that clients seem to insist be included in the map but get in
>the way of the program. If you license access to pc file formats, you
could
>use PROC IMPORT to read the spreadsheet directly, but you still would want
>to process the result to get rid of extraneous rows. To set the variable
>attributes, we write an include file (to a TEMP file) from the MAP dataset
>generated above and use it like this:
>
>data _null_;
> set map end = done;
> file include;
> if _n_ = 1 then put 'attrib ';
> put col1 'length = ' col4 'label = "' col2 +(-1) '"';
> if done then put ';';
>data your-dataset-name-here;
> %include include;
> set your-dataset-full-of-variables-here;
>run;
>
>To write the final dataset, we use SQL like so:
>
>proc sql noprint;
> select trim(col1) into :names separated by ', '
> from map;
> create table your-domain-name (label = 'your-domain-label') as
> select &names
> from your-dataset-name-here
> order by whatever-your-sort-variables-are-here;
>quit;
>
>If you need supplementary qualifier datasets, comment datasets, etc. they
>can all be generated from the your-dataset-full-of-variables dataset in
the
>same way.
>
>I admit this is kind of wordy - it would be possible to collapse this into
>fewer steps, the attrib statement could be a macro variable instead of an
>%include file, etc. - but I'm staffed with relatively young, inexperienced
>programmers, and they can follow the explict steps more easily. If you
must
>have a macro, wrapping the above in %macro and %mend statements with a few
>parameters would be the work of a few minutes.
|