Date: Fri, 21 Sep 2007 18:26:38 -0000
Reply-To: junkmail115@GMAIL.COM
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: junkmail115@GMAIL.COM
Organization: http://groups.google.com
Subject: Re: PROCSQL UPDATE with CASE based on results from sub query?
In-Reply-To: <1190324374.261648.237480@d55g2000hsg.googlegroups.com>
Content-Type: text/plain; charset="us-ascii"
On Sep 20, 4:39 pm, junkmail...@gmail.com wrote:
> I'm trying to update multiple variables in existing dataset with
> different results (CASE?) based on independent results obtained from a
> sub query.
>
> I have a meta table and a master table. The meta table describes the
> variables in the master table. This is a relatively watered down
> example so please overlook the impracticality.
>
> create table metatable as
> select libname, memname, name,
> "" as CASEUPDATE length=30 format=char30.
> from dictionary.columns
> where libname = "LIB"
>
> Next I want to update the CASEUPDATE field based on the results of a
> query against the master table which it describes.
>
> Something like this:
>
> update metatable
> set CASEUPDATE =
> ( case
> when
> ( select max(length(variable1))>10
> from mastertable ) then "LONG"
> when ( select max(length(variable1)) between 3 and 9
> from mastertable ) then "SHORT"
> END,
> case
> when
> ( select count(unique(label))
> from mastertable) = 0 then label="MISSING"
> when
> ( select count(unique(label))
> from mastertable) >0 then label ="NOT MISSIN"
> END
> WHERE variablename=var1;
>
> Is something like this even possible? Could someone point me in the
> right direction please?
>
> Many Thanks,
>
> Aaron
>
> (
> CASE
> WHEN
> ((balance - 10.00) < 0)
> THEN
> 0
> ELSE
> (balance - 10.00)
> END
> )
> WHERE
> id = 1
Thank you for your responses. I'll elaborate on my actual goal. As I'm
obviously novice SAS user there might be an approach that achieves my
final goal more conventionally so suggestions are always welcome.
I'm trying to create a hybird metadata table from the
dictionary.columns table in proc sql. I frequently process data
(UPDATETABLE) with inconsistent field names, data types and formats.
By applying on testable functions (maximum char length, regular
expression matches, character count, presence/absence of numbers) to
each column in the UPDATETABLE I want to assign 1 of 50 known fields.
So for every potential field in the UPDATETABLE I have a set of
mutually exclusive conditions that match 1 of about 50 known fields.
If I can identify all of the fields in the UPDATETABLE I can then
apply the appropriate transformations to cleanse the data
automatically.
The procsql update query I asked about was an attempt to run functions
on all fields in the UPDATETABLE. Then based on the function results
update the hybrid meta data table. I can then match the meta hybrid
table against my transformation lookup table to call the appropriate
functions to cleanse the data.
The actual code I've written runs a number of test functions on all
fields from the UPDATETABLE with a %do loop and updates the
dictionary.columns table with the appropriate "known field" name.
Am I on the right path or am I reinventing the wheel?
Thanks,
Aaron