Date: Wed, 3 Feb 2010 05:16:10 -0500
Reply-To: Jim Groeneveld <jim.1stat@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jim Groeneveld <jim.1stat@YAHOO.COM>
Subject: Re: Mapping data
Hi Wing Wah,
There may be shorter solutions, but I think the code below does what you
want. It includes automatic iterations until all names have been determined.
It also supports names that can not be determined: unresolvable names.
Can there be order numbers with multiple, ambiguous names? The program code
quite well processes those, but the results may not be as intended.
Remove the added data lines to check your original testdata.
Regards - Jim.
--
Jim Groeneveld, Netherlands
Statistician, SAS consultant
http://jim.groeneveld.eu.tf
~~~~~~~~~ START OF PROGRAM CODE ~~~~~~~~~
data orders;
input name $ order_no new_order_no;
datalines;
X 1 .
Y 2 .
. 1 3
Z 4 .
. 3 5
. 2 .
. 14 . /* added unresolvable name */
Z 3 . /* added multiple, ambiguous name */
. 5 .
Z 6 7
. 7 .
Y 8 9
X 10 11
Z 12 13
. 11 .
. 9 .
. 13 .
;
run;
DATA Orders;
SET Orders;
* Retain original record order;
SeqNum + 1;
RUN;
PROC SORT DATA=Orders;
BY Order_No;
RUN;
%MACRO Iterate;
%LOCAL OnceMore Previous Round;
%LET OnceMore = 1;
%LET Round = 0;
%DO %WHILE (&OnceMore);
%LET Previous = &Oncemore;
%LET OnceMore = 0;
%LET Round = %EVAL ( &Round + 1 );
DATA Orders;
MERGE Orders (DROP=Name)
Orders (KEEP=Name Order_No WHERE=(NOT MISSING(Name)));
BY Order_No;
RUN;
TITLE "After first merge, iteration &Round"; PROC PRINT DATA=Orders; RUN;
PROC SORT DATA=Orders (KEEP=Name New_Order_No
WHERE=(NOT MISSING(New_Order_No) AND NOT
MISSING(Name)))
Out=Mapping (RENAME=(New_Order_No=Order_No));
BY New_Order_No;
RUN;
DATA Orders (DROP=OnceMore);
MERGE Orders END=LastRec /*(DROP=Name)*/ Mapping;
BY Order_No;
IF (MISSING(Name)) THEN OnceMore + 1;
IF (LastRec) THEN CALL SYMPUT ('OnceMore', PUT (OnceMore, BEST.));
RUN;
TITLE "After second merge, iteration &Round"; PROC PRINT DATA=Orders; RUN;
%IF (&Round GE 2) %THEN %DO;
%IF (&OnceMore EQ &Previous AND &OnceMore GT 0) %THEN %DO;
%PUT PROBLEM: Unresolvable name(s) left, see results;
%LET OnceMore = 0; * Leave iteration loop;
%END;
%END;
%END;
%MEND Iterate;
%Iterate;
PROC SORT DATA=Orders;
BY SeqNum;
RUN;
TITLE "Final result"; PROC PRINT DATA=Orders; RUN;
~~~~~~~~~ END OF PROGRAM CODE ~~~~~~~~~
On Wed, 3 Feb 2010 03:29:01 -0500, Wing Wah Tham <wingwahtham@YAHOO.CO.UK>
wrote:
>Dear all,
>
>I am dealing with a multi-layered missing data problem. I have data on three
>variables, order_no, new_order_no and name. Each order number is associated
>with a name. Every time the order number is changed there is no name, so the
>linking variable is the order number. I would like to associate the new-
>order_no to a name. Unfortunately, there can be multiple changes of the
>order number and I do not know the maximum number of changes. Below is an
>example of what the data looks like and what the output should look like.
>Any help would be appreciated.
>
>Wing Wah
>
>
>What I have is the following:
>data orders;
>input name $ order_no new_order_no;
>datalines;
>X 1 .
>Y 2 .
>. 1 3
>Z 4 .
>. 3 5
>. 2 .
>. 5 .
>Z 6 7
>. 7 .
>Y 8 9
>X 10 11
>Z 12 13
>. 11 .
>. 9 .
>. 13 .
>;
>run;
>
>Output should be the following:
>
>X 1 .
>Y 2 .
>X 1 3
>Z 4 .
>X 3 5
>Y 2 .
>X 5 .
>Z 6 7
>Z 7 .
>Y 8 9
>X 10 11
>Z 12 13
>X 11 .
>Y 9 .
>Z 13 .