|
On May 11, 6:27 am, randista...@HOTMAIL.COM (Randy) wrote:
> I want to flag multiple ID's (where ID_A and ID_B are one unit) that appear
> in different "Code" on the same day:
>
> My data set is as follows
>
> Date Code ID_A ID_B
> May1 1 AB1 XZ
> May1 1 AB1 MN
> May1 2 AB1 XZ
> May1 2 CD1 MN
> May2 1 AB1 XZ
> May2 1 CD1 MN
> May2 2 CD1 MN
> May2 2 AB1 MN
>
> My output should be as follows:
>
> Date Code ID_A ID_B Flag1
> May1 1 AB1 XZ 1
> May1 1 AB1 MN 0
> May1 2 AB1 XZ 1
> May1 2 CD1 MN 0
> May2 1 AB1 XZ 0
> May2 1 CD1 MN 1
> May2 2 CD1 MN 1
> May2 2 AB1 MN 0
>
> The code that I wrote is as follows:
> data need; set have;
> by code ID_A ID_B date;
> flag1 = 0;
> if (first.ID_B and last.ID_B) then flag1 = 1;
> run;
>
> It does not work. What mistake am I making?
>
The mistake is not sorting properly before doing the flagging. The
sort order should be
Date id_a id_b;
You don't need to have code in the sort order because you are only
concerned with the multiplicity of codes within the group.
Additionally, (first.ID_B and last.ID_B) is an insufficient test
(first.ID_B and last.ID_B) will test true in two different situations
- single observation in group (flags are both 1)
- more than 2 observations in group, and at an 'internal' observation
(flags are both 0)
So, the the test you want to use is:
if first.by-varN = 1 and last.by-varN = 1 then flag1=0; else
flag1=1;
An alternative is:
if first.by-varN=last.by-varN and first.by-varN=1 then flag1=0; else
flag1=1;
An single assignment statement can be:
flag1 = NOT (first.by-varN = 1 and last.by-varN = 1);
Since the original data has no apparent sort order, you will need an
additional step before sorting to add a variable to the data to record
the original row order and use that to sort the flagged data back into
original order.
--
Richard A. DeVenezia
http://www.devenezia.com
|