Date: Thu, 16 Aug 2001 09:18:31 -0400
Reply-To: "Diskin, Dennis" <Dennis.Diskin@PHARMA.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Diskin, Dennis" <Dennis.Diskin@PHARMA.COM>
Subject: Re: How to program the algorithm
Content-Type: text/plain; charset="iso-8859-1"
Alex,
Here's a simple base SAS approach. (tested under V6.12 UNIX).
1. Create separate datasets with X and Y values.
2. Calculate DIFF for all combinations of X-Y and output to a new dataset.
3. Sort DIFF by DIFF.
4. The result has the lowest differences at the beginning and the highest
ones at the end.
You could also retain some ID variables for X and Y if there was a need to
be able to refer back to the source.
hth,
Denis Diskin:
data x(keep=var1 rename=var1=x) y(keep=var1 rename=var1=y);
input var1 trt;
select (trt);
when(1) output x;
when(2) output y;
otherwise;
end;
cards;
6.9 1
7.2 1
7.2 1
7.3 1
7.8 1
6.7 2
6.8 2
6.9 2
7.1 2
7.4 2
run;
%let nx = 5; /* could also be obtained from dataset */
%let ny = 5; /* could also be obtained from dataset */
%let k = 3; /* from whatever algorithim you want */
data diff;
set x;
do point = 1 to &ny;
set y point=point;
diff = x - y;
output;
end;
proc sort data=diff;
by diff;
proc print data=diff(obs=3);
title "&k smallest differences";
run;
proc print data=diff(firstobs=%eval(&nx*&ny - &k +1 ));
title "&k largest differences";
run;
> -----Original Message-----
> From: Alex Gray [SMTP:Alex.Gray@RTP.PPDI.COM]
> Sent: Thursday, August 16, 2001 7:27 AM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: How to program the algorithm
>
> Morning all,
>
> I need to engage my brain when I am posting to SAS-L especially when I
> am doing that at night. Although I explained the problem (see below) I
> didn't give you my data structure. I am putting this information below.
>
> The variables Xi and Yi are actually level of one variable treatment
> (TRT) so the data looks like:
>
> var1 TRT
> 6.9 1
> 7.2 1
> 7.2 1
> 7.3 1
> 7.8 1
> 6.7 2
> 6.8 2
> 6.9 2
> 7.1 2
> 7.4 2
> There is 4 levels to the TRT variable and I would want to be able to
> determine the kth smallest and k largest differences between the levels
> I specify.
>
> I hope this makes my post much clearer.
>
> Thanks for anyones help.
>
> Alex.
>
> Dear fellow SAS list members,
>
> I would be grateful for any help in programming the following.
>
> I need to calculate the the k lowest and highest pairwise differences
> between two variables.
>
> From all of the possible pairs (Xi, Yi), I need to find the k largest
> differences Xi-Yi and find the k smallest differences.
>
> Where k= Wapha/2 - n(n+1)/2 n= number of obs in variable Xi
> Walpha/2 comes from
> statistical tables
>
> To find the k largest and smallest differences it is convenient to order
>
> each variable first from smallest to largest. The kth largest
> difference is called the upper limit (U) and the kth smallest difference
>
> will be called the lower limit (L). That is counting towards the middle
>
> of the ordered array of all mn possible differences, the kth differences
>
> from each end of the array are the points L and U.
>
> This is better explained by an example because I didn't get it from the
> above.
>
> Xi Yi
> 7.3 7.4
> 6.9 6.8
> 7.2 7.8
> 7.8 6.7
> 7.2 7.1
>
> I would use Proc Rank to order them and we get
> Xi Yi
> 6.9 6.7
> 7.2 6.8
> 7.2 6.9
> 7.3 7.1
> 7.8 7.4
>
> In this example n=5 (for Xi) and m =5 for Yi, alpha=0.05, W0.025 (from
> tables) =18 and thus k=3.
>
> The three smallest and largest differences are found as:
>
> Smallest
>
> 6.9-7.4 = -0.5
> 6.9-7.1 = -0.2
> 7.2-7.4 = -0.2 = L
>
> Largest
>
> 7.8-6.7 = 1.1
> 7.8-6.8 =1.0
> 7.8-6.9 =0.9=U
>
> So L is the third smallest difference and U is the third largest
> difference.
>
> Problem is aside from the Proc Rank I haven't the faintest idea how one
> would start to program this (I guess one would want a macro, which
> first sorts the data using Proc Rank, then the macro is set up such that
>
> the user would provide the value of Walpha/2 in order to obtain k).
>
> Any help anyone may be able to offer I would sincerely appreciate.
>
> If you think this is really easy data step programming please break it
> to me gently.
>
> Kind regards.
>
> Alex.
>
>
>
> --
> Alex M. Gray, Ph.D
> Department of Biostatistics,
> PPD Development,
> Research Triangle Park
> 3900 North Paramount Parkway,
> Morrisville, NC 27560.
> Tel No: (919) 462 4978.
> Fax No: (919) 654 8632.
>
>
> ______________________________________________________________________
> This email transmission and any documents, files or previous email
> messages attached to it may contain information that is confidential or
> legally privileged. If you are not the intended recipient or a person
> responsible for delivering this transmission to the intended recipient,
> you are hereby notified that you must not read this transmission and
> that any disclosure, copying, printing, distribution or use of this
> transmission is strictly prohibited. If you have received this
> transmission in error, please immediately notify the sender by telephone
> or return email and delete the original transmission and its attachments
> without reading or saving in any manner.
> ______________________________________________________________________
|