Date: Tue, 29 Nov 2005 08:57:16 -0500
Reply-To: "Richard A. DeVenezia" <rdevenezia@WILDBLUE.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Richard A. DeVenezia" <rdevenezia@WILDBLUE.NET>
Subject: Re: Filter top 30 out of total, based on variable
mikekiwi wrote:
> Hi all,
>
> this must be a realitvely easy question for all of you, but for me as
> a not-experienced programmer, it's not that easy to solve.
>
> I've got a dataset with a music-hitlistsales of a shop; this dataset
> has - amongst others of course - the variables 'salesamount' and
> 'mediatype'. Salesamount of course contains the sales of that article
> in that shop, mediatype is filled with for instance CD, DVD, SACD,
> etc.
>
> What I want to do is only use the top 30 per mediatype and delete the
> remaining records; sorting is no problem, so the articles are reanked
> correctly, but I can't find a way to add the hitlistpositions after
> which I can say: "if hitpos < 31 then delete".
>
> Can you please help me out?
>
> Thanks,
> Michael
Here is a way that processes the data set in group wise chunks. Handy if
you have auxiliary variables maintaining some 'state' of the group {not
shown} that need to be reset at the start of each group.
I'm no sales specialist, but a study prior to an advertisement campaign
might also involve ranks based on number of units sold or average sale
amount.
------------------------
%let seed = 123105;
data sales;
do mediaTypeId = 1 to 10;
do storeId = 1 to 15;
do articleId = 1 to 50;
do _n_ = 1 to 25*ranuni(&seed);
transactid+1;
amount = floor(30 * sin(transactid/30*6) * ranuni (&seed));
if amount < 0 then amount = -2*amount;
output;
end;
end;
end;
end;
run;
proc sql;
create view salesSummary as
select mediaTypeId, articleId
, sum(amount) as totalSales
, count(amount) as nSales
, mean(amount) as avgSale format 8.2
, count(distinct storeId) as nStores
from sales
group by mediaTypeId,articleId
order by mediaTypeId,totalSales descending
;
quit;
data top30s;
do _n_ = 1 by 1 until (last.mediaTypeId);
set salesSummary;
by mediaTypeId;
if _n_ > 30 then continue;
salesRanking = _n_;
OUTPUT;
end;
run;
------------------------
Richard A. DeVenezia
http://www.devenezia.com/
|