LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (September 2006, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 18 Sep 2006 10:52:03 -0400
Reply-To:     wing wah <wing.tham03@PHD.WBS.AC.UK>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         wing wah <wing.tham03@PHD.WBS.AC.UK>
Subject:      hash

Dear folks,

I am trying to reconstruct the real time orderbook in financial market. If the quotes below are bid price that is posted by traders, i am interested to rank the posted quote as the bestbid secondbid thirdbid fourthbid and so on. The quantity of each rank (bestbid, secondbid,....) is sum of accumulated quantity of those rank before it. i.e. quantity of bestbid=quantity of bestbid, accumulated quantity of secondbid=quantity of secondbid+quantity of bestbid, and so on... I hope the output of the code will provide a clearer explanation of the logic behind what i am trying to construct.

Since i am trying to create a demand curve. any price lower than the bestbid price will have an accumulated quantity (look at the demand and supply curve). By doing so, I will have to think of a way to extract the price and arrange them as the bestbid, secondbid..... The best bid from the output from the simple code below will be the first price that is not '.' starting from right to left. The secondbid will be the next accumated quantity change after then bestbid from right to left. At least by doing this, i am not worried about how far the queue will stretch.

The datasize is about 10G with about 20 million observations. Firstly, I am wondering if this can be done more efficiently using 'hash'.

Secondly, if the code below is a sensible one, from the output below how can i extract the bestbid, secondbid,... and the quantity? I am using proc transpose for testing but i am not sure about using it when the data size increases. All suggestions and criticism are welcome.

Thank you in advance!

Wing

data given; input quote quantity ; cards;

0.6126 3 0.6126 -3 0.611 1 0.611 -1 0.6115 1 0.6115 -1 0.6103 1 0.6103 -1 0.6075 5 0.6075 -5 0.6061 19 0.609 2 0.6118 2 0.6075 5 0.6061 -19 0.6084 19 0.6118 -2 0.6115 1 0.6121 2 0.6121 -2 0.612 1 0.6118 2 0.6121 1 0.6121 -1 0.612 -1 0.612 5 0.6054 2 0.612 1 0.612 -1 0.61205 1 0.6116 5 0.61105 1 0.61105 -1 0.61205 -1 0.6123 1 0.6123 -1 0.612 2 0.612 2 0.612 -2 0.6116 -5 0.6115 -1 0.612 1 ;

DATA maxmin; SET given END=lastobs ; IF _N_ = 1 THEN DO; start = quote; finish = quote ; END; RETAIN start finish; start = (MIN(start,quote)); finish = (MAX(finish,quote)) ; IF lastobs THEN DO; start=start*100000-1; finish=finish*100000+1; diffmaxminprice=finish-start+1; CALL SYMPUT('lo',LEFT(PUT(start,8.))) ; CALL SYMPUT('hi',LEFT(PUT(finish,8.))); CALL SYMPUT('diffmaxminprice',LEFT(PUT(diffmaxminprice,8.))); END; RUN;

%put lo &lo.; %put hi &hi; %put diffmaxminprice &diffmaxminprice;

data roll3(drop = i newpoint); set given (obs=1000); array v(&LO:&HI) v&LO-v&HI; retain v&LO-v&HI id(0) ; if _n_=1 then id=0; id=id+1; bid=int(quote*100000); newpoint = missing(v(bid) ); do i = bid to &lo+1 by -1; if i=bid or not missing(v(i) ) then v(i) = sum(v(i),quantity); if v(i)<=0 then v(i) = .; end;

do i = bid+1 to &Hi-1 until(v(i) or i=&hi); end; if v(bid)=v(i) then v(bid)=.;

if newpoint then do; do i = bid+1 to &Hi-1 until(v(i) or i=&hi); end; v(bid) = sum(v(bid),v(i)); end;

run;

proc transpose data=roll3 out=dvector; by id ; var v:;

run;quit;

data dvector; set dvector; if col1=. or col1=0 then delete; run;


Back to: Top of message | Previous page | Main SAS-L page