LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2001, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 27 Aug 2001 17:00:40 -0400
Reply-To:   Mike Rhoads <RHOADSM1@WESTAT.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Mike Rhoads <RHOADSM1@WESTAT.COM>
Subject:   Re: tcpip parsing
Comments:   To: "Brad.Goldman@AUTOTRADER.COM" <Brad.Goldman@AUTOTRADER.COM>
Content-Type:   text/plain; charset="iso-8859-1"

Brad,

I had a couple thoughts on your problem.

One alternative to converting each TCP/IP address from base 256 would be to zero-fill each of the 4 parts to the maximum of 3 digits/characters. You could make the final variable either numeric (e.g. 12,010,117,208) or character (012.010.117.208), as you prefer. Either of these would allow the addresses to be compared, and might be easier to develop and debug since it is easier to "map" back into the original representation. Your idea should certainly work, however.

Also, I don't see why you'd need to compare "every number to every other number", unless your input data set is so huge that you can't easily sort it. I would start by sorting by LoTcpIp, and within that by descending HiTcpIp. After that I think it's just a matter of a DATA step based on a "look-ahead" read to the next record, where your decision algorithm is something like:

Hold on to the current record if and only if LoTcpIp from the next record is greater than the HiTcpIp from the current record.

I am assuming you have no "partial overlaps" (e.g. 1 - 100 followed by 50 - 200) -- seems a little unlikely given your data, and I don't know what you'd want to do with them anyway.

Note that this is a late-afternoon, completely untested algorithm ...

Mike Rhoads Westat RhoadsM1@Westat.com

-----Original Message----- From: Brad Goldman [mailto:Brad.Goldman@AUTOTRADER.COM] Sent: Monday, August 27, 2001 2:40 PM To: SAS-L@LISTSERV.UGA.EDU Subject: tcpip parsing

I have a dataset, which I have created using the whois function from unix. This dataset has variables for the hostname and tcpip range, for example:

host range ------ ------ AT&T 12.0.0.0 - 12.255.255.255 Meddve, Inc. 12.10.117.208 - 12.10.117.223 Husky Corp 12.10.117.32 - 12.10.117.47 ...

What I would like to do is to only choose the most specific range(s). In this case, the latter two lines are "included" in the AT&T range, so I want to discard the AT&T entry. If there were another host with range 12.10.177.0 - 12.10.177.255 I would discard that also. Any bright ideas how to proceed? All I can see is to convert the beg and end tcpips into two big numbers (as if the tcpip was a 4 digit, base 256 number). Then these numbers can be compared to each other. (I see no way to avoid comparing every number to every other one, any tricks there would help also.)

My eventual goal is to create a format where a given tcpip can be mapped to a host name.

Much thanks in advance, Brad Goldman


Back to: Top of message | Previous page | Main SAS-L page