```Date: Mon, 5 Jan 2009 10:21:26 -0800 Reply-To: ajayohri@yahoo.com Sender: "SAS(r) Discussion" From: ajay ohri Subject: Re: geocoding Comments: To: msz03@albany.edu In-Reply-To: <46306.150.142.232.7.1231178529.squirrel@webmail.albany.edu> Content-Type: text/plain; charset=iso-8859-1 I think Mike's formula are the better approach. I wasnt aware of these formulas, and Haversian formula approximate the earth as a sphere which it isnt. great solution. best regards, ajay --- On Mon, 1/5/09, Mike Zdeb wrote: > From: Mike Zdeb > Subject: Re: geocoding > To: SAS-L@LISTSERV.UGA.EDU > Date: Monday, January 5, 2009, 11:32 PM > hi ... for distance calculations, I don't think that > this thread has included any mention of the new V9.2 > functions: ZIPCITYDISTANCE, GEODIST > > they both use the Vincenty formula, not the Haversine > formula > > http://en.wikipedia.org/wiki/Thaddeus_Vincenty > > if you have V9.2, you can try the following (how far is it > from my zip of residence 12203 to some Boston-area zips) > > the format in PROC PRINT shows that ZIPCITYDISTANCE returns > a distance rounded to tenths while GEODIST distances imply > more precision than seems > possible (and you can see the difference between Haversine > and Vincenty distance) > > > data howfar; > * centroid of zip 12203; > retain y2 42.691560 x2 -73.827840; > set sashelp.zipcode; > where put(zip,z5.) eq : '0213'; > dist1 = zipcitydistance(12203, zip); > dist2 = geodist(y2,x2,y,x,'M'); > * convert to radians for haversine distance; > x1 = -x * constant('pi') / 180; > y1 = y * constant('pi') / 180; > x3 = -x2 * constant('pi') / 180; > y3 = y2 * constant('pi') / 180; > dist3 = 3961 * arcos(sin(y1) * sin(y3) + cos(y1) * cos(y3) > * cos(x1 - x3)); > label > dist1 = 'ZIPCITYDISTANCE' > dist2 = 'GEODIST' > dist3 = 'HAVERSINE' > ; > run; > > proc print data=howfar label; > var city zip dist1 dist2 dist3; > format dist: 15.10; > run; > > Name of 5-digit > Obs city/org ZIP Code ZIPCITYDISTANCE > GEODIST HAVERSINE > 1 Jamaica Plain 02130 140.5000000000 > 140.5023399595 140.1996366709 > 2 Roslindale 02131 140.8000000000 > 140.7914716174 140.4891045935 > 3 West Roxbury 02132 140.0000000000 > 139.9556216459 139.6563191378 > 4 Boston 02133 143.0000000000 > 142.9769683379 142.6626516503 > 5 Allston 02134 139.6000000000 > 139.6451823553 139.3385079256 > 6 Brighton 02135 140.3000000000 > 140.3483702300 140.0406529741 > 7 Hyde Park 02136 141.2000000000 > 141.2105493185 140.9106405081 > 8 Readville 02137 141.2000000000 > 141.2306946001 140.9327798400 > 9 Cambridge 02138 139.4000000000 > 139.3758380356 139.0681569701 > 10 Cambridge 02139 140.9000000000 > 140.9119802051 140.6015911943 > > > > > as for the GEOCODING part, if you do not need an exact > location and are willing to accept a zip centroid, you can > match zips to the SASHELP.ZIPCODE > data set > > it is INDEXED and updated quarterly ... > http://support.sas.com/rnd/datavisualization/mapsonline/html/misc.html > > data centroids; > input zip @@; > set sashelp.zipcode (keep=zip city x y) key=zip; > datalines; > 12203 90210 02138 > ; > run; > > proc print data=centroids label; > run; > > Latitude Longitude > (degrees) of (degrees) of > The the center the center > 5-digit (centroid) (centroid) Name of > Obs ZIP Code of ZIP Code. of ZIP Code. city/org > 1 12203 42.691560 -73.827840 Albany > 2 90210 34.096629 -118.412426 Beverly > Hills > 3 02138 42.377465 -71.131249 > Cambridge > > > > if you have zip+4 data, you can download a much larger data > set (16 million+ observations)... > > http://support.sas.com/rnd/datavisualization/mapsonline/html/geocode.html > > (that link also has another link to "Cheap Geocoding: > SAS/GISŪ and Free TIGERŪ Data" by Ed Odom and Darrell > Massengill) > > it is also INDEXED > > data locations; > input zip plus4; > set sashelp.zip4 (keep=zip plus4 x y) key=zipplus4; > datalines; > 12203 3005 > 12203 2007 > 12203 2009 > ; > run; > > proc print data=locations label; > run; > > > ZIP+4 ZIP+4 > weighted weighted > center center > ZIP latitude longitude > Obs Code ZIP+4 (DD) (DD) > 1 12203 3005 -73.815012 42.675415 > 2 12203 2007 -73.803461 42.666658 > 3 12203 2009 -73.804830 42.668454 > > > > PROC GEOCODE was introduced in V9.2 ... by default, it uses > the SASHELP.ZIPCODE data set and will return the zip > centroid > > it can also be used with other reference data sets (e.g. > ZIP4) > > > a LONG time ago, an issue of SAS OBSERVATIONS (who > remembers that) included SAS code for geocoding addresses > using the TIGER line files > > TIGER files are free and it would not be that difficult to > write code for address matching (then again, you might > spend your time write SAS code to > produce the files needed for batch processing with Google > Earth) > > > > -- > Mike Zdeb > U@Albany School of Public Health > One University Place > Rensselaer, New York 12144-3456 > P/518-402-6479 F/630-604-1475 > > > for automation on websites try macro addons in browser > from iopus.com they are free but the firefox addon is > better. google maps is a website > > http://maps.google.com/ > > > > for automation in windows environment to the google > earth application > > try batch files > > > > see here for batch file tips > > > > > http://www.decisionstats.com/2008/02/using-batch-files-to-automate/ > > > > for distance use the haversian formula-- it is better > to use it in an MS EXCEL formula..like this..where A8,B8 are > one set of long lat and E4,E5 are > > another set of long lat..answer is in kms > > > > > 6371*ACOS(COS(RADIANS(90-(\$B8)))*COS(RADIANS(90-(E\$5)))+SIN(RADIANS(90-(\$B8)))*SIN(RADIANS(90-(E\$5)))*COS(RADIANS((\$A8-E\$4)))) > > > > ```

Back to: Top of message | Previous page | Main SAS-L page