| Date: | Mon, 5 Jan 2009 10:21:26 -0800 |
| Reply-To: | ajayohri@yahoo.com |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | ajay ohri <ajayohri@YAHOO.COM> |
| Subject: | Re: geocoding |
|
| In-Reply-To: | <46306.150.142.232.7.1231178529.squirrel@webmail.albany.edu> |
| Content-Type: | text/plain; charset=iso-8859-1 |
I think Mike's formula are the better approach. I wasnt aware of these formulas, and Haversian formula approximate the earth as a sphere which it isnt.
great solution.
best regards,
ajay
--- On Mon, 1/5/09, Mike Zdeb <msz03@ALBANY.EDU> wrote:
> From: Mike Zdeb <msz03@ALBANY.EDU>
> Subject: Re: geocoding
> To: SAS-L@LISTSERV.UGA.EDU
> Date: Monday, January 5, 2009, 11:32 PM
> hi ... for distance calculations, I don't think that
> this thread has included any mention of the new V9.2
> functions: ZIPCITYDISTANCE, GEODIST
>
> they both use the Vincenty formula, not the Haversine
> formula
>
> http://en.wikipedia.org/wiki/Thaddeus_Vincenty
>
> if you have V9.2, you can try the following (how far is it
> from my zip of residence 12203 to some Boston-area zips)
>
> the format in PROC PRINT shows that ZIPCITYDISTANCE returns
> a distance rounded to tenths while GEODIST distances imply
> more precision than seems
> possible (and you can see the difference between Haversine
> and Vincenty distance)
>
>
> data howfar;
> * centroid of zip 12203;
> retain y2 42.691560 x2 -73.827840;
> set sashelp.zipcode;
> where put(zip,z5.) eq : '0213';
> dist1 = zipcitydistance(12203, zip);
> dist2 = geodist(y2,x2,y,x,'M');
> * convert to radians for haversine distance;
> x1 = -x * constant('pi') / 180;
> y1 = y * constant('pi') / 180;
> x3 = -x2 * constant('pi') / 180;
> y3 = y2 * constant('pi') / 180;
> dist3 = 3961 * arcos(sin(y1) * sin(y3) + cos(y1) * cos(y3)
> * cos(x1 - x3));
> label
> dist1 = 'ZIPCITYDISTANCE'
> dist2 = 'GEODIST'
> dist3 = 'HAVERSINE'
> ;
> run;
>
> proc print data=howfar label;
> var city zip dist1 dist2 dist3;
> format dist: 15.10;
> run;
>
> Name of 5-digit
> Obs city/org ZIP Code ZIPCITYDISTANCE
> GEODIST HAVERSINE
> 1 Jamaica Plain 02130 140.5000000000
> 140.5023399595 140.1996366709
> 2 Roslindale 02131 140.8000000000
> 140.7914716174 140.4891045935
> 3 West Roxbury 02132 140.0000000000
> 139.9556216459 139.6563191378
> 4 Boston 02133 143.0000000000
> 142.9769683379 142.6626516503
> 5 Allston 02134 139.6000000000
> 139.6451823553 139.3385079256
> 6 Brighton 02135 140.3000000000
> 140.3483702300 140.0406529741
> 7 Hyde Park 02136 141.2000000000
> 141.2105493185 140.9106405081
> 8 Readville 02137 141.2000000000
> 141.2306946001 140.9327798400
> 9 Cambridge 02138 139.4000000000
> 139.3758380356 139.0681569701
> 10 Cambridge 02139 140.9000000000
> 140.9119802051 140.6015911943
>
>
>
>
> as for the GEOCODING part, if you do not need an exact
> location and are willing to accept a zip centroid, you can
> match zips to the SASHELP.ZIPCODE
> data set
>
> it is INDEXED and updated quarterly ...
> http://support.sas.com/rnd/datavisualization/mapsonline/html/misc.html
>
> data centroids;
> input zip @@;
> set sashelp.zipcode (keep=zip city x y) key=zip;
> datalines;
> 12203 90210 02138
> ;
> run;
>
> proc print data=centroids label;
> run;
>
> Latitude Longitude
> (degrees) of (degrees) of
> The the center the center
> 5-digit (centroid) (centroid) Name of
> Obs ZIP Code of ZIP Code. of ZIP Code. city/org
> 1 12203 42.691560 -73.827840 Albany
> 2 90210 34.096629 -118.412426 Beverly
> Hills
> 3 02138 42.377465 -71.131249
> Cambridge
>
>
>
> if you have zip+4 data, you can download a much larger data
> set (16 million+ observations)...
>
> http://support.sas.com/rnd/datavisualization/mapsonline/html/geocode.html
>
> (that link also has another link to "Cheap Geocoding:
> SAS/GISŪ and Free TIGERŪ Data" by Ed Odom and Darrell
> Massengill)
>
> it is also INDEXED
>
> data locations;
> input zip plus4;
> set sashelp.zip4 (keep=zip plus4 x y) key=zipplus4;
> datalines;
> 12203 3005
> 12203 2007
> 12203 2009
> ;
> run;
>
> proc print data=locations label;
> run;
>
>
> ZIP+4 ZIP+4
> weighted weighted
> center center
> ZIP latitude longitude
> Obs Code ZIP+4 (DD) (DD)
> 1 12203 3005 -73.815012 42.675415
> 2 12203 2007 -73.803461 42.666658
> 3 12203 2009 -73.804830 42.668454
>
>
>
> PROC GEOCODE was introduced in V9.2 ... by default, it uses
> the SASHELP.ZIPCODE data set and will return the zip
> centroid
>
> it can also be used with other reference data sets (e.g.
> ZIP4)
>
>
> a LONG time ago, an issue of SAS OBSERVATIONS (who
> remembers that) included SAS code for geocoding addresses
> using the TIGER line files
>
> TIGER files are free and it would not be that difficult to
> write code for address matching (then again, you might
> spend your time write SAS code to
> produce the files needed for batch processing with Google
> Earth)
>
>
>
> --
> Mike Zdeb
> U@Albany School of Public Health
> One University Place
> Rensselaer, New York 12144-3456
> P/518-402-6479 F/630-604-1475
>
> > for automation on websites try macro addons in browser
> from iopus.com they are free but the firefox addon is
> better. google maps is a website
> > http://maps.google.com/
> >
> > for automation in windows environment to the google
> earth application
> > try batch files
> >
> > see here for batch file tips
> >
> >
> http://www.decisionstats.com/2008/02/using-batch-files-to-automate/
> >
> > for distance use the haversian formula-- it is better
> to use it in an MS EXCEL formula..like this..where A8,B8 are
> one set of long lat and E4,E5 are
> > another set of long lat..answer is in kms
> >
> >
> 6371*ACOS(COS(RADIANS(90-($B8)))*COS(RADIANS(90-(E$5)))+SIN(RADIANS(90-($B8)))*SIN(RADIANS(90-(E$5)))*COS(RADIANS(($A8-E$4))))
> >
> >
|