现在的位置: 首页 > 综合 > 正文

Geocoding in ArcGIS

2013年03月26日 ⁄ 综合 ⁄ 共 10474字 ⁄ 字号 评论关闭
Geocoding is a process by which information that is not in spatial map format, e.g., a list of addressess, can be placed as points on a map, similar to putting pins on a paper map. The process assigns geographic coordinates to the original data, hence the name geocoding. It is also called address-matching when the information is address-based and is being assigned to a street map. In a typical geocoding process, the data list might include an address like 508 W. 5th St., and a street centerline GIS data layer would have a street segment corresponding to the 500 block of West 5th Street. The result of geocoding would be a point placed somewhere along the even-address side of that street segment. The geocoding process is described in ArcGIS Desktop Help (on the ArcGIS Help menu) under Contents - Geocoding Addresses, and you should refer to that for details. This tip sheet provides an overview of the preparation process, which is very important and a bit tricky. Note: if your data list already has x and y coordinates of some sort, you don't have to geocode - those are the geocodes. You can add them to a map by using the Tools - Add XY Data menu function in ArcMap. Preparing for the Geocoding Process Two sets of data are needed for the geocoding process - the data that you want to place on a map, e.g., a list of addresses, and the GIS data layer that you will use as the reference layer, e.g., a city's street centerlines layer. Both data sets need to be prepared prior to geocoding. Preparing the address dataPreparation of the address data set means formatting the information correctly so that a GIS software like ArcGIS can process it (called parsing). The data set should be in a database-compatible format like comma-delimited (.csv) or tab-delimited (.tab) text file or dBase (.dbf). The address should be contained in a single column and should contain the street number and street name; the street's prefix direction, prefix type, street type, or suffix direction, if any. Intersection descriptions (for example, " Blvd. & Vine St.") can also be included in this field. Here is an example of the addresss field from a table. Note that several of these addresses may be unmatchable due to the format, e.g., 401-409 S. 5th. Ideally you should have as complete an address as possible, but without apartment information. So 619 E. Cedar St.  would be good assuming Cedar Street has a east and west portion, not 609 Cedar, nor 609 E. Cedar St., Apt. 119. The geocoding process will likely be able to handle missing street types like St. or Blvd., but it will go easier the more complete your address information is. Apartment information should go in a separate field if you need to maintain it for your information purposes - it will not be part of the geocoding process. The most important thing is that your data formatting is consistent throughout the column. If you include intersections for some addresses instead of street numbers, always use the same connector (e.g., &) and use the complete street names (e.g., Burnet Rd. & W. 51st St.) The more carefully you format your address list with geocoding in mind, the better the geocoding process will work, so take some time doing this, and plan ahead if you know you will be geocoding. Other tips: for your column names, follow dBase compatible formats - no spaces or funny characters in the field name, and a maximum of 10 characters. If you are using a spreadsheet to create this data set, make the first row the field names, and start your actual address records on the second row. Do not put in other formatting or rows or columns, e.g., no titles, or spacer rows. Only enter the field names and actual data records. You can then save this to a comma-delimted text file (.csv) for use in ArcGIS. (Note, you can also save it as a dBase file -.dbf - but I find comma-delimited files less problematic than .dbf when saving from Excel). Make sure you save only the header/data rows and columns, not extraneous rows or columns. See Saving Excel files in dBase format for more information if you are exporting from Excel to dBase. About reference dataYou must already have a GIS reference layer available to act as your reference layer. If you are geocoding a list of street addresses you will want a street centerlines GIS layer. For the city of Austin, this would be the str-address shape file, the city's street centerline layer. A good source for street centerline GIS data layers for any area in the US is the US Census' TIGER roads data set. These can be downloaded from the Census Bureau itself (http://www.census.gov/), or from the Geography Network (http://www.geographynetwork.com/freeresources.html) - the Geography Network has a more user friendly interface. Note that TIGER roads files are downloaded county by county, so you may have to merge files to create a reference data layer for more than one county (to merge, in ArcMap, add all the data layers to be merged and then choose Tools-Geoprocessing, and follow the instructions for merging). Well-formatted street centerline GIS data layers have separate fields for street name, street's prefix direction, street type, and suffix direction as appropriate (some streets don't have suffix or prefix directions). They will also have four address range fields indicating From address left and To address left (e.g., 1100 and 1122), From address right and To address right (e.g., 1101 and 1123). This address range is what allows an address to be pin-mapped onto the street network. If you need to match by both address and zip code, your reference layer should also have fields for zip code on the left side of the street and zip code on the right side of the street. The TIGER data is formatted in this way, and many cities, including Austin, follow this format for their street centerlines layer. You can also match your data only to a zip code if you desire. For statewide or nationwide data sets, the zip code may be the only information you have or mapping to a generalized zip code boundary may be good enough for your needs. In this case you will need some kind of zip code points (centroid - center point of a zip code) or polygon GIS layer to act as reference. The US Post Office which creates and maintains zip codes for mail purposes does not maintain this data for various reasons, but the US Census has an approximation of zip code areas that it calls Zip Code Tabulation Areas (ZCTAs) that you can download from the Census Bureau by state. But understand that these are only approximations for census purposes and do not reflect actual zip code areas and are not kept up to date.  See the GeoCommunity web site's section about zip code GIS data for more warnings and information (http://spatialnews.geocomm.com/newsletter/2000/jan/zipcodes.html). Note that many private data vendors also sell zip code GIS information and offer services for zip code mapping. Preparing reference dataThe GIS reference layer, e.g., a street centerlines or zip code polygon layer, needs to be prepared by creating what ArcGIS refers to as a "adress locator". This process essentially indexes a reference layer, much like indexing a book. You have to create the Address Locator index in ArcCatalog if it does not already exist. To create a geocoding service for your reference data layer: Open ArcCatalog. On the left side scroll to the near the bottom of the list to where you see Address Locator and click on that. On the right side you will see any existing geocoding services, plus "Create new address locator". Double-click on Create a new address locator You now need to choose a geocoding style. If you are matching addresses without zip code information to a street centerline shape file like the City of Austin  str-address layer or a TIGER-type street centerline shape file, choose US Streets (File) If you are matching addresses and zip code information to a street shape file, choose US Streets with Zone (File) (note: you would need to use addresses and zip codes if you were matching addresses to more than one city, since something like 1201 N. Main St. might be found in lots of cities, but there is only one 1201 N. Main St., 78715 in the world) If you are matching only zip code information to a zip code point or polygon layer, choose either Zip5digit (File) or Zip+4 (File), whichever is appropriate to your data set.For more information, press the Help button in the geocoding style dialog box. After choosing a style, the address locator dialog box appears. You give your new locator index an appropriate name so that you can locate it again in the future. Under the Primary Table area on the left, browse to locate the GIS data layer that will serve as your reference layer. In the example below, I am using a merged TIGER roads layer called TIGER_centex_DD (a set of TIGER road files for Central Texas counties). Once I specify that, the fields portion is automatically filled out but you should check these to make sure that the software is finding the correct field for each element.If your address list has intersection information  (e.g., Burnet Rd. & W. 51st St.) instead of addresses for some or all records, you need to make sure that the intersect connector symbol is specified. Again there are defaults that may already match your address data, but add it if you need to.Finally, I like to checkmark the option to add x and y coordinates to my output table that will result from the geocoding process. This is not necessary, however.As you see on the right side below, there are many other options - for your first time, just accept the other defaults. To learn more about the options, use ArcGIS Desktop Help Contents - Geocoding Addresses. Press OK when you are finished. The new address locator index will be stored in the Addresss Locators area of ArcGIS and will be available later for the actual geocoding process (note, in Sutton Lab, if you do this at one computer it will not be available at another computer - you will have to recreate the service on each computer). Geocoding a list of addresses Once you have prepared your address data and created a geocoding service using your reference GIS data layer, you are ready to do the actual geocoding. There are good instructions for geocoding in ArcGIS Desktop Help (on the ArcGIS Help menu) under Contents - Geocoding Addresses-Geocoding a Table of Addresses.  You should refer to these instructions for this process. Typically, only some percentage (ideally a large percentage) of your addresses will actually find a match. Some will remain unmatched. For these, there is a re-matching process described well in ArcGIS Desktop Help under Contents - Geocoding Addresses-Re-matching a Geocoded Feature Class. But before you do the re-matching process, you should spend time carefully examining the addresses that didn't match (indicated by a "U" in the status field of the geocoded shape file). There can be many reasons for a matching failure. The address in your list may be typed incorrectly or be in a wrong format, or the street centerline file may have problems (e.g., be out of date, list a name that for a street that is different from the same street in your address list - e.g., MLK Blvd, instead of 19th St. or Martin Luther King, Jr. Blvd), or the address ranges may be incorrect or missing for a street segment. Also note that TIGER files typically do not contain street address ranges for rural areas or small towns, thus addresses in these areas cannot be matched against the TIGER files. You should also check the addresses that did match. They may have matched incorrectly for various reasons. You always need to do a data check for any processes that you perform in GIS! Geocoding by Zip Code Only The same process described above is used for geocoding by zip code only. The difference is that you would create and use a geocoding service specific to this process - e.g., using the Zip5digit geocoding style.  The difference is that the geocode assigned and the point placed on the map will be located at the center point (centroid) of a zip code polygon or, if your zip code GIS data layer consists of points, the geocoded points will be on top of the zip code points. Note that when there are multiple records for a single zip code, all the points will be placed one on top of the other at the zip code centroid (center point). It will look like just a single point, but if you click on it with the information tool, you will see all the records come up. If you select it with the selection tool, all the records will be selected. If you summarize or join the geocoded data, all the records for that point will be processed.

抱歉!评论已关闭.