Geocoding is the process of appending
latitude/longitude coordinates to a given database record(s). It is a gateway step
toward getting your point location data on the map. |
There are three basics parts of a geocoding system:
- Your data file
- A geo-reference file, typically streets, or ZIP centroids
- Software used to perform a join between records of the two files
The quality of your geocoding results is highly dependent on the quality
of each of these three parts, discussed in detail below.
|
Your Data
This is an especially tricky area for a surprising number of GIS
users. It can be difficult at times to get your hands on just the right cut of data,
with the exact fields you need. At a minimum, you will need to bring over some
fields that contain geo-referenceable information, such as street address, city, state
and/or zip code. Once you have this data in hand, the biggest
hurdle is to "scrub" it up as much as possible. The geocoding software
basically looks for text matches on data such as city or zip code or street name, et al,
so it is important that your data has some level of consistent quality.
Consider the problem of New York City. If we are dependent on an
accurate city name to achieve a match, how challenging the file that contains
"nyc", "New York", "N.Y.C.", "NY City" and so
forth?!
If at all possible, devote some time to reviewing your data and making
substitutions wherever possible. Other areas that can benefit from scrubbing include
zip code fields, street name suffixes such as boulevard, and so forth.
A more subtle issue is understanding the actual nature of the location
data in your record relative to the application of that record in your analysis. For
instance, billing addresses might work fine for getting invoices out, but they do not help
us schedule or analyze service demand for equipment in the field. Be sure that you
are looking to the right address for use when geocoding.
|
Geo-Reference Data
Basically a huge lookup table, geo-reference data, such as street files
or zip centroids, contain feature name information as well as coordinate
information. The records in this file are examined to find matches to your data
records. ZIP Centroids
Zip centroids are fairly straightforward files. Basically three columns of
information are required - zip name, lat and lon. The lat/lon coordinates represent
the geographic centerpoint of the zip code area.
The keys to quality in a zip centroid file are data vintage (more
current is better), and positional accuracy.
Street Data
Street files, such as TIGER, are useful georeference files when
you need a high degree of positional accuracy. Generally speaking, an address-match
to a street file will be much more accurate than a simple zip centroid match
Street files are very large, pretty complex, and require good input
information to be truly effective.
The keys to quality in a street file are data vintage, positional
accuracy, completeness of network, and degree of data population (records having address
information in them). The census TIGER files are of pretty good
quality and are quite cheap.
(We offer enhanced TIGER from GDT which can be quite costly, and are of
excellent quality.)
|
Geocoding Software
Most GIS mapping software has some inate capability to perform
geocoding. Both ESRI and MapInfo (and others) offer customized programs that are
meant to make geocoding easier and faster. Too, there are stand-alone programs and
programming libraries that enable geocoding outside the confines of a mapping session.
Centroid matching is a no-brainer, and can be performed in any number of
non-GIS environments, given your data and a centroid file.
The software for address matching basically is called on to resolve
textual matches among the many attributes being compared between your data and the street
data. Speed, the ability to use fuzzy rules, and user interface are the keys to
quality here.
That said, our internal testing shows that the quality differences --
other than speed -- among the major geocoding software tools, are negligible, especially
in comparison to the impact of poor quality data.
|
Geocoding services
There is no shortage of people with really good geocoders who would
love to take your money in return for geocoding, ourselves included. It makes sense
to outsource your geocoding if you have only an occasional need and/or small numbers of
records. But for those with big files, or regularly changing files, outsourcing
becomes costly very quickly. That being said, when quality is
very important...
|
Of course we sell all this stuff, and would be
happy to help you with your geocoding. Just call 888-840-6100
to discuss your needs. |
|