Address Issues

Known issues with addresses. Many of the police records have inconsistent addresses which do not match city and county records. Google and other geocoding services were used to attempt to locate these many thousands of missing/mismatched addresses.

Addresses to locate first (Top 250)

These are the addresses which were not found in the county GIS nor with sufficient accuracy by Google, because the incident database incident addresses do not match the county format or the address is invalid, etc. This list along with the ones google located outside of Chapel Hill are the ones that show what needs to be fixed.

The list is sorted by the number of incidents at that address so the most frequent ones (heavy hitters) can be fixed first.

Number of incidents by geocoding accuracy

These are the accuracies reported by the source of information. All addresses from the town, which are really from the county, are considered accurate. Addresses geolocated by Google come with an accuracy score and all scores less than 6 are not used in the statistics because they are not accurate enough to consider.

Invalid Addresses

Told Google these addresses were in Chapel Hill, but the latitutude and longitude were actually outside Chapel Hill. Some of them may actually be outside of Chapel Hill if the police had to travel out of town.

It would be better if every incident had its own latitutude and longitude so we don't have to rely on address matching.

The county GIS records are searched first for an address match, then google is consulted.

This query shows the addresses which google found which are definitely outside of chapel hill. Obviously, the most frequently cited addresses should be corrected first.

Summary of Google Accuracies on addresses not found in county database

Over 8,000 unique addresses could not be easily matched with the county GIS database, so Google was used to geocode them.

The higher the score, the better. Google assigns an accuracy which were recorded them so that the matches which scored 5 or lower could be ignored when doing analysis.

All of the location data of 5 and less were deleted so that they would not cause inaccurate information.

This report shows that there are a lot of incidents which have no accurate location associated with it and are left out of the analysis.

0 Unknown location.
1 Country level accuracy.

