Data Issues

Since the data is in text form, these are the problems with the data, like location names not found in county gis database, not found in google, etc.

Most Frequent Incident Descriptions with Categorizations (Top 250)

This report shows which most frequent incidents do not have categorization beyond person crime and property crime.

Currently, each incident is freeform text entered by CHPD or dispatch. Many seem to follow a convention.

This report shows how each incident has been categorized based on a best-effort text search. Ideally, these categories and all kinds of metadata would be filled in by the officer when he or she does the incident report.

Uncategorized Incident Descriptions (Top 250)

This report shows which most frequent incidents do not have categorization beyond person crime and property crime.

Currently, each incident is freeform text entered by CHPD or dispatch. Many seem to follow a convention.

This report shows how each incident has been categorized based on a best-effort text search. Ideally, these categories and all kinds of metadata would be filled in by the officer when he or she does the incident report.

Incidents at CHPD Police Station (Most occurred elsewhere?)

Seems like the "SOLICITING PERMIT" and perhaps some resisting arrest or assaults might be at the police station, but almost all the incidents and arrests listed here appear to have likely occurred elsewhere and either someone turned themselves into the police station or the location of the incident was not known or the victim didn't want to say?

Addresses to locate first (Top 250)

These are the addresses which were not found in the county GIS nor with sufficient accuracy by Google, because the incident database incident addresses do not match the county format or the address is invalid, etc. This list along with the ones google located outside of Chapel Hill are the ones that show what needs to be fixed.

The list is sorted by the number of incidents at that address so the most frequent ones (heavy hitters) can be fixed first.

Number of incidents by geocoding accuracy

These are the accuracies reported by the source of information. All addresses from the town, which are really from the county, are considered accurate. Addresses geolocated by Google come with an accuracy score and all scores less than 6 are not used in the statistics because they are not accurate enough to consider.

Invalid Addresses

Told Google these addresses were in Chapel Hill, but the latitutude and longitude were actually outside Chapel Hill. Some of them may actually be outside of Chapel Hill if the police had to travel out of town.

It would be better if every incident had its own latitutude and longitude so we don't have to rely on address matching.

The county GIS records are searched first for an address match, then google is consulted.

This query shows the addresses which google found which are definitely outside of chapel hill. Obviously, the most frequently cited addresses should be corrected first.

Syndicate content


Powered by Drupal, an open source content management system