Module 1.3: Data Quality-Assessment
This weeks module was trying to see the completeness of the roads separately and compare them to each other. There were two road networks that were used, TIGER and Jackson County street centerlines. TIGER stands for Topologically Integrated Geographic Encoding and Referencing which was collected by the Census Bureau. The TIGER data that was used is from 2000 and it does have some major errors, but in 2010 the government tried to fix these issues. The other data is from Jackson County street centerlines that is more accurate than the TIGER data but it is not more complete. Knowing where your data comes from is important because it will help see the completeness of the road.
First, I needed to make sure that the roads were only within the grids. The way I did this was by using the select by location and choosing the input features as the roads. I made sure to use the relationship as Intersect and the selecting features as the Grid shapefile. Then I hit apply to get only the lines within the grid polygons. Next, I used the Intersect geoprocessing tool and each road as the input feature. I made sure to do it separately because I did not want all the information be put together. I made the Output feature class as “Grid_Street” and “Grid_TIGER” so I would know that was different than the other shapefiles. Then in the Attributes to Join I wanted All Attributes because it would make it easier in the future to get the information for each grid. Finally, I made sure in the Output Type to put Line so that it would make the line features that intersected the grid. Lastly, I added a new column called “New_Length” to get the new information of each polyline. Once it was created, I did the calculate geometry to get the new length in kilometers. I did this to make sure that there were actually new lengths.
After getting all the data it showed that the TIGER 2000 data were more complete than the Jackson County street centerlines data. Also, that even though the data is open to the public and done by the Federal Government it does not mean it will always be accurate. This indicates that it is necessary to have two datasets to help compare against one of each other. The image below is a choropleth that shows the percentage of differences between the two dataset length within each grid. It indicated that there were major negative and positive differences between the two data sets. The negatives indicated that the TIGER data had more total length than the Jackson County street centerlines. The positives indicated that the Jackson County street centerlines had more total length than the TIGER data. Lastly, the little change is approximately from -3 to 3 on the number scale.
No comments:
Post a Comment