Wednesday, December 15, 2010

Updated TIGER map

For a full history of this post you may want to read these three posts first: 12 and 3.

When I first presented my nationwide TIGER map at wherecamp5280, Samat mentioned that he was surprised that his county wasn't more heavily edited since he has done a lot of work there. A few days ago we were chatting on IRC (#osm on and he asked a question that made me realize that the map did indeed have a serious flaw in it. Earlier today a discussion came up on the talk-us mailing list that made me notice that the same error exists in the TIGER edited map that MapQuest (re)set up. So I decided to look at correcting it.

The error is that I was only looking at the latest version of the ways. This means that any edits between the original import and the mass edit that expanded street names were not taken into account. Both the import and the abbreviation expansion happened well before I started mapping so I didn't even consider this fact. As Antony Pegg says later in the mailing list thread, it is too expensive to go back and look at all previous versions. Especially since you would have to use a FULL planet file since the regular ones only contain the current version of objects, not a full history.

But we don't really care about the contents of the edits or who performed them... we just care that they happened. The initial import obviously created version 1 of the ways and if nothing else changed, then version 2 was created by balrog-kun in the name expansion edit. If any edits happened between the import and the name edit, then the ways will be on version 3 or higher. Thus any TIGER way with a version higher than 2 must have been edited by someone other than these users, even if one of them was the last to touch a way.

So here is the updated map which takes this into consideration:

Sunday, November 21, 2010

Nationwide TIGER Map

Update: You should of course still read this post because it is is awesome but there is a new post with an updated version of the map here.

I presented most of this material this weekend at WhereCamp5280 and promised to put this online soon. I have a few more details and maybe some raw data that I will put in another post after I get back home after Thanksgiving but for now I'll throw up what I have.

Well after some sweat, blood and tears (mostly on the part of my hard drives) I was eventually successful in extracting the data I needed to make a map of the entire US! If you just want the map feel free to skip down but I am going to document my trials and tribulations a little.

First, some things that did not work:

Saturday, October 30, 2010

Two more TIGER maps

After my last post there were some requests for me to do additional states. So I imported Colorado and New Mexico since they are kind of one big block. However doing 50 states a couple at a time will be rather cumbersome so I think after this I will try to coax osmosis into doing a more fine-tuned import so that I can import only the things I need but do it for the entire US at once.

I also learned some more quirks about ArcMap. The program is amazing in some ways but user interface is not one of them. I was also unable to figure out how to use two different fields to join my data to the counties shapefile. So that was slightly frustrating but I was able to work around it.

Anyway, until I get around to poking osmosis again,
I give you Colorado:

Sunday, October 24, 2010

TIGER ways in Kansas

Update: I have a second post with two more state maps up here. More to come!
Update 2: Here are two more posts with nationwide maps.

If you are at all familiar with OpenStreetMap, especially in the US then you will know that most of the roads here are imported from the US Census Bureau TIGER data. This data was imported from the current data in 2007. It is fairly complete but the accuracy leaves a lot to be desired in some areas. For example look at this screen shot comparing the TIGER data to reality (USGS aerial imagery in this case). Some roads are off by over 200 meters. And there are even worse spots - this was just one I was able to find quickly.

Example of bad TIGER data

While the data was pretty complete when it was made, it is now several years old and missing many new housing developments and does not reflect instances where roads were moved. The good news is that, this being a census year, there will be new TIGER data available from the Census Bureau starting at the end of the year. It will be more complete and probably more accurate considering that GPS technology has made some advances over the last 10 years.

Well that's great but the question is how do we use the new data to improve OSM? In the software world this would be an ideal situation to use a diff utility to determine what had changed between two versions and apply patches to update the old version. But I am not aware of any geo-spatial diff utilities.

So how about just deleting the old data and re-importing the new stuff? This would obviously be a disaster in areas with active mappers. All the work they have put into correcting existing TIGER ways would be destroyed and replaced with data that is probably better than the old TIGER import but also probably worse than what local mappers have done.


Saturday, October 23, 2010

The Shiny New Blog

I have avoided creating a blog for years now. I never felt like I had enough important ideas to force upon the world to justify it. That and I assumed I would abandon it at some point and it would become yet another piece of litter along the information superhighway. I'm not convinced that these things have really changed but there have been a few times recently where I did want to post an idea to a more permanent medium than IRC, especially concerning my latest hobby: OpenStreetMap. So here I am, creating a blog on a Friday night.

Who knows where it will go but I do hope to post occasional observations concerning OpenStreetMap as well as some open source topics in general (yes, I use Linux) and maybe an occasional note about my job. I work as a developer for Kansas State University. Java with a heavy dose of SQL (Oracle) if you are curious.

Now, here is a random picture of me at the end of No-Shave November in 2008. If anyone reading this doesn't know me, it is a horrible first impression because I usually shave. But there you have it.