Wednesday, November 21, 2012

Preparing for Operation Cowboy

If you haven't heard, there's a major mapping event about to happen. Dubbed Operation Cowboy, this will be the second event of its kind, the first being Night of the Living Maps which happened back in February. As described on the wiki, these events are kind of like a global LAN party for maps. For one weekend people who choose to participate will focus their mapping efforts on a common goal. This time the organizers decided to focus on the USA.

We of course have plenty of mapping to do around here. Cleaning up and updating the TIGER road import is kind of a never-ending task. In some of the sparsely populated areas of the country there aren't really any local mappers to speak of so there is ample work for "armchair mappers" to come in and do some map maintenance.

Considering the audience and the event, I decided to try and get a few things tweaked before Operation Cowboy goes down.

TIGER 2012 road name tiles


First up was improving the TIGER 2012 road name tiles being served up by the OSM-US tile server. They were originally set up by Ian Dees and are a great resource for tracing roads that were built since the original TIGER data was imported. The only issue I had with them is that they show the street name in the Census Bureau's abbreviated format. In OSM we prefer to use unabbreviated names because it reduces ambiguity. Does "St" mean "Street" or "Saint"? While that one might not be too hard to figure out and expanding "Ave" to "Avenue" might be trivial, there are a lot of abbreviations in the data and some of them are not easy to guess if you aren't familiar with the area.

Any guesses as to what the following mean? Brg. Cll. Natl Lkshr. Ofc. Pso. RM. BIA Rte? (see answer key at end of post) Most people in the US can probably guess some of them but there are definitely some regional ones in there that many people may not be familiar with. Then add in abbreviated directional prefixes. And now suppose you are a European trying to figure all of this out during Operation Cowboy. It's going to look like a secret code.

I won't go into the technical details in this post (might save that for another one after I clean up the process doumentation) but it did involve 27 GB of data in 3,200 .dbf files, about 10 different command line utilties and several large SQL queries. The end result was a mapping of 10.8 million line IDs (linearid field in TIGER speak) to fully expanded road names which I applied to the rendering database. The result can be seen on this tile preview page.

Imagery preset updates


Related to the road name tiles, I noticed that Potlatch2 had a TIGER road name tiles preset in the "Background" menu but it was still pointing at tiles with 2011 data in them. Thanks to the magic of github, I was able to get that fixed easily.

Then I noticed that the JOSM imagery sources page didn't have this tile set listed at all. That one was a little trickier to fix. It involved coming up with a simplified polygon of the US border to define the available extent of the tile layer as well as putting a base64 encoded image into an XML tag on the JOSM wiki. The image+base64+XML+wiki thing seems a little nasty but the JOSM functionality that this enables is actually pretty neat. If you download OSM data inside of the US, it will pop up a notice that informs you that that there is additional background imagery available and lets you add it to the "Imagery" menu with a single click. The base64 encoded image is displayed next to the menu option. In this case it is the TIGER logo:

RAWR!

TIGER tag removal


The TIGER road import put some tags on imported objects that weren't really necessary. As people edit roads, some of the tags become even less useful as the information from the tiger tags diverges from the "normal" OSM tags. Some of the tags were useful at the time for the upload process itself but are now just annoying bloat that confuses new users.

Back in July I started a mailing list thread about having editors automatically drop some of these tags like they already drop the created_by tag. Out of that discussion came a JOSM trac ticket and patch that does exactly this. Right before uploading, JOSM checks to see if any of the modified objects it is about to upload have "discardable" TIGER tags on them and removes them.

In September, I opened a ticket against Potlatch2 with the same feature request. Since nothing had happened with that ticket and I already had the Potlatch2 repository cloned from the tile URL change pull request, I decided to take matters into my own hands. I had never so much as glanced at ActionScript but fortunately P2 was a breeze to get running in a development environment. The developer documentation page on the wiki got me up and running in a matter of minutes. Although I did have to give ant more PermGen space to get through compilation. I should probably update the wiki with that tidbit...

After a little over an hour I had another pull request fashioned which Richard was kind enough to merge the next day. I believe this went live on osm.org within the last few hours. So now every way that is touched during Operation Cowboy will help to clean this clutter out of the database without the user having to do anything.

YEEEEHAW!


So that's how I prepared for Operation Cowboy. Unfortunately the actual event will take place over the Thanksgiving weekend which means I will be in Nebraska with very limited internet access so I won't be able to participate much with actual mapping. I will probably manage to hang out in the IRC channel a bit to try and field some questions if they come up. Please join me! (#opc2012 on irc.oftc.net)


Answer key for TIGER abbreviations:
Brg = Bridge
Cll = Calle (Spanish)
Natl Lkshr = National Lakeshore
Ofc = Office
Pso = Paseo (Spanish)
RM = Ranch to Market Road
BIA Rte = Bureau of Indian Affairs Route

Tuesday, October 23, 2012

Licensed to Map (What happened to Los Angeles?!)

I recently got back from the State of the Map - USA conference. It was great and you should have been there! But for those who weren't... I presented one session on Saturday afternoon with the same title as this post. Instead of just putting my slides out there I thought I would write a blog post to tell more of the story. This is a fairly long post but don't worry, it has a lot of pictures! (click to see full resolution)

So, the license change. It happened. We lost some data. But what happened before that to try and save as much as we could? And what exactly did we lose?

First, a brief timeline:
  • Many moons ago, the OSM Foundation voted to change our license from Creative Commons to the Open Database License. This actually happened before I knew that OSM existed.
  • In order to make the change, permission had to be secured from everyone who had contributed map data to the database.
  • Contributions from anyone who did not agree to the new terms had to be removed from the database.
  • At the end of 2011 the board set April 1st, 2012 as a target date for the license switch. This turned out to be wildly optimistic but at least it was a concrete goal to shoot for instead of just an ongoing process with no end in sight.
  • The process of removing non-relicensable data was done by a bot that went through the entire database in the 2nd half of June
  • After a little more cleanup, the first ODbL planet file was finally delivered in September.
Even though the April 1st target date was not really reasonable to hit, (especially in hindsight of course) it still gave the community something to work towards. And work we did.

Contacting inactive mappers

The first priority was to contact undecided users to make them aware of the change and that they needed to log in and indicate a decision. There were several rounds of emails sent out from the foundation to undecided users, trying to get them to respond. In addition to the emails from the foundation, users actively contacted other people in their area to try and make them aware of the decision. Lastly, the foundation supplied the account email address of some of the accounts with the most map data to a small group of volunteers for more targeted contact.

The result of this contact effort can be seen in this graph that I have been presenting since the beginning of the process:

The green line corresponds to the scale on the left while the red line follows the scale on the right. Of those who responded, it was a 99.5% landslide in favor of re-licensing their data. You can see that the people who were opposed to the change were very quick to log in and enter their decision. The big bumps in the green line clearly show when the mass emails were being sent out by the foundation.

Thursday, June 7, 2012

Making a location badge using OsmAnd and the MapQuest Open static map API

I'm about to embark on my second Biking Across Kansas tour. When I did it last year, I used Google Latitude to create a little "My Location" badge which I put here on my blog to let people track me across the state. The Google one looks like this:



However at some point during the last year I noticed that OsmAnd added a feature that allows it to periodically ping a web service of your choosing with the current location of the phone. This allows me to build my own location publishing service.

I did a little searching for how other people have used this feature in OsmAnd and found a few people talking about feeding the data into a Google map display. The example in the OsmAnd wiki posts the location to a Google doc which then somehow magically updates a Google map display. No offense to Google but I'd rather use OSM based products. Fortunately the good people over at MapQuest Open provide a service that suits this situation: The static map API.

Sunday, March 11, 2012

Remapping using TIGER 2011

The license change mapping process I talked about in my previous post works well where there is still some clean data to go off of and you're just having to redo some work that a decliner did in splitting ways or adding some nodes to refine the geometry. But what about when an entire area is on the chopping block? I present the OSM Inspector view in Glendale, CA:



The area I worked on was just south of here and was actually even worse although I didn't get a "before" screenshot. Less yellow, more red. Here is what it looked like on Simon Poole's "badmap" rendering. Pretty much the entire road network is going to get nuked:


Most of this data is from a single, very prolific contributor who is not likely to agree to the new terms. It is also highly unlikely that anyone in the area is going to be able to redo all of his work before April 1st. So I reached for a bigger paintbrush: TIGER 2011.

Sunday, January 8, 2012

License Change Mapping

So far I have avoided saying anything about the OpenStreetMap license change on this blog. It is a very sensitive subject for a lot of people for a million different reasons. This is not a post about how right/wrong the license change is. I personally have agreed to the new terms but then again I don't really care what license my data gets redistributed under. While I prefer (in most situations) a share-alike clause, if we were switching to public domain instead of ODbL, I would probably still agree. Whatever.

But the license change is happening and we must prepare the map for it, ahead of the April 1 cutover date set by the license working group. Today I decided to take a look at what it would take to get I-70 in Kansas cleaned up, license-wise, and thought I'd share my findings. Note that this particular process may or may not apply to other areas. The interstate system is kind of special in that it was imported from a public domain source and has very long stretches that are identical except for tags that don't really matter. This means that there isn't really very much unique or unrecoverable information present. If there is a small section of road (say a bridge) that is dirty, it can just be removed and replaced from clean data that surrounds it plus imagery to verify that it is indeed a bridge. However whenever possible it is preferred to maintain the history of objects in OSM so I don't want to just nuke and replace large sections of I-70. Plus, deleting and recreating ways gets more complicated because of the route relations that they may be a part of.

A majority of I-70 in Kansas is already clean but there is still a noticeable fraction of dirty objects. The goal is to end up with all objects being "clean" to re-license under ODbL. First task: find dirty objects. For this I use the "License Change" view in the OSM inspector tool created by Geofabrik: http://tools.geofabrik.de/osmi/?view=wtfe