In Pursuit of Big Data, Mexico City Mapathon Gamifies Crowdsourcing

Participants in a Mexico City mapathon helped to pin down the routes of a sprawling bus system.

Mexico City has over 1,500 ya-kinda-just-gotta-know-about-them bus routes making up one of the largest bus systems in the world. The buses, or “peseros,” are the primary mode of public transit in the city, accounting for 60 percent of Mexico City’s commutes. That’s 14 million rides a day. But nobody knows exactly where all of the buses go. There is no official route map. Pinning down the system and all its routes presents a daunting task, but this month, a group of collaborators achieved it by creating a citywide participatory mobile app game. They relied on crowdsourcing, cell phones and algorithms to make it happen.

The peseros are privately run; some are licensed, some aren’t. Since the 1980s, more and more lines have been popping up in organic response to demand. The city’s own website describes it like this: “[Peseros] currently serve the entire city but figuring them out can be complicated and is not always the most comfortable way to go. If you’re not used to getting around in big cities, it may be best to avoid using them.”

So 34 people from 14 organizations are trying to clean up this mess. The “collaborative table,” as they call themselves, are members of government bodies, nonprofits, think tanks, engineering firms and consultancies, who worked together for a year on their own time to solve the problem of Mexico City’s lack of data on its bus system. To accomplish what would otherwise be an expensive and time-consuming task for any one agency, they made it into a citywide participatory game. Mapatón CDMX (Mapathon, Mexico City) took place during the first two weeks of February and drew 3,594 participants.

Here’s how the game worked: Players competed to win money and other prizes by downloading an app and earning points by riding peseros while sharing their GPS data and feeding info about routes into the nascent database with the goal of mapping the system. Sounds simple, but in fact, the Mapatón involved multiple clever algorithms to overcome some tricky problems inherent in involving so many people in a complex task.

The biggest challenge was that three-quarters of the population of Mexico City doesn’t have a smartphone.

“We had to avoid the concentration of mappings where all the phones are,” says Humberto Fuentes of the city’s Laboratorio Para La Ciudad (Laboratory for the City), one of the organizing bodies. So they created an algorithm that adjusted which routes were worth the most points to get people out and across the city.

“We couldn’t just send citizens to map any place at any time. We had to establish priorities,” explains programmer Christian Guerrero.

To begin, the routes in the neighborhoods where the team predicted there would be the least amount of smartphones were worth the most points. But the algorithm had the points system constantly shifting in order to incentivize participants and uphold strategic priorities of what needed to get mapped. The points system constantly recalculated in real time.

“If you have previous mappings in one place,” says Fuentes, “then that place starts being worth less points.” Likewise, the places that were not getting mapped increased in points. They also wanted to make sure they got the major transit nodes mapped since that data is likely the most useful later on to citizens attempting to navigate the city. So they developed an algorithm that translated sizes of transit nodes into enhanced point values.

Mexico City mapathon

The varying quality of data that they received from participants presented another challenge. “A problem in crowdsourcing,” says Guerrero, “is that you never know what’s on the other end. You don’t know what type of or what quality phone the person has.” Low-quality phones produce low-quality GPS data. So they implemented an open source algorithm that cleans GPS data in real time. It attaches GPS location points to corresponding streets following roughly the same paths.

Other algorithms necessary because of the nature of crowdsourcing included one that corrected place names and another that detected strange behaviors, like if someone began mapping in their home while playing with the app.

A slightly funnier problem: On peseros, “routes” aren’t exactly always fixed.

“Our system doesn’t always have the formality it should have. Sometimes the driver might say things like ‘Is anyone getting off at Chilpancingo?’ and if people say no, he might just change the route to get to the next stop faster. Or he might change his route just because of traffic,” explains Emanuel Hernández, a collaborator from Planeación y Desarollo. This problem could not be solved with an algorithm. They solved it with simple repetition: If five people mapped the same route but one went haywire, they could chuck the outlier assuming it was a detour.

With all the data collected, 2,632 rides were mapped. All of the data will be publicly available by the end of the month, both in raw and processed forms. At a hackathon planned for the coming weeks, the collaborators expect to use the data to generate visualizations, maps and mobile apps. “We are trying to make the data as versatile as possible,” says González.

And the platform itself is also versatile. It’s a common phenomenon in the developing world that public transit needs are met organically by the informal sector. Unknowable bus systems ensue. Mapatón might be a cost-effective model for mapping similar systems globally.