GTFS but for …

2000px-GTFS_class_diagram.svgJacob Baskin writes:

The Story of GTFS

GTFS is one of the biggest success stories in mobility data. In 2005, Chris Harrelson, a Google engineer, worked together with IT managers at TriMet, the transit agency for the Portland, Oregon metro area, to take an export of their schedule data and incorporate it into Google Maps to provide transit directions. The next step was adding transit in four more cities. Naturally, when Chris asked them to give him their transit data, he asked them to all provide it in the same format. In 2006, that format was enshrined as the Google Transit Feed Specification.

This GTFS format was static, a representation of where buses and trains were supposed to be according to schedule. Since then, a lot of progress has been made on real-time transit vehicle location data, and standards have emerged, and there is a real-time GTFS standard. Version 2.0 is out.

Given the success of GTFS, we want to know why so many other things are not standardized and openly available. This post summarizes the state to date of “GTFS but for.”



  • Curbs
    • “SharedStreets creates a structured language for the street, unlocking new ways of collecting, analyzing and sharing information. A shared language lets us exchange information about what’s really happening on our streets, breaking down barriers the between public and private sectors, and combining layers of data in new ways to make streets work better for people.”
    • While it lacks curb usage data, DDOT (Washington DC DOT) has open public street cross-sectional data.
  • Parking (on and off street)
    • This is related to curb data in the on-street sense, but would track utilization as well as capacity, legality. It would also include off-street data.
  • Traffic signals states (past, present, and scheduled/future)
    • “There is an ongoing challenge to get 20 signals in all 50 states by 2020 to broadcast the signal phase and timing. A lot of progress has been made & agencies are deploying well into the 100s of signals. Resources and info can be found at  ” – Patrick Son
    • Traffic Technology Services has an API, which they charge for, for accessing this standard traffic signal data which AUDI uses for in-vehicle traffic light information. They claim 4700 signals in the system currently. Some DOTs have feeds accessible with registration.
    • VDOT’s SmarterRoads open data. Includes signal phase and timing based on J2735, for all state-controlled signals (which is most of Virginia). Also includes real-time tolling HOT tolls for I-66 and much more.
  • Services
    •  Taxis/ridesourcing
      • Some cities, e.g. New York, require taxi and ridesourcing companies to make data available. In other places, this is proprietary. Some companies are sharing selected data (Uber is sharing some data via Movement, as well as SharedStreets.)
    • Shared vehicles (bikes, ebikes, scooters)
      • Some of the shared bike and scooter companies make their data available. Others don’t. For instance, New York’s CitiBike data is available. There is a GBFS (General Bikeshare Feed Specification) standard. The trove of available data is collated at
      • DC also requires bike and scooter shares to provide public real-time information via an API, although the format varies. 
    • Shared vehicles (cars)
    • Mobility-as-a-Service – The City of Los Angeles has established a Mobility Data Specification. Transport for NSW has a proposed Specification for MaaS.
  • Traffic data (vehicle counts, turning movements, speeds, vehicle locations, etc.)
    • Various states have information like this, but it is not standard between states as far as I can tell. See e.g. PEMS (California) or IRIS (Minnesota)
  • Real-time tolls, road prices.
    • There is no standardized feed type, though various agencies make this public.
  • EV charging stations and occupancy (queue length)
  • Logistics (open delivery services, physical internet)

There is of course some movement. The V2X community (vehicle to vehicle, vehicle to infrastructure, etc.) is setting standards, but they are not widely deployed nor used, nor are the outputs freely available on the internet  — the challenge to get 1000 traffic signals by 2020, out of the million or so out there in the US, “broadcasting” their state (locally and online), shows the sluggishness of deployment.

The first issue is standardization. When the data is standard, applications can be built that suck it in, process it, and provide useful outputs. No one has to reinvent the data filter for every distinct agency.

The second issue is openness. The data needs to be easily accessed. The traffic signal data may exist, but there is as far as I can tell, no open source place where one can go and grab it all.

Some providers might value incompatibility or secrecy for their data, especially parking vendors who are in competition. From a societal perspective all of this information should be freely available (gratis (free as in at no cost) and libre (free to use in any interesting way)). Making these data available in a standard format should be a quid pro quo for a license to operate a parking facility, a taxi or shared vehicle, or a toll road.

What else should there be a “GTFS” for? How do we get from here to there? What other initiatives out there show promise?