Some work we have done at TransportLab at the University of Sydney.
Some work we have done at TransportLab at the University of Sydney.
Some work we have done at TransportLab at the University of Sydney.
Some work we have done at the University of Sydney’s TransportLab on Network Growth in Sydney:
In binary networks, the focus is on whether or not a connection between two nodes exists. However, not all links (or nodes) are created equal, particularly when it comes to transportation networks. When we know about the presence of a link as well as the strength of that link, it is called a valued network. For instance when traveling from A to B in a street network, there is usually discontinuity in street type. In other words, one might move from a local street to a collector road to an arterial road and then back to a collector before reaching their destination. While engineers know this sort of differentiation as functional classification, it is also referred to as hierarchy.
Hierarchy, which is embedded in many natural and societal systems such as biologic cells and the Internet, is a common transportation complexity that requires a more complicated topological analysis (Tomko, Winter, & Claramunt, 2008). Typical topological measures such as Degree or Betweenness can be useful in helping understand network hierarchy, particularly in tree-like networks; however, such measures would fail to properly distinguish between streets in a gridded street network. In the above version of Metropolis’ street network, the major streets are represented by thicker lines and easily discerned, even in a gridded network (Fleischer, 1941). Using the basic set of topological metrics, we would have no idea that 8th Street is functionally different from 7th Street or F Street from D Street. These metrics fail to consider attributes – such as urban design, number of lanes, active transportation infrastructure, adjacent land uses, and speed – beyond network structure and would not necessarily be able to distinguish such streets.
Working with hierarchical networks often involves dividing networks in multiple layers or tiers. Measurements of heterogeneity have also become common proxies for characterizing hierarchy. To identify heterogeneity among street segments, researchers have used entropy measures as well as discontinuity measures (Xie, 2005). Discontinuity, for example, does not necessarily denote a disconnected network; rather, the reference is to the discontinuity in moving from one street type to another. If we sum the number of times a traveler goes from one type of street to another while traveling along a shortest path route, we find the trip discontinuity. Dividing that number by the length of the trip gives us the relative discontinuity (Parthasarathi, 2011). Other simplistic hierarchy measures calculate the relative percentage of a particular type of road. For instance, we might divide the number or length of arterials by the total number or length of roads to find the relative percent arterials (Parthasarathi, 2011).
Interestingly, it is not uncommon for large-scale transportation models to delete streets on the lower end of the hierarchical spectrum (i.e. local streets) for the sake of computational efficiency. Yet, removing such streets creates a bias against more connected networks because less connected networks typically need to be supported by major streets with more capacity than would be needed in more connected networks (Bern & Marshall, 2012). Some topological researchers – where the focus should be on understanding the full network – unfortunately reach the same conclusion: “urban streets demonstrate a hierarchical structure in the sense that a majority is trivial, while a minority is vital” (Jiang, 2009). If we only care about vehicle traffic flow, such statements may be true. However, my previous street network research confirms that understanding the full network holds the key to pushing toward improved safety, increased active transportation, and better environmental and health outcomes (Bern & Marshall, 2013; Marshall & Garrick, 2009, 2010a, 2010b, 2012).
Urban planners and engineers have long been interested in measuring street connectivity and typically do so with relatively simple measures such as the link to node ratio (called the Beta Index in the Transport Geography field), which divides the total number of links (i.e. road segments between intersections) by the total number of nodes (i.e. intersections including dead ends). In the above image, the connected network has link to node ratio of 1.6 while the dendritic network’s link to node ratio is 1.0 (a link to node ration of 1.4 is typically considered a well-connected street network).
The connected node ratio divides the number of connected nodes (i.e. nodes that are not dead ends) by total number of nodes (Handy, Paterson, & Butler, 2003). The networks above have a connected node ratio of 1.0 and 0.6, respectively. The underlying intent is distinguish between well-connected or gridded street networks and dendritic, treelike networks – as highlighted in the figure above – in researching relevant issues such as travel behavior, road safety, VMT, and public health outcomes.
Topology takes a slightly different approach to understanding this issue. The Meshedness Coefficient, for instance, measures connectivity by looking at the number of cycles in the network with respect to the maximum number of cycles (a cycle is a closed path that begins and ends at the same node with no fewer than three links). A Meshedness Coefficient of 0 represents full tree structure (i.e. no cycles), and 1 represents complete connectivity (i.e. every node is directly connected to every other node, which is not feasible in a large surface transportation network) (Buhl et al., 2006). In non-planar networks, this measure is also known in Transport Geography as the Alpha Index. The Alpha for the connected network above is 0.4 and for the dendritic network, it is just 0.03. For large networks, Beta and Alpha are highly correlated.
Xie and Levinson (2007) developed another useful metric called Treeness. Instead of counting the number of cycles, Treeness is instead calculated by dividing the length of street segments not within a cycle by the total length of street segments. The Treeness measure also provides a value between 0 and 1, but in this case, the higher number represents a more treelike or dendritic network (Xie & Levinson, 2007).
Networks with good overall connectivity are called integrated networks. Networks with low connectivity are called fractured networks (although fractured networks can still be comprised of connected components). Again, these measures relate to issues of resilience. When a single node failure can significantly erode network functionality, the system is fragile. The image below shows a fallen tree in Lake Oswego, OR that cut off more than 50 families from the outside world (or more specifically, the cars of more than 50 households were trapped) (Florip, 2010). If only that network had a little less Treeness.
When we have nodes or links with high Betweenness values, it is often because our network is split into various sub-groups that can be called clusters. Clusters tend to have their own unique set of properties, so it is useful to be able to identify clusters quantitatively.
While there are a growing number of clustering algorithms, the basic idea behind them is to capture the degree to which nodes cluster. The Clustering coefficient, for instance, represents how likely is it that two connected nodes are part of a larger group of highly connected nodes. It can be calculated by dividing number of actual connections between the neighbors of a node (i.e. the nodes directly connected to the node in question) by the number of possible connections between these same neighboring nodes. For instance in the image above, the red node is the node of interest, and it has a Degree of 4. Those 4 neighboring nodes make 4 actual connections (i.e. the black lines in the figure on the right) but have 6 possible connections (i.e. the black lines plus the red dashed lines). Thus, the Clustering coefficient for the red node is 4 divided by 6 or 0.67.
The value represented by the Clustering coefficient ranges from 0 (i.e. no clustering) to 1 (i.e. complete clustering). If we are interested in the amount of clustering for an entire network, we average the Clustering coefficients for all of the nodes. Clustering tends to be higher in real-world networks than in random networks. So when a network becomes more centralized (i.e. a small percentage of nodes have high connectivity), the overall topology becomes more differentiated and clusters begin to emerge.
Other related terms include component and clique. When a given sub-group of nodes is also highly connected, that is called a component. When the nodes in a component have few connections to other nodes outside of the component, that is a clique. Understanding clusters, components, and cliques in networks can be useful because they can hold more influence over behavior than overall network structure (Neal, 2013). Imagine, for instance, a New Urbanist neighborhood with great street connectivity set into a city with poor overall street connectivity. Analyzing network structure for the overall city might lead us to one conclusion; yet, we could find very different outcomes in the New Urbanist neighborhood. While factors such as land use, street design, and demographics influence transportation-related outcomes as well, the concept of clustering holds value for those interested in truly understanding transportation networks.
Degree is often good for measuring local circumstance, but adequately characterizing centrality is a bit more complicated. When trying to figure out centrality in terms of how connected and influential a node or link is, it is useful to get a sense of relative network flow through a particular node or link.
Betweenness measures attempt to capture this relative flow by quantifying the number of times a node or link is on a shortest path between two other nodes. The first step would be to calculate the shortest path between every origin and every destination. Next, we count the number of times that a particular node or link shows up on a shortest path. The resulting number represents the relative role of a node or link as a connector between clusters of nodes or links. In the above street network, the intersection highlighted in red must be included in over half of the shortest paths. We call this count Betweenness, which is essentially an attempt to quantify how necessary a node or link is to get from one side of the network to the other. The Panama Canal, for instance, is a key maritime link connecting the Atlantic and Pacific Oceans. Without it, ships would have to route around Cape Horn at the southernmost tip of Chile or through the Straits of Magellan. For a ship traveling from New York to San Francisco, the Panama Canal – due to its high Betweenness value – cuts more than 7,500 miles from the journey. In terms of other transportation issues, Betweenness usually relates to metrics such as accessibility and traffic congestion.
In addition to revealing relative importance, Betweenness also indicates how irreplaceable a node or link may be to a network. In other words, what happens if we remove a certain node or link from the network? Very high betweenness values can indicate a critical connection between various groups of nodes or links. In some cases, this represents a vulnerability where we would want to add redundancies to the network.
In transportation networks, if we assume all travelers take the shortest path and treat each traveler as having a unique origin and destination, Betweenness is the same as the flow (number of travelers) on the link. We call this Flow-weighted Betweenness.
Centrality measures help gauge the overall importance of a node. In other words, how connected and how influential is a node within the overall network?
One of the simplest measures of centrality is Degree, which measures the number of connections between a node and all other nodes. For instance if we are considering a street network with intersections as nodes, a nodal Degree of 4 would indicate a typical 4-way intersection.
The image above depicts a rendition of the Metropolis street network with a Degree value shown at each intersection and a 4-way intersection highlighted in red (Fleischer, 1941). When we focus on what is happening at one particular node, it is called the ego network (in that we are looking at the network from the perspective of a single node while ignoring all nodes not directly connected, which can be deemed a bit narcissistic). The entire network can be called the complete, whole, or global network. So if we want an overall Degree measure, we can calculate Average Degree, which is the average number of connections for all the nodes within the overall network. When the Average Degree exceeds 1, every node has at least one connection, on average. When the Average Degree approaches log(n), where n equals the number of nodes in the network, every node starts to become accessible from every other node (Neal, 2013). For the Metropolis network, there are 78 nodes with an Average Degree of 3.4.
Analyzing Degree measures for a complete network also entails generating a Degree Distribution, which literally equates to the plotting the frequency of each Degree for all the nodes as shown in the image below for the Metropolis street network. The idea is to try to capture the relative differences in connectivity between the nodes in order to gain a sense of network structure. For instance, every node in a homogenous network would have the exact same number of connections and not much of a distribution. A more centralized network might have one node with a high Degree value and all other nodes with low Degree values.
Networks play a role in nearly all facets of our daily lives, particularly when it comes to transportation. Even within the transportation realm lays a relatively broad range of different network types such as air networks, freight networks, bus networks, and train networks (not to mention the accompanying power and communications networks). We also have the ubiquitous street network, which not only defines how you get around a city, but it provides the form upon which our cities are built and experienced. Cities around the world that are praised for having good street networks come in many different configurations ranging from the medieval patterns of cities like Prague and Florence, to the organic networks of Boston and London, and the planned grids of Washington D.C. and Savannah, Georgia. But how do researchers begin to understand and quantify the differences in such networks?
The primary scientific field involved with the study of shapes and networks is called topology. Based in mathematics, topology is a subfield of geometry that allows one to transform a network via stretching or bending. Under a topological view, a network that has been stretched like a clock in a Salvador Dali painting would be congruent with the original, unstretched network. This would not be the case in conventional Euclidean geometry where differences in size or angle cannot be ignored. The transportation sector typically models networks as a series of nodes and links (Levinson & Krizek, 2008). The node (or vertex) is the fundamental building block of the model; links (or edges) are not independent entities but rather are represented as connections between two nodes. Connectivity – and the overall structure of the network that emerges from that connectivity – is what topology is all about. In other words, topology cares less about the properties of the objects themselves and more about how they come together.
For instance if we look at the topology of a light rail network, the stations would typically be considered the nodes and the rail lines would be the links. In this case, the stations are the actors in the network, and the rail lines represent the relationships between the actors (Neal, 2013). Those relationships – and more specifically, those connections – embody what is important. Taking a similar approach with a street network, we might identify the intersections as the nodes and the street segments as the links as shown in the network based on an early version of Metropolis above (Fleischer, 1941). For most street networks, however, the street segments are just as important as the intersections, if not more so. The ‘space syntax’ approach takes the opposite (or ‘dual’) approach with street networks: the nodes represent the streets, and the lines between the nodes (i.e. the links or edges) are present when two streets are connected at an intersection, as shown below using the same Metropolis network (Jiang, 2007).
Initial theories related to topology trace back to 1736 with Leonhard Euler and his paper on the Seven Bridges of Königsberg. Graph theory based topological measures first debuted in the late 1940s (Bavelas, 1948) and were initially developed in papers analyzing social networks (Freeman, 1979) and the political landscape (Krackhardt, 1990). Since then, topological analyses have been widely adopted in attempting to uncover patterns in biology (Jeong, Tombor, Albert, Oltvai, & Barabasi, 2000), ecology (Montoya & Sole, 2002), linguistics (Cancho & Sole, 2001), and transportation (Carvalho & Penn, 2004; Jiang & Claramunt, 2004; Salingaros & West, 1999). Topology represents an effort to uncover structure and pattern in these often complex networks (Buhl et al., 2006). The topological approach to measuring street networks, for instance, is primarily based upon the idea that some streets are more important because they are more accessible, or in the topological vernacular, more central (Porta, Crucitti, & Latora, 2006). Related to connectivity, centrality is another important topological consideration. A typical Union Station, for example, is a highly central (and important) node because it acts as a hub for connecting several different rail lines. Some common topological measures of centrality include Degree and Betweenness, which we will discuss in more detail in subsequent sub-sections.
There are also some peculiarities worth remembering when it comes to topology.
When thinking about the Size of a network, our first inclination might be measures that provide length or area. In topological terms, however, Size refers to the number of nodes in network. Other relevant size-related measures include: Geodesic Distance (the fewest number of links between two nodes); Diameter (the highest geodesic distance in a network); and Characteristic Path Length (the average geodesic distance of a network).
Density is another tricky term in the topological vernacular. When talking about the density of a city, we usually seek out measures such as population density, intersection density, or land use intensity. In most cases, these metrics are calculated in terms of area (e.g. per km2). In topology, however, Density refers to the density of connections. In other words, the density of a network can be calculated by dividing the number of links by the number of possible links. Topologically, the fully-gridded street networks of Portland, OR and Salt Lake City, UT are essentially the same in terms of Density; with respect transportation and urbanism, however, there remain drastic functionality differences between the 200’ (~60m) Portland blocks and the 660’ (~200 m) Salt Lake City blocks.
As illustrated with the Portland/Salt Lake City example, one limitation of topology is that it ignores scale. However, this can also be an advantage. For instance, Denver might be much closer to Springfield, IL than Washington, DC as the crow flies, but a combination of several inexpensive options for direct flights to DC and relatively few direct flight options for Springfield mean that DC is essentially closer in terms of network connectivity. Topology captures such distinctions by focusing on connectedness rather than length.
While topological analyses such as the above are scale-free, we also need to be careful about use of this term because scale-free networks are not equivalent to scale-free analyses. In topological thinking, scale-free networks are highly centralized. More specifically, if we plot the number of connections for each node, the resulting distribution for what is known in topology as a scale-free network would resemble a Power law distribution with some nodes having many connections but most having very few. A hub-and-spoke light-rail system, for instance, tends to exhibit scale-free network qualities with relatively few stations connecting many lines. The nodes in a random network, on the other hand, tend to have approximately the same number of connections. For instance when we define the intersections of a street network as the nodes and the segments as the links, the results tends towards a random network. If we flip the definition again, so that the streets are the nodes and the intersections the links, we trend back towards a scale-free network (Jiang, 2007; Jiang & Claramunt, 2004).
One reason to look at connectivity in these terms has to do with the critical issues of resilience and vulnerability. In general, robustness is associated with connectivity. When we have good connectivity, removing one node or link does not make much of a difference in terms of overall network performance. In contrast, scale-free networks are more susceptible to strategic attacks, failures, or catastrophes. However, as shown in a recent paper about urban street network topology during a Zombie apocalypse, good connectivity could actually be a double-edged sword (Ball, Rao, Haussman, & Robinson, 2013).
Most recent route choice models, following either the random utility maximization or rule-based paradigm, require explicit enumeration of feasible routes. The quality of model estimation and prediction is sensitive to the appropriateness of the consideration set. However, few empirical studies of revealed route characteristics have been reported in the literature. This study evaluates the widely applied shortest path assumption by evaluating routes followed by residents of the Minneapolis—St. Paul metropolitan area. Accurate Global Positioning System (GPS) and Geographic Information System (GIS) data were employed to reveal routes people used over an eight to thirteen week period. Most people did not choose the shortest path. Using three weeks of that data, we find that current route choice set generation algorithms do not reveal the majority of paths that individuals took. Findings from this study may guide future efforts in building better route choice models.