Wednesday

Introduction to network theory

Nexus: small worlds and the groundbreaking theory of networks, Mark Buchanan, 2002

Network theory has become a buzz word and has people hoping for breakthroughs in fields as diverse as fashion trends, cell biology, river geography, and economics. Although people have been interested in various forms of networks for a long time, the tools to model them mathematically have only become available very recently. In Nexus, Buchanan traces the development of network mathematics from random graphs to complex networks.

Book summary
According to Buchanan it all began with the search for a mathematical explanation to the small world phenomenon and 'six degrees of separation.' Early theories were based on the idea that links in a network are placed randomly. (The US highway system is sometimes cited as looking much like random network.) Obviously, this can't do justice to real-world social networks. I am much more likely to connect with my neighbour or someone with similar interests than with a random person anywhere in the world. However, models which are based on clustering (ie. people only connect with others if they share an affinity - to a place, person, hobby etc.) can't account for small worlds. Americans would be linked by hundreds of degrees of separation, not just six. The solution, it turns out, lies in the 2 simple facts that 1) some people are more connected than others and 2) not all connections are the same. Some of the consequences have been explored in Malcom Gladwell's 'The tipping point.' Most people are connected to their neighbours, while a few are connected to more 'distant' people. (See this graph.) As a result, networks are clustered, but the clusters are interlinked providing the 'shortcuts' necessary to achieve six degrees of separation. These short cuts between clusters are often weak links and are often provided by connectors, people who have many more connections to others than the average person.

Buchanan shows how different restrictions on the growth of a network can lead to different network structures. If a network can grow without restraint, a 'rich get richer' effect can be observed. Any node in the network that has many links, will have a larger chance of getting more links than a less-connected node. This often happens because choice is involved, not just random chance, and linking to a highly linked node carries benefits. (E.g. an airline will be more attractive if it flies out of a large hub than out of a small airport because passengers will have more options to connect to continuing flights.)

The beauty of network theory is that you can define links and nodes almost any way you like and observe similar effects. So if you define webpages as nodes and hyperlinks pointing to them as links, you will find (as Barabasi and his colleagues at Notre Dame University did), that very few nodes have most of the links. Much like a small part (say 20%) of a country's population owns most (say 80%) of its wealth. Or few of the routers in the internet carry most of the traffic. Or few enzymes are responsible for most of the chemical reactions in a cell. It's easier to understand all this if you can look at pictures.

These kinds of networks are very stable and can survive random attacks very well. However, they are highly vulnerable to directed attacks. Remove a few of the highly connected nodes ("connectors" or "hubs") and the whole network breaks apart: The shortcuts are removed and only small, separate clusters remain. This makes the internet (originally designed to withstand attacks) look very vulnerable. On the upside, it also provides a new way to think about fighting viruses and bacteria.

If, however, there is a restriction on how many links any single node can have, the network will look very different. Once a few nodes have reached their maximum number of links, the network will look more like a random network again. Consider airline routes. Most airlines want to fly out of the largest hubs - these hubs will grow, gaining more and more connections until the airports reach their capacity limits. Delays and long transfer times reduce the attractivity of the main hubs (e.g. Chicago O'Hare). Other airports have a better shot at competing for the airlines' business. Chances of any single hub getting more links are closer to equal again - a condition for random networks.

So how does this apply to innovation networks?
Consider a network where the nodes are regions and the links are firms that choose to locate in a region. For the IT industry, this network's largest hub would be Silicon Valley. If the network is scale-free (which roughly means that it follows a 'rich-get-richer' pattern), then Silicon Valley will grow indefinitely, attracting more and more firms, while other locations don't have a chance to catch up.

Various studies of regional economics have shown that industrial clusters provide benefits to the companies located there - from external economies of scale to knowledge spill-overs and relational assets. As a successful cluster grows, these benefits increase, attracting more and more firms. This sounds a lot like a scale-free network. (I don't have any data on this, but for the sake of argument, I'll assume that my hunch is right.)

However, any industrial cluster, including Silicon Valley, has growth restrictions: Land is limited; there is a limit to the number of people in any social network; there is a limit to the distance between firms that want to cooperate closely (a 2 hour drive is reportedly the limit in Silicon Valley). There are also questions of efficiency - an agglomeration may still have capacity, but if it doesn't use it's resources efficiently, it will also face restricitions to its growth. This is how I would incorporate differences between Route 128 and Silicon Valley into the model. Capacity was still available, but inefficient use of social networks and knowledge spill-overs led to a slow-down in the region's growth, allowing Silicon Valley to take the lead - and benefit from the 'rich-get-richer' effect.

This would predict that as Silicon Valley reaches the limits of its capacity, chances for other clusters to grow rise again. One could argue that increased outsourcing of manufacturing and programming abroad is a manifestation of Silicon Valley's slowing growth and the increasing chances of other regions to become new hubs. One could also argue that the 'rich-get-richer' effect and the growth restrictions are much more relevant for innovation than for production. Depending on how these arguments play out, geographical dispersion and the growth of new clusters would be restricted to production or could include cutting-edge innovation.

Any further discussion would require serious mathematical modelling for which I have neither the data nor the skills at the moment. But after playing the model through in my mind, I am even more convinced that network models can help understand geographical trends in economics, especially in services and certain high-technology industries, where physical resources and transportation play less of a role.

No comments: