I started reading about network statistics this morning and I wanted to experiment with the available Python tools. It looks like the state of the art is graph-tool. I had a little trouble installing the software on my Mac, but there is an evolving guide to this process here.
Introductory Terms
I found a great introductory resource, the Introduction to Social Network Methods by Robert A. Hanneman and Mark Riddle. In this post I have reproduced the KNOKI
data set they used throughout their book. The advantage of this data set is that it is reasonably small, complex, and widely used an example in other resources. Here are some definitions I compiled from Chapter 7, Connection and Distance:
- Out-degree statistics
- Calculated from the rows, tells us about sources of information.
- In-degree statistics
- Calculated from the columns, tells us about consumers of information.
- Density
- The number of connections relative to the number of possible connections.
- Reachability
- Binary, tells us if there is a path between two actors.
- Connectivity
- Integer, counts the number of nodes that have to be removed in order for one node to be disconnected from another.
- Distance
- Integer, count of the number of walks of a given number of hops, or minimum distance in a weighted graph.
- Geodesic distance
- The number of hops, or distance, of the shortest path between two nodes.
- Diameter
- The largest geodesic distance.
- Geodesic count
- There is one geodesic distance, but there may be many paths for that geodesic distance.
- Maximum flow
- The number of paths between a source neighborhood and a target.
A Code Example
The graph-tool documentation is simply extraordinary. I borrowed heavily from it. I’m not sure that there isn’t an easier way to define a graph, but my method below isn’t painfully tedious. First we import all of graph-tool, then we define a two dimensional array representing the Knoke Information data set.
from graph_tool.all import * knoki = [ [ 0, 1, 0, 0, 1, 0, 1, 0, 1, 0 ] , [ 1, 0, 1, 1, 1, 0, 1, 1, 1, 0 ] , [ 0, 1, 0, 1, 1, 1, 1, 0, 0, 1 ] , [ 1, 1, 0, 0, 1, 0, 1, 0, 0, 0 ] , [ 1, 1, 1, 1, 0, 0, 1, 1, 1, 1 ] , [ 0, 0, 1, 0, 0, 0, 1, 0, 1, 0 ] , [ 0, 1, 0, 1, 1, 0, 0, 0, 0, 0 ] , [ 1, 1, 0, 1, 1, 0, 1, 0, 1, 0 ] , [ 0, 1, 0, 0, 1, 0, 1, 0, 0, 0 ] , [ 1, 1, 1, 0, 1, 0, 1, 0, 0, 0 ] ]
Next we build the graph object,
N = len( knoki ) g = Graph() g.set_directed( True ) v = [ g.add_vertex() for i in range(N) ] for i in range(N): for j in range(N): if knoki[i][j] == 1: g.add_edge( v[i], v[j] )
Then we can visualize different things like the closeness of the nodes,
g = GraphView( g, vfilt=label_largest_component(g) ) c = closeness( g ) graph_draw( g, vertex_fill_color=c, vertex_size=prop_to_size(c, mi=5, ma=15), vcmap=matplotlib.cm.gist_heat, vorder=c, output="KNOKI_closeness.png", vertex_text=g.vertex_index, vertex_font_size=18 )
Maybe you would like to know that graph-tool has been included in homebrew, and the installation instructions you linked to are now outdated. It is now much simpler to install. See here
Also, if you have an adjacency matrix to begin with, you can populate your graph with the function Graph.add_edge_list().
Thanks Tiago! I’ll update this when I get a chance!