Higher-order analysis of temporal network data

Higher-order analysis of temporal network data

An educational tutorial using the free python module pyTempNets

Ingo Scholtes
Chair of Systems Design
ETH Zürich

May 2015 (Updated: August 2015)

In this tutorial, I will introduce some basic concepts of the analysis of temporal network data using higher-order networks, which have been introduced in the following recent publications:

  • I Scholtes, N Wider, A Garas: Higher-Order Aggregate Networks in the Analysis of Temporal Networks: Path Structures and centralities, [1508.06467], Aug. 2015
  • I Scholtes, N Wider, R Pfitzner, A Garas, CJ Tessone, F Schweitzer: Causality-driven slow-down and speed-up of diffusion in non-Markovian temporal neworks, Nature Communications, 5, Sept. 2014
  • R Pfitzner, I Scholtes, A Garas, CJ Tessone, F Schweitzer: Betweenness preference: Quantifying correlations in the topological dynamics of temporal networks, Phys Rev Lett, 110(19), 198701, May 2013

A key point of these works is that we highlight the importance of non-Markovian characteristics in time-stamped relational data. In a nutshell, we show that the ordering of links in time-stamped networks matters. We further introduce higher-order networks, a generalization of the common network perspective that allows to study the influence of order correlations in temporal networks. This new perspective on time-stamped network data provides interesting new opportunities not only to better understand dynamical processes on temporal networks, but also for the definition of novel measures for centrality or new community detection algorithms that take into account bot the topological and temporal dimension of complex systems.

In the following, I illustrate some basic concepts of non-Markovian characteristics in temporal networks. In particular, I will showcase the use of pyTempNet, a free python module which simplifies the analysis and visualization of time-stamped relational data, as well as the simulation of dynamical processes on top of temporal networks. All of the methods and concepts introduced in the publications listed above are fully implemented in pyTempNet thus making it particularly easy to apply the higher-order network framework to your data.

Installation of pyTempNets is easy. You will need a working installation of python3 properly set up. You will also have to have the modules numpy, scipy, matplotlib and igraph installed. This can be done easily by installing the Anaconda python 3.4 distribution, which you can get from here. Once you have installed Anaconda, you will just need to additionally set up igraph by invoking the following command:

> pip install python-igraph

You are then ready to install pyTempNets directly from gitHub by typing:

> pip install git+git://github.com/IngoScholtes/pyTempNets.git

After this, you should be ready to run this notebook.

Let us start with a new ipython notebook and let us import some basic modules required in the following tutorial. We first import the basic pyTempNet module as follows:

In [1]:
import pyTempNet as tn

Since time-aggregated networks are internally represented as igraph objects (which we can plot and analyze using standard igraph functions) we will also need to include igraph.

In [2]:
import igraph

Finally, we will want to plot some figures and do some basic calculations using numpy and matplotlib.

In [3]:
import numpy as np
import matplotlib.pyplot as plt

You won't have to care too much about the following imports and definitions, as they will simply allow us to directly embed igraph plots, Latex output and videos into the output of an ipython notebook.

In [4]:
from IPython.display import *
from IPython.display import HTML
import base64

from wand.image import Image

from subprocess import call

def showVideo(filename):
    VIDEO_TAG = """<video controls>
 <source src="data:video/x-m4v;base64,{0}" type="video/mp4">
 Your browser does not support the video tag.
</video>"""
    bytes = open(filename, "rb").read()
    return HTML(VIDEO_TAG.format(base64.b64encode(bytes).decode('utf-8')))

def compileAndShowTex(file):
    call("cd " + file.split('/')[0] + " && pdflatex " + file.split('/')[1], shell=True)
    with Image(filename=file.replace('.tex', '.pdf'), resolution=200) as img:
        img.resize(400, 400)
        img.save(filename=file.replace('.tex', '.png'))
        display(Image(filename=file.replace('.tex', '.png')))

Two simple examples

Let us introduce some basic concepts using two simple examples for temporal networks. We first create an empty temporal network object, which we can then use to add time-stamped links one by one. In each call, we pass the source and target node, as well as an integer time stamp indicating when the link was present.

In [5]:
t1 = tn.TemporalNetwork()
t1.addEdge('a', 'c', 1)
t1.addEdge('c', 'e', 2)

t1.addEdge('b', 'c', 3)
t1.addEdge('c', 'd', 4)

t1.addEdge('b', 'c', 5)
t1.addEdge('c', 'e', 6)

t1.addEdge('a', 'c', 7)
t1.addEdge('c', 'd', 8)

A nice way to visualize such networks are so-called time-unfolded notations, in which we unfold time into a spatial (vertical) dimension. pyTempNet allows you to directly generate tikz code for such a representation. We just call the following function:

In [6]:
tn.Visualizer.exportTikzUnfolded(t1, 'tutorial/t1.tex')

This will produce a LaTeX file which we can simply compile to obtain a PDF figure (this comes in handy for illustrative figures in manuscripts). We can either manually compile the file and convert the resulting PDF figure to a PNG, or we can do this automatically using our little helper function (which requires imagemagick and ghostscript as well as imagemagick's python interface Wand to be installed on your system).

With this helper function, we can easily display the resulting figure in ipython as follows:

In [7]:
compileAndShowTex('tutorial/t1.tex')

In this representation, each of the five nodes is represented by multiple temporal copies. Two subsequent temporal copies are connected whenever a link existed at the respective time stamp. This representation makes it easy to study so-called time-respecting paths.

Apart from time-unfolded representations, pyTempNet can also easily be used to produce videos of temporal networks. For this we can simply export a sequence of PNG movie frames using the following built-in function:

In [8]:
tn.Visualizer.exportMovie(t1,'tutorial\\video.mp4', fps=1)
2015-08-19 12:58:23 [Severity.INFO]	Extracting two-paths for delta = 1...
2015-08-19 12:58:23 [Severity.INFO]	finished.
2015-08-19 12:58:23 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 12:58:23 [Severity.INFO]	finished.
2015-08-19 12:58:23 [Severity.WARNING]	No visual style specified, setting to defaults
2015-08-19 12:58:23 [Severity.INFO]	Encoding video ...
2015-08-19 12:58:24 [Severity.INFO]	finished.

This will internally generate a sequence of numbered frames within a subdirectory called frames. It will then automatically convert the resulting frames to an mp4 video using - for instance - the tool imagemagick. For this, make sure to install imagemagick for your platform.

We can now show the resulting video of the temporal network using the helper function defined above. Just click the play button.

In [9]:
showVideo('tutorial/video.mp4')
Out[9]:

Let us now study the extraction of so-called time-respecting paths in temporal networks. This crucial concept is the temporal equivalent to paths in static network. Different from static networks, links on a time-respecting path have to respect causality, i.e. in order for a path (u,v,t_1) -> (v,w,t_2) of two time-stamped links to connect nodes u and w, the first link has to be present before the second link, i.e. t_1<t_2. pyTempNet can automatically extract time-respecting paths of length two for you. Sometimes, you may also want to include a limit in the waiting time for time-respecting paths. I.e. rather than simply requiring t_1<t_2, you may want to impose that t_2-t_1 <= delta for some maximum waiting time delta.

The pyTempnet function extractTwoPaths can be used to extract time-respecting paths of length two. If no delta is specified, delta = 1 is assumed, i.e. only consecutive links will be considered to contribute to time-respecting paths.

Note that we can specify an arbitrary maximum time difference delta by using the function setMaxTimeDiff.

In [10]:
t1.setMaxTimeDiff(delta=1)
t1.extractTwoPaths()
2015-08-19 12:58:24 [Severity.INFO]	Extracting two-paths for delta = 1...
2015-08-19 12:58:24 [Severity.INFO]	finished.

Depending on your data set, this function may require a few seconds (if you have several hundred thousand or even Millions of time-stamped links). Once it has been completed, we can start analyzing causality structures and non-Markovian characteristics in the temporal network. If you omit the explicit call above, the call will automatically be made whenever time-respecting paths are first needed.

Based on the statistics of time-respecting paths, we can calculate betweenness preference, a measure that captures to what extent nodes mediate time-respecting paths between particular pairs of their neighbours in the static network. Being an information-theoretic measure, it can be interpreted in terms of the number of bits of information that are lost, when we aggregate all time-stamped links around a node (thus obtaining a time-aggregated network which misses the information on the ordering of links). For details on the definition and interpretation of the measure, I have to refer to the publications mentioned above.

With pyTempNet we can calculate the betweenness preference of a node using the following function. In this particular case, the betweenness preference of node c should be zero, as there is no preference for node c to mediate time-respecting paths between any particular pair of nodes (see also temporal network above).

In [11]:
print('I^b(S;D) = ', tn.Measures.BetweennessPreference(t1,'c'))
I^b(S;D) =  0.0

We can now calculate different time-aggregated representations of our temporal network. The simplest and most commonly used one is the plain time-aggregated network, in which all time-stamps are discarded, while link weights indicate the number or frequency of interactions between nodes.

We can obtain an igraph object representing the (first-order) aggregate network as follows. This igraph object can be plotted using basic igraph functions.

In [12]:
# We can compute and plot the first-order aggregate network
g1 = t1.igraphFirstOrder()

visual_style = {}
visual_style["bbox"] = (600, 400)
visual_style["margin"] = 60
visual_style["vertex_size"] = 80
visual_style["vertex_label_size"] = 24
visual_style["vertex_color"] = "lightblue"
visual_style["edge_curved"] = 0.2
visual_style["edge_width"] = 1
visual_style["edge_arrow_size"] = 2

visual_style["layout"] = g1.layout_auto()
visual_style["vertex_label"] = g1.vs["name"]
visual_style["edge_label"] = g1.es["weight"]

igraph.plot(g1, 'tutorial/t1_G1.png', **visual_style)
display(Image(filename='tutorial/t1_G1.png'))

In the Nature Communications and EPJ B articles listed above, we introduced so-called higher-order networks which not only capture the frequency but also the ordering of time-stamped links. The idea is that each link in the time-stamped network is represented by a node in the second-order network, while each time-respecting path of length two is represented by a link. Again, details on this abstraction can be found in the papers above.

Here, we can compute (and plot) the second-order network for the temporal network above as follows:

In [13]:
g2 = t1.igraphSecondOrder()

visual_style["layout"] = g2.layout_auto()
visual_style["vertex_label"] = g2.vs["name"]
visual_style["edge_label"] = g2.es["weight"]
igraph.plot(g2, 'tutorial/t1_G2.png', **visual_style)
display(Image(filename='tutorial/t1_G2.png'))
2015-08-19 12:58:24 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 12:58:24 [Severity.INFO]	finished.

In the example above, we see that a total of four time-respecting paths of length two (represented by the four links) exist in the temporal network, each of them occurring exactly one time (thus the link weights of one).

Let us now see what happens if we simply flip the order of two time-stamped links in the temporal network. We do this by defining the following new temporal network (which is identical to the previous one except for one reordering).

In [14]:
t2 = tn.TemporalNetwork()
t2.addEdge('a', 'c', 1)
t2.addEdge('c', 'e', 2)

t2.addEdge('b', 'c', 3)
t2.addEdge('c', 'd', 4)

t2.addEdge('b', 'c', 5)
t2.addEdge('c', 'd', 6)

t2.addEdge('a', 'c', 7)
t2.addEdge('c', 'e', 8)

Again, we can produce a tikz figure showing the time-unfolded representation of this temporal network (which I again manually compiled and converted to a PNG file).

In [15]:
tn.Visualizer.exportTikzUnfolded(t2, 'tutorial/t2.tex')
compileAndShowTex('tutorial/t2.tex')

We now again extract all time-respecting paths of length two and then visualize the first-order network. Not surprisingly, this is exactly the same as before, since we only changed the ordering of time-stamped links.

In [16]:
t2.setMaxTimeDiff(delta=1)
t2.extractTwoPaths()

g1 = t2.igraphFirstOrder()

visual_style["layout"] = g1.layout_auto()
visual_style["vertex_label"] = g1.vs["name"]
visual_style["edge_label"] = g1.es["weight"]
igraph.plot(g1, 't2_G1.png', **visual_style)
display(Image(filename='t2_G1.png'))
2015-08-19 12:58:26 [Severity.INFO]	Extracting two-paths for delta = 1...
2015-08-19 12:58:26 [Severity.INFO]	finished.
2015-08-19 12:58:26 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 12:58:26 [Severity.INFO]	finished.

As we will see now, there is an important change in terms of causality structures. Now, node c has a preference to mediate time-respecting paths between the pairs of nodes a and e as well as b and d respectively. Furthermore, knowing the "source" of a time-respecting path through c completely determines the target (source a determining target e and source b determining target d). Our uncertainity of two equally likely choices is reduced to one, which corresponds to a betweenness preference of one bit.

In [17]:
print('I^b(S;D) = ', tn.Measures.BetweennessPreference(t2,'c'))
I^b(S;D) =  1.0

We can also output the corresponding (unnormalized) betweenness preference matrix (again see paper for details). The entries provide us with the number of different time-respecting paths through node c. The first row corresponds to node a, the second row corresponds to node b, the first column corresponds to node e, the second column corresponds to node d. Here, the entries reveal that there are two time-respecting paths a -> c -> e (first row) and two time-respecting paths b -> c -> d (second row). Time-respecting paths a -> c -> d and b -> c -> e (off-diagonal zero entries) are absent.

In [18]:
print('I^b(S;D) = ', tn.Utilities.BWPrefMatrix(t2,'c'))
I^b(S;D) =  [[ 2.  0.]
 [ 0.  2.]]

The changes in the statistics of time-respecting paths that are due to the reordering of the time-stamped edges is captured by the fact that the second-order network is different from the one before (even though the first-order network is the same!).

In [19]:
g2 = t2.igraphSecondOrder()

visual_style["layout"] = g2.layout_auto()
visual_style["vertex_label"] = g2.vs["name"]
visual_style["edge_label"] = g2.es["weight"]
igraph.plot(g2, 'tutorial/t2_G2.png', **visual_style)
display(Image(filename='tutorial/t2_G2.png'))
2015-08-19 12:58:26 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 12:58:26 [Severity.INFO]	finished.

In the second-order network above, we see that now each of the two time-respecting paths (a,c) -> (c,e) and (b,c) -> (c,d)occurs twice, while the theoretically possible time-respecting paths (a,c) -> (c,d) and (b,c) -> (c,e) (which we would actually expect to occur with the same probability than the others) are absent!

You see that this change in the causal structures of the temporal network are exclusively driven by the ordering of time-stamped edges and it is these phenomena that we will now study in more realistic scenarios.

Reading from files: A larger synthetic example

We now consider a larger (synthetic) example which we read from a TEDGE file, a simple file format which contains a list of time-stamped links. TEDGE files have the format

`source target time
a b 4
b c 7
u v 3
u v 3
...`

The separation character (here: space) can be arbitrary and can be specified upon reading. Furthermore, the ordering of columns can be arbitrary and thus a header line indicating which column refers to source, target and timestamp of an edge is required. As you can see in the example above, time-stamped edges do not necessarily have to be ordered according to time. In addition to simple integer time stamps, actual (arbitrary) string timestamps like 2015-05-23 09:23:22 can be used. In this case a format string indicating the meaning of the timestamp has to be included.

Here, we use a simple example of integer time stamps, reading a file of time-stamped edges that were synthetically generated to include order correlations.

In [20]:
t = tn.readFile('data/tutorial/example.tedges', sep=' ')
2015-08-19 12:58:26 [Severity.INFO]	Reading time-stamped links ...
2015-08-19 12:58:26 [Severity.INFO]	finished.
2015-08-19 12:58:26 [Severity.INFO]	Building index data structures ...
2015-08-19 12:58:26 [Severity.INFO]	finished.
2015-08-19 12:58:26 [Severity.INFO]	Sorting time stamps ...
2015-08-19 12:58:26 [Severity.INFO]	finished.

We can actually print a summary of its basic properties by using the Summary() function.

In [21]:
print(t.Summary())
Nodes:			30
Time-stamped links:	13200
Links/Nodes:		440.0
Observation period:	[0, 13199]
Observation length:	13199
Time stamps:		13200
Avg. inter-event dt:	1.0
Min/Max inter-event dt:	1/1
Max Time Diff (delta):	1
Two-paths:		not calculated
First-order network:	not constructed
Second-order network:	not constructed

Let us now extract time-respecting paths of length two and visualize the first-order network.

In [22]:
t.setMaxTimeDiff(delta=1)
t.extractTwoPaths()

g1 = t.igraphFirstOrder()

visual_style = {}
visual_style["bbox"] = (600, 600)
visual_style["margin"] = 60
visual_style["edge_width"] = [x/100 for x in g1.es()["weight"]]
visual_style["vertex_size"] = 20
visual_style["vertex_label_size"] = 12
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_label"] = g1.vs["name"]
visual_style["edge_arrow_size"] = 0.3
visual_style["layout"] = g1.layout_auto()

igraph.plot(g1, 'tutorial/example_g1.png', **visual_style)
display(Image(filename='tutorial/example_g1.png'))
2015-08-19 12:58:26 [Severity.INFO]	Extracting two-paths for delta = 1...
2015-08-19 12:58:26 [Severity.INFO]	finished.
2015-08-19 12:58:26 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 12:58:26 [Severity.INFO]	finished.

Importantly, the weighted time-aggregated network does not show any specific structure. However, due to the specific ordering of time-stamped edges in the temporal network, the second-order aggregate network can look completely different, meaning that only very specific time-respecting paths actually occur. Here, the second-order network shows three pronounced temporal communities that are not visible in the first-order network.

In [23]:
g2 = t.igraphSecondOrder()

visual_style = {}
visual_style["edge_arrow_size"] = 0.01
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_size"] = 5

igraph.plot(g2, 'tutorial/example_g2.png', **visual_style)
display(Image(filename='tutorial/example_g2.png'))
2015-08-19 12:58:26 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 12:58:26 [Severity.INFO]	finished.

Analyzing real-world data: Diffusion dynamics in temporal networks

Let us now come to a more interesting case, analyzing real-world relational data sets from a number of different contexts. The first data set that we will study covers time-stamped antenna-antenna interactions in an ant colony and was originally published in:

  • B Blonder, A Dornhaus: Time-Ordered Networks Reveal Limitations to Information Flow in Ant Colonies, PlosOne, 2011

Here, we read data on one particular colony (details in our papers above) as a temporal network.

In [24]:
t = tn.readFile('data/tutorial/ants-1-1_agg_6s_scc.tedges', sep=' ')
print(t.Summary())
2015-08-19 12:58:30 [Severity.INFO]	Reading time-stamped links ...
2015-08-19 12:58:30 [Severity.INFO]	finished.
2015-08-19 12:58:30 [Severity.INFO]	Building index data structures ...
2015-08-19 12:58:30 [Severity.INFO]	finished.
2015-08-19 12:58:30 [Severity.INFO]	Sorting time stamps ...
2015-08-19 12:58:30 [Severity.INFO]	finished.
Nodes:			61
Time-stamped links:	982
Links/Nodes:		16.098360655737704
Observation period:	[0, 239]
Observation length:	239
Time stamps:		201
Avg. inter-event dt:	1.195
Min/Max inter-event dt:	1/11
Max Time Diff (delta):	1
Two-paths:		not calculated
First-order network:	not constructed
Second-order network:	not constructed

Let us first consider the (first-order) time-aggregated network.

In [25]:
g1 = t.igraphFirstOrder()

visual_style = {}
visual_style["bbox"] = (600, 600)
visual_style["margin"] = 60
visual_style["vertex_size"] = 30
visual_style["vertex_label_size"] = 8
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_label"] = g1.vs["name"]
visual_style["edge_arrow_size"] = 0.3
visual_style["layout"] = g1.layout_auto()

igraph.plot(g1, 'tutorial/ants_g1.png', **visual_style)
display(Image(filename='tutorial/ants_g1.png'))
2015-08-19 12:58:30 [Severity.INFO]	Extracting two-paths for delta = 1...
2015-08-19 12:58:30 [Severity.INFO]	finished.
2015-08-19 12:58:30 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 12:58:30 [Severity.INFO]	finished.

Above, we have briefly argued about the time-respecting paths that we would expect, considering that specific order correlations (i.e. betweennness preferences of nodes) are absent in the temporal network. Based on the first-order aggregate network we can actually compute the expected second-order network, under the assumption of a null model in which all interactions are independent from each other. For this, we can use the following pyTempNet function:

In [26]:
g2n = t.igraphSecondOrderNull()

visual_style = {}
visual_style["edge_arrow_size"] = 0.01
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_size"] = 10
visual_style["layout"] = g2n.layout_auto()

igraph.plot(g2n, 'tutorial/ants_g2n.png', **visual_style)
display(Image(filename='tutorial/ants_g2n.png'))
2015-08-19 12:58:31 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 12:58:31 [Severity.INFO]	finished.

We see that this expected network is rather densely connected, i.e. we expect many time-respecting paths of length two to exist between nodes. We can now compare this to the actual second-order network.

Here, we see that - due to the specific ordering of time-stamped links, the actual second-order network can look completely different, meaning that only very specific time-respecting paths actually occur. Here, the second-order network is much less connected than the expected one shown above.

In [27]:
g2 = t.igraphSecondOrder()

igraph.plot(g2, 'tutorial/ants_g2.png', **visual_style)
display(Image(filename='tutorial/ants_g2.png'))

This (visually) different second-order network shown above is a sign for the presence of non-Markovian characteristics which change causality in the temporal network. We can quantify the presence of these characteristics by calculating the entropy growth rate ratio between a first- and a second-order Markov model of the time-stamped edge sequence (see details in publications above). This value captures to what extend the actual temporal network differs from a Markovian case, in which the next edge in a time-stamped edge sequence is independent from the previous one.

Here, we observe a value that is strictly smaller than one, which signifies that the data set indeed included non-Markovian characteristics.

In [28]:
print("Entropy growth rate ratio is", tn.Measures.EntropyGrowthRateRatio(t))
Entropy growth rate ratio is 0.363798532795

We can next ask the question to what extent these characteristics influence dynamical processes on the temporal network. Here, we consider a diffusion process, modeled by the convergence behavior of a random walk process on the temporal network. With the next two functions, we can compute the time (i.e. number of random walk steps) required by a random walker to converge. Precisely, we measure the average number of steps after which the total variation distance between the visitation probabilities and the stationary distribution is smaller than a given threshold of epsilon.

In [29]:
speed_g2 = tn.Processes.RWDiffusion(t.igraphSecondOrder().components(mode="strong").giant(), epsilon=1e-6)
speed_g2n = tn.Processes.RWDiffusion(t.igraphSecondOrderNull().components(mode="strong").giant(), epsilon=1e-6)

print("Empirical slow-down factor for diffusion is", speed_g2/speed_g2n)
Empirical slow-down factor for diffusion is 2.214887640449438

For small epsilon (i.e. large times t) we empirically observe that non-Markovian characteristics slow down the diffusion process on average by a factor of approx. 2.05 (values will naturally differ for individual runs).

We can actually predict this analytically by means of a spectral analysis of the second-order network. The following function analytically calculates the expected factor by which a diffusion process in a temporal network is slowed down due to non-Markovian characteristics (and the resulting changes in the causality structures of the temporal network).

In [30]:
print("Analytical slow-down factor for diffusion is", tn.Measures.SlowDownFactor(t))
Analytical slow-down factor for diffusion is 2.05252345941

Here we expect diffusion in the temporal network to be slower by a factor of about 2.05 (compared to a Markovian temporal network in which no order correlations change causality). This is well in line with our empirical observation above.

Apart from predicting changes in diffusion speed, we can do some extended spectral analysis of the second-order network based on a normalized Laplacian matrix. This analysis confirms that the actual temporal sequence has a smaller algebraic connectivity than a Markovian temporal network. This indicates that the causal topology of time-respecting paths in the temporal network is less connected than expected at random, thus explaining the observed slow-down of diffusion.

This can intuitively be related to the weaker *causal connectivity& in the real temporal network compared to what we expect at random (compare the second-order networks shown above).

In [31]:
print("Algebraic Connectivity (G2) =", tn.Measures.AlgebraicConn(t))
print("Algebraic Connectivity (G2 null) =", tn.Measures.AlgebraicConn(t, model="NULL"))
Algebraic Connectivity (G2) = 0.0426976073824
Algebraic Connectivity (G2 null) = 0.0856700955263

Let us now move to a second data set, which covers E-Mail exchanges between employees in a manufacturing company. Again, we can read these data asa temporal network.

In [32]:
t = tn.readFile('data/tutorial/manufacturing_30d_agg_3600_scc.tedges', sep=' ')
print(t.Summary())
2015-08-19 12:58:33 [Severity.INFO]	Reading time-stamped links ...
2015-08-19 12:58:33 [Severity.INFO]	finished.
2015-08-19 12:58:33 [Severity.INFO]	Building index data structures ...
2015-08-19 12:58:33 [Severity.INFO]	finished.
2015-08-19 12:58:33 [Severity.INFO]	Sorting time stamps ...
2015-08-19 12:58:33 [Severity.INFO]	finished.
Nodes:			96
Time-stamped links:	3843
Links/Nodes:		40.03125
Observation period:	[36, 717]
Observation length:	681
Time stamps:		221
Avg. inter-event dt:	3.09545454545
Min/Max inter-event dt:	1/64
Max Time Diff (delta):	1
Two-paths:		not calculated
First-order network:	not constructed
Second-order network:	not constructed

The entropy growth rate ratio smaller than one confirms that the temporal network exhibits non-Markovian characteristics that are likely to change causality.

In [33]:
print("Entropy growth rate ratio is", tn.Measures.EntropyGrowthRateRatio(t))
2015-08-19 12:58:33 [Severity.INFO]	Extracting two-paths for delta = 1...
2015-08-19 12:58:33 [Severity.INFO]	finished.
2015-08-19 12:58:33 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 12:58:33 [Severity.INFO]	finished.
2015-08-19 12:58:33 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 12:58:33 [Severity.INFO]	finished.
Entropy growth rate ratio is 0.549171568756

Based on spectral properties, we analytically predict these characteristics to slow down diffusion by a factor of about 3.01 (compared to a Markovian temporal network).

In [34]:
print("Analytical slow-down factor for diffusion is", tn.Measures.SlowDownFactor(t))
Analytical slow-down factor for diffusion is 3.01996747563

Let us empirically confirm that this analytical prediction is correct ...

In [35]:
speed_g2 = tn.Processes.RWDiffusion(t.igraphSecondOrder().components(mode="strong").giant(), epsilon=1e-12)
speed_g2n = tn.Processes.RWDiffusion(t.igraphSecondOrderNull().components(mode="strong").giant(), epsilon=1e-12)

print("Empirical slow-down factor for diffusion is", speed_g2/speed_g2n)
Empirical slow-down factor for diffusion is 2.86497461928934

Our third data set covers contacts between medical personnel in a hospital, recorded by means of proximity sensing tags. Again, we read these data asa temporal network.

In [36]:
t = tn.readFile('data/tutorial/Hospital_noADM_agg_300_scc_8_56h.tedges', sep=' ')

print(t.Summary())
2015-08-19 12:58:34 [Severity.INFO]	Reading time-stamped links ...
2015-08-19 12:58:34 [Severity.INFO]	finished.
2015-08-19 12:58:34 [Severity.INFO]	Building index data structures ...
2015-08-19 12:58:34 [Severity.INFO]	finished.
2015-08-19 12:58:34 [Severity.INFO]	Sorting time stamps ...
2015-08-19 12:58:34 [Severity.INFO]	finished.
Nodes:			53
Time-stamped links:	8571
Links/Nodes:		161.71698113207546
Observation period:	[23, 574]
Observation length:	551
Time stamps:		350
Avg. inter-event dt:	1.5787965616
Min/Max inter-event dt:	1/99
Max Time Diff (delta):	1
Two-paths:		not calculated
First-order network:	not constructed
Second-order network:	not constructed

Again, the entropy growth rate ratio smaller than one confirms that the temporal network exhibits non-Markovian characteristics that are likely to change causality.

In [37]:
print("Entropy growth rate ratio is", tn.Measures.EntropyGrowthRateRatio(t))
2015-08-19 12:58:34 [Severity.INFO]	Extracting two-paths for delta = 1...
2015-08-19 12:58:34 [Severity.INFO]	finished.
2015-08-19 12:58:34 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 12:58:34 [Severity.INFO]	finished.
2015-08-19 12:58:34 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 12:58:34 [Severity.INFO]	finished.
Entropy growth rate ratio is 0.700588097083

Based on spectral properties, we analytically predict these characteristics to slow down diffusion by a factor of about 5.75.

In [38]:
print("Analytical slow-down factor for diffusion is", tn.Measures.SlowDownFactor(t))
Analytical slow-down factor for diffusion is 5.7450563705

Again, we empirically confirm that this prediction is reasonable ...

In [39]:
speed_g2 = tn.Processes.RWDiffusion(t.igraphSecondOrder().components(mode="strong").giant(), epsilon=1e-10)
speed_g2n = tn.Processes.RWDiffusion(t.igraphSecondOrderNull().components(mode="strong").giant(), epsilon=1e-10)

print("Empirical slow-down factor for diffusion is", speed_g2/speed_g2n)
Empirical slow-down factor for diffusion is 5.407657657657658

We next use the RealityMining data set, covering proximity interactions between students at MIT. We read it as a temporal network.

In [40]:
t = tn.readFile('data/tutorial/RealityMining_agg_300s_scc.tedges', sep=' ')
print(t.Summary())
2015-08-19 12:58:35 [Severity.INFO]	Reading time-stamped links ...
2015-08-19 12:58:35 [Severity.INFO]	finished.
2015-08-19 12:58:35 [Severity.INFO]	Building index data structures ...
2015-08-19 12:58:36 [Severity.INFO]	finished.
2015-08-19 12:58:36 [Severity.INFO]	Sorting time stamps ...
2015-08-19 12:58:36 [Severity.INFO]	finished.
Nodes:			58
Time-stamped links:	24984
Links/Nodes:		430.7586206896552
Observation period:	[0, 6215]
Observation length:	6215
Time stamps:		5961
Avg. inter-event dt:	1.0427852349
Min/Max inter-event dt:	1/185
Max Time Diff (delta):	1
Two-paths:		not calculated
First-order network:	not constructed
Second-order network:	not constructed

And again, the temporal sequence deviates from a Markovian temporal network and non-Markovian characteristics are expected to slow down diffusion by a factor of about 7.77, which we confirm empirically.

In [41]:
print("Entropy growth rate ratio is", tn.Measures.EntropyGrowthRateRatio(t))
2015-08-19 12:58:36 [Severity.INFO]	Extracting two-paths for delta = 1...
2015-08-19 12:58:36 [Severity.INFO]	finished.
2015-08-19 12:58:36 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 12:58:36 [Severity.INFO]	finished.
2015-08-19 12:58:36 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 12:58:36 [Severity.INFO]	finished.
Entropy growth rate ratio is 0.392959580242
In [42]:
print("Analytical slow-down factor for diffusion is", tn.Measures.SlowDownFactor(t))
Analytical slow-down factor for diffusion is 7.77413007763
In [43]:
speed_g2 = tn.Processes.RWDiffusion(t.igraphSecondOrder().components(mode="strong").giant(), epsilon=1e-6)
speed_g2n = tn.Processes.RWDiffusion(t.igraphSecondOrderNull().components(mode="strong").giant(), epsilon=1e-6)

print("Empirical slow-down factor for diffusion is", speed_g2/speed_g2n)
Empirical slow-down factor for diffusion is 7.2079942556247

Finally, we also find examples for temporal networks in which non-Markovian characteristics result in a speed-up. For this, we consider a data set of time-stamped passenger flows in the London Tube network.

In [44]:
t = tn.readFile('data/tutorial/tube_flows_scc.tedges', sep=' ')
print(t.Summary())
2015-08-19 12:58:39 [Severity.INFO]	Reading time-stamped links ...
2015-08-19 12:58:40 [Severity.INFO]	finished.
2015-08-19 12:58:40 [Severity.INFO]	Building index data structures ...
2015-08-19 12:58:43 [Severity.INFO]	finished.
2015-08-19 12:58:43 [Severity.INFO]	Sorting time stamps ...
2015-08-19 12:58:43 [Severity.INFO]	finished.
Nodes:			132
Time-stamped links:	225372
Links/Nodes:		1707.3636363636363
Observation period:	[8, 358154]
Observation length:	358146
Time stamps:		225372
Avg. inter-event dt:	1.58913968523
Min/Max inter-event dt:	1/34
Max Time Diff (delta):	1
Two-paths:		not calculated
First-order network:	not constructed
Second-order network:	not constructed

Interestingly, here non-Markovian characteristics result in a speed-up of diffusion expressed by a slow-down factor smaller than one.

In [45]:
print("Entropy growth rate ratio is", tn.Measures.EntropyGrowthRateRatio(t))
print("Analytical slow-down factor for diffusion is", tn.Measures.SlowDownFactor(t))
2015-08-19 12:58:43 [Severity.INFO]	Extracting two-paths for delta = 1...
2015-08-19 12:58:46 [Severity.INFO]	finished.
2015-08-19 12:58:46 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 12:58:46 [Severity.INFO]	finished.
2015-08-19 12:58:46 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 12:58:47 [Severity.INFO]	finished.
Entropy growth rate ratio is 0.282201408877
Analytical slow-down factor for diffusion is 0.230196514165

This sounds counter-intuitive, the mere reordering of time-stamped links results in a speed up. Let us validate this by means of a simulation.

In [46]:
speed_g2 = tn.Processes.RWDiffusion(t.igraphSecondOrder().components(mode="strong").giant(), epsilon=1e-6)
speed_g2n = tn.Processes.RWDiffusion(t.igraphSecondOrderNull().components(mode="strong").giant(), epsilon=1e-6)
print("Empirical slow-down factor for diffusion is", speed_g2/speed_g2n)
Empirical slow-down factor for diffusion is 0.23787772312016864

That's interesting, but what exactly is happening there? We can get a better idea about this, by actually looking at a visualization of a diffusion process. For this purpose, pyTempNet comes with a function that allows to conveniently generate a video of the diffusion in a temporal network. The idea is that the process will be visualized in the first-order time-aggregated network, while the diffusion dynamics can either follow a second-order or a first-order Markov model. By this, we are able to visualize the effect of order correlations on the dynamical process.

In [47]:
g1 = t.igraphFirstOrder()

visual_style = {}
visual_style["edge_arrow_size"] = 0.4
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_size"] = 10
visual_style["edge_width"] = [np.log(x)/4 for x in g1.es()["weight"]]
visual_style["layout"] = g1.layout_auto()

tn.exportDiffusionVideo(t, 'tutorial/LT_diffusion_t2.mp4', visual_style, steps = 200, initial_index=0, model='SECOND')
2015-08-19 12:58:48 [Severity.INFO]	Calculating diffusion dynamics in non-Markovian temporal network ...
2015-08-19 12:58:48 [Severity.INFO]	Step 0 TVD =0.999208285297
2015-08-19 12:58:53 [Severity.INFO]	Step 50 TVD =0.0784047221785
2015-08-19 12:58:58 [Severity.INFO]	Step 100 TVD =0.00563876981434
2015-08-19 12:59:03 [Severity.INFO]	Step 150 TVD =0.000428278867121
2015-08-19 12:59:08 [Severity.INFO]	finished.
2015-08-19 12:59:08 [Severity.INFO]	Encoding video ...
2015-08-19 12:59:11 [Severity.INFO]	finished.

This function will output the evolution of total variation distance on the console every 50 steps, so that you can check whether the chosen number of steps was sufficient for the diffusion process to converge to a reasonable degree. The function also generates video frames, are internally encoded into a video using imagemagick's convert tool. If you would like, you could actually call the function exportDiffusionMovieFrames and convert the frames into a video yourself with this command:

> convert -delay 10 frames/LT_diffusion_t2_frame* output/LT_t2.mp4

However, here we save us this effort and directly embed the video resulting from the method call above. This video shows how a diffusion process evolves in the temporal network based on the actual two-path statistics in the data.

In [48]:
showVideo('tutorial/LT_diffusion_t2.mp4')
Out[48]:

In the video, we observe a fast diffusion through the network. Let us now compare this to a diffusion process based on the first-order time-aggregated network, i.e. based on the two-path statistics that we would expect. We could do this in the same way as above, i.e. generating a video and embedding it below. However there is a nice shortcut that directly produces a comparison video, in which a diffusion process is running side-by-side in a Markovian and non-Markovian temporal network. This allows you to quickly visually explore how non-Markovian characteristics influence a dynamical process in your data. We can simply do this by the following method call:

In [49]:
tn.exportDiffusionComparisonVideo(t, 'tutorial/diffusion_comparison.mp4', visual_style, steps = 200, initial_index=0)
2015-08-19 13:00:57 [Severity.INFO]	Calculating diffusion dynamics in non-Markovian temporal network
2015-08-19 13:00:57 [Severity.INFO]	Step 0 TVD =0.999208285297
2015-08-19 13:01:02 [Severity.INFO]	Step 50 TVD =0.0784047221785
2015-08-19 13:01:07 [Severity.INFO]	Step 100 TVD =0.00563876981434
2015-08-19 13:01:12 [Severity.INFO]	Step 150 TVD =0.000428278867121
2015-08-19 13:01:17 [Severity.INFO]	finished.
2015-08-19 13:01:17 [Severity.INFO]	Calculating diffusion dynamics in Markovian temporal network ...
2015-08-19 13:01:17 [Severity.INFO]	Step 0 TVD =0.999208285297
2015-08-19 13:01:23 [Severity.INFO]	Step 50 TVD =0.270497339842
2015-08-19 13:01:28 [Severity.INFO]	Step 100 TVD =0.132012191175
2015-08-19 13:01:33 [Severity.INFO]	Step 150 TVD =0.0720745505823
2015-08-19 13:01:38 [Severity.INFO]	finished.
2015-08-19 13:01:38 [Severity.INFO]	Stitching video frames ...
2015-08-19 13:02:21 [Severity.INFO]	finished.
2015-08-19 13:02:21 [Severity.INFO]	Encoding video ...
2015-08-19 13:02:26 [Severity.INFO]	finished.

Neat! Now we can directly watch the diffusion processes side-by-side. Note that in the current implementation, the non-Markovian temporal network is shown on the left, while the Markovian one is shown on the right.

In [51]:
showVideo('tutorial/diffusion_comparison.mp4')
Out[51]:

That's interesting! We observe that the diffusion process based on a first-order model, in which two-path statistics correspond to what we expect based on a weighted time-aggregated network, is indeed slower than in the second-order model. We also observe why this is the case: The numerous loops in the network result in a slow diffusion, while the actual two-path statistics based on the data set on passenger flows result in a much faster diffusion, which quickly reaches all nodes in the network.

In fact, this is not surprising as people do not travel the London Tube in loops. And it is this temporal characteristics which we miss if we merely study time-aggregated networks, and which we can include in our analysis by means of a second-order representation.

Spectral analysis: demonstration using TRIGRAM files

Let us now come to the last part of our tutorial, introducing some basic support for a spectral analysis of temporal networks. Here, rather than using a TEDGE file containing time-stamped edges from which we extract time-respecting paths of length two, we use so-called TRIGRAM files. These files directly contain the statistics of time-respecting paths of length two. They are useful whenever direct data on paths or flows in dynamic networks is available.

The format of a TRIGRAM file is the following:

`node1 node2 node3 weight
a b c 3.0
u v w 4.0
...`

In the example above, we have a time-respecting path (a,b) -> (b,c) occuring three times, while we have a time-respecting path (u,v) -> (v,w) occurring four times.

Based on this format, in the following we use synthetic TRIGRAM data generated from a model which
shows how non-Markovian characteristics in temporal networks can both slow down and speed up dynamical processes.

Similar like we have seen above, the first file corresponds to a case where non-Markovian properties speed up a diffusion process.

In [52]:
t_su = tn.readFile('data/tutorial/sigma0_75.trigram', fformat='TRIGRAM', sep = ' ')
print("Entropy growth rate ratio is", tn.Measures.EntropyGrowthRateRatio(t_su))
2015-08-19 13:02:54 [Severity.INFO]	Reading time-stamped links ...
2015-08-19 13:02:54 [Severity.INFO]	finished.
2015-08-19 13:02:54 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 13:02:54 [Severity.INFO]	finished.
2015-08-19 13:02:54 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 13:02:54 [Severity.INFO]	finished.
Entropy growth rate ratio is 0.996687767987

Again, the entropy growth rate ratio is smaller than one, verifying that the temporal network has non-Markovian characteristics. We can test the effect of these non-Markovian characteristics by computing the analytical slow-down factor. Here it is smaller than one, verifying a speed-up of diffusion dynamics.

In [53]:
print("Analytical slow-down factor for diffusion is", tn.Measures.SlowDownFactor(t_su))
Analytical slow-down factor for diffusion is 0.688363359047

Let us have a closer look at this temporal network. The first-order aggregate network consists of two communities.

In [54]:
g1 = t_su.igraphFirstOrder()

visual_style = {}
visual_style["edge_width"] = [np.sqrt(x) for x in g1.es()["weight"]]
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_size"] = 10
visual_style["edge_arrow_size"] = 0.5
visual_style["layout"] = g1.layout_auto()

igraph.plot(g1, 'tutorial/model_0_75_g1.png', **visual_style)
display(Image(filename='tutorial/model_0_75_g1.png'))

These two communities are visible in the expected second-order network as well

In [55]:
g2n = t_su.igraphSecondOrderNull()

visual_style = {}
max_weight = max(g2n.es()["weight"])
visual_style["edge_width"] = [400*x for x in g2n.es()["weight"]]
visual_style["edge_arrow_size"] = 0.5
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_size"] = 12
visual_style["layout"] = g2n.layout_auto()

igraph.plot(g2n, 'tutorial/model_0_75_g2n.png', **visual_style)
display(Image(filename='tutorial/model_0_75_g2n.png'))

In the actual second-order time-aggregated network, we observe that communities are (temporally) more connected than expected at random (visible in the few strong links connecting communities), and thus explaining the speed-up.

In [56]:
g2 = t_su.igraphSecondOrder()

max_weight = max(g2.es()["weight"])
visual_style["edge_width"] = [6*x for x in g2.es()["weight"]]
visual_style["edge_arrow_size"] = 0.5
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_size"] = 12

igraph.plot(g2, 'tutorial/model_0_75_g2.png', **visual_style)
display(Image(filename='tutorial/model_0_75_g2.png'))

We can gain some more (analytical) insights about this by means of a spectral analysis of the higher-order networks. We can first calculate algebraic connectivity in the second-order network. A comparison with the algebraic connectivity of the (expected) Markovian temporal network shows that the temporal network is indeed better connected than expected.

In [57]:
print("Algebraic Connectivity (G2) =", tn.Measures.AlgebraicConn(t_su))
print("Algebraic Connectivity (G2 null) =", tn.Measures.AlgebraicConn(t_su, model="NULL"))
Algebraic Connectivity (G2) = 0.0147085987469
Algebraic Connectivity (G2 null) = 0.0101482158018

We can get a better perspective on this by studying the so-called Fiedler vector, i.e. the eigenvector corresponding to the second-smallest eigenvalue of the Laplacian matrix, corresponding to the higher-order aggregate network. This vector can be computed with pyTempNet right away. We can then plot the entries of this vector correponding to individual nodes in the second-order network (i.e. links in the first-order network).

In [59]:
fiedler = tn.Measures.FiedlerVectorDense(t_su)

plt.clf()
plt.tick_params(axis='both', which='major', labelsize=20)
plt.tick_params(axis='both', which='minor', labelsize=20)       
plt.xlabel('$i$', fontsize=30)
plt.ylabel(r'$(v_2)_i$', fontsize=30)
plt.subplots_adjust(bottom=0.25)
plt.subplots_adjust(left=0.25)
plt.scatter(range(len(fiedler)), np.real(fiedler), color="g")
plt.savefig("tutorial/model_0_75_fiedler.png")
plt.close()

display(Image(filename='tutorial/model_0_75_fiedler.png'))

We see two clear ranges of values below and above zero which are a sign of two pronounced communities. At the same time, the points connecting these two ranges show that a number of nodes are in between these communities. Interestingly, if we compare the Fiedler vector to that of the expected second-order network, we see that community structures in the real temporal network are not as strong as expected at random, i.e. in this case the non-Markovian properties of the temporal network actually mitigate community structures!

In [60]:
fiedler = tn.Measures.FiedlerVectorDense(t_su, model="NULL")

plt.clf()
plt.tick_params(axis='both', which='major', labelsize=20)
plt.tick_params(axis='both', which='minor', labelsize=20)       
plt.xlabel('$i$', fontsize=30)
plt.ylabel(r'$(v_2)_i$', fontsize=30)
plt.subplots_adjust(bottom=0.25)
plt.subplots_adjust(left=0.25)
plt.scatter(range(len(fiedler)), np.real(fiedler), color="b")
plt.savefig("tutorial/model_fiedler_null.png")
plt.close()

display(Image(filename='tutorial/model_fiedler_null.png'))

We now consider a second example, a temporal network which - although it is consistent with the same time-aggregated network shown above, is generated in such a way that non-Markovian characteristics actually slow down diffusion. We can again read the resulting TRIGRAM file, compute the entropy growth rate ration and the analytical slow-down factor.

In [62]:
t_sd = tn.readFile('data/tutorial/sigma-0_75.trigram', fformat='TRIGRAM', sep = ' ')

print("Entropy growth rate ratio is", tn.Measures.EntropyGrowthRateRatio(t_sd))
2015-08-19 13:03:50 [Severity.INFO]	Reading time-stamped links ...
2015-08-19 13:03:51 [Severity.INFO]	finished.
2015-08-19 13:03:51 [Severity.INFO]	Constructing second-order aggregate network ...
2015-08-19 13:03:51 [Severity.INFO]	finished.
2015-08-19 13:03:51 [Severity.INFO]	Constructing first-order aggregate network ...
2015-08-19 13:03:51 [Severity.INFO]	finished.
Entropy growth rate ratio is 0.991543696969
In [63]:
print("Analytical slow-down factor for diffusion is", tn.Measures.SlowDownFactor(t_sd))
Analytical slow-down factor for diffusion is 5.03547933645

Here, our analysis shows that diffusion is expected to be slower by a factor of five, which we can again verify using simulations.

In [64]:
speed_g2 = tn.Processes.RWDiffusion(t_sd.igraphSecondOrder().components(mode="strong").giant(), epsilon=1e-6)
speed_g2n = tn.Processes.RWDiffusion(t_sd.igraphSecondOrderNull().components(mode="strong").giant(), epsilon=1e-6)

print("Empirical slow-down factor for diffusion is", speed_g2/speed_g2n)
Empirical slow-down factor for diffusion is 5.046707146193368

A spectral analysis shows that the algebraic connectivity in the second-order network is much smaller than before, which is due to the fact that in this particular example non-Markovian characteristics enforce the community structures. Compared to the algebraic connectivity of the expected Markovian temporal network, here non-Markovian characteristics result in a less connected causal topology, thus explaining the slow-down.

In [65]:
print("Algebraic Connectivity (G2) =", tn.Measures.AlgebraicConn(t_sd))
Algebraic Connectivity (G2) = 0.00202358811696

This can also be seen in the distribution of entries of the Fiedler vector. The two value ranges are much more separated, which means that community structures are stronger than before. Furthermore, they are stronger than in the null model.

In [67]:
fiedler = tn.Measures.FiedlerVectorDense(t_sd)

plt.clf()
plt.tick_params(axis='both', which='major', labelsize=20)
plt.tick_params(axis='both', which='minor', labelsize=20)       
plt.xlabel('$i$', fontsize=30)
plt.ylabel(r'$(v_2)_i$', fontsize=30)
plt.subplots_adjust(bottom=0.25)
plt.subplots_adjust(left=0.25)
plt.scatter(range(len(fiedler)), np.real(fiedler), color="r")
plt.savefig("tutorial/model_-0_75_fiedler.png")
plt.close()

display(Image(filename='tutorial/model_-0_75_fiedler.png'))

Notably, both the first-order aggregate network, as well as the expected second-order network (both shown below) are exactly the same as before (since the only difference between the temporal networks is the ordering of time-stamped edges).

In [68]:
g1 = t_sd.igraphFirstOrder()

visual_style = {}
visual_style["edge_width"] = [np.sqrt(x) for x in g1.es()["weight"]]
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_size"] = 15
visual_style["edge_arrow_size"] = 0.5
visual_style["layout"] = g1.layout_auto()

igraph.plot(g1, 'tutorial/model_-0_75_g1.png', **visual_style)
display(Image(filename='tutorial/model_-0_75_g1.png'))
In [70]:
g2n = t_sd.igraphSecondOrderNull()

visual_style["edge_width"] = [500*x for x in g2n.es()["weight"]]
visual_style["edge_arrow_size"] = 0.5
visual_style["vertex_size"] = 10
visual_style["layout"] = g2n.layout_auto()

igraph.plot(g2n, 'tutorial/model_-0_75_g2n.png', **visual_style)
display(Image(filename='tutorial/model_-0_75_g2n.png'))

However, the actual second-order aggregate network is different. Here we observe that compared to the second-order network above, non-Markovian characteristics enforce community structures (visible in the smal weights of the links connecting the two communities in the second-order network).

In [71]:
g2 = t_sd.igraphSecondOrder()

visual_style["edge_width"] = [5*x for x in g2.es()["weight"]]
visual_style["vertex_size"] = 10
visual_style["edge_arrow_size"] = 0.5

igraph.plot(g2, 'tutorial/model_-0_75_g2.png', **visual_style)
display(Image(filename='tutorial/model_-0_75_g2.png'))

Let us finally visually explore how the different ordering of interactions in the two models for temporal networks presented above influence a diffusion process. Based on the spectral analysis presented above, we already have an idea: In the one case, where time-respecting paths across communities are - compared to what is expected at random - overrepresented diffusion speed is fast because the community structure of the aggregate networks is effectively mitigated. In the second case, where time-respecting paths across communities are underrepresented, the ordering of interactions effectively enforces community structures. As such, in the first case we expect a diffusion process to quickly move across communities, while in the second case we expect it to be trapped in communities for a longer time.

Let us check this intuition by generating two videos of the diffusion process in both temporal networks (in both cases comparing a diffusion process that considers the actual two-path statistics with the one expected based on Markovian dynamics).

In [72]:
visual_style = {}
visual_style["edge_width"] = [np.sqrt(x) for x in g1.es()["weight"]]
visual_style["vertex_color"] = "lightblue"
visual_style["vertex_size"] = 15
visual_style["edge_arrow_size"] = 0.5
visual_style["layout"] = g1.layout_auto()

tn.exportDiffusionComparisonVideo(t_sd, 'tutorial/model_slowdown.mp4', visual_style, steps = 200, initial_index=50)
tn.exportDiffusionComparisonVideo(t_su, 'tutorial/model_speedup.mp4', visual_style, steps = 200, initial_index=50)
2015-08-19 13:04:36 [Severity.INFO]	Calculating diffusion dynamics in non-Markovian temporal network
2015-08-19 13:04:36 [Severity.INFO]	Step 0 TVD =0.9975
2015-08-19 13:04:42 [Severity.INFO]	Step 50 TVD =0.455415222248
2015-08-19 13:04:48 [Severity.INFO]	Step 100 TVD =0.411542598196
2015-08-19 13:04:54 [Severity.INFO]	Step 150 TVD =0.371902085892
2015-08-19 13:05:00 [Severity.INFO]	finished.
2015-08-19 13:05:00 [Severity.INFO]	Calculating diffusion dynamics in Markovian temporal network ...
2015-08-19 13:05:01 [Severity.INFO]	Step 0 TVD =0.9975
2015-08-19 13:05:07 [Severity.INFO]	Step 50 TVD =0.310791190964
2015-08-19 13:05:13 [Severity.INFO]	Step 100 TVD =0.186627070236
2015-08-19 13:05:19 [Severity.INFO]	Step 150 TVD =0.11206839418
2015-08-19 13:05:26 [Severity.INFO]	finished.
2015-08-19 13:05:26 [Severity.INFO]	Stitching video frames ...
2015-08-19 13:06:09 [Severity.INFO]	finished.
2015-08-19 13:06:09 [Severity.INFO]	Encoding video ...
2015-08-19 13:06:15 [Severity.INFO]	finished.
2015-08-19 13:06:15 [Severity.INFO]	Calculating diffusion dynamics in non-Markovian temporal network
2015-08-19 13:06:15 [Severity.INFO]	Step 0 TVD =0.9975
2015-08-19 13:06:21 [Severity.INFO]	Step 50 TVD =0.248399714029
2015-08-19 13:06:28 [Severity.INFO]	Step 100 TVD =0.118408446044
2015-08-19 13:06:34 [Severity.INFO]	Step 150 TVD =0.0564439390413
2015-08-19 13:06:40 [Severity.INFO]	finished.
2015-08-19 13:06:40 [Severity.INFO]	Calculating diffusion dynamics in Markovian temporal network ...
2015-08-19 13:06:40 [Severity.INFO]	Step 0 TVD =0.9975
2015-08-19 13:06:46 [Severity.INFO]	Step 50 TVD =0.310791190964
2015-08-19 13:06:52 [Severity.INFO]	Step 100 TVD =0.186627070236
2015-08-19 13:06:58 [Severity.INFO]	Step 150 TVD =0.11206839418
2015-08-19 13:07:04 [Severity.INFO]	finished.
2015-08-19 13:07:04 [Severity.INFO]	Stitching video frames ...
2015-08-19 13:07:50 [Severity.INFO]	finished.
2015-08-19 13:07:50 [Severity.INFO]	Encoding video ...
2015-08-19 13:07:55 [Severity.INFO]	finished.

This will take about two minutes. After the process finisheds, we can directly embed both resulting videos as before. Again, the non-Markovian dynamics is left, while the Markovian (null) model is right.

In [73]:
showVideo('tutorial/model_slowdown.mp4')
Out[73]:
In [74]:
showVideo('tutorial/model_speedup.mp4')
Out[74]:

The videos of the diffusion process confirm the results of our spectral analysis: the mere ordering of interactions in temporal networks can either mitigate or enforce the community structures of a complex network, thus speeding up or slowing down diffusion dynamics!

I hope you enjoyed this tutorial and I hope you are now excited to apply the higher-order network framework introduced in the articles mentioned above to your own time-stamped relational data. We believe that higher-order networks provide interesting opportunities for novel data mining methods, new measures for centrality in temporal networks, and the development of novel community detection algorithms that incorporate both topological and temporal characteristics of time-stamped network data. Moreover, as demonstrated above, the higher-order network abstraction has been implemented in the latest version of the Open Soure python module pyTempNets

Feel free to get in touch with me if you have further questions or if you want to use the python model pyTempNet for your own analyses.

Ingo Scholtes
ischoltes@ethz.ch

Zurich, May 2015