Dr. Giacomo Vaccario
My main interests are in the quantification of knowledge and of its exchange in academia and in R&D activities.
In particular, I concentrate on two specific classes of problems. The first class of problems is related to the question of how knowledge artifacts are linked to each other, ranked and filtered in repositories. Examples of repositories are patent and scientific publication databases. The second class of problems relates to the question of how knowledge is exchanged and transferred. Indeed, knowledge is not only produced and encoded in knowledge artifacts, as patents or scientific publications, but it is also exchanged by humans and it diffuses in our society. R&D alliances among firms and co-authorship of papers among scientists are examples of activities favoring knowledge exchange. While the physical migration of inventors or scientists is an example of knowledge transfer/migration.
To answer the above questions, I develop statistical methods and agent-based models.
CV» |
|
|
Publications»
«Publications
Publications in
Should the government reward cooperation? Insights from an agent-based model of wealth redistribution
|
[2021]
|
Schweitzer, Frank;
Verginer, Luca;
Vaccario, Giacomo
|
Advances in Complex Systems,
pages: (submitted),
volume: XXX
|
more» «less
|
Abstract
In our multi-agent model agents generate wealth from repeated interactions for which a prisoner's dilemma payoff matrix is assumed. Their gains are taxed by a government at a rate α. The resulting budget is spent to cover administrative costs and to pay a bonus to cooperative agents, which can be identified correctly only with a probability p. Agents decide at each time step to choose either cooperation or defection based on different information. In the local scenario, they compare their potential gains from both strategies. In the global scenario, they compare the gains of the cooperative and defective subpopulations. We derive analytical expressions for the critical bonus needed to make cooperation as attractive as defection. We show that for the local scenario the government can establish only a medium level of cooperation, because the critical bonus increases with the level of cooperation. In the global scenario instead full cooperation can be achieved once the cold-start problem is solved, because the critical bonus decreases with the level of cooperation. This allows to lower the tax rate, while maintaining high cooperation.
Reproducing scientists' mobility: A data-driven model
|
[2020]
|
Vaccario, Giacomo;
Luca, Verginer;
Schweitzer, Frank
|
arXiv:1811.07229
|
more» «less
|
Abstract
High skill labour is an important factor underpinning the competitive advantage of modern economies. Therefore, attracting and retaining scientists has become a major concern for migration policy. In this work, we study the migration of scientists on a global scale, by combining two large data sets covering the publications of 3.5 Mio scientists over 60 years. We analyse their geographical distances moved for a new affiliation and their age when moving, this way reconstructing their geographical ``career paths''. These paths are used to derive the world network of scientists mobility between cities and to analyse its topological properties. We further develop and calibrate an agent-based model, such that it reproduces the empirical findings both at the level of scientists and of the global network. Our model takes into account that the academic hiring process is largely demand-driven and demonstrates that
the probability of scientists to relocate decreases both with age and with distance. Our results allow interpreting the model assumptions as micro-based decision rules that can explain the observed mobility patterns of scientists.
The mobility network of scientists: Analyzing temporal correlations in scientific careers
|
[2020]
|
Vaccario, Giacomo;
Verginer, Luca;
Schweitzer, Frank
|
Applied Network Science,
pages: 36,
volume: 5,
number: 1
|
more» «less
|
Abstract
The mobility of scientists between different universities and countries is important to foster knowledge exchange. At the same time, the potential mobility is restricted by geographic and institutional constraints, which leads to temporal correlations in the career trajectories of scientists. To quantify this effect, we extract 3.5 million career trajectories of scientists from two large scale bibliographic data sets and analyze them applying a novel method of higher-order networks. We study the effect of temporal correlations at three different levels of aggregation, universities, cities and countries. We find strong evidence for such correlations for the top 100 universities, i.e. scientists move likely between specific institutions. These correlations also exist at the level of countries, but cannot be found for cities. Our results allow to draw conclusions about the institutional path dependence of scientific careers and the efficiency of mobility programs.
What is the Entropy of a Social Organization?
|
[2019]
|
Zingg, Christian;
Casiraghi, Giona;
Vaccario, Giacomo;
Schweitzer, Frank
|
Entropy,
volume: 21,
number: 9
|
more» «less
|
Abstract
We quantify a social organization's potentiality, that is its ability to attain different configurations. The organization is represented as a network in which nodes correspond to individuals and (multi-)edges to their multiple interactions. Attainable configurations are treated as realizations from a network ensemble. To encode interaction preferences between individuals, we choose the generalized hypergeometric ensemble of random graphs, which is described by a closed-form probability distribution. From this distribution we calculate Shannon entropy as a measure of potentiality. This allows us to compare different organizations as well different stages in the development of a given organization. The feasibility of the approach is demonstrated using data from 3 empirical and 2 synthetic systems.
Quantifying knowledge exchange in R&D networks: A data-driven model
|
[2018]
|
Vaccario, Giacomo;
Tomasello, Mario Vincenzo;
Tessone, Claudio Juan;
Schweitzer, Frank
|
Journal of Evolutionary Economics,
pages: 461-493,
volume: 28,
number: 3
|
more» «less
|
Abstract
We propose a model that reflects two important processes in R&D activities of firms, the formation of R&D alliances and the exchange of knowledge as a result of these collaborations. In a data-driven approach, we analyze two large-scale data sets extracting unique information about 7500 R&D alliances and 5200 patent portfolios of firms. This data is used to calibrate the model parameters for network formation and knowledge exchange. We obtain probabilities for incumbent and newcomer firms to link to other incumbents or newcomers which are able to reproduce the topology of the empirical R&D network. The position of firms in a knowledge space is obtained from their patents using two different classification schemes, IPC in 8 dimensions and ISI-OST-INPI in 35 dimensions. Our dynamics of knowledge exchange assumes that collaborating firms approach each other in knowledge space at a rate $μ$ for an alliance duration $τ$. Both parameters are obtained in two different ways, by comparing knowledge distances from simulations and empirics and by analyzing the collaboration efficiency $\hat{C}_n$. This is a new measure, that takes also in account the effort of firms to maintain concurrent alliances, and is evaluated via extensive computer simulations. We find that R&D alliances have a duration of around two years and that the subsequent knowledge exchange occurs at a very low rate. Hence, a firm's position in the knowledge space is rather a determinant than a consequence of its R&D alliances. From our data-driven approach we also find model configurations that can be both realistic and optimized with respect to the collaboration efficiency $\hat{C}_n$. Effective policies, as suggested by our model, would incentivize shorter R&D alliances and higher knowledge exchange rates.
Data-driven modeling of collaboration networks: A cross-domain analysis
|
[2017]
|
Tomasello, Mario Vincenzo;
Vaccario, Giacomo;
Schweitzer, Frank
|
EPJ Data Sci.,
pages: 22,
volume: 6
|
more» «less
|
Abstract
We analyze large-scale data sets about collaborations from two different domains: economics, specifically 22.000 R&D alliances between 14.500 firms, and science, specifically 300.000 co-authorship relations between 95.000 scientists. Considering the different domains of the data sets, we address two questions: (a) to what extent do the collaboration networks reconstructed from the data share common structural features, and (b) can their structure be reproduced by the same agent-based model. In our data-driven modeling approach we use aggregated network data to calibrate the probabilities at which agents establish collaborations with either newcomers or established agents. The model is then validated by its ability to reproduce network features not used for calibration, including distributions of degrees, path lengths, local clustering coefficients and sizes of disconnected components. Emphasis is put on comparing domains, but also sub-domains (economic sectors, scientific specializations). Interpreting the link probabilities as strategies for link formation, we find that in R&D collaborations newcomers prefer links with established agents, while in co-authorship relations newcomers prefer links with other newcomers. Our results shed new light on the long-standing question about the role of endogenous and exogenous factors (i.e., different information available to the initiator of a collaboration) in network formation.
Quantifying and suppressing ranking bias in a large citation network
|
[2017]
|
Vaccario, Giacomo;
Medo, Matus;
Wider, Nicolas;
Mariani, Manuel S.
|
Journal of Informetrics,
pages: 766-782,
volume: 11,
number: 3
|
more» «less
|
Abstract
It is widely recognized that citation counts for papers from different fields cannot be directly compared because different scientific fields adopt different citation practices. Citation counts are also strongly biased by paper age since older papers had more time to attract citations. Various procedures aim at suppressing these biases and give rise to new normalized indicators, such as the relative citation count. We use a large citation dataset from Microsoft Academic Graph and a new statistical framework based on the Mahalanobis distance to show that the rankings by well known indicators, including the relative citation count and Google's PageRank score, are significantly biased by paper field and age. Our statistical framework to assess ranking bias allows us to exactly quantify the contributions of each individual field to the overall bias of a given ranking. We propose a general normalization procedure motivated by the z-score which produces much less biased rankings when applied to citation count and PageRank score.
Talks»
«Talks
Talks
The structure, exchange, and transfer of knowledge in socio-technical systems
[Nov. 19, 2019 - Nov. 19, 2019]
Room AND 4.57, PhD Seminar Research in Network Science, University of Zurich
Data-driven modeling of collaboration networks: A cross-domain analysis
[Aug. 30, 2019]
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Vancouver, Canada
What is the Entropy of a Social Organization?
[Aug. 27, 2019]
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Vancouver, Canada
Reproducing scientists' mobility: A data-driven model
[June 24, 2019]
Workshop on Economic Science with Heterogeneous Interacting Agents (WEHIA), London, England
Quantifying knowledge exchange in R&D networks: A data-driven model
[June 24, 2019]
Workshop on Economic Science with Heterogeneous Interacting Agents (WEHIA), London, England
Reproducing scientists’ mobility: A data-driven model
[May 17, 2019]
Migration, Globalization and the Knowledge Economy Workshop, Utrecht, Netherlands
Reconstructing knowledge flows to develop citation-based measures for journals
[May 15, 2019]
Centre for Science and Technology Studies, Leiden, Netherlands
Quantifying and suppressing ranking bias
[March 11, 2018]
DPG-Frühjahrstagung, Berlin, Germany
How do firms collaborate? A data-driven model
[March 11, 2018]
DPG-Frühjahrstagung, Berlin, Germany
Quantifying knowledge exchange in R&D networks
[Sept. 21, 2017]
KU Leuven - Summer School on Data & Algorithms for STI studies
Quantifying knowledge exchange in R&D networks: A data-driven model
[Sept. 7, 2017]
Kreyon 2017 - Roma
Collaborations Across Economic and Scientific Domains
[Sept. 19, 2016]
Conference on Complex Systems - Amsterdam
Quantifying and suppressing ranking bias»
«Quantifying and suppressing ranking bias
Every day scholars and online users explore available knowledge using recommender systems based on ranking algorithms. This challenge us to design more sophisticated filtering and ranking procedures to avoid biases that can systematically hide relevant contents.
In this work, we tackle this issue by quantifying and suppressing biases of indicators of scientific impact. We use a large citation dataset from Microsoft Academic Graph and a new statistical framework based on the Mahalanobis distance to show that the rankings by well known indicators, including relative citation count and Google's PageRank score, are significantly biased by paper field and age. We propose a general normalization procedure motivated by the z-score which produces much less biased rankings when applied to citation count and PageRank score.

We provide a simple and quick tutorial on how we quantify ranking bias. Source codes and tutotial can be found in this gitHub project. Enjoy!
Research Plan»
«Research Plan
For the PhD, I focus on the organization, exchange and transfer ot knowledge is socio-technical systems.
My full research plan can be found here and in the follwoing I summarie the 7 research questions (RQs) that I will answer in PhD thesis. These 7 questions can be divided in three different sections.
The organization of Knowledge
RQ1: Assessing multiple normalizations. We need to age- and field- normalize citation-based indicators in order to compare documents of different age and from different fields. How can we assess that these indicators have been simultaneously age- and field- normalized?
RQ2: Developing a new normalization procedure. As the procedure of Radicchi et al. (2008) failed to correctly age- and field-normalize citation count and to the best of our knowledge there are no better ones, how can we develop a better one?
RQ3: Developing new knowledge order. How can we use time-correlations present in citation data to develop citation-based indicators?
The exchange of knowledge
RQ4: knowledge exchange among firms. Previous results from our chair indicate that knowledge is rather a determinant than a consequence of R&D collaborations. How does this result change when using different methods to quantify knowledge?
RQ5: knowledge exchange among scientists. How can we extend the model and the analysis of Tomasello et al., (2015) to the scientific domain?
The transfer of knowledge
RQ6: temporal correlations in the transfer of knowledge. How should we model scientists' academic mobility in order to retain temporal correlations and a network perspective?
RQ7: new agent-based model for knowledge transfer. Let us assume that temporal correlations in the migration trajectories of scientists break the transitivity assumption. Then, how can we use an agent-based model to reproduce these type of trajectories?
|