Workshop Minutes - ETH - Chair of Systems Design - Welcome

Chair of Systems Design

Team

►

Dissertations

People

Former Collaborators

Research

►

Temporal networks

Multi-layered networks

Models of systemic risk

Biological systems

Software engineering

Animal groups

Socio-technical systems

Emotional influence

Outbreak of cooperation

R&D networks

Financial networks

Ownership networks

Response in Media

Projects

►

SNSF: 127 years of Swiss Parliament

SDSC: Democracy Studies

ET SP-RC: Systemic Risk for Privacy in Online Interaction

SERI: Information Spaces

MTEC: Interaction patterns

SNSF: Emotional Interactions

ETH SP-RC: Performance and resilience of collaboration networks

EU COST: KNOWeSCAPE - Information Landscapes

ETH: Systemic Risks, Systemic Solutions

EU: Multilevel Complex Networks

SNSF: Payoffs of Networks

SNSF ISJRP: Trust-based search in P2P Networks

EU: Forecasting Financial Crises

SNSF: OTC Derivatives

SNSF: R&D Network Life Cycles

SNSF: Social Interactions and Architecture in OSS

EU: Cyberemotions

ETH: CCSS - Coping with Crises

SERI: Agents Competing for Centrality

Projects finished before 2012

Publications

Teaching

►

Social Data Science

Systems Dynamics and Complexity

Agent-Based Modelling of Social Systems

Complex Networks

Theses

Services

►

Scientific Journals

Downloads

Activities&Events

►

Talks

SG Seminars 2015-

SG Seminars 2005-2014

Workshops

Introduction to multi-edge network inference in R using the ghypernet-package

Symposium Economic Networks

Symposium Networks, Time and Causality

10 Years Anniversary

ETH Risk Center

►

ETH Risk Center Working Paper Series

CCSS Working Paper Series

Open positions

►

Contact us

Download Official Program and Abstracts (PDF, 1.1 Mb) Minutes from the workshop Session: Bibliometrics Rüdiger Mutz: How to use bibliometric data to rank universities according to their research performance Bibliometric data as an objective and reliable base for measuring performance, but usually come with many problems Rankings: statistical models needed to differentiate between random and real fluctuations Percentile-based measures Covariate-adjusted version of Leiden ranking Statistical perspective on data has serious consequences for rankings Choice between full and fractional counting not trivial, e.g. Links excellencemapping.net Discussion The quality of data needs to be improved: more emphasis on data cleaning and disambiguation Different publication types–proceedings, journals, books–need to be uniformly covered in bibliometric data to have a better basis for comparison of different fields and measurement of interdisciplinary impact More attention towards highligting the differences of citations meaning in different fields needed Citations are not the only measure of quality, which is a general practice when ranking universities A counter-example of Netherlands, where evaluation of research projects incorporates social impact of research Performance of research groups and individuals could incorporate resources available, i.e. measures of productivity and not total output. Robin Haunschild: Publications, Citations, Mentions Assessment of peers and citations correlate positively Citation cultures differ very much across disciplines Reason for citation Intellectual debt, acknowledgment of knowledge transfer Influence on the manuscript Reciprocal citations Well-known person Potential reviewer Field-specific discrepancy between references in the paper, and linked references in bibliometric databases Humanities have many references outside bibliometric databases, that are not counted Normalization of citation counts based on citing side or cited side? For humanities, the citing side may be best strategy Discussion Larger coverage of scientific literature can possibly solve the issues coming from differences in citation cultures Coverage in databases is particularly difficult for social sciences, due to difference in forms of scientific communication Geographic differences in citation cultures Different publication languages not covered homogeneously Effects of publication format/practical restriction of citations by journal format Values of citations differ based on number of references in the citing paper Dirk Tunger: Bibliometrics Attention is what we want to measure Best case: citation indicates knowledge transfer Measuring interdisciplinary impact: J-factor (normalization at the journal level) Possible potential in linking altmetrics and bibliometric data Discussion Institutions may actually not be interested in the attention that someone gets, but in the output that is produced given the funding provided The mean should not be used with bibliometric data, as is not an appropriate measure of "average"–and often is not defined–for broad distributions, which are commonly met "Altmetric" is open to provide data In many contexts related to measuring scientific performance we need correct for attention, and not take it as a proxy of performance. Stakeholder Session: Ranking Sonja Berghoff: CHE and rankings CHE ranking for German universities U Multirank: project led by consortium of CHE and Leiden CTWS CHE approach to rankings No enumerated ranking list, but ranking groups (e.g. "top", "average", "low") Field-specific, not universal rankings Multivariate ranking, no weighting and aggregation of factors Publications, research grant, reputation survey as factors Links che-ranking.de u-multirank.eu Discussion Awareness about the risks of reputation surveys: visibility bias, geographic scope; However, high response rate (professors surveyed) Collect all professors at universities Martin Juno: University Rankings – weightings and bias QS: Rankings by subject, region, aspects Same focus, different methodology 3500 universities ranked, top 850 of them are shown to public Four general factors considered Research, teaching, employability, internationality Weighting factors 40 % academic reputation (survey based) 10 % employer reputation (survey based) 20 % citations per faculty 20 % faculty students 5 % intl faculty 5 % intl students Bias in the ranking Heavily survey based Research-centric Reliance on self-reported data (however, limited) Cultural (e.g. different levels of directness) Links topuniversities.com/university-rankings Discussion Academic reputation surveys (QS academic reputation data set) provided In order to interpret results, institutions need to get insights into the methodology Some ranking providers are more open than others Reinforcing feedback of rankings to reputation surveys is probably pronounced Variability in data has to be taken into account Australia case study: ranking not possible, because the variability of opinions within the institutions is higher than across them Nees Jan van Eck: CWTS Leiden Ranking: An advanced bibliometric approach to university ranking CWTS: bibliometric contract research VOSviewer: visualising scientific landscapes LeidenRanking Exclusively based on bibliometric indicators for scientific citations and collaborations, no surveys Based on Web of Science data Multiple dimensions, no aggregation Focus on scientific performance Bibliometric approach Fractional counting of co-authorship data Percentile-based indicators to deal with highly skewed distributions Field normalization Percentile-based indicators is more stable average-based indicators Exclusion of non-core publications (about 1/6 of the total number of publications) Non-English Retracted publications National level scientific journals, trade journals, popular magazines Publications in fields with low citation density Proceedings Links vosviewer.com leidenranking.com Discussion Full vs. fractional counting: full counting is biased towards fields with lots of collaborations Known: collaborative publications tend to be cited more often than non-collaborative publications Use of appropriate bibliometric methodology is important Percentile-based indicators No blind use of databases, removing non-core journals The weights for fractional counting determined based on the number of different institutions involved Fractional counting at the individual level not recommended At the aggregate level some effects cancel out, that may be a problem at the individual level Extensive manual cleaning of affiliation data performed Thomson Reuters, the owner of Web of Science, works with universities to introduce canonical affiliation names So far for 4800 university names disambiguated Session: Social sciences Peter van den Besselaar: Bibliometrics beyond rankings Bibliometric indicators have become popular as incentives Actual question is: What questions can be answered based on citation data? Finding in social sciences (2008-2010): decisions do not correlate with bibliometrics Many false positives, low predictive validity ER Council study (2014-2015): ER Starting Grant Review example Important research outcomes Impressive h-index Many citations But: publication venues not very prestigious, so rejected No clear difference of citation distribution between successful and non-successful candidates Real questions are at the systems level How to organize the research system to maximise the performance What measures/incentives should be used? Interesting questions that can be addressed Relation between organizational form and performance Impact of funding ecology on performance Talent recognition and selection Identification of new fields Example 1: What incentives to introduce? Simonton on scientific creativity: creative people try a lot, the more they try, the higher the chance to achieve a breakthrough Existing evidence: most productive authors have the largest share in most cited papers Example 2: What is the role of competition Share of competition-based funding negatively correlates with performance (sample of 14 countries) Level of autonomy of the university negatively correlates with performance (sample of 9 countries) Level of academic freedom reported positively correlates with performance (sample of 8 countries) Talent selection Independence, impact, productivity Low overlap with the supervisor's collaboration network and research agenda Flaminio Squazzoni: Competition, serious “gamification” and scientist misbehaviour Introduction to PEERE COST Action Simmel: money and prices important trigger for the emergence of rationality in western societies Science: Reputation and citations play the role of money and prices Book by Robert Merton: The Sociology of Science Reputation is only productive, if competitive spirits are constrained by strong social norms Attention as a scarce resource: Signalling mechanisms like high-impact journals can help to avoid coordination problems Number of retractions increases over time "Rankings are natural social artefacts" Independent of ranking quality, it always has consequences Science becomes a "serious games" "In times of scarce attention the "rankitude" could bring people to easy, broad-tent view conclusions about value of people independently of context and situations" Indicators are used both in intended and unintended ways Discussion Misbehaviour of reviewers can take very creative forms: not only nonobjective rejections, but prolongation of the review process and other ways to keep control over the paper While metrics become more popular, they influence the dynamics of science, but this influence can be beneficial in certain cases In Italy, associations played a major role previously The system of science needs to be steered towards beneficial external motivations for scientists Mechanisms reinforcing the social norms that restrict the "competitive spirits" needed Judit Bar-Ilan: Altmetrics - Alternative metrics Definition of altmetrics: supplementary measures Highest-ever Altmetric rank: "Experimental evidence of massive-scale emotional contagion through social networks", PNAS vol. 111 no. 24 Controversial paper on Facebook experiments Wikipedia included in Altmetric score Sources of altmetrics Mendeley CiteULike Blogs F1000Prime PubPeer ResearchGate, academia.edu Readership counts (Mendeley) vs. citations Medium strength positive correlation with citation counts Discussion Suggestion made to Altmetric to not report an aggregate score The real implication and usage of altmetics for measurements in science not clear yet Measures incorporating altmetrics have to very robust, as these are prone to manipulations, e.g. by means of link farms Altmetrics don't measure social impact, but more the social media attention, which cannot be used as scientific advancement measure Session: Computer Science Filippo Radicchi: Dynamical graph-based impact metrics Scientific motivation: data on large social system Practical motivation: research evaluation Example: Italian National Scientific Qualification Number of papers (divided by academic age) Number of citations (divided by academic age) Contemporary h-index For individual fields: median values are computed Network structure of citation data is often neglected in research evaluation Examples of network based measures: CiteRank, Eigenfactor Weighted citation network of author-author citations Weighted by out-degree of a paper in the citation network Physical Review database SARA: Science Author Rank Algorithm (PageRank + diffusion equation) Career trajectories of real scientists (Nobel laureates) Validation Relative scores of metrics vs. prizes Table of best ranked physicists 1976/2004 vs. prizes won Links physauthorsrank.org Discussion Interesting to retrospectively compare predictions of SARA and the prizes won later by the top in the ranking The approach of using historical data for predictions assumes that the dynamics don't change which is not necessarily the case Yearly contest idea: Predict scientific prize winners based on scholarly citation data Martin Rosvall: Machine learning for robust rankings Zero- vs. first- vs. second-order Markov models for journal-journal citations Second-order Markov models particularly useful for interdisciplinary fields Good ranking has to be robust: removing journals should not significantly change the ranking of the remaining journals Second-order Markov models lead to more expressed rankings and are more robust Robustness comes at the cost of additional data Second-order models have higher predictive power Mapequation.org: visualization tool for information space navigation Links mapequation.org Discussion Zero-, first-, second-order Markov models can complement each other, thus can have higher predictive power when used together Ingo Scholtes: The social dimension of citation networks "Science is done by people", Heisenberg Social system of scientists Complex social mechanisms leading to information filtering, thus shaping the citation network Social cognition mechanism Attention mechanisms Trust Reputation Conference community case study Small community around specific topic, thus homogeneous citation network expected However, high correlation between collaborations and citations Hypothesis: authors importance in the collaboration network is indicative for the citation success of the papers in the network Dynamic collaboration network (Microsoft Academic Search data on Computer Science) Two-year rolling window aggregates of the network Predicting percentile-based citation success based on collaboration centrality Random forest classifier based on the centrality 6 times better precision then random Low recall which is good, as otherwise your success would be just described by your position in the collaboration network Citation networks mainly have semantic meaning, but also social component Social effects can contribute to "self-fulfilling prophecy" effect Discussion Productive author probably has more collaborations (thus, more chances to be more central) and higher chance to get a paper to be in top 10% by random Self-citations are considered in the study, excluding them can be considered The citation success is predicted for a paper, and the more central author contributes to the prediction Corollary is that a young researchers citation success is dependent on the PI's centrality, if the latter is in the authors list The model can be of interesting for journal editors, for the data providers to check this social bias based on their data. The levels of social bias among different fields can be compared For computer science (used in the study) it is expected to be higher social bias is higher, than in e.g. physics due to the conferences (personal meetings) and conference proceedings being the main types of communication. Stakeholder Session: Data Evangelia Lipitakis: Quaitnfying scientific impact using the citation network Evolution of citation analysis and ISI Thomson Reuters Increasing use of citation analysis by corporate sector 58 million records 900 million citations 3000 journal submissions per year with 10% acceptance rate Team of full-time editors deciding about new journal submissions 27000+ journals indexed by the WoS platform Extended coverage to be added soon What we can and should be measured productivity and impact: patents, h-index, fractional counting, etc. normalization: category normalization, journal normalized citation impact, etc. top performance and excellence: hot papers, top 1% or 10%, baselines, etc. collaborations: international, etc. InCites: Indicators Handbook Links webofknowledge.com Discussion Request for a standard test data set with disambiguated and normalized institution names for research purposes From a data scientist's perspective not easy to decouple the effects from sudden addition of large numbers of journals from the changes in the systems dynamics, thus providing data on the journal inclusion time can be helpful About 3,000 OA journals indexed A work in progress showing the citation differences between open access and subscription-based journals PLoS One: 83000 articles between 2004-2013 Normalization on journal level subject categories has the disadvantage when considering journals with broad topic coverage The historical data for newly included journals also indexed Martijn Roelandse Open access has positive effect on impact factor Rise of container Journals: PLoS One Acceptance rate 85 % 2013 Impact Factor: 3.54 Impact factor for such journal not meaningful due to the differences in disciplines Average citations per discipline very different Article-level metrics more important than impact factor of the journal Research Evaluation Framework Impact is defined as effect on society, academia, quality of life 1:AM 2014 the first altmetrics conference 2:AM 2015 expected in September 2015 Link springer.com/citations Discussion Estimation that 80% will be open access Possible emergence of two main journal categories: big container journals versus flagship journals Although the impact factor for container journals is not meaningful, it has an influence on behaviour; People from low average IF fields jump onto wagon of PLoS to benefit from the IF coming from life sciences Although download data possibly has potential for quantifying impact, information on total downloads not trivial to obtain, as usually the publications are hosted on multiple services with different data sharing policies. Social media such as Twitter can be considered as a new fast communication channel for science, but at the same time it can and is used for "advertising" purposes Session: Statistical Analysis of Science Networks Matus Medo: Temporal patterns in social and information systems Generalization for the preferential attachment model for citations Growing networks with fitness and aging, Phys. Rev. Lett. 107, 238701 Additional intrinsic quality + aging factor describing a decrease of relevance over time In the model, paper popularity grows exponentially with quality Thus, quality depends logarithmically on popularity Modification of PageRank, to correct for temporal biases Classification of users as leaders or followers in social information systems Implicit assumptions of centrality measures do not seems to be justified in scientometric data Altmetrics are shallow We can avoid some of the problems, by building only on the community structure of science Discussion Leader detection method can be applied to detect authors cultivating interdisciplinarity, facilitating knowledge flow between disciplines For a better evidence for the fitness model, attempts to consider the fitness parameter as external and estimate it outside of the model's scope Olesya Mryglod: The downloads as a measure of attractiveness of scientific publication Goodhart's law: "When a measure becomes a target, it ceases to be a good measure." Multiple dimensions of scientific impact Popularity, prestige, attractiveness Correlation between downloads and citations Download pattern for open-access and subscription based journals qualitatively similar Classification of paper by burstiness and overall attractiveness Discussion Deeper investigation of relation between downloads and citations Alexander Petersen: Quantifying growth trends in science careers with applications in bibliometric analyses How fast is science changing? Quantifiable patterns of scientific success? Unintended consequences of bibliometric measures? Budget doubling of NIH in late 1990s resulting PhD bubble Shifts in co-author numbers over last decades Ways of rewarding scientists have to adapt to these changing trends Microscopic career growth dynamics and potential pitfalls in the forecasting of careers Flaws in Nature 489, 201–202 addressed Aggregating across different career ages Artificially large R² value in the correlation, simply because H-index is non-decreasing! Splitting careers into stages: early, postdoc, tenure track Predictive power for H-index for each of these cohorts is low Interacting networks of reputation flows Author-specific factors matter for citation dynamics (reputation effects) Reputation effect is strong for papers with small citation numbers Impossible to get a highly-cited paper by reputation alone 66 % increase in citation rate for each unit increase in reputation Analyzing dynamic ego collaboration networks Rapid accumulation of co-authors after publication of influential papers Life partners in terms of collaborations (super-ties) have a positive effect on productivity Plenary discussion Predicting success in scientific careers: What are the clues? Important to understand the focus, depending on the question: is it to confidently rule out the worst cases at the cost of also ruling out some better cases (many false negatives) or is it confidently identifying the very best at the cost of getting also average candidates (many false positives) more generally, finding the right balance between the two is important and not trivial There is a social component in the scientific career success, which needs to be subtracted when predicting the success as scientific impact The distribution of talent is very heterogeneous, even more the distributions of citations/funding – the differences have to be accounted when comparing them or using ones to predict the outcome of the others Definition of the successful scientist: ideally that is an influential researcher, in the sense of being a generalist, inspiring other scientists and fields Quantifying such qualities is a challenge but needed Sensitivity for manipulation: It is easier to target a scalar indicator and optimize the behaviour for maximizing that indicator (manipulation) However, if an indicator is not an aggregate, but comprises multiple carefully selected, independent dimensions, it is much harder to maximize it by manipulation In the ideal case the only way to maximize it will be the intended behaviour, i.e. the indicator will serve its purpose Higher the number of dimensions that the indicator comprises, harder to manipulate it Creative ways of manipulation Think about a scientist at the saturation point, generating a project with wide visibility to boost his impact Manipulation by adding additional feedback layers of attention: altmetrics is one example for such an additional layer Distortion of research findings and messages in public media Robust measures are especially crucial to have when considering that only a small fraction of scientists who adopt the manipulative bahaviour is needed to undermine the social norms of "proper" behaviour Interdisciplinarity The question of what to measure for interdisciplinarity evaluation needs to be answered before the question "how". Going beyond measures based on mere counting, looking at the network is particularly crucial Interdisciplinary impact is particularly prone to manipulation One can try to sell trivial things from their discipline to other disciplines as a breakthrough Research on predicting the emergence of new scientific fields is currently being conducted Measures, indicators for decision-support The choice between simple, human-readable, but inaccurate measures and more complex, more accurate measures Risk of using indicators to shift the decision responsibility from humans to machines exists Using the indicators for decision support, versus letting machines take decisions Importance of conflicting goals/"frustrated" problems There is not the ideal option/candidate This allows differences and different preferences be viable The importance of focus also applies here: Confidence in ruling out the worst at the cost of ruling out the not that bad ones (false negatives) Confidence in identifying the best at the cost of also taking not that good ones (false positives) Transparency of a measure can also be in the methodology (e.g., clearly defined machine learning algorithm with multiple dimensions), rather than in having a simple numeric measure. Often it is important not show aggregate numbers, because their interpretation needs knowledge of statistics Understanding and knowing what data is used behind measures is as important, as knowing and understanding the method behind these measures This is a relevant issue, as the same measures are different not only across different data providers, but sometimes also for the same data provider in different contexts (e.g., the measure they provide is different from the independently calculated one, based on their data)