How is scientific knowledge created? Are there patterns and special characteristics behind the flow of scientific knowledge? In this article I will guide you through my research work and try to give some answers to these questions!
Collaboration is a crucial ingredient of academic research. Nowadays, most research articles are written bymultiple authors. But co-authorship is not the only way in which researchers collaborate. There is also a form that we label informal collaboration. It involves giving each other feedback and comments on unpublished manuscripts. In fact, it is prevalent in academic research, no matter which discipline. Some Economists concluded that without these informal ways of collaboration there would be much less exchange of ideas. While many scientists would agree to this, based on their private experience, we so far lacked the data to study informal collaboration on a large scale. In this article I describe how we aspire to change that, what interesting facts we found in the dataset we constructed and which questions we want to tackle in the future.
A big part of my PhD research was devoted to collecting data, this resulted into a large and unique dataset on informal collaboration in Financial Economics. I derived the data from acknowledgement sections of more than 5,500 published papers published over 15 years (1997-2011). These papers were published in six journals in Finance all with similar focus. All the data is publicly available online to researchers.
In acknowledgement sections authors thank others for feedback and helpful comments. Though common in all sciences, acknowledgement is most pronounced in Economics and its subfields. On average, there are five researchers acknowledged per paper. If you take the authors on top of that (on average 2), and the editor and one or two anonymous referees, you reach an average group size of nine to ten researchers. That's the amount of researchers that is usually involved in the conceptual production of knowledge (still excluding research assistants, secretaries, colleagues in seminar presentations, etc.). As a profession, we however only reward authors, while comments get nothing but an acknowledgement. There are more than 14,000 researchers in the dataset, most of them as acknowledged commenters. Interestingly, only a minority of authors in the dataset is also appearing as acknowledged commenters.
The number of authors and acknowledged commenters in the dataset
The dataset we built enabled us to create the informal collaboration network in Financial Economics. We connect two researchers when one acknowledges the other. In what follows we illustrate two practical uses of our dataset.
The first one is to forecast academic productivity of researchers, which is a method already known in the scientific literature. Common approaches use measures derived from co-author networks only. These measures include the number of co-authors and the productivity of co-authors, but also network centralities. There are multiple reasons why including those variables makes sense for predicting a researcher's future productivity. The probably most plausible is the improved access to knowledge flows: If a researcher is well-connected, she/he receives a lot of information and is likely to have access to new information, e.g. on new dataset, unpublished results, etc. Being connected to such well-connected researchers in turn allows other researchers to take advantage of their knowledge and information. But do co-author networks give a complete picture of the flow of knowledge between researchers? To answer this we compared measures derived from the co-author network (using the six journals mentioned before) and the network of informal collaboration. Interestingly, information embedded in the network of informal collaboration outperforms information embedded in co-author networks.
The second utility of such a dataset is to predict citation counts of academic papers. While it is common knowledge that the number of co-authors or the number of commenters on a paper correlate with citation counts, the novelty of our approach is to include their network centralities in two different networks. The one network is the co-author network and the other the network of informal collaboration. We found that the latter network outperforms the co-author network.
We also found that authors connecting research communities that are generally considered to be distinct, publish less in top journals. Once published, their papers receive above journal-average citations. We identify these researchers based on their betweenness centrality. Betweenness centrality measures how often someone is on the shortest path between two others, as a fraction of how many shortest paths are there. Think of someone who has co-authors in two research communities few other researchers connect. High betweenness central researchers play a special role for the flow of information within the profession. Removing a high betweenness central researcher results in lower rates of information diffusion. It is very difficult to detect these researchers based on other characteristics. A complete view of the network is in fact needed. We put the list of the most central researchers online, together with an interactive network visualization.
On an aggregate level, acknowledgements reveal to some extend how the profession works. For example, we were surprised that the largest share of researchers in the network were not authors, but commenters. About two thirds of the researchers are only acknowledged, but also authors in our sample. For a profession that values high-quality publication above everything it is surprising that so many researchers choose to help others without official reward without official reward.
Another interesting fact shading light on the functioning of the community is the network structure. The network of informal collaboration is drastically better connected than comparable co-author networks. There are more links between the researchers, and there are more researchers. The increased connectivity is consequential for estimates of the speed of information diffusion. Information on new results, new datasets or new trends spreads faster.
Co-author network for 2009-2011 using 6 finance journals
The same network including acknowledged commenters and ties between authors and commenters
Another finding concerns the question when researchers are acknowledged during their career. For every acknowledged commenter we compute the time that has passed since her first publication - a measure of academic experience. The bulk of comments is given by researchers 0 to 20 years into their career, with the modal number of years being 7. For often acknowledged researchers there seems to be a steady up followed by a steady down in the number of papers acknowledging them. This could be a sign of first increasing, then decreasing influence on the profession that they excel.
The relation of being acknowledged and academic age of commenters
One of the most interesting findings however concerns female researchers. Females are less often acknowledged than comparable male counterparts. That means, the average male author is acknowledged by 4 authors and on 2 papers during a period of three years. The female author with same level of academic experience and prolificness is acknowledge by 3.5 authors and 1.7 papers on average during the same three years. Female researchers in general have lower values of various centrality measures, including betweenness centrality. This may be consequential for them because they are cut off of information flows within the profession. It may also be consequential for the profession, as it does not include female researcher's views to the same degree. The statistical malus for females in being central however disappears around 2005/2006, but they keep being acknowledged less often.
The data set is but the beginning. There are many open questions we want to tackle: Which mechanism underlies the correlation between eigenvector centrality and a researchers’ future productivity? What incentivizes researchers to comment on each others work? What explains the age effects in the provision of commentary? Why are female researchers less often acknowledged than comparable males? How did the structure change during the academic reform following the global financial crisis?
Michael Rose is a Post-Doc researcher at the Max Planck Institute for Innovation and Competition.
 Co-Pierre Georg and Michael Rose, What 5,000 Acknowledgements Tell Us About Informal Collaboration in Financial Economics.