Academic Migration and Academic Networks: Evidence from Scholarly Big Data and the Iron Curtain
Map showing the Eastern Bloc (in red) and Western Europe (in blue)
So do we really care?
The decision to migrate is one of the most important decisions an individual can make. As such, this decision is influenced and shaped by a lot of factors such as inequality levels at home and intended destination, returns to education, migration costs, employment prospects, life cycle considerations, and many more. Networks play a vital role in such decision as they influence all of the above factors in one way or another. Networks help the individual in learning about opportunities and conditions in potential destinations; moreover, at home and by construction, the structure of migrants’ social networks shape their ability and desire to learn, and thus their migration prospects. On the macro level, studying migration of scientists is important as it has implications on brain drain; indeed, human capital influences differentials in economic prosperity across space and it is the engine of innovations and a major source of knowledge externalities. Thus, with the use of big data, the study of the role of networks in migration is facilitated, and this is the major contribution of this research as it is in the context of scientists.
Figure showing the literature this research contributes to and draws from
How does this contribute to the literature?
There are ambiguities in the literature regarding the causal relationship between migration and network. This is due to the fact that it has historically been difficult to differentiate between distinct sources of social capital (synonymous to different types and structures of networks) in a single empirical setting. More specifically, in the migration case, traditional data sources inhibit the linking of social network structure to migration decisions. Additionally, the existing empirical evidence on the effects of networks makes the implicit assumption - which is a result of the constraints of the data - that all potential migrants benefit from the networks at destination equally (Bertoli and Ruyssen, 2018). This empirical evidence studying the effect of networks ranges from looking at share of households with a migrant at the village (McKenzie and Rapoport, 2010), size of diaspora at each destination country (Bertoli and Moraga, 2015 and Beine et al., 2011, 2015) or at the country level (Bertoli, 2010); whilst all making the implicit assumption that migrants benefit equally from networks.
This research, instead, abstracts from this assumption by the ability to map and identify networks of individuals and specific characteristics about them, as also Blumenstock et al. (2020) recently did. Lastly and most importantly, this work attempts to reach causal identification of the effect of networks on migration decisions, by looking at a specific context in which manipulation of networks prior to migration was not possible and a rich data source that allows for a wide range of controls.
Identification of the causal effect
Focusing on academics from Eastern Europe (henceforth EE) from 1980-1988 and their academic networks (1980-1988), I investigate the effect of academic network characteristics, by location, on the probability to migrate after the fall of the Berlin Wall in 1989 and up to 2003, when many EE countries held referendums or signed treaties to join the EU. The timing offers a unique context in which there was no anticipation of the fall of the Eastern Bloc and, together with the data that offers unique rich information, identification is achieved. Approximately 30k academics from EE were identified, 3% of whom were migrants.
During this period, the Iron Curtain, a political boundary dividing Europe into two separate areas from the end of the Second World War in 1945 up until the end of the Cold War in 1991, was in place. As a result, it severely limited migration between the East and the West from 1950 up until its fall in 1991 (Van Mol and de Valk, 2016). The series of events that preceded border openings and the collapse of the Soviet Union led to the largest migration wave - in and from eastern Europe - ever since the events of refugee and forced migration of WWII (Bade, 2008). All in all, after the opening of the Iron Curtain in November 1989 marked by the fall of the Berlin Wall, immigration from eastern Europe started and surged in all categories, including migration of academics and scientists (Marshall, 2000). Thus, the collapse of the Iron Curtain induced new migration flows, and enabled and facilitated the migration of academics and researchers from Eastern Europe, the focus of this analysis. Note that the Eastern European authors identified in this period are tracked up until 2003, marking the year many Eastern European countries signed treaties or held referendums to join the EU and consequently the enlargement of the European Union in 2004. For example, after 2003, academic migration from eastern Europe to the United Kingdom increased through movements from the countries that gained access and membership to the European Union in 2004 (Burrell, 2010). In this research, we find evidence that academics were not able to anticipate the fall of the Berlin Wall and their possibility to migrate. As a result, there appears to be no manipulation of the network size and quality, throughout all types of migrants (between EE and out of EE), implying that migration did not induce networks, and thus enables us to reach the causal effect of networks on migration.
Using this context and these data, I test the assumption that the effect of the size and quality of pre 1989 academic networks, classified by location home, destination and foreign, on the probability to migrate, goes through two distinct channels: the cost and signalling channels, respectively. The cost channel is how the network characteristic reduces or increases the cost of migration,thus acting as a facilitator or a de-facilitator of migration. The signal channel on the other hand in which the network characteristic serves as a signal for the academic himself and his quality and his potential contribution and addition to the new host institution, thus also serving as a facilitator or a de-facilitator of migration.
Schema of Microsoft Academic Knowledge Graph showing how everything is derived from the papers themselves
Sneak peak of the data!
The schema shows how everything is derived from the paper ID, information about the academic, his field, his co-authors (i.e. his network) and consequently information about his network (mainly their size and quality). Size is defined as the sum of co-authors, by location, from 1980-1988. Quality, on the other hand, is the average citation count and average rank of the co-authors by location (only for home average rank is used for definition issues), from 1980-1988. The main dependent variable is whether or not an academic migration post the fall and up to 2003, which is then classified into within-EE migration, out of EE migration, and no migration. The figure shows the motivation behind MAKG over other data sources. Even though the data is not perfect, especially when it comes to look at the quality of academics, MAKG is considered a better option, compared with other traditional scholarly data and other scholarly big data sources. Some descriptive results are important to note. Out of the approximately 30k academics from EE, 855 are migrants, 509 engaged in out of EE migration and 346 engaged in within EE migration. There is no evidence that academics strategically manipulated their networks in anticipation of migration due to the focus on Eastern European academics behind the Iron Curtain. Additionally, there was only within EE migration prior to the fall of the Berlin Wall giving further support for the Iron Curtain as a barrier to a migration from the East. The majority of academics' most frequent language of publication was English. Migrants tend to be older, have larger networks, smaller home network size, and larger foreign network size. The most famous destination is the United States, especially for mathematicians and scientists.
Motivating the use of Microsoft Academic Knowledge Graph as a data source for this research
Findings and Implications
In this analysis, I find that an increase in the home network size (80-88) by one unit reduces the probability to migrate (1989-2003) by 0.1-0.05pp. Distinguishing between the types of migration, I find that an increase in home network size increases the chances of an academic not migrating compared to his chances of migrating to another EE. In fact, an increase in home network size by 1 unit, all variables constant, an academic is 1.034-1.071 times more likely to not migrate as compared to migrating to another EE , as the risk or odds are 3.4% - 7.1% higher. For the groups of those who migrated outside of EE vs those who migrated within EE, the evidence mainly implies lower chances of migrating out of EE compared to migrating within EE when home network size increases, aligning with the theoretical predictions. On the other hand, an increase in the destination network size for migrants increases the probability to migrate by approximately 7.7pp highlighting the lower costs of migration due to already established connections at destination and probably the easier the process of integration into the new host institution, aligning with the theoretical prediction. An increase foreign network size increases the probability of migration by 0.1pp, yet not statistically significant throughout all specifications. By distinguishing between the types of migration, evidence confirming the assumption that foreign connections are more likely to be close in terms of distance is found. This is because an increase in foreign network size by 1 unit increases the chances of an academic migrating within EE as compared to not migrating, as the risk or odds are lower by 1.5%. Similarly, an increase in foreign network size by 1 unit increases the chances of migrating within EE as compared to migrating out of EE as the risk or odds are 0.3%. For destination network size, as expected, the chances of not migrating versus migrating within EE is nearly zero.
All of this confirms the fact that the cost channel mostly operates through network size in which greater networks at home increase the cost of migration, as leaving connections behind is costly, whilst establishing ones at destination reduces the cost of migration, as academic connections have already been established and would also ease integration in the host institution. For the foreign network size, the statistical insignificance and context does not help in disentangling which channel the effect operates through.
An increase in home network quality on the other hand, shows that the signalling channel marginally outweighs the cost channel as an increase in home network quality increases the probability to migrate. The effect is positive, statistically significant yet not economically significant (i.e. very small in magnitude), and thus the signalling channel outweighs slightly the cost channel, implying that an increase in home network quality acts as a signal for the academic's quality and, thus, a facilitator of migration more than a de-facilitator of migration, as "better" network at home is left behind. A decrease in home network quality increases the chances of not migrating versus migrating within EE, yet the effect is not very economically significant. Similarly, a decrease in home network quality decreases the chances of migrating outside of EE versus migrating within the EE. Looking at destination and foreign network quality I find evidence that supports the fact that better quality destination and foreign networks significantly increase the probability to migrate. However, the effect is economically not significant, being 0.0003pp and 0.0001pp. Considering these results, whilst having the results on destination and foreign network size at the back of our heads, they could imply that having a greater network at the destination or at a foreign country is more important that having a good connection at another country. This might be due to the fact that academics in EE were so segregated from the rest of the academic community worldwide that any additional connection would be of great help and would increase migration prospects, irrespective of the quality of that connection. Furthermore, an increase in foreign network quality increases the chances of an academic migrating within EE versus migrating out of EE as the risk or odds are 0.1% lower. This highlights that a greater foreign network quality, which is assumed to be usually in other EE countries, has an effect through the cost channel as leaving the region completely means loss of these foreign connections completely, thus, this acts more as a pull factor.
All of this confirms the fact that the signalling channel mostly operates through network quality, in which better networks by all locations act as a signal of the academic's quality, openness and options. However, the fact that many of the results are not economically significant and sometimes statistically insignificant highlights that size matters more than quality. This could be specific to this exact context, where academics were highly isolated from the rest of the academic world.
As expected, prior migration does facilitate migration, especially if it occurred through an Eastern European country that became part of the EU in 2003. Looking at heterogeneous effects by broad disciplines offers some useful insights that are intuitive, novel, and confirm findings by previous, yet different, papers. There are no heterogenous home network size effects by broad discipline or field of study. In contrast, there are heterogenous effects of destination network size that are significant, statistically and economically, for Mathematicians, Computer Scientists and Engineers. This aligns with Borjas and Doran (2012) who argue that any Soviet Mathematician that tried to communicate with scholars outside of the Soviet Union, particularly in the US, could risk the potential attention from the KGB or even arrest. Thus, due to the extremely limited contact, an additional contact at destination would increase migration prospects from them, more than any other field, especially since they were of high quality/reputation. Additionally, an increase in foreign network size increases the probability of academics from the Arts and Humanities to migrate significantly more than academics from other fields. This could be explained by the fact that academics from fields that have larger network barriers and less quality signalling have a foreign network - which is a measure of openness, quality and options - that plays a more important role in facilitating migration. This aligns with the results from Becker et al (2021). Looking at the network quality, an increase in home network quality has a significantly different and positive effect on the migration probability of Mathematicians, Computer Scientists and Engineers. Evidence confirms that the signalling channel outweighs the cost channel. This happens more for these specific disciplines as they are the only ones with a significant interaction term when the home network belongs to the top 25%, whereas other disciplines would need their networks to be from top 10% so that the effect is significant. This hints at the reputation and quality of Mathematicians, aligning with Borjas and Doran (2012). There are no heterogeneous effects by destination network quality and only heterogeneous effects by foreign network quality for Arts and Humanities academics, aligning with the above interpretation for foreign network size.
Concluding Remarks
In conclusion, this research is important due to the various and vast contributions it offers to different strands of the literature. It first contributes to the current wave of research on human migration through the lens and perspective of big data (see Sirbu et al., 2020). However, it expands on this literature by focusing on a unique historical context that offers a much closer step to achieve the causal impact of networks on migration, thus it also expands on the literature focusing on migration post the fall of the Berlin Wall, the Iron Curtain and the dissolution of the Soviet Union. By focusing on academics, it contributes to the limited literature on academic migration and what shapes and affects their migration decisions (Teichler, 2015). It contributes to the vast and extensive literature on brain drain as the focus of this research is on academic migration. This research also provides a new empirical perspective on the determinants of academic migration paying particular attention to academic networks. As such, it also contributes to the strands of literature on the empirical relationship between networks and migration, which is an empirically hard task to do as mentioned before. The alignment between network theory and social capital theory also makes this research contribute to the empirical literature on social network theory.
Author: Donia Kamel*
This work is part of a micro-project done in collaboration between Paris School of Economics (Professor Hillel Rapoport and PhD student Donia Kamel), the University of Pisa (Dr. Laura Pollacci) and ISTI-CNR (Dr. Giulio Rossetti). This work is classified under the Migration Studies Exploratory under WP10, SoBigData++. It could also be classified under the Demography, Economy and Finance Exploratory.
For more information about the cited literature and this work, feel free to contact me at: doniasameh3@gmail.com or on Twitter: @Donia_Sameh