From Webscraping to Social Network Analyses: Structure and Evolution of Scientific Co-publishing Networks

In this workshop you will first collect and then analyze academic departmental co-publishing networks. In the workhop’s Part 1 you learn to webscrape scientific metadata of scientific departmental websites (via R packages like rvest and RSelenium), assign name-based gender and ethnicity signals, retrieve scholars’ publications, and construct longitudinal co-publishing networks. In Part 2, you learn to analyze the structure and evolution of these networks by means of RSiena. You will answer questions such as: are co-publishing networks segregated by scientific success? And is success-based segregation in co-publishing networks the result of departmental characteristics, structural network effects, influence processes and selection processes? For each step, we provide clear (proof-of-principle) coding examples and output data, ensuring you will not get stuck along the way. Depending on skills and progress, you might analyze your own chosen departments. You will keep track of your work via a labjournal on GitHub.


  • Intermediate familiarity with working in R (base and tidyverse),
  • A beginner’s understanding of SNA via stochastic actor-orientated models
  • Entry-level of git, and GitHub

