In this post, I am looking at a couple of ways to analyse biblographic data. Starting with the simplest, Word Clouds but then an interesting tool VosViewer. All the data is taken from the University of Northampton's Research Repository - Nectar - for members of the academic Computing team.
Word Clouds
The image above is based on data for all the listed publications for the computing team since 2011. It includes the authors, title, conference, etc; but no abstract. It takes quite a bit of editing and really all that is being shown is the Authors name for the most published authors and a few key terms. Provides a nice snap shot but is difficult to interpret.
Taking this a bit further, looking at the titles of research outputs per year.
|
Titles 2016 |
|
Titles 2015 |
|
Titles 2014 |
|
Title 2013 |
|
Title 2012 |
|
Title 2011 |
The interesting trend is the changing nature of the research in 2011 computer education comes out as a strong feature. In both 2015 and 2016 the areas have changed and the terms around architecture and networking comes out more strongly.
Co-author Analysis
Here is the same data but processed using software that is only looking at the authors and showing how many times a particular pair have published together. Some interconnections between authors can be seen.
Text Analysis
This is the same tool as above, but this time looking at the text within the title and abstracts. All words in the title and abstract of the papers; but using binary counting (so a term is only counted once per publication) and allowing only the terms with 60% highest relevance through; you can get a graph like the one below. Personally I the graph find beautiful, gives a sense of a lot going on; but is difficult to interpret.
Now repeating the same exercise but with only words that appear at least three times.
The groups are clearer. This group has some interesting subject area coming out; for example wireless networking and machine-to-machine communicaton; alongside pedagogic, cultural research and sensors for animals.
Let us apply this last approach to some individual cases
Case Study 1: Mid-Research Career Academic
Case Study 2: Mid-Research Career Academic 2
In both Case Study 1 and 2, there are several groupings. Case study 1 the subjects in the groupings are more diverse than in case study 2, which has greater specialisation.
Case Study 3: Early Career Researcher
Greater separation in the groups (though three groups are related in terms of subject) than in the first two case studies. This is in part may be due to the smaller number of papers compared to the first two case studies (between four to eight times fewer).
Case Study 4: PhD by Publication Candidate
There is a stronger inter-relationships between the groups, than has been in some of the other case studies. This, I would argue, is a positive feature for someone pursuing a PhD by publication; suggesting a coherent 'story' to their publications.
All views are the authors and do not reflect the views of any organisations the author is associated with. Twitter: @scottturneruon
Comments
Post a Comment