EarthQuake

Image
Earthquake Start of a project to bring in information from a website with an Application Programming Interface (API). We are going to use a modified version of "Project: Fetching Current Weather Data" from "Automate the boring stuff with Python" by Al Sweigart What is going on below? We import libraries to dela with json, reuest from the server and the pandas library. In [1]: import json , requests import pandas as pd from pandas import json_normalize In this section we creating a string made up of the URL. Requesting the information from the site with the URL we created and pass back the information. Data comes from the US Geological survey  https://www.usgs.gov/about/about-us/who-we-are and one of their earthquake feeds. Then print out what was returned. In [2]: url = 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.geojson' response = requests . get ( url ) response . raise_for_status () Now we need load the data which i

Bibliographic Analysis Tools for Computing

In this post, I am looking at a couple of ways to analyse biblographic data. Starting with the simplest, Word Clouds but then an interesting tool VosViewer. All the data is taken from the University of Northampton's Research Repository - Nectar - for members of the academic Computing team.  




Word Clouds

The image above is based on data for all the listed publications for the computing team since 2011. It includes the authors, title, conference, etc; but no abstract. It takes quite a bit of editing and really all that is being shown is the Authors name for the most published authors and a few key terms. Provides a nice snap shot but is difficult to interpret.

Taking this a bit further, looking at the titles of research outputs per year.
Titles 2016
Titles 2015


Titles 2014





Title 2013









Title 2012
Title 2011










The interesting trend is the changing nature of the research in 2011 computer education comes out as a strong feature. In both 2015 and 2016 the areas have changed and the terms around architecture and networking comes out more strongly.



Co-author Analysis
Here is the same data but processed using software that is only looking at the authors and showing how many times a particular pair have published together. Some interconnections between authors can be seen.


Text Analysis
This is the same tool as above, but this time looking at the text within the title and abstracts. All words in the title and abstract of the papers; but using binary counting (so a term is only counted once per publication) and allowing only the terms with 60% highest relevance through; you can get a graph like the one below. Personally I the graph find beautiful, gives a sense of a lot going on; but is difficult to interpret.


Now repeating the same exercise but with only words that appear at least three times.

The groups are clearer. This group has some interesting subject area coming out; for example wireless networking and machine-to-machine communicaton; alongside pedagogic, cultural research and sensors for animals.


Let us apply this last approach to some individual cases
Case Study 1: Mid-Research Career Academic



Case Study 2: Mid-Research Career Academic 2


In both Case Study 1 and 2, there are several groupings. Case study 1 the subjects in the groupings are more diverse than in case study 2, which has greater specialisation.


Case Study 3: Early Career Researcher
Greater separation in the groups (though three groups are related in terms of subject) than in the first two case studies. This is in part may be due to the smaller number of papers compared to the first two case studies (between four to eight times fewer).

Case Study 4: PhD by Publication Candidate

There is a stronger inter-relationships between the groups, than has been in some of the other case studies. This, I would argue, is a positive feature for someone pursuing a PhD by publication; suggesting a coherent 'story' to their publications.






All views are the authors and do not reflect the views of any organisations the author is associated with. Twitter: @scottturneruon

Comments

Popular posts from this blog

Social Analysis using socioviz

Using NodeXL to visualise twitter and looking at what it shows a bit