Skip to main content

SoBigData Event

Using language to save the world: information extraction tools for social media and medical data analysis

The GATE open source NLP toolkit has been in continuous development for 20 years at the University of Sheffield. In this talk, I will give an overview of some of our language analysis work, giving examples of real-life case studies. In particular, I will focus on three main areas of interest: analysing patient records to improve medical research, investigating the correlation between the public’s environmental behaviour and social media, and tools for assessing critical information during disasters such as earthquakes. Perhaps surprisingly, these all use many of the same underlying techniques. The analysis component includes entity and topic recognition, semantic annotation and entity linking, informativeness assessment, and sentiment analysis, In particular, I will focus on the text analysis lifecycle, ranging from twitter collection to text analysis, indexing, querying and visualisation of results, as well as crowdsourcing and evaluation tools.

Invited talk by Diana Maynard