
New paper: Classification and event identification using word embedding

A new paper “Classification and event identification using word embedding” is now available online.

This paper presents our contribution to the CLEF 2019 Protest-News Track, which aims to classify and identify protest events in English-language news from India and China. We used traditional classification models, namely, support vector machines and XGBoost classifiers, combined with various word embedding approaches. Multiple models were tested for experimental purposes, in addition to the two models evaluated within the official campaign. Results show promising performance, especially in terms of precision on both document and sentence classification tasks.

Twitter experiment at Royal Meteorological Society Conference

Michelle Spruce recently attended the Royal Meteorological Society (RMetS) Student & Early Career Researcher conference at the University of Birmingham on 4/5 July 2019.

As well as opening the conference by presenting her research on the social sensing of extreme weather events, Michelle also encouraged conference attendees to use Twitter during the conference in a social sensing experiment to understand the impact of ‘tweeting’ during an academic conference.

Over the 2 days of the conference attendees tweeted news and updates using the conference hashtag #RMetSStudents. By lunchtime on the second day of the conference with just 162 tweets Michelle was able to demonstrate the wider impact of these tweets:

By the end of the conference, 203 tweets including this hashtag were generated, from 44 users in 6 countries and 13 cities. While a seemingly small amount of data, by the end of the conference these tweets generated a potential reach of over 32,000 Twitter users and over 500,000 impressions (individual views of these tweets). This simple experiment demonstrated the power of using Twitter as a source of information even for small scale events such as this.

Conference paper accepted: Classification and Event Identification Using Word Embedding

Our new paper has just been accepted for presentation at CLEF 2019 in September.

Classification and Event Identification Using Word Embedding

This paper presents our contribution to the CLEF 2019 ProtestNews Track, which aims to classify and identify protest events in English-language news from India and China. We used traditional classification models, namely, support vector machines and XGBoost classifiers, combined with various word embedding approaches. Multiple models were tested for experimental purposes, in addition to the two models evaluated within the official campaign. Results show promising performance, especially in terms of precision on both document and sentence classification tasks.

Come and talk to us if you would like to know more.

New paper: Communities of online news exposure during the UK General Election 2015

New paper available in Online Social Networks and Media

Communities of online news exposure during the UK General Election 2015

Media exposure has become increasingly complex and hard to measure with the rise in online news consumption. Furthermore, since many people now routinely access news via social media, questions arise as to whether social news-sharing is affected by the polarization and partisan echo chambers that are often observed in social media communication. This study considers news-sharing on Twitter during the UK General Election in 2015, using the act of sharing as an indicator that the sharer has been exposed to that online news content. Analysis of the network structure of users and the news articles they share identifies multiple distinct user communities, which are characterized by analysis of the articles shared within them. Communities are characterised by news article sources (web domains), geographical origin and content; time of article publication was also considered but showed no significant relationships. There is evidence for ideologically biased audiences that predominantly share content from either left-leaning or right-leaning news sources, but these audiences also see content from opposing viewpoints. Other audiences are characterized by geography and/or specialised on particular news topics. Overall these findings suggest that many people consume a diverse range of news content over the election period and that the level of political bias in content exposure varies widely across the Twitter user population.

New paper: Scaling Laws in Geo-located Twitter Data

New paper accepted for publication in PLOS One

Scaling Laws in Geo-located Twitter Data

We observe and report on a systematic relationship between population density and Twitter use. Number of tweets, number of users and population per unit area are related by power laws, with exponents greater than one, that are consistent with each other and across a range of spatial scales. This implies that population density can accurately predict Twitter activity. Furthermore this trend can be used to identify ‘anomalous’ areas that deviate from the trend. Analysis of geo-tagged and place-tagged tweets show that geo-tagged tweets are different with respect to user type and content. Our findings have implications for the spatial analysis of Twitter data and for understanding demographic biases in the Twitter user base.