Computing Veracity – the Fourth Challenge of Big Data

Social stream entity processing work wins best paper award

It’s easy to get an overview of what some documents are about by looking at a tag cloud, and easy to explore those documents by clicking on words in the tag cloud. Part of the ease of use is from the most common concepts being presented in a more prominent font. However, when people use different terms to talk about the same concepts – like MUFC and Man U as well as Manchester United – then each of these different terms is represented less, and none of them receive prominence in the tag cloud.

By doing better processing of tweets, connecting the terms in them to linked data in Freebase, we are able to remedy this problem, grouping the terms so they still receive prominence. This fixed the problem, and was our first contribution.

But one problem remains – how can we tell if this actually helps? To find out, we measured the new tag clouds, both using a crowdsourced human test and also a theoretical evaluation method. The crowd agreed that grouping the terms helped – good; and we discovered that the theoretical evaluation agreed with the crowd. So, as well as having found a better way of building tag clouds, we can now also check their quality without needing a human in the loop.

Our resultign paper, “Enhanced Information Access to Social Streams Through Word Clouds with Entity Grouping“, won the Best Student Paper award at WEBIST 2015.

Martin Leginus, Leon Derczynski and Peter Dolog

Be Sociable, Share!

Comments are currently closed.

One thought on “Social stream entity processing work wins best paper award