Workshop on Noisy, Unstructured Text

04/05/2017 Pheme News

We’re holding a workshop on dealing with noisy text, like the social media content that Pheme relies upon. The workshop will be held in Copenhagen, Denmark on September 7, 2017, in conjunction with the top-tier EMNLP conference. Here’s the official call for papers.

Call for Papers

We seek submissions of regular papers on original and unpublished work (same page limit EMNLP main conference). 1-page abstracts on work-in-progress or work published elsewhere are also welcome and will *not* be included in the conference proceedings. All accepted submissions will be presented as posters. Additionally, selected submissions will be presented orally. Shared task participants are also encouraged (but not required) to submit system description papers and present posters; the top systems will be invited (but not required) to present orally.

Important Dates

Submissions Deadline: Friday, June 2
Notification: Friday, June 30
Camera-Ready: Friday, July 14
Workshop: September 7, EMNLP – Copenhagen, Denmark

Workshop website:

http://noisy-text.github.io/2017/

Submission URL: https://www.softconf.com/emnlp2017/w-nut/user/

Topics of interest include but are not limited to:

NLP Preprocessing of Noisy Text
Part of speech tagging
Named entity tagging, including a wide range of categories, e.g. product names
Chunking of user-generated text
Parsing
Text Normalization and Error Correction
Normalizing noisy text for downstream tasks and for human readability
Error detection and correction
Paraphrase identification and semantic similarity of short text or noisy text
User prediction, e.g. geolocation, gender, age, etc
Bilingual translation of noisy text
Information extraction from noisy text
Multilingual NLP in noisy text
Colloquial language, e.g. idiom detection
Domain adaptation to user-generated text
Geolocation prediction
Global and regional trend detection and event extraction
Extracting user demographics, profiles and major life events
Detecting rumors, contradictory information, sarcasms and humors on social media
Sentiment analysis
Temporal aspects of user-generated content (resolving time expressions, concept drift, diachronic analyses, etc…)

All submissions should conform to EMNLP 2017 style guidelines, http://emnlp2017.net/call-for-papers.html . Long and short paper submissions must be anonymized. Abstract submissions should include author information (and where the work was published in a footnote on front page, if applicable). Please submit your papers at the softconf link ( https://www.softconf.com/emnlp2017/w-nut/user/ ).
Shared task: Novel and Emerging Entity Recognition
This shared task focuses on identifying unusual, previously-unseen entities in the context of emerging discussions. Named entities form the basis of many modern approaches to other tasks (like event clustering and summarisation), but recall on them is a real problem in noisy text – even among annotators. This drop tends to be due to novel entities and surface forms. Take for example the tweet “so.. kktny in 30 mins?” – even human experts find entity kktny hard to detect and resolve. This task will evaluate the ability to detect and classify novel, emerging, singleton named entities in noisy text.

Shared task organisers: Leon Derczynski (University of Sheffield), Marieke van Erp (VU University Amsterdam), Nut Limsopatham (University of Cambridge), Eric Nichols (Honda Research Institute, Japan)
Workshop Organizers

Leon Derczynski (The University of Sheffield)
Wei Xu (The Ohio State University)
Alan Ritter (The Ohio State University)
Tim Baldwin (The University of Melbourne)

Invited Speakers

Miles Osborne (Bloomberg)
Bill Dolan (Microsoft Research)
Dirk Hovy (University of Copenhagen)
Program Committee

David Bamman (University of California, Berkeley)
Kalina Bontcheva (University of Sheffield)
Claire Cardie (Cornell University)
Colin Cherry (National Research Council Canada)
Grzegorz Chrupała (Tilburg University)
Marina Danilevsky (IBM Research)
Seza Doğruöz (Tilburg University)
Heba Elfardy (Columbia University)
Noura Farra (Columbia University)
Eric Fosler-Lussier (The Ohio State University)
Kevin Gimpel (Toyota Technological Institute at Chicago)
Weiwei Guo (Yahoo! Research)
Ben Hachey (Hugo AI)
Masato Hagiwara (Duolingo)
Ed Hovy (Carnegie Mellon University)
Jing Jiang (Singapore Management University)
Nobuhiro Kaji (Yahoo! Research)
Emre Kiciman (Microsoft Research)
Chen Li (University of Texas at Dallas)
Wang Ling (Google DeepMind)
Fei Liu (University of Central Florida)
Huan Liu (Arizona State University)
Rada Mihalcea (University of Michigan)
Smaranda Muresan (Columbia University)
Preslav Nakov (Qatar Computing Research Institute)
Naoaki Okazaki (Tohoku University)
Miles Osborne (Bloomberg)
Ellie Pavlick (University of Pennsylvania)
Daniel Preoţiuc-Pietro (University of Pennsylvania)
Will Radford (Hugo AI)
Afshin Rahimi (The University of Melbourne)
Shourya Roy (Xerox Research)
Alla Rozovskaya (City University of New York)
Derek Ruths (McGill University)
Andrew Schwartz (Stony Brook University)
Djamé Seddah (University Paris-Sorbonne)
Richard Sproat (Google Research)
Anders Søgaard (University of Copenhagen)
Benjamin Strauss (The Ohio State University)
Jeniya Tabassum (The Ohio State University)
Joel Tetreault (Yahoo! Research)
Svitlana Volkova (Pacific Northwest National Laboratory)
Byron C. Wallace (University of Texas at Austin)
Xiaojun Wan (Peking University)
Jun-Ming Xu (University of Wisconsin-Madison)
Diyi Yang (Carnegie Mellon University)
Yi Yang (Georgia Tech)
Guido Zarrella (MITRE)
Ming Zhou (Microsoft Research)