Police Killings Extraction


In this project, we propose a new, socially-impactful task for natural language processing: from a news corpus, extract names of persons who have been killed by police. We provide a newly collected police fatality corpus, and have developed an EM-based, distantly supervised model for the problem, by combining web news text with historical data from the excellent Fatal Encounters crowdsourced project. Systems can be evaluated on this corpus to aid further development of automated fatality extraction methods.

More details in in the paper:

Identifying civilians killed by police with distantly supervised entity-event extraction. Katherine A. Keith, Abram Handler, Michael Pinkham, Cara Magliozzi, Joshua McDuffie, and Brendan O'Connor. Proceedings of EMNLP 2017. [pdf]

Code and Datasets

We've released two versions of the data. If you use them in research, please cite the paper. Thanks! Supporting code, including the evaluation script, is available here.