In this project, we propose a new, socially-impactful task for natural language processing: from a news corpus, extract names of persons who have been killed by police. We provide a newly collected police fatality corpus, and have developed an EM-based, distantly supervised model for the problem, by combining web news text with historical data from the excellent Fatal Encounters crowdsourced project. Systems can be evaluated on this corpus to aid further development of automated fatality extraction methods.
More details in in the paper:
Also available: slides at KDD DS+J.