Sources, in the form of selected Facebook pages, can be used as indicators of hate-rich content. Polarized distributed representations created over such content prove superior to generic embeddings in the task of hate speech detection. The same content seems to carry a too weak signal to proxy silver labels in a distant supervised setting. However, this signal is stronger than gold labels which come from a different distribution, leading to re-think the process of annotation in the context of highly subjective judgments.
|Title of host publication||Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018)|
|Editors||Tomasso Caselli, Nicole Noviell, Viviana Patti, Paolo Rosso|
|Number of pages||6|
|Publication status||Published - 2018|
|Event||EVALITA 2018 - CLIC-It 2018, Turin, Italy|
Duration: 12-Dec-2018 → 13-Dec-2018
|Period||12/12/2018 → 13/12/2018|