Good ideas and conversation. No ads, no tracking. Login or Take a Tour!
user-inactivated · 2565 days ago · link · · parent · post: Perspective API: Using machine learning to score the toxicity of online comments
Paper. They're just looking at small windows of the text, building (very sparse) vectors along the lines of "1.0 if this sequence of n words/characters appeared in the text, 0.0 if not" and doing some voodoo with it. This is the sort of classifier marketing firms use to guess whether Twitter feels positively or negatively about something. You're not going to be able to do fine-grained classification of short texts that way and, unsurprisingly, "toxicity" looks a lot like vehemence.