a thoughtful web.
Good ideas and conversation. No ads, no tracking.   Login or Take a Tour!
comment by kleinbl00
kleinbl00  ·  2546 days ago  ·  link  ·    ·  parent  ·  post: Perspective API: Using machine learning to score the toxicity of online comments

Run it down this thread:

1) I'm a toxic mutherfucker

2) Because I cuss a lot

3) and whatever you say in an indoor voice, it isn't toxic.

I'm not a machine learning guy, but I don't know I'd describe the problem as "strictness" so much as "lack of context."

18%, by the way, 13% without point 1 and 11% without point 2.





user-inactivated  ·  2546 days ago  ·  link  ·  

Paper. They're just looking at small windows of the text, building (very sparse) vectors along the lines of "1.0 if this sequence of n words/characters appeared in the text, 0.0 if not" and doing some voodoo with it. This is the sort of classifier marketing firms use to guess whether Twitter feels positively or negatively about something. You're not going to be able to do fine-grained classification of short texts that way and, unsurprisingly, "toxicity" looks a lot like vehemence.

kleinbl00  ·  2546 days ago  ·  link  ·  

I am nothing if not vehement.

Devac  ·  2546 days ago  ·  link  ·  
This comment has been deleted.