Blacklist REGEX Support
I believe that Twitch should add REGEX support for Automod "Blocked words and phrases". This would be a very helpful feature for chat moderation since a lot of users can easily get past the blocked term by intentionally misspelling a word. This would also allow streamers to enforce English only characters in chat. I'm aware that twitch allows using the "*" wildcard but in many instances this is simply not enough.
Thanks for sharing this type of informative article. I have learned some right stuff here. I really like your articles.
Faced with a similar problem about 20 years ago in the way of spam about the trademark V pill and when SpamAssassin was just a baby. We created and sold an eMail filtering package that included a program we called WordAssassin to identify bad words. Users were able to add good and bad words to a couple of lists and every time an eMail came in Procmail would run WordAssassin which would give the desired output. Actions included removing the word, returning "bad word" found, adjusting the subject line, return value, and so on.
Most of the world and ourselves started with a REGEX but that didn't really work too well to catch the message and every time an eMail came in a Perl program would run the tests. Needless to say, this overloaded the 450Mghz servers and would still be quite the task given the workload here. Furthermore, it didn't catch all the manipulations of the words very well. We created a C++ program to handle all the fun stuff spammers were doing to get their message thru spam filters and to the user. My old sales pitch seems to fit here so, here we go.
It was always difficult to explain how spammers were using different character sets, spaces, etc, but they were already seeing it anyway. I would explain we had built an extremely fast C binary that performed a variety of tests to detect and provide short ***** protection, longer ***** protection, mangled ***** protection, HTML ***** protection, and so on. YEP, It got a lot of laughs and the product is still running somewhere most likely today.
The core engine.cpp would solve this problem and has been available for licensing or for some good use for quite some time now at wordassassin.com
Currently, the list of blocked terms and phrases found at https://dashboard.twitch.tv/u/[CHANNEL]/settings/moderation/blocked-terms only matches directly against the input. This means that in order to ban variations of a word, one would have to input many millions of phrases through the API even for a single word (as shown here: https://twitter.com/thomsimonson/status/1429472208506822659).
Expand this page to allow soft matching & regex-based matching (advanced, potentially restrict API only) entries.
Soft-matching would match every entry in the list not by it's exact character, but any latin variations/etc. - so banning "information" (example) would also match "înfórmàtion" etc.
Regex-based matching would match via regular expressions (no further explanation needed).
Love the auto mod . But being a lead mod for Papa_pixels, we are a multi cultural channel having only English would really not work well for our channel. Auto mod should be multi cultural
I have a bot that can handle this. I've been using it to block our favorite "Wanna be famous" spam. It can target messages and delete them rather than clearing the chat of all a particular user's message, too, incase of a false positive and keeps a log.
Their bot interface is IRC and it's plaintext and easy to use.
You can sort of do this... Already if you ban f.e an excessive user of HeyGuys (like 3 times right after each other)
If you disallow HeyGuys HeyGuys HeyGuys
Every message containing HeyGuys three or more times right after each other (even at the end of the message) it is blocked...
So it is for anything you could blacklist...