RE: @blessthisdoobie tweeted a slur
Hatkirby on August 14th, 2018 at 9:23:16pmI've made some changes to my Twitter bots recently regarding how they choose the words that they post. This was in response to a situation wherein a follower notified me that @blessthisdoobie posted a tweet containing the n word. In this post, I'd like to discuss 1) how this happened, and 2) what I've done to prevent it from happening again. However, I first and foremost want to apologize to the followers of the bot. I take it very seriously when my bots post offensive content, and I want to do what I can to ensure that my bots remain safe and fun for all.
Now, I'd like to explain how this happened in the first place. A large number of my bots use a library I created called verbly for natural language processing. This includes @blessthisdoobie, which uses verbly to find nouns and verbs that rhyme. verbly is a complex library, and the reason behind this issue requires some knowledge of how verbly works. I will provide a brief description of the relevant parts of the library, and if it sounds interesting, I wrote a motivating example for the data model that you can read later.
verbly organizes its data into a number of "objects". Two of the types of objects are "word" and "form". "form"s are the literal text that is used to represent a word, while "word"s have meaning attached to them. The distinction between these is crucial because of a phenomenon called homography, where two words with different meanings are spelled in exactly the same way. An example of this is the form "tweet", which can represent a word meaning the sound a bird makes, and which can also represent a word meaning a post on Twitter.
Another important feature of verbly is relationships between objects. A lot of the relationship data in verbly comes from WordNet, and one of the relationships included is called "usage". "usage" forms a link between two "word" objects. The exact semantic meaning of this relationship vs some other ones is not entirely clear, but it is able to provide some useful data for my bots. Specifically, there are a number of "usage" relationships from the word "ethnic slur" to words that are, ethnic slurs. Because this relationship exists, it is possible for me to write a query in verbly that returns words that are ethnic slurs.
With this in mind, let's discuss how @blessthisdoobie generates tweets. It first queries verbly for a verb that rhymes with at least one improper noun that is not an ethnic slur. It then queries for an improper noun that is not an ethnic slur and which rhymes with the chosen verb. The way it adds the ethnic slur condition to the query is by restricting the set of words to those that do not have a "usage" relationship with "ethnic slur".
In most cases, this works fine. However, by examining the semantics of these queries, we can find a potential problem. The queries restrict the noun word that is returned to one that is not an ethnic slur. Since this condition is on the "word" object, this means it will not return an object that means an ethnic slur. However, it has no problem returning a word that looks like an ethnic slur.
Therein lies the issue. It turns out that the verbly data contains information on some extremely obscure words that are homographic with ethnic slurs, but are not tagged as slurs themselves. The exact scenario that resulted in @blessthisdoobie tweeting a slur was that, for the verb, it picked an obscure word that is spelled in exactly the same way as the n word. This was an extremely rare situation, but the semantics of the query allowed it to happen.
The fix for this issue was rather simple. Instead of restricting the noun query to words that are not ethnic slurs, it is now restricted to words that are not homographic to slurs. The verb query is now also subject to this restriction -- even though there are no verbs that are ethnic slurs, there are verbs that are homographic to slurs.
The filter for what is considered a slur has also been updated. Rather than looking for words that have a usage relationship with "ethnic slur", the filter now looks for words that have a usage relationship with "derogation". In the WordNet data, this contains all words previously considered under "ethnic slur", as well as some other non-ethnic slurs. A few of the words included are quite innocent, but a few false positives are preferable here to a few false negatives.
Lastly, all of my other bots that use verbly to find words that are then posted have been updated to this new standard. The affected bots are:
- blessed (@blessthisdoobie)
- grunge (@pastelhearted)
- support (@rtyoursupport)
- fefisms (@fefisms)
- insult (@TeamMeanies)
- wordplay (@eventhenotion)
- furries (@thefurriesare)
- nancy (@nancy_ebooks)
- fruity (@FruitNames)
- owo (@furrytherapist)
- infinite (@icouldswear)
- composite (@everyfullmetal)
- chemist (@drbotmd)
- advice (@howtodoesthing)
- capital (@extracapitalism)
- difference (@differencebot)
As always, I thank my followers and my bots' followers for supporting me. I appreciate the patience they have had as I fixed this issue, and I hope I can continue to make a fun and safe space on Twitter.
Comments