Toxic comment detection with TensorFlow.js

With the widespread use of online community and social media platforms, more people can easily express their opinions online. Unfortunately, there is a subset of individuals who post comments that are disrespectful and emotionally hurtful to others. Thankfully, we can use machine learning algorithms to help toxic comment detection and prevent them from being published. In this article, we will use a pre-trained TensorFlow.js model to build a toxic comment classifier.


Toxic comment detection demo

This is a demo of the TensorFlow.js toxicity model, which classifies text according to whether it exhibits offensive attributes (i.e. profanity, sexual explicitness).

Enter text below and click ‘Classify’ to add it to the table.

GitHub repository

You can download the complete code of the above demo in the link below:


Implementation

This technique would be helpful to identify which comments might violate community guidelines and keep our online environment clean. Let’s dive into the code and show you step by step how to build this toxic comment classifier with tensorflow.js.

# Step 1 : Include tensorflow.js

Simply include the scripts for tfjs and toxicity models in the <head> section of the html file. I also include the jquery library as well.

<html>
  <head>
    <script src="https://code.jquery.com/jquery-3.3.1.min.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
    <script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/toxicity"></script></script>
  </head>

Or you can install it via npm for use in a TypeScript / ES6 project

npm install @tensorflow/tfjs @tensorflow-models/toxicity

# Step 2 : Load the model

In order to perform classification, we first need to load the pre-trained toxicity model, by calling the API of toxicity.load(threshold).

$(document).ready(function() {    
    let samples = [];
    for (let comment of sampleComments) {
      samples.push(initToxicLabels(comment));
    }
    
    displayTable(samples, false, false, 0);

    // load model
    toxicity.load(threshold).then(mdl => {
        model = mdl;
        labels = [];
        classifyToxicity(samples).then(results =>{
          displayTable(results, false, true, 0);
        })
    })
});

The threshold parameter is the minimum confidence level above which a prediction is considered valid (defaults to 0.85). Setting threshold to a higher value means that predictions are more likely to return null because they fall beneath the threshold.

# Step 3 : Classify toxicity

In the snippet below we used the model.classify() API to predict the toxicity of the comments. model.classify() returns a Promise, so we need to call .then() afterwards to retrieve the actual predictions.

async function classifyToxicity(comments){
    //classify toxicity
    await model.classify(comments.map(d => d.text)).then(labels => {
        //result is an array of labels
        labels.forEach(item => getLabelResults(item, comments));
    });
    return comments;
}

The predictions are an array of objects, one for each toxicity label. At the moment it supports 7 labels : identity_attack | insult  | obscenesevere_toxicity  | sexual_explicit  | threat | toxicity. The results are also an array, one for each input comment, that contain the raw probabilities, along with the final prediction as the match property.

Here’s what the predictions array might look like for our sample text:

# Step 4 : Display the result

Once you get the toxicity classifications result above, you can display it on the screen as a table.

function displayTable(data, append, hasResult, replaceRowID){
  let wrapper = $('#table-wrapper');
  if(!append){
    wrapper.empty();
  }else if(replaceRowID > 0){
    $("#"+replaceRowID).remove();
  }
  
  if(window.matchMedia("(max-width: 767px)").matches){
      // The viewport is less than 768 pixels wide, mobile device
      return generateTable(data, 'card', ["text"], wrapper, hasResult, replaceRowID);
  } else{
      // The viewport is at least 768 pixels wide, tablet or desktop
      let columns = Object.keys(data[0]);
      if(!append){
        generateTableHead(columns, wrapper);
      }
      return generateTable(data, 'table', ["text"].concat(labels), wrapper, hasResult, replaceRowID);
  }
}

That’s pretty much for the code! Other than that are just to enhance user experience. You can also add a new comment and click to ‘Classify’ button to see the result, it would just call the above methods to perform the toxicity classification.


Conclusion

Thanks to tensorflow.js, the advantage of this toxic comment detection model is that, it runs directly in the browser. Doing this type of evaluation client-side eliminates potential privacy concerns related to sending not-yet-published comments over the internet. It also removes the need for an additional web service endpoint, and allows full customization of the machine learning model to the moderation policies of any particular website. Let’s say NO to online bullying and keep our online environment clean and friendly.

Thank you for reading. If you like this article, please share on Facebook or Twitter. Let me know in the comment if you have any questions. Follow me on Medium, GitHub and Linkedin. Support me on Ko-fi.

Leave a Reply