Today we're going to discuss making a "Bad Word" filter for your site. I ended up creating this because I recently needed this functionality, and decided to just create my own using XML. First we need a list of words we want to prevent users from using, I did this an XML document, then loaded the words I came up with into a strongly typed list, otherwise known as
List(T) using the
XmlDocument Class.
When loading the words from the XML document into my
List(T) I use a form of
XPath to navigate directly to the nodes I actually want, the ones that contains the actual
banned words. When we start the first method we create a new
XmlDocument, a new
List(T) and build our
XPath query. Once we accomplish that we load the document using the
Load Method of the
XmlDocument.
Once the XML is finally loaded then we can loop through each
XmlNode that meets our
XPath search criteria. For each
XmlNode we find we add the
InnerText of each
ChildNodes found in the loop. This all looks like this
csharp
/// <summary>
/// method for loading banned words from an
/// XML document into a generic list (List(T)
/// </summary>
/// <param name="file">the file that holds the banned words</param>
/// <returns></returns>
public static List<string> BadWordList(ref string file)
{
//create a new List(T) for holding the words
List<string> words = new List<string>();
//create a new XmlDocument, this will read our XML file
XmlDocument xmlDoc = new XmlDocument();
//here is where XPath comes into play, when we use
//SelectNodes we will pass this XPath query into it
//so we can navigate straight to the nodes we want
string query = "/WordList/word";
//now load the XML document
xmlDoc.Load(file);
//loop through all the XmlNodes that meet our XPath criteria
foreach (XmlNode node in xmlDoc.SelectNodes(query))
{
//add the InnerText of each ChildNodes we find
words.Add(node.ChildNodes[0].InnerText);
}
//return the populated List(T)
return words;
}
Okay, part one is complete, we have a generic list of words we want excluded from a users post, username, etc (wherever you choose to implement this.). Now we need to be able to take a word, normally provided by the user, say a username when registering, and compare it with the words in our list. So we create yet another new List(T) to hold the list of
banned words, except this time we loop through each of these words comparing them with the word from the user. We always compare in lower class by using
ToLower() on each side of the equation.
This method looks like this
csharp
/// <summary>
/// method for determining if the provided word is in
/// the banned word list
/// </summary>
/// <param name="word">the word we are looking for</param>
/// <param name="file">file name containing the words we're to search against</param>
/// <returns></returns>
public bool IsBadWord(ref string word, ref string file)
{
//create a new List(T) to hold the words. Then populate the list
//by calling the BadWordList method in the IOManager class
List<string> badWords = IOManager.BadWordList(ref file);
//now we need a loop as long as the words count
for (int i = 0; i < badWords.Count; i++)
{
//on each iteration we compare the two (1 provided, 1 from the List(T))
//to see if they match. If they do then return true and exit the loop
if (word.ToLower() == badWords[i].ToLower()) return true;
}
//we made it through the entire list and no match found
//so we return false as this word isnt a banned word
return false;
}
Now where I probably differ from most who will read and use this functionality is I needed it for a Paint Shop Pro group I'm making a website for. Registered users have the ability to request a
Tag and have a name put on it. Needless to say we wanted this caught before the data made it into the database, so that's where I did the final check. I do the check first thing, before any database instantiating is even started (no need to start that if it's going to end up being a
banned word, right?
csharp
public bool CompleteTagRequest(string uid, string tagID, string name, ref string file)
{
string query = "uspRequestTag";
int rows = 0;
//first we need to make sure the user did try to use a word in our list
if (!(IsBadWord(ref name, ref file)))
{
//they are clean so move on
try
{
//create our SqlConnection object
using (SqlConnection conn = new SqlConnection(GetConnectionString("MyConnectionName")))
{
//create our SqlCommand
SqlCommand cmd = new SqlCommand();
//set the CommandText and CommandType properties
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = query;
//now add our parameters
cmd.Parameters.AddWithValue("@user_uid", uid);
cmd.Parameters.AddWithValue("@tag_uid", tagID.Substring(3));
cmd.Parameters.AddWithValue("@name_on_tag", name);
cmd.Connection = conn;
//now open the connection
cmd.Connection.Open();
rows = Convert.ToInt32(cmd.ExecuteScalar().ToString());
if (!(rows > 0)) return false;
else return true;
}
}
catch (SqlException ex)
{
_message = ex.Message;
return false;
}
catch (Exception ex)
{
_message = ex.Message;
return false;
}
}
else
{
//since we're here they obviously attempted to use one of the banned words
//so we need to let them know and possibly send goons with baseball bats to find them,
//well that last part isnt really true
_message = "We're sorry, that name is not allowed. Please try a different name";
return false;
}
}
I might as well give the
GetConnectionString method since it is used in this tutorial. What this method does is it pulls up the
connectionStrings portion of the web.config. The method accepts a string parameter, but if you pass something like
"" then the method will default to the top most connection string.
csharp
/// <summary>
/// method to retrieve the database connection string from the web.config
/// </summary>
/// <param name="name">name of the connection string we want (this allows us to have multiple if needed)</param>
/// <returns></returns>
public string GetConnectionString(string name)
{
try
{
//variable to hold our connection string for returning it
string connString = string.Empty;
//check to see if the user provided a connection string name
//this is for if your application has more than one connection string
if (!(string.IsNullOrEmpty(name))) //a connection string name was provided
{
//get the connection string by the name provided
connString = ConfigurationManager.ConnectionStrings[AntiXss.HtmlEncode(name)].ConnectionString;
}
else //no connection string name was provided
{
//get the default connection string
connString = ConfigurationManager.ConnectionStrings["DefaultConnection"].ConnectionString;
}
//return the connection string to the calling method
return connString;
}
catch (Exception ex)
{
return string.Empty;
}
}
Okay, you have the code for parsing the XML file and loading a
List(T) and looping though it doing your comparisons. What you dont have is my XML file, so Ill zip that file up and add it to this tutorial. I hope you found this tutorial as informative and fun as I did when creating the filter list. Thank you for reading, and Happy Coding!
WordList.zip ( 1016bytes )
Number of downloads: 40Hope you find this tutorial useful
