Page 1 of 1

Search Engine [Basic] Create a simple search engine for your website. Rate Topic: ****- 1 Votes

#1 Master Jake  Icon User is offline

  • D.I.C Head
  • member icon

Reputation: 16
  • View blog
  • Posts: 106
  • Joined: 27-February 09

Posted 20 October 2009 - 09:48 AM

Hello. Today you are going to be learning how to create a very simple search engine system which you can implement into your own website. Keep in mind that this search engine tutorial is meant to teach only the fundamental basics.

If you are implementing this search engine into a professional or commercial site, be sure to update it with paging so that 20 billion results aren't displayed on one page. Let's get started.

1: The Database

In order for this search engine to operate properly, we need to store information into a database which we will scan for later. You can use this to search pretty much any field in a table; however, for this example we will be using a simple "blog" type section.

Our database will be very simple, and looks like this:
TABLE: blog_entries

id (primary key; auto increment) - int(11)
blogbody - text



Obviously, a real blog_entries table would contain MUCH more data than this;however, like I said, I want to make this as simple and easily understandable as possible.

Now, we will assume that there are already quite a few entries into this table for which we may search.

First, we need to create an HTML form field using GET as the method so that our user's can search the database.

File: test.html
<form action="search.php" method="get">
<input type="text" name="q" size="20" /> <input type="submit" value=" Search " />
</form>



I didn't name the input for the submit button because it's not really needed for this example.

Now, there are a few things you may notice. First of all, this form directs the data to the page "search.php" using the "get" method. This means that if the user type's in say "Hello World" in the input textbox and press's "Search", the url will look like this:

search.php?q=Hello+World



The nifty thing about using the get method is that it will automatically url encode the form data and then url decode it when we use php's $_GET to retrieve it.

Speaking of PHP, let's get on with the "search.php" page which this form directs to.

First we need to connect to MySQL so that we can use mysql_query and etc.

<?php

$c = mysql_connect("localhost", "root", "");
mysql_select_db("my_database", $c);



Of course, the username, password, host, and database name will change depending on your MySQL information. Now that we are connected, let's grab the value of "q" in the URL.

$q = mysql_real_escape_string($_GET["q"]);
$newq = strtoupper($q);



We use mysql_real_escape_string to escape harmful characters which can be used to SQL Inject the database. We also created a second variable called "$newq" which is simply an all uppercase version of the search query. This is that we may run a case insensitive search.

Now that we have the information in which the user submitted (and keep in mind that $_GET automatically url decoded the data), we may run the main search query.

$searchQuery = mysql_query("SELECT * FROM `blog_entries` WHERE UPPER(`blogbody`) LIKE '%$newq%' ORDER BY `id` DESC LIMIT 10");



Now, a lot is going on in this query. I hope that if you are reading this tutorial you already know a little about MySQL. First of all, we are selecting every field from the table "blog_entries" (which I showed above). We are selecting only the fields in which the value of "blogbody" contains the string "$newq." We use MySQL's UPPER() around the blogbody so that, once again, the search query is case insensitive. This tests a full uppercase version of the "blogbody" field with a full uppercase version of the search query. All case insensitive.

You may notice the %%'s around $newq. Basically they are wildcards which mean "anything in this spot." This means that if the user searches:

'o'

and the word "dog" is in "blogbody", it will return that result.

%o% = anything-o-anything

dog would match in that case. This is used for better search results. In case the user searches:

tut

instead of "tutorial", it will still match.

You may also notice that we are ordering these by the id descending. Since the id auto increments with each new entry, the highest id is going to be the newest entry. Descending means highest to lowest. Basically we are ordering them from newest to oldest. You can also do this with a date timestamp, but we won't get into that right now.

Also, we are limiting this search to a maximum of 10 results. This is so the page doesn't get cluttered with too many search results. If you want unlimited results, I suggest creating a paging system and adding it to your search system so that it displays say 10 search results per page.

Now that we have the value of the search query tested with the database, we need to print the retrieved values (providing there are any). We can do this by using a while() loop.

while ($row = mysql_fetch_assoc($searchQuery))
{
	echo "Blog Entry #" . $row["id"] . "<br /><br />" . stripslashes($row["blogbody"]) . "<hr />";
}



A few things are going on here. We are assigning the variable "$row" to each row of data retrieved from the search query. Each time the while loop increments, one new row is retrieved and we can print data associated with that row. mysql_fetch_assoc() is simply retrieving the rows and storing them in an associative array of "$row."

These values can be accessed with the variable name, brackets, and the field name in quotes:

$row["blogbody"]; // returns the blogbody field



The rest is pretty self explanatory. We are echoing out the id of each blog entry that was successfully matched with the search query, and then echoing out the actually blogbody of that field itself. You may notice "stripslashes()" around the blogbody. I assumed that whoever was making the blog system used "mysql_real_escape_string()" to escape the blogbody before entry into the database.

This make's all single and double quotes (and a few other characters) have a backslash before them. E.g. " becomes \". We need to strip those slashes when echoing them out so that \" returns back to ", etc.

Well, that's pretty much it. I hope you found this tutorial helpful, and I hope that it wasn't too confusing. Here is the final code put together:

<?php

// Connect to MySQL
$c = mysql_connect("localhost", "root", "");
mysql_select_db("my_database", $c);

// Retrieve the Search String
$q = mysql_real_escape_string($_GET["q"]);
$newq = strtoupper($q);

// Run the Search Query
$searchQuery = mysql_query("SELECT * FROM `blog_entries` WHERE UPPER(`blogbody`) LIKE '%$newq%' ORDER BY `id` DESC LIMIT 10");

// Display Results Based on the Retrieved Data
while ($row = mysql_fetch_assoc($searchQuery))
{
	echo "Blog Entry #" . $row["id"] . "<br /><br />" . stripslashes($row["blogbody"]) . "<hr />";
}

?>



If you are wondering why I didn't just use "strtoupper" on the "$q" value itself, I wanted it separate so that the original search string was saved. This way, if you wanted to print out what they searched, you could:

<?php

echo "You searched for <b>" . stripslashes($q) . "</b>";

?>



Is This A Good Question/Topic? 0
  • +

Page 1 of 1