Off and on I have spent a lot of time with creating php/MySQL based search programs. The preferred method is to use fulltext indexing for fast, efficient, and accurate returns on queries. I am against this for a few reasons. fulltext indexing results are based on the indexed tables, which need to be indexed at regular intervals in order to get up to date returns. Stopwords are a factor. The min word length (ft_min_word_len) is often set at 4 letters and in order to alter this from 4 to 3 you need root access to your MySQL database, which with my GoDaddy account requires the significantly more expensive Deluxe account. Terms are also included that I would rather were not. Searching for "keyboards" could return "keyboard". Searches that are found in over 50% of the results are ignored as well. As I understand it Google uses search patterns similar to this.
All that being said I use MySQL's LIKE function.
Code:
SELECT * FROM table WHERE col1 LIKE '%term%'
.
The searches will be be in realtime and will find results even if the text to be searched was entered into a database 2 seconds ago. Remember to sanitize the searches though so as to avoid SQL injection. The searches will be slower, but for most people with hardly be noticeable. This also is not the same as fulltext indexing as listed above.
Skipping forward several paragraphs here is an example of one of my queries:
Code:
SELECT ID, summary, image FROM memoblog WHERE
(lcase( concat(cast(ID as char), summary, IFNULL(image,'')))
LIKE '%yoo%' )
Breaking this apart I am returning the ID, summary, and image from table "memoblog" if "yoo" is found anywhere in ID, summary, or image columns.
lcase is used to make the search case insensitive.
concat means concatenation, which means the 3 columns are searched as if they were a single unified column. This is useful when searching multiple columns with multiple terms where one term may be found in one col and another is found in another column, but all of the terms are not found in a single column together. I hope that makes sense.
cast is used here because if you are using MySQL 5.0 and later and you search your database for a column that is text and another column that is an integer the search becomes case sensitive again. I forget why or how though. cast (col as char) will return integers as alphanumeric characters.
IFNULL if any field is null then no results will be returned for that particular row. IFNULL will allow you to recognize NULL values as something else. In this case the NULL value is returned as empty as opposed to NULL.
The actual code I use in my search program is certainly more dynamic and can be rather complex if needed.
There are other parameters that need to be considered for the admin like how do you make certain results unsearchable? Sometimes you want to keep certain entries from being found via your search program. How do you make those hidden results appear depending on the user? How do you keep the program simple enough so as to be easily editable? How do you search specific columns in multiple tables? Are you using MySQL 5.x or 4.x (big difference here)?
PHP based questions include: how will you return the results? Will you return the enitre document or just a portion of 200 words or so that contains one or more of the search terms? Will you limit the searches to 10 results per page? Pagination? Can the user search for exact phrases?
Other questions are implied, so I will avoid mentioning them here.
On a side note there are free search programs out there, but I had some very specific requirements, so I made my own. It was not easy and a lot of the information was not all that easy for me to find, like ft_min_word_len manipulation, but it was rewarding. It has been a while since I have worked on my search engine and so I may be a bit out of practice with this.
Bookmarks