View Full Version : search script question
james438
05-11-2010, 07:17 AM
My search script can search for terms and search for phrases. If I add a "-" before the term it will search the database for results that do not contain the negated word.
I am just not sure how users would consider negating phrases. I would do it like this: -->css php mysql -website -"sample scripts"
This will search for all results that contain all the words: "css", "php", "mysql", but not the word "website" or the phrase "sample scripts".
Does this seem like the way users would commonly consider using a search script to negate phrases without reading the instructions?
djr33
05-11-2010, 02:46 PM
Users who don't know the instructions will immediately type in several related words without thinking much and hope to get results. For example, "css php mysql". They won't think to use negation.
That comes later, once they get a set of results they don't like and want to refine it, in many cases following the directions.
If they do just guess and start adding - signs before words, what other result would they expect? I don't see another logical possibility.
(It's very possible they won't know the syntax or won't know how to use it, but if they do think that a - before a word means negation, then what other result would it generate?)
This sounds very similar to my search here:
http://www.dynamicdrive.com/forums/showthread.php?t=54348
Is it based on/related to that?
And if not, then it sounds like we have the same syntax, so that's a good start that at least we're thinking alike, so some of the users may as well.
For my search I have some general questions like you do about users knowing what to do, but the way that I look at it is that only advanced users will do anything beyond just typing in a set of keywords. The only knowledge required for that is whether they realize that multiple terms are combined with "and" or "or", but if they start getting too few results even the casual user will figure out to use fewer or less specific keywords.
I'd agree - most users don't even have a concept of negation; those that do will likely have experience using the ( - ) operator, and/or will have taken the time to read the directions.
james438
05-11-2010, 09:30 PM
As I was reading through your thread I was inspired to continue developing my own search program that I use on my site. I have since added the Ajax Pagination, which really did turn out to be quite an undertaking due to the nature of the search program I use. In fact after I finished it I relabeled it search 5.0 from version 4.5. After it was rewritten I started 5.1, which involved streamlining the code. I put the search form into its own page, because the form is actually rather dynamic and intricate as well. At about this point I considered introducing negation, because as I was writing the code for 5.1 it occurred to me how to easily add negation. This may have been inspired a little from your work as well.
I like telling a little about my search program, because I have put a lot of work into it and am a bit proud of it :).
Another way to negate a phrase would be to add NOT "sample scripts" or the like. I am not sure if google does negation of phrases, but the little that I tried did not seem to work. After thinking about it last night and today I will use - to negate phrases so that it looks like -"sample scripts" to negate phrases. Currently individual words are negated.
The following is a digression:
I was impressed with how quickly you came up with a rather advanced search program. The layering is certainly more advanced than mine. I also did not think to do parenthetical layering of results. I may look into doing that with mine script as well. For now I don't mind multiplying the queries out. It is all dynamic and I find that it is somewhat easier to read that way (for me) when I need to read the queries that are generated to see if it is generating correctly or else to find a bug.
I suggest creating different posts and searching for them to see if they are being retrieved correctly. Some early problems that I had with my script was that certain results were being returned more than once to create artificially slightly larger number of results than were actually in the database. Other times certain results were not being retrieved, because they were being searched case sensitively despite the use of lcase(). This was due to searching an integer datatype concatenated with a lcase() text datatype. This only happened in the MySQL 5.0 database. Also if you concatenate a NULL value with a text value or whatnot the entire value is considered NULL as well.
I recently updated my query to use UNION. Here is a simplified version of it
(select ID, title, '' as missing from anime WHERE (lcase( concat(cast(ID as char), summary, IFNULL(image,''))) LIKE '%e%' ) ORDER BY ID)
UNION
(select ID, ID, title from misc WHERE (lcase( concat(cast(ID as char), summary)) LIKE '%e%' ) ORDER BY ID)
UNION
(select ID, title, date2 from manga WHERE (lcase( concat(cast(ID as char), summary, IFNULL(image,''))) LIKE '%e%' ) ORDER BY ID)
LIMIT 0, 10
The benefit to this is that I can categorize my results and each category can be sorted in a different way. Also the LIMIT is now easy to implement as opposed to using complicated php to keep the number of results manageable.
The biggest benefit to using UNION is that I can now easily integrate it with DD's Ajax Pagination script to decrease server load and speed up queries. I definitely recommend the use of UNION to combine queries and allow for greater control of how results are sorted and displayed. I do not think it would be all that useful if you are only searching one table though.
I used to think that fulltext indexing was the next greatest thing, but the more I learn about it the more I do not like it.
I think that I am going to put the search program on the back shelf for a while after I add the ability to negate phrases, because the search program will operate pretty much how I want. I am not sure of any other way to improve it at present.
james438
05-11-2010, 09:32 PM
I'd agree - most users don't even have a concept of negation; those that do will likely have experience using the ( - ) operator, and/or will have taken the time to read the directions.
Kinda surprised to hear that. I don't disbelieve you, but I use negation a lot to refine my google searches.
djr33
05-11-2010, 09:45 PM
But how did you find out about the negation? I'm guessing you didn't just accidentally type that in there.
I was impressed with how quickly you came up with a rather advanced search program. The layering is certainly more advanced than mine. I also did not think to do parenthetical layering of results. I may look into doing that with mine script as well.Thanks, but the whole reason that it works in a such a complex way is in the basic idea behind it: it extracts the terms into an array then builds the structure. After that, they're put back together. For this reason it's pretty easy to add in things like layering, quoted material, etc.
Anyway, I think your approach is just fine to negation, though I'm probably the wrong person to ask because you're basing it on my syntax, haha.
And note that google does use this same format:
http://www.google.com/advanced_search?q=hi+-hello
(Type in "hi -hello" into the search bar and you'll get the same thing.)
Kinda surprised to hear that. I don't disbelieve you, but I use negation a lot to refine my google searches.
By "most users," I'm probably not referring to people (including you, djr, etc.) who are into programming. :D
If you've ever had to explain to a friend or family member how to click on the "New Message" button in Yahoo mail, you might know what I'm talking about. I'd love to see statistics (if they exist) on how accurate most search engine queries are - meaning how accurate the person's query was compared to what they actually wanted to find.
james438
05-12-2010, 04:19 AM
And note that google does use this same format:
http://www.google.com/advanced_search?q=hi+-hello
(Type in "hi -hello" into the search bar and you'll get the same thing.)
You can negate individual terms, but negating phrases is slightly different. I have not looked too hard into whether negating phrases in Google works.
I implemented the ability to negate phrases. using the "-" sign.
I think I see now what you mean Traq.
djr33
05-12-2010, 07:05 PM
I certainly see no reason why phrases and individual words should differ.
For negation+quotes, that gets a little complex--- using - before " means negate whole phrase OR not [literal] quote+term, but I think that's clear enough-- should be negating the phrase.
Also, within quotes, what does a - do? Does it negate the word or act as a literal hyphen symbol? Here, again, it seems obvious that it should be literal and not a negation symbol.
james438
05-12-2010, 08:50 PM
yep, you got it. Quotes act as a delimiter for phrases. I could update it to search for quotes, but that seems unnecessary and would probably make searches more complicated for the user as opposed to less.
A hyphen within quotes acts as a hyphen. negating terms differs from negating phrases in that phrases retains the order of the terms. I may want to ignore all results that contains the phrase "visit this spam site" because I know that several of the results will contain that phrase which indicates spam, but I don't want to ignore all results that contain all of those words or ignore all results that contain any of those words, because they are rather common.
No matter. I'm kinda just discussing the concept of search script at this point since the script is the way I want it now. The only real change I have made to my search since my last post was to add some annotation, which also act as "delimiters".
Powered by vBulletin® Version 4.2.2 Copyright © 2021 vBulletin Solutions, Inc. All rights reserved.