Results 1 to 10 of 10

Thread: mySQL LIKE

  1. #1
    Join Date
    Aug 2005
    Posts
    174
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Default mySQL LIKE

    Hello,

    I have a field in my table with a string of letters, each letter represents a "feature" of a product.. e.g. if a product has touch screen and has bluetooth the string would be tb.

    This gets a little complicated when a product has lots of features like this
    Code:
    scbmrtl
    Using mySQL LIKE, i would want to search for products with multiple features.

    How can i construct search string in order for this to work?

    PHP Code:
    Features LIKE '$searchstring' 
    i have tried %t%b% but this doesnt always work... like in the string above..

  2. #2
    Join Date
    Jan 2007
    Location
    Davenport, Iowa
    Posts
    2,385
    Thanks
    100
    Thanked 113 Times in 111 Posts

    Default

    Code:
    select column from table where column regexp '[scbmrtl].*[scbmrtl]'
    Is the best I can think of. The above will match 'ss' and 'scb' and 'sss' and 'eppsppt' but not 's' or 'zvc'. It is designed to look for at least two letters (from your list) in your column even if the same one is listed twice.

    I was wracking my brains to try and get LIKE to work, but couldn't get anything to work. The more I thought about it your problem was begging for a regexp to step in. It is best to avoid using regexp except when you need it, because regexp tends to be server heavy. This may not be the case with MySQL queries like it is with PHP, but I doubt it. To read up more on regexp and MySQL go here.

    EDIT: Actually I think I could write out the code where LIKE would be used instead of using regexp, but the query would be around 15 to 20 times longer and would probably be just as server heavy as the above code if not more so.
    Last edited by james438; 12-31-2008 at 08:05 AM.
    To choose the lesser of two evils is still to choose evil. My personal site

  3. #3
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    This is a fine example of how not to design your tables.

    There are two options, both of which are superior to this:
    1. Use a separate column boolean for each feature, or
    2. Use a bitset.
    Using separate columns may make your table quite wide, but is also the simplest (and probably most 'correct') way of doing it. The biggest disadvantage is that when adding a new feature you must add a whole new column. If you're not going to be adding new features to the table very often, that probably doesn't matter.

    Using a bitset, on the other hand, involves only a single column, and you can search the table for (a) specific feature(s) with the built-in binary operators. Adding a feature involves only setting the appropriate value. However, this does make the data a little harder to decipher at a glance (documentation for the meaning of each position in the bitset must be stored elsewhere, probably in the code), and won't be quite as efficient as the separate columns (although still much more efficient than your original strings).

    Note that % does match any number of characters, including none; your example above should work.
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

  4. #4
    Join Date
    Aug 2005
    Posts
    174
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Default

    hello, i did the obvious and just used this for each of the features:
    Code:
    .. AND Features LIKE '%b%' AND Features LIKE '%t%' .. etc
    thanks for your help

  5. #5
    Join Date
    Jan 2007
    Location
    Davenport, Iowa
    Posts
    2,385
    Thanks
    100
    Thanked 113 Times in 111 Posts

    Default

    Twey, when I design tables, my goal is to try to keep it as simple and efficient as possible. Having a seperate column for each boolean feature just sounds needlessly complicated. The same is true with bitset, however I must add that I am not very familiar with bitset nor of any good reason to use bitset. I am sure there is a good reason, but I am simply not aware of one. I tried reading up on bitset over in dev.mysql and got a little more understanding of what and how bitset is used, but not why.

    Note that % does match any number of characters, including none; your example above should work.
    The first part is true. the second part of your statement is not. What if 'b' and 't' were reversed in position? Then it would not work.

    nikomou, your example will only work in certain circumstances. The way you have it written all of the features must be present in order for the statement to be true as opposed to two or more non matching features being present in a given field.

    I am happy to see that people do read this part of the forum. MySQL seems like a rather neglected subject.
    To choose the lesser of two evils is still to choose evil. My personal site

  6. #6
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    What if 'b' and 't' were reversed in position? Then it would not work.
    Yes, but why not simply sort the string first, or insert new features in alphabetical order? That way you don't have to worry about them being in orders other than the ones you specified.

    That's all academic anyway, though, because:
    Twey, when I design tables, my goal is to try to keep it as simple and efficient as possible.
    ... you've managed to find the least simple and least efficient manner of tackling this particular problem
    Having a seperate column for each boolean feature just sounds needlessly complicated.
    When thinking of SQL, don't consider what the tables look like; they only have to be created once. Consider, instead, the queries. The separate columns are actually the simplest approach. I think that any layman could say with reasonable confidence that SELECT * FROM table WHERE has_bluetooth AND has_touchscreen is considerably less complicated than SELECT * FROM table WHERE features LIKE '%b%' AND features LIKE '%t%', but what they probably couldn't point out is the relative expense of performing those string operations: both the individual columns option and the bitset option are effectively O(1) on the match, but matching an arbitrary substring is O(m*n) in the worst case, with m being the length of the substring and n the length of the string, and you do it many many times in each query — at least once for each row in the table, making the whole thing O(m*n*o) (o being the number of rows), even if you're only doing a single match in the query. Including a number of matches p brings it up to O(m*n*o*p). Now, m (the length of the substring) will technically only ever be 1 in your case, so we can remove it from the equation, but n can be reasonably be expected to be around 10, p from 1 to 10, and o to be in the hundreds in even a small table — depending on the success of your site it might end up in the hundreds of thousands. On top of this there's overhead from parsing the LIKE pattern, which is not a simple substring but a more complicated domain-specific language, which gives us particularly large constants.
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

  7. #7
    Join Date
    Apr 2007
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    Use explode()

    Then you match every single letter alone (Using LIKE)

    Then you display all the results together

    This is the best way i can think of

  8. #8
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    In the code, define (PHP):
    Code:
    define('BLUETOOTH', 1);
    define('TOUCHSCREEN', 2);
    define('OTHERFEATURE', 4);
    define('NEWFEATURE', 8);
    Then:
    Code:
    mysql_query(sprintf('SELECT * FROM products WHERE features_bitset & %d',
                        BLUETOOTH + TOUCHSCREEN));
    Inserting them is done with +:
    Code:
    mysql_query(sprintf('INSERT INTO products (features_bitset) values (%d)',
                        BLUETOOTH + OTHERFEATURE + TOUCHSCREEN));
    And that's a bitset. Very simple, you just have to keep track of which value means what.
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

  9. #9
    Join Date
    Jan 2007
    Location
    Davenport, Iowa
    Posts
    2,385
    Thanks
    100
    Thanked 113 Times in 111 Posts

    Default

    Sorry for the delay in response to your last statement. I just didn't think that there was much for me to ask or contribute further to this thread. My response to nikomou may have been the least efficient, but I was trying to work with the scenario that I was given without altering his table. Honestly the idea of using bit set or creating a potentially long table just didn't occur to me.
    When thinking of SQL, don't consider what the tables look like; they only have to be created once.
    I am more familiar with my own situation as opposed to others, but I find that I am often altering my tables as I find better and more efficient ways to organize, store, and retrieve my data. I also want my table to make as much sense to me as possible if I find that I need to go back and change or fix something.

    This might scare you a bit, but I have one table that I designed which stores a rather large member timesheet within each row, because I did not want to create a new table for each day's worth of data generated. To make matters more complicated almost every column is a person's name, about 100, and since names are added, subtracted, and/or corrected frequently the table columns must be altered almost daily as well. Because of this I have modified my MySQL table creation to make my tables as understandable (mostly to me) as possible. Especially since I may need to go back and correct someone's name or improve something. I say that it may scare you, because you may know of a simpler method, but it was the best I could think of at the time . Yes, adding a column for each boolean feature does still sound needlessly complicated to me.

    SELECT * FROM table WHERE has_bluetooth AND has_touchscreen
    Sorry, could you explain this statement? I am not really seeing how this works. MySQL is an area where I feel I have only gotten my feet wet, so I have a great deal left to learn yet as opposed to PHP which I have worked much more with and am more comfortable with.

    And that's a bitset. Very simple, you just have to keep track of which value means what.
    I am sure this is implied, but I would store the values in the database . I am currently reading up on bitset, by the way, as this is new to me. I only first heard about it in this thread. It sounds promising and I may end up redesigning some of my database search programs to incorporate it.
    To choose the lesser of two evils is still to choose evil. My personal site

  10. #10
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    I am sure this is implied, but I would store the values in the database
    The overhead of the subquery would pretty much negate the speed benefits.
    Yes, adding a column for each boolean feature does still sound needlessly complicated to me.
    Complicated? How could anything be simpler? It's certainly simpler than that convoluted mess of LIKE queries.
    This might scare you a bit, but I have one table that I designed which stores a rather large member timesheet within each row, because I did not want to create a new table for each day's worth of data generated. To make matters more complicated almost every column is a person's name, about 100, and since names are added, subtracted, and/or corrected frequently the table columns must be altered almost daily as well.
    It's about time someone introduced you to the wonders of relationships. If your table changes frequently then there's something fundamentally wrong with your design. Specific, dynamic data like names should not be realised as columns. Nor should new tables be created with that frequency. If you want to post the relevant parts of your database layout, I'll normalise it for you.
    MySQL is an area where I feel I have only gotten my feet wet, so I have a great deal left to learn yet as opposed to PHP which I have worked much more with and am more comfortable with.
    Clearly. I strongly suggest you read through a good tutorial before doing anything more with them. Bitsets are one thing, but you really should at least be familiar with relationships, and I disapprove of any teaching method which omits them from the curriculum.
    Sorry, could you explain this statement?
    It's almost the simplest query imaginable — certainly the simplest using SELECT, WHERE, and AND. It says 'select every column from every row from the table "table" where the has_bluetooth column and the has_touchscreen column are both true.'
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •