Log in

View Full Version : Making search through text file?



Nis
04-02-2010, 12:10 AM
Hey guys, I got a form on an html that sends a string to a php script, and I want to search through a textfile to find any partial matches to the string that was inputted.

So let's say the textfile was delimited by | or an endline between each term.

Anyone know how to get me started? I only started php recently, but have been doing c++ for a year or so now. So please try to explain the syntax if possible as well, thanks.

traq
04-02-2010, 01:01 AM
if you have each term on a new line, you can use file() to read the file into an array, and then use preg_match() to compare the submitted string to each term:

<?php

$terms = file('path/to/file.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$input = $_POST['user_submission'];

foreach($terms as $t){
if(preg_match($t, $input)){ echo 'Found match: '.$t.'<br>'; }
}

?>

djr33
04-02-2010, 01:59 AM
Note: file_get_contents() is like file() but doesn't mess with splitting new lines into an array. That may or may not be helpful.

Nis
04-02-2010, 04:25 AM
$terms = file('path/to/file.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);

What does FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES do? It looks to me like it is some sort of function or something. I'm really not used to PHP because it is such a high level language.

Also, what if there's also a delimiter to some of the terms? Basically it is sort of like, a movie data base search or something. Where you have on one line in the format of

Actor(s) | Director | Movie Name

Where | is a delimiter. So if I search for, Daniel Craig for example, it'll short circut itself and go to the next line to see if that is in there.

By the way, I really appreicate the help. Thanks.

djr33
04-02-2010, 06:34 AM
In the code posted above the delimiter is not handled. It will simply give you the whole row in which that is found.
You can use explode('|',$variable) to split a text string into parts around a delimiter, but I'm not sure exactly what you want to do with that.

As for the capitalized strings at the end of the function, they are parameters based on PHP default constants.
To explain, a constant is like a variable except that it cannot change its value (ever).
define(CONSTANT,'value');
Those above are predefined by PHP to be certain values. The idea then is that they are parsed by the function (based technically on their assigned values), then this makes the function do different things such as in this case "ignore new lines". In this type of use in a function, they are called "flags" in the sense that they are markers for certain parameters to be processed.
Now that's just the general explanation. I'm not sure about these specific items because I haven't had the need to use them.
But for questions like this it's always a good idea to refer to the php.net reference page for the function in question. They'll be defined there:
http://php.net/manual/en/function.file.php

Nis
04-02-2010, 04:56 PM
Alright, I think I can somehow do this. Thanks a lot guys!

traq
04-02-2010, 07:47 PM
The FILE_IGNORE_NEW_LINES flag removes the \n (newline) character from each line of the file, so you'd end up with 'this is a line from the file' instead of 'this is a line from the file\n'.

The FILE_SKIP_EMPTH_LINES flag tells the function to skip lines with no content (otherwise, you'd end up with empty strings in your array whereever there was an empty line in the file).



in your post above, it looks like you may have multiple "actor" names on each line? If so, you'd need to use a second (different) delimiter (e.g., "actor one,actor two|director|movie name" and use explode() twice. With this complexity, however, it would be much quicker (and simpler) to simply use a database.

Good luck!

djr33
04-02-2010, 08:38 PM
I agree about using a database. Once the search operations become complex (such as having one thing embedded in another thing then you want to search), it's much faster and easier to use a database.
In theory you can use a text file for anything, but the programming for databases will be much more efficient (and faster), and it will be more standard so it will be easier to figure out.

Databases are a bit harder at the basic level but from there it's a lot easier to get to an advanced level.

Nis
04-05-2010, 06:55 PM
I understand that the normal use is a database but, being that this is a assignment, prof tends to come with the theoratical non-practical real life situation.

james438
04-05-2010, 07:58 PM
Homework assignments are not allowed to be discussed here and requests for such will quickly get locked. The reason is that getting your work done for you on this forum is sorta like cheating.

I do find the topic somewhat interesting to read. I do not think the parameters of what is being requested have been clearly defined though. Storing the data in a database and then searching the database is the best way to go whenever possible. It is faster, more efficient, and you get more accurate results too with less margin for error.

I think that it is important to be able to search through a text document, though, because sometimes you need to search through one or multiple php files for a particular line of code for the purpose of fixing a hard to find error or errors.

traq
04-05-2010, 08:12 PM
I generally prefer XML for text files more complex than a usage log.

djr33
04-06-2010, 04:52 AM
You can ask QUESTIONS about homework, but you should not be asking for us to do it for you. The difference is that you will post your code and ask for specific help, then we will see if there's some way to help you with what is giving you trouble-- not tell you where to start, what to do next and continue until it's done, and certainly not write it for you. There's no point, if you're in a class. If you don't do the work, you won't learn, and there's no reason we'd want extra homework.

I think the entire point of this assignment is to learn how to parse text files and that's actually useful to understand. If you want to go the advanced route, you can look into regex/regular expressions, but that can get complex.
If not, just start understanding the basics of reading files (it's not that hard-- just take some examples from php.net and work it out from there), then learn to work with strings.
If you aren't entirely confused by arrays, one easy way out is always to use the function called "explode()".

Of course cross-referencing will become the big complication. In a database you can actually just do a search. Here you'd have to load all of the data into variables to search through that. Arrays will help. Or you could just try to split the text file at the instance of the search term and find the closest related value (split at city, then find state, or whatever your goal might be).

For good reason, no one uses text files to search and parse all the time. They're fine for a once in a while thing, such as storage, but then for real use they are put into a database for searching. And you can output later to text files for storage, etc.
In other words, they're only parsed into and out of the database. After that they're used within the database. And it's much easier doing it once in a while just loading/unloading than actually using them.
Of course you can just deal with it by loading them into a database or set of complex variables just at the moment to search, but that's VERY VERY inefficient and I can guarantee that anyone giving out that kind of assignment is just testing your knowledge of the language/logic, and not hoping you'll ever use this for anything.


Anyway, good luck, and please keep in mind that we should be helping when you get stuck, not doing anything for you, especially when it comes to homework assignments.