View Full Version : Resolved Regex extract sequence of a certain format from string
djr33
10-04-2010, 05:44 PM
I think this is possible, but it's beyond my ability with regex.
What I want is to extract a certain pattern from a string.
So if the pattern is "any number" then I want to extract it from any string:
'this is a string123 test'
That returns '123'.
The specific pattern I am matching is a 10-digit hexadecimal code.
This means: exactly 10 characters, each of which is 0-9a-f.
I would like the function to take a string as input and return an array containing all possible matches.
Let me know if that isn't clear.
bluewalrus
10-04-2010, 06:16 PM
If the numbers are always all grouped together this should work.
$string = "this is a string123 test";
$numbers = preg_replace('/.*?(\d+).*/', '$1', $string);
echo $numbers;
Actually on rethinking this you don't even need a regex.
You can just set it as an integer and all the text will be removed.
settype($string, "integer");
[\dABCDEF]{10}
untested.
I think [0-9A-F]{10} would work equally well, and [0-9A-Fa-f]{10} if you might have lowercase hex numbers.
james438
10-04-2010, 10:30 PM
Could I have a better example of some of the text that you need processed? For example the following would work:
<?php
$css = "this is a test:ade40275ad tadaa.bde40275ad ade40275ad ade40275ad";
preg_match_all('/[a-fA-F0-9]{10}/s', $css, $match);
print_r($match);
?>
resulting in:
Array ( [0] => Array ( [0] => ade40275ad [1] => bde40275ad [2] => ade40275ad
[3] => ade40275ad ) )
I suspect that the parameters that you are working with are somewhat more complicated than that.
djr33
10-05-2010, 02:35 PM
Bluewalrus, thanks for the post, but I need this for hexadecimal not just regular integers. I like that creative method though. It actually might be possible if I set the format of the number of hexadecimal then try that... not sure if PHP is capable of ordering it that way though.
Adrian and James, thanks! One of those will work, I'm sure.
Yes, it might be lowercase or uppercase.
No, it shouldn't be much more complex than that.
I'm creating a verification system so that users must reply to an email and the subject line contains a verification sequence (a 10 digit portion of an md5-generated hash string). The email will be sent to them. All they need to do is reply. In most cases the subject line will still be "{seq}" or "RE: {seq}" but I don't want it to fail if the user does something unexpected. Additionally, if there is something wrong with the email sent to them or they want to reply from a different account, they might type it manually in a different email.
The purpose of this is to create an email based uploading system so that users on mobile devices can attach an image or audio clip to an email sent to my server [an automatically managed email address/account] that then will check incoming messages and "upload" attachments to the server. Of course I don't want just any files getting through since users must be registered, so I will give them a 'key' for each 'upload' and they just need to include that somewhere in the subject line of the email.
Very complex to figure out, but worth it because it's pretty easy for the user and in many cases mobile browsers don't have the ability to do a normal file upload. For example, the iphone. But it can attach a file to an email and send it, so that's all that needs to be done here.
cool... I'd actually be interested in how you're checking the email and uploading the attachments from it.
djr33
10-06-2010, 04:53 AM
I'm using the IMAP functions to connect to an email account.
The system isn't polished yet, but it is functional. If you want to see some specific code, contact me directly.
It's not really something I can easily post here since it exists in several pieces (and I wouldn't want to until it's worked out a bit more).
The hardest part was simply figuring out how to deal with the mail server. The rest is just step by step in PHP-- not "easy" exactly, but a lot easier than figuring out the mail server and connecting to it.
Another problem is that it either needs to run as a cron job or be triggered by the user, and either way it's not fast: about a 3 second delay to check and parse the messages (connection time, not transfer time for files-- that's fast usually, not noticable).
hmm.
I've had a passing curiosity about emailing stuff to my website (like Facebook mobile allows). I'll look into IMAP, and drop you a note if I want to know more.
thanks
djr33
10-07-2010, 03:21 AM
It's all very possible, but as I said, the biggest problem becomes the cron job part of it. It's easy to send an email because you have a triggering event (user doing something) but checking for emails ends up with the same problem that creating a chat script does: basically you need to constantly refresh. Cron jobs may be the answer, but they're not efficient for the server.
Let me know how it goes. I've been curious about this for a while too but never had a specific need to try it.
well, I may not provide an answer. I'd probably be happy with using my login as a trigger to check for emails. :D
(actually, that's a hacky solution, but it would probably work pretty well - every time you serve a page to a logged-in user, check that user's "inbox." I think you could use ob_start() + ob_flush() to effectively do it in the background -serve all the page content first, then deal with the email connection... serve a little bit of ajax to wait for the response, or save it to the session and display on the next page load.)
I may be completely wrong, however. ;)
djr33
10-07-2010, 03:27 PM
That method would work fine. Just check for email when you look for email.
However, if you want something else to automatically happen such as an "auto-reply" things get much more complicated. In my case, I am attempting to 'upload' files via email, so this should happen when the email is sent (and 'received') not the next time someone logs in.
yeah.
do you have any control over the mail server? I don't know anything about it, but it seems you might be able to have the mail server "ping" your upload scripts when there's a new message...? just a thought.
djr33
10-07-2010, 06:01 PM
Two answers to that: at the moment, no, since I don't have it configured that way.
In the future, it might be possible, but that would require a different language than PHP, so it wouldn't really be a "PHP" thing. Of course for the purpose that's fine. As a test for what PHP can do, it doesn't really accomplish much.
djr33
10-08-2010, 09:46 PM
I looked into the ideas above a bit and I came up with an interesting link:
http://harrybailey.com/2009/02/send-or-pipe-an-email-to-a-php-script/
That requires cpanel, though.
This looks like a way around it:
http://forum.powweb.com/archive/index.php/t-63142.html
I think you still need to install a mail server on the system though.
Everything I saw on the powweb link used cron jobs.
Does your server not use cPanel? I wonder if other control panels wouldn't have a similar option... ::wanders off to check::
djr33
11-07-2010, 05:58 PM
Returning to this thread after a while, I'm now actually setting this up for my project.
I'm using a cron job and checking if the subject contains a string that looks right. If so, try to 'upload'-- see if there's an attachment, validate the code, etc.
BTW, Adrian, I missed your question earlier: no, the current host doesn't use cpanel, unfortunately.
I'm happy with this cron job though since it does other things as well.
I asked 1&1 about "piping" emails to a script, and the initial customer service guy didn't *quite* know what I was talking about... We'll see if I can get any further. :)
djr33
11-08-2010, 06:01 PM
I imagine it's not something that your average host will support (at least not actively-- there might be an option somewhere). Do let me know if you figure it out though.
With an external mail server and a cron job it's easy enough, if you can deal with the slight delay, slight inefficiency, and that the script is slow to run because it must connect to the external server-- the same sort of delays you get with FTP-- each action requires logging in again etc. (At least I imagine this is what is happening. I'm not sure how it functions at the deepest level.)
djr33
11-10-2010, 04:09 AM
After some thorough testing, I found a flaw in the regex. Something is wrong with preg_match_all().
If there are overlapping matches, such as '1234' where we want "3 numbers in a row" which should return 123 and 234, preg_match_all() only returns one.
I'm posting because I'm kinda curious about why that's not working.
I solved it using this, so there's no need to fix it, just wondering.
$matches = array();
while (strlen($str)>=10) {
preg_match('/[a-fA-F0-9]{10}/', $str, $match);
$matches[] = $match;
$str = substr($str,1);
}
after finding a match, preg_match_all (http://us2.php.net/manual/en/function.preg-match-all.php) continues searching from the end of that match (it won't look for "overlapping" matches).
It seems there ought to be such a function, but I can't seem to find one right now...
You could do it by putting preg_match() in a loop and incrementing the offset each time, for the length of the string. Not very efficient, I guess. You'd have to keep track of the match positions, as well, or you'd count some of the matches more than once.
djr33
11-10-2010, 06:10 AM
You could do it by putting preg_match() in a loop and incrementing the offset each time, for the length of the string. Not very efficient, I guess. You'd have to keep track of the match positions, as well, or you'd count some of the matches more than once.See above :)
It's not that inefficient considering that string operations aren't too bad and regex is bad, but the difference between an internally recursive function and looping it is minimal.
It works. Now I understand "end" in the correct way... I had missed that completely before reading through the documentation trying to figure it out. Thanks.
james438
11-10-2010, 11:31 AM
PCRE and perl have built in recursive abilities see http://php.net/manual/en/regexp.reference.recursive.php. Finding information on recursion or examples of it has been rather difficult to track down, but I was able to get this one from the kind people at regexadvice.com
<?php
$text = 'text text text text text';
$regexp = '{((\[([^\]]+)\])((?:(?:(?!\[/?\3\]).)*+|(?1))*)(\[/\3\]))}si';
while(preg_match($regexp,$text,$match)){
$text = preg_replace($regexp,'<$3>$4</$3>',$text);
}
echo $text;
?>
The above is something that will convert certain bbcode to html code. In this case
text text text text text
is converted to
<one>text <two>text text text</two> text</one>
notice that the mismatched bbcode was ignored.
It was also around this time that regex was making my head hurt so I stopped reading up on it. As I recall, this PCRE example still needs a touch of work, but for me should be a perfect example for experimenting with recursion later on when the desire/need arises to investigate this aspect of PCRE. I do not recall if/what/where this pattern failed, but yeah, it is beyond my understanding of PCRE.
On a side note, the script above looks like it should make sense, I have an understanding of all the terms, but with all the nested loops and different types going on in the PCRE given above it just makes my head hurt. It is like the feeling you get when you look at a card table covered with the 5000 pieces to a jigsaw puzzle! Sometimes it's best just to shove the pieces back into the box and put it away for a year or three.
See above :)
$#!+ ...
I was tired last night.
Powered by vBulletin® Version 4.2.2 Copyright © 2021 vBulletin Solutions, Inc. All rights reserved.