Results 1 to 4 of 4

Thread: line terminators. Yes, PCRE again.

  1. #1
    Join Date
    Jan 2007
    Location
    Davenport, Iowa
    Posts
    2,385
    Thanks
    100
    Thanked 113 Times in 111 Posts

    Default line terminators. Yes, PCRE again.

    I seem to post a lot of PCRE questions lately. Well, this time I mostly want to know a few of the very basics.

    1. Do you call PHP's PCRE function PCRE or regular expressions or regexp or PCRE. I know there are differences between PCRE and Perl so I have been calling it PCRE. What term should I be using?

    2. Is \r\n known as a line terminator? What are they used for? I seem to only get them to display when in notepad mode for editing a script. I figured it would be used for more than that. In fact I used PCRE to create a program that would display code just like forum programs do to display code including the typed in line terminators as well. Or at least similar to how forums do it. By the way, I have not found any built in function among the very many available with PHP that will display \r\n located in a string.

    3. Is it possible to use PCRE to match everything that does not match a set like anything that is not aeiou or sometimes y or w? I have seen it for things like... actually
    PHP Code:
    $string preg_replace('/[\d\s\Waeiouyw]/'''$string); 
    will remove anything that is not a letter and will even get rid of letters aeiouyw as well with:

    \s = whitespace like a space or tab or line terminator.
    \d = any decimal digit.
    \W = any non word character.

    sorry, that last one was supposed to be my main question I know that isn't the best way to answer #3, but it is a start.

    4. is there a way to replace a space with a space? For example
    PHP Code:
    $string preg_replace('/[\040]/''/\040/'$string); 
    preg_replace('/((\040){2,2})/',"  ",$string); works, but
    preg_replace('/((\040){2,2})/'," ",$string]); does not.

    fear not, I'll probably up and buy a book on PCRE patterns soon the way I am reading up on them...
    Last edited by james438; 08-04-2007 at 07:32 AM.

  2. #2
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    1. Do you call PHP's PCRE function PCRE or regular expressions or regexp or PCRE. I know there are differences between PCRE and Perl so I have been calling it PCRE. What term should I be using?
    Because PHP has non-Perl-compatible regular expressions as well (ereg() and friends), most PHP developers refer to them as PCRE to avoid confusion.
    2. Is \r\n known as a line terminator?
    It's a Windows line terminator, consisting of a carriage return and a line feed, thus sometimes abbreviated CRLF. UNIX derivatives use a single line feed character.
    What are they used for?
    Er... terminating lines?
    I seem to only get them to display when in notepad mode for editing a script.
    HTML "ignores" line-breaks.
    I figured it would be used for more than that. In fact I used PCRE to create a program that would display code just like forum programs do to display code including the typed in line terminators as well. Or at least similar to how forums do it.
    Forums tend to just str_replace("\n", "<br>\n", str_replace("\r", "", $post)). This isn't always semantically correct, which is why I prefer Markdown or similar.
    By the way, I have not found any built in function among the very many available with PHP that will display \r\n located in a string.
    "Display" how? In HTML, the code above will work (although it's not usually best). If you mean actually display them as \r and \n, try str_replace("\r", '\r', str_replace("\n", '\n', $string)).
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

  3. #3
    Join Date
    Jan 2007
    Location
    Davenport, Iowa
    Posts
    2,385
    Thanks
    100
    Thanked 113 Times in 111 Posts

    Default

    heh, thanks for the quick answers! I wasn't sure they would be interesting enough to answer :P

    for those that are wondering

    CRLF = carriage return line feed

    for those still wondering:

    carriage returns are those things that I used to play with on my grandfather's manual typewriter in order to go down to the next line and all the way to the left.

    I was wondering about the built in functions, because it was difficult to discover that \r\n even existed for a long time for me until I discovered pattern matches. I was using things like htmlentities and htmlspecialchars and others to try and display what was being used to replace my CRLFs, because while they were not being recognized I noticed they were not being deleted either. No matter, I know what is being used now. I learned it mostly by accident

    anyway, I did some more testing and I think I found out the answer to my last question.
    PHP Code:
    $r="\40";
    $string preg_replace('/[bd]/'$r$string); 
    will display spaces, or octal code just fine, but spaces still seem to be condensed to just one. in fact in the above code that I tried to post where there were two spaces next to each other the forum condensed the two down to one. (There is a spot of typo in the code above that I will try to fix right now)

    Thanks again!
    Last edited by james438; 08-04-2007 at 07:54 AM. Reason: grammar

  4. #4
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    carriage returns are those things that I used to play with on my grandfather's manual typewriter in order to go down to the next line and all the way to the left.
    Heh. Not quite -- the carriage return sends the carriage to the left, then the line feed sends it down one line. You're right, this is a leftover from the days of typewriters. On computers, of course, the two operations are rarely seen independently of one another, so it makes sense to save that space (one byte per line) and eliminate the \r.
    spaces still seem to be condensed to just one.
    Again, this is an HTML whitespace parsing rule.
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •