LinuxGuruz
  • Last 5 Forum Topics
    Replies
    Views
    Last post


The Web Only This Site
  • BOOKMARK

  • ADD TO FAVORITES

  • REFERENCES


  • MARC

    Mailing list ARChives
    - Search by -
     Subjects
     Authors
     Bodies





    FOLDOC

    Computing Dictionary




  • Text Link Ads






  • LINUX man pages
  • Linux Man Page Viewer


    The following form allows you to view linux man pages.

    Command:

    pcrepartial

    
    
    

    PARTIAL MATCHING IN PCRE

    
           In  normal  use  of  PCRE,  if  the  subject  string  that is passed to
           pcre_exec() or pcre_dfa_exec() matches as far as it goes,  but  is  too
           short  to  match  the  entire  pattern, PCRE_ERROR_NOMATCH is returned.
           There are circumstances where it might be helpful to  distinguish  this
           case from other cases in which there is no match.
    
           Consider, for example, an application where a human is required to type
           in data for a field with specific formatting requirements.  An  example
           might be a date in the form ddmmmyy, defined by this pattern:
    
             ^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$
    
           If the application sees the user's keystrokes one by one, and can check
           that what has been typed so far is potentially valid,  it  is  able  to
           raise  an  error as soon as a mistake is made, possibly beeping and not
           reflecting the character that has been typed. This  immediate  feedback
           is  likely  to  be a better user interface than a check that is delayed
           until the entire string has been entered.
    
           PCRE supports the concept of partial matching by means of the PCRE_PAR-
           TIAL   option,   which   can   be   set  when  calling  pcre_exec()  or
           pcre_dfa_exec(). When this flag is set for pcre_exec(), the return code
           PCRE_ERROR_NOMATCH  is converted into PCRE_ERROR_PARTIAL if at any time
           during the matching process the last part of the subject string matched
           part  of  the  pattern. Unfortunately, for non-anchored matching, it is
           not possible to obtain the position of the start of the partial  match.
           No captured data is set when PCRE_ERROR_PARTIAL is returned.
    
           When   PCRE_PARTIAL   is  set  for  pcre_dfa_exec(),  the  return  code
           PCRE_ERROR_NOMATCH is converted into PCRE_ERROR_PARTIAL if the  end  of
           the  subject is reached, there have been no complete matches, but there
           is still at least one matching possibility. The portion of  the  string
           that provided the partial match is set as the first matching string.
    
           Using PCRE_PARTIAL disables one of PCRE's optimizations. PCRE remembers
           the last literal byte in a pattern, and abandons  matching  immediately
           if  such a byte is not present in the subject string. This optimization
           cannot be used for a subject string that might match only partially.
    
    
    

    RESTRICTED PATTERNS FOR PCRE_PARTIAL

    
           Because of the way certain internal optimizations  are  implemented  in
           the  pcre_exec()  function, the PCRE_PARTIAL option cannot be used with
           all patterns. These restrictions do not apply when  pcre_dfa_exec()  is
           used.  For pcre_exec(), repeated single characters such as
    
             a{2,4}
    
           and repeated single metasequences such as
    
           If PCRE_PARTIAL is set for a pattern  that  does  not  conform  to  the
           restrictions,  pcre_exec() returns the error code PCRE_ERROR_BADPARTIAL
           (-13).  You can use the PCRE_INFO_OKPARTIAL call to pcre_fullinfo()  to
           find out if a compiled pattern can be used for partial matching.
    
    
    

    EXAMPLE OF PARTIAL MATCHING USING PCRETEST

    
           If  the  escape  sequence  \P  is  present in a pcretest data line, the
           PCRE_PARTIAL flag is used for the match. Here is a run of pcretest that
           uses the date example quoted above:
    
               re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
             data> 25jun04\P
              0: 25jun04
              1: jun
             data> 25dec3\P
             Partial match
             data> 3ju\P
             Partial match
             data> 3juj\P
             No match
             data> j\P
             No match
    
           The  first  data  string  is  matched completely, so pcretest shows the
           matched substrings. The remaining four strings do not  match  the  com-
           plete  pattern,  but  the first two are partial matches. The same test,
           using pcre_dfa_exec() matching (by means of the  \D  escape  sequence),
           produces the following output:
    
               re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
             data> 25jun04\P\D
              0: 25jun04
             data> 23dec3\P\D
             Partial match: 23dec3
             data> 3ju\P\D
             Partial match: 3ju
             data> 3juj\P\D
             No match
             data> j\P\D
             No match
    
           Notice  that in this case the portion of the string that was matched is
           made available.
    
    
    

    MULTI-SEGMENT MATCHING WITH pcre_dfa_exec()

    
           When a partial match has been found using pcre_dfa_exec(), it is possi-
           ble  to  continue  the  match  by providing additional subject data and
           calling pcre_dfa_exec() again with the same  compiled  regular  expres-
           sion, this time setting the PCRE_DFA_RESTART option. You must also pass
           the same working space as before, because this is where details of  the
           matched  string. It is up to the calling program to do that if it needs
           to.
    
           You can set PCRE_PARTIAL  with  PCRE_DFA_RESTART  to  continue  partial
           matching over multiple segments. This facility can be used to pass very
           long subject strings to pcre_dfa_exec(). However, some care  is  needed
           for certain types of pattern.
    
           1.  If  the  pattern contains tests for the beginning or end of a line,
           you need to pass the PCRE_NOTBOL or PCRE_NOTEOL options,  as  appropri-
           ate,  when  the subject string for any call does not contain the begin-
           ning or end of a line.
    
           2. If the pattern contains backward assertions (including  \b  or  \B),
           you  need  to  arrange for some overlap in the subject strings to allow
           for this. For example, you could pass the subject in  chunks  that  are
           500  bytes long, but in a buffer of 700 bytes, with the starting offset
           set to 200 and the previous 200 bytes at the start of the buffer.
    
           3. Matching a subject string that is split into multiple segments  does
           not  always produce exactly the same result as matching over one single
           long string.  The difference arises when there  are  multiple  matching
           possibilities,  because a partial match result is given only when there
           are no completed matches in a call to pcre_dfa_exec(). This means  that
           as  soon  as  the  shortest match has been found, continuation to a new
           subject segment is no longer possible.  Consider this pcretest example:
    
               re> /dog(sbody)?/
             data> do\P\D
             Partial match: do
             data> gsb\R\P\D
              0: g
             data> dogsbody\D
              0: dogsbody
              1: dog
    
           The  pattern matches the words "dog" or "dogsbody". When the subject is
           presented in several parts ("do" and "gsb" being  the  first  two)  the
           match  stops  when "dog" has been found, and it is not possible to con-
           tinue. On the other hand,  if  "dogsbody"  is  presented  as  a  single
           string, both matches are found.
    
           Because  of  this  phenomenon,  it does not usually make sense to end a
           pattern that is going to be matched in this way with a variable repeat.
    
           4. Patterns that contain alternatives at the top level which do not all
           start with the same pattern item may not work as expected. For example,
           consider this pattern:
    
             1234|3789
    
           If  the  first  part of the subject is "ABC123", a partial match of the
           Philip Hazel
           University Computing Service
           Cambridge CB2 3QH, England.
    
    
    

    REVISION

    
           Last updated: 04 June 2007
           Copyright (c) 1997-2007 University of Cambridge.
    
                                                                    PCREPARTIAL(3)
    
  • MORE RESOURCE


  • Linux

    The Distributions





    Linux

    The Software





    Linux

    The News



  • MARKETING






  • Toll Free

webmaster@linuxguruz.com
Copyright © 1999 - 2016 by LinuxGuruz