JavaScriptCore/pcre/pcre_compile.cpp
author Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
Fri, 17 Sep 2010 09:02:29 +0300
changeset 0 4f2f89ce4247
permissions -rw-r--r--
Revision: 201037
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
     1
/* This is JavaScriptCore's variant of the PCRE library. While this library
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
     2
started out as a copy of PCRE, many of the features of PCRE have been
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
     3
removed. This library now supports only the regular expression features
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
     4
required by the JavaScript language specification, and has only the functions
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
     5
needed by JavaScriptCore and the rest of WebKit.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
     6
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
     7
                 Originally written by Philip Hazel
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
     8
           Copyright (c) 1997-2006 University of Cambridge
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
     9
    Copyright (C) 2002, 2004, 2006, 2007, 2008, 2009 Apple Inc. All rights reserved.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    10
    Copyright (C) 2007 Eric Seidel <eric@webkit.org>
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    11
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    12
-----------------------------------------------------------------------------
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    13
Redistribution and use in source and binary forms, with or without
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    14
modification, are permitted provided that the following conditions are met:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    15
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    16
    * Redistributions of source code must retain the above copyright notice,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    17
      this list of conditions and the following disclaimer.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    18
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    19
    * Redistributions in binary form must reproduce the above copyright
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    20
      notice, this list of conditions and the following disclaimer in the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    21
      documentation and/or other materials provided with the distribution.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    22
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    23
    * Neither the name of the University of Cambridge nor the names of its
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    24
      contributors may be used to endorse or promote products derived from
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    25
      this software without specific prior written permission.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    26
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    27
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    28
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    29
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    30
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    31
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    32
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    33
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    34
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    35
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    36
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    37
POSSIBILITY OF SUCH DAMAGE.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    38
-----------------------------------------------------------------------------
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    39
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    40
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    41
/* This module contains the external function jsRegExpExecute(), along with
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    42
supporting internal functions that are not used by other modules. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    43
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    44
#include "config.h"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    45
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    46
#include "pcre_internal.h"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    47
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    48
#include <string.h>
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    49
#include <wtf/ASCIICType.h>
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    50
#include <wtf/FastMalloc.h>
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    51
#include <wtf/FixedArray.h>
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    52
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    53
using namespace WTF;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    54
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    55
/* Negative values for the firstchar and reqchar variables */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    56
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    57
#define REQ_UNSET (-2)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    58
#define REQ_NONE  (-1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    59
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    60
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    61
*      Code parameters and static tables         *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    62
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    63
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    64
/* Maximum number of items on the nested bracket stacks at compile time. This
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    65
applies to the nesting of all kinds of parentheses. It does not limit
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    66
un-nested, non-capturing parentheses. This number can be made bigger if
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    67
necessary - it is used to dimension one int and one unsigned char vector at
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    68
compile time. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    69
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    70
#define BRASTACK_SIZE 200
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    71
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    72
/* Table for handling escaped characters in the range '0'-'z'. Positive returns
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    73
are simple data values; negative values are for special things like \d and so
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    74
on. Zero means further processing is needed (for things like \x), or the escape
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    75
is invalid. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    76
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    77
static const short escapes[] = {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    78
     0,      0,      0,      0,      0,      0,      0,      0,   /* 0 - 7 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    79
     0,      0,    ':',    ';',    '<',    '=',    '>',    '?',   /* 8 - ? */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    80
   '@',      0, -ESC_B,      0, -ESC_D,      0,      0,      0,   /* @ - G */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    81
     0,      0,      0,      0,      0,      0,      0,      0,   /* H - O */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    82
     0,      0,      0, -ESC_S,      0,      0,      0, -ESC_W,   /* P - W */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    83
     0,      0,      0,    '[',   '\\',    ']',    '^',    '_',   /* X - _ */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    84
   '`',      7, -ESC_b,      0, -ESC_d,      0,   '\f',      0,   /* ` - g */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    85
     0,      0,      0,      0,      0,      0,   '\n',      0,   /* h - o */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    86
     0,      0,    '\r', -ESC_s,   '\t',      0,  '\v', -ESC_w,   /* p - w */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    87
     0,      0,      0                                            /* x - z */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    88
};
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    89
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    90
/* Error code numbers. They are given names so that they can more easily be
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    91
tracked. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    92
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    93
enum ErrorCode {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    94
    ERR0, ERR1, ERR2, ERR3, ERR4, ERR5, ERR6, ERR7, ERR8, ERR9,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    95
    ERR10, ERR11, ERR12, ERR13, ERR14, ERR15, ERR16, ERR17
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    96
};
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    97
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    98
/* The texts of compile-time error messages. These are "char *" because they
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
    99
are passed to the outside world. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   100
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   101
static const char* errorText(ErrorCode code)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   102
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   103
    static const char errorTexts[] =
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   104
      /* 1 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   105
      "\\ at end of pattern\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   106
      "\\c at end of pattern\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   107
      "character value in \\x{...} sequence is too large\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   108
      "numbers out of order in {} quantifier\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   109
      /* 5 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   110
      "number too big in {} quantifier\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   111
      "missing terminating ] for character class\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   112
      "internal error: code overflow\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   113
      "range out of order in character class\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   114
      "nothing to repeat\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   115
      /* 10 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   116
      "unmatched parentheses\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   117
      "internal error: unexpected repeat\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   118
      "unrecognized character after (?\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   119
      "failed to get memory\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   120
      "missing )\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   121
      /* 15 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   122
      "reference to non-existent subpattern\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   123
      "regular expression too large\0"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   124
      "parentheses nested too deeply"
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   125
    ;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   126
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   127
    int i = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   128
    const char* text = errorTexts;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   129
    while (i > 1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   130
        i -= !*text++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   131
    return text;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   132
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   133
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   134
/* Structure for passing "static" information around between the functions
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   135
doing the compiling. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   136
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   137
struct CompileData {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   138
    CompileData() {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   139
        topBackref = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   140
        backrefMap = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   141
        reqVaryOpt = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   142
        needOuterBracket = false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   143
        numCapturingBrackets = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   144
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   145
    int topBackref;            /* Maximum back reference */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   146
    unsigned backrefMap;       /* Bitmap of low back refs */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   147
    int reqVaryOpt;            /* "After variable item" flag for reqByte */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   148
    bool needOuterBracket;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   149
    int numCapturingBrackets;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   150
};
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   151
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   152
/* Definitions to allow mutual recursion */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   153
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   154
static bool compileBracket(int, int*, unsigned char**, const UChar**, const UChar*, ErrorCode*, int, int*, int*, CompileData&);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   155
static bool bracketIsAnchored(const unsigned char* code);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   156
static bool bracketNeedsLineStart(const unsigned char* code, unsigned captureMap, unsigned backrefMap);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   157
static int bracketFindFirstAssertedCharacter(const unsigned char* code, bool inassert);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   158
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   159
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   160
*            Handle escapes                      *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   161
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   162
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   163
/* This function is called when a \ has been encountered. It either returns a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   164
positive value for a simple escape such as \n, or a negative value which
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   165
encodes one of the more complicated things such as \d. When UTF-8 is enabled,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   166
a positive value greater than 255 may be returned. On entry, ptr is pointing at
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   167
the \. On exit, it is on the final character of the escape sequence.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   168
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   169
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   170
  ptrPtr         points to the pattern position pointer
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   171
  errorCodePtr   points to the errorcode variable
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   172
  bracount       number of previous extracting brackets
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   173
  options        the options bits
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   174
  isClass        true if inside a character class
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   175
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   176
Returns:         zero or positive => a data character
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   177
                 negative => a special escape sequence
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   178
                 on error, errorPtr is set
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   179
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   180
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   181
static int checkEscape(const UChar** ptrPtr, const UChar* patternEnd, ErrorCode* errorCodePtr, int bracount, bool isClass)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   182
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   183
    const UChar* ptr = *ptrPtr + 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   184
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   185
    /* If backslash is at the end of the pattern, it's an error. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   186
    if (ptr == patternEnd) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   187
        *errorCodePtr = ERR1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   188
        *ptrPtr = ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   189
        return 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   190
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   191
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   192
    int c = *ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   193
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   194
    /* Non-alphamerics are literals. For digits or letters, do an initial lookup in
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   195
     a table. A non-zero result is something that can be returned immediately.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   196
     Otherwise further processing may be required. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   197
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   198
    if (c < '0' || c > 'z') { /* Not alphameric */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   199
    } else if (int escapeValue = escapes[c - '0']) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   200
        c = escapeValue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   201
        if (isClass) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   202
            if (-c == ESC_b)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   203
                c = '\b'; /* \b is backslash in a class */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   204
            else if (-c == ESC_B)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   205
                c = 'B'; /* and \B is a capital B in a class (in browsers event though ECMAScript 15.10.2.19 says it raises an error) */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   206
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   207
    /* Escapes that need further processing, or are illegal. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   208
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   209
    } else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   210
        switch (c) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   211
            case '1':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   212
            case '2':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   213
            case '3':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   214
            case '4':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   215
            case '5':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   216
            case '6':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   217
            case '7':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   218
            case '8':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   219
            case '9':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   220
                /* Escape sequences starting with a non-zero digit are backreferences,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   221
                 unless there are insufficient brackets, in which case they are octal
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   222
                 escape sequences. Those sequences end on the first non-octal character
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   223
                 or when we overflow 0-255, whichever comes first. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   224
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   225
                if (!isClass) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   226
                    const UChar* oldptr = ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   227
                    c -= '0';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   228
                    while ((ptr + 1 < patternEnd) && isASCIIDigit(ptr[1]) && c <= bracount)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   229
                        c = c * 10 + *(++ptr) - '0';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   230
                    if (c <= bracount) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   231
                        c = -(ESC_REF + c);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   232
                        break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   233
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   234
                    ptr = oldptr;      /* Put the pointer back and fall through */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   235
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   236
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   237
                /* Handle an octal number following \. If the first digit is 8 or 9,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   238
                 this is not octal. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   239
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   240
                if ((c = *ptr) >= '8') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   241
                    c = '\\';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   242
                    ptr -= 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   243
                    break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   244
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   245
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   246
            /* \0 always starts an octal number, but we may drop through to here with a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   247
             larger first octal digit. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   248
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   249
            case '0': {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   250
                c -= '0';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   251
                int i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   252
                for (i = 1; i <= 2; ++i) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   253
                    if (ptr + i >= patternEnd || ptr[i] < '0' || ptr[i] > '7')
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   254
                        break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   255
                    int cc = c * 8 + ptr[i] - '0';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   256
                    if (cc > 255)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   257
                        break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   258
                    c = cc;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   259
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   260
                ptr += i - 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   261
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   262
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   263
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   264
            case 'x': {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   265
                c = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   266
                int i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   267
                for (i = 1; i <= 2; ++i) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   268
                    if (ptr + i >= patternEnd || !isASCIIHexDigit(ptr[i])) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   269
                        c = 'x';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   270
                        i = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   271
                        break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   272
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   273
                    int cc = ptr[i];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   274
                    if (cc >= 'a')
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   275
                        cc -= 32;             /* Convert to upper case */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   276
                    c = c * 16 + cc - ((cc < 'A') ? '0' : ('A' - 10));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   277
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   278
                ptr += i - 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   279
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   280
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   281
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   282
            case 'u': {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   283
                c = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   284
                int i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   285
                for (i = 1; i <= 4; ++i) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   286
                    if (ptr + i >= patternEnd || !isASCIIHexDigit(ptr[i])) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   287
                        c = 'u';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   288
                        i = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   289
                        break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   290
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   291
                    int cc = ptr[i];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   292
                    if (cc >= 'a')
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   293
                        cc -= 32;             /* Convert to upper case */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   294
                    c = c * 16 + cc - ((cc < 'A') ? '0' : ('A' - 10));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   295
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   296
                ptr += i - 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   297
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   298
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   299
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   300
            case 'c':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   301
                if (++ptr == patternEnd) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   302
                    *errorCodePtr = ERR2;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   303
                    return 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   304
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   305
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   306
                c = *ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   307
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   308
                /* To match Firefox, inside a character class, we also accept
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   309
                   numbers and '_' as control characters */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   310
                if ((!isClass && !isASCIIAlpha(c)) || (!isASCIIAlphanumeric(c) && c != '_')) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   311
                    c = '\\';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   312
                    ptr -= 2;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   313
                    break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   314
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   315
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   316
                /* A letter is upper-cased; then the 0x40 bit is flipped. This coding
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   317
                 is ASCII-specific, but then the whole concept of \cx is ASCII-specific. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   318
                c = toASCIIUpper(c) ^ 0x40;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   319
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   320
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   321
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   322
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   323
    *ptrPtr = ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   324
    return c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   325
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   326
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   327
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   328
*            Check for counted repeat            *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   329
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   330
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   331
/* This function is called when a '{' is encountered in a place where it might
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   332
start a quantifier. It looks ahead to see if it really is a quantifier or not.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   333
It is only a quantifier if it is one of the forms {ddd} {ddd,} or {ddd,ddd}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   334
where the ddds are digits.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   335
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   336
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   337
  p         pointer to the first char after '{'
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   338
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   339
Returns:    true or false
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   340
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   341
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   342
static bool isCountedRepeat(const UChar* p, const UChar* patternEnd)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   343
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   344
    if (p >= patternEnd || !isASCIIDigit(*p))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   345
        return false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   346
    p++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   347
    while (p < patternEnd && isASCIIDigit(*p))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   348
        p++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   349
    if (p < patternEnd && *p == '}')
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   350
        return true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   351
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   352
    if (p >= patternEnd || *p++ != ',')
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   353
        return false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   354
    if (p < patternEnd && *p == '}')
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   355
        return true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   356
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   357
    if (p >= patternEnd || !isASCIIDigit(*p))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   358
        return false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   359
    p++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   360
    while (p < patternEnd && isASCIIDigit(*p))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   361
        p++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   362
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   363
    return (p < patternEnd && *p == '}');
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   364
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   365
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   366
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   367
*         Read repeat counts                     *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   368
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   369
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   370
/* Read an item of the form {n,m} and return the values. This is called only
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   371
after isCountedRepeat() has confirmed that a repeat-count quantifier exists,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   372
so the syntax is guaranteed to be correct, but we need to check the values.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   373
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   374
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   375
  p              pointer to first char after '{'
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   376
  minp           pointer to int for min
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   377
  maxp           pointer to int for max
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   378
                 returned as -1 if no max
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   379
  errorCodePtr   points to error code variable
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   380
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   381
Returns:         pointer to '}' on success;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   382
                 current ptr on error, with errorCodePtr set non-zero
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   383
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   384
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   385
static const UChar* readRepeatCounts(const UChar* p, int* minp, int* maxp, ErrorCode* errorCodePtr)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   386
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   387
    int min = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   388
    int max = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   389
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   390
    /* Read the minimum value and do a paranoid check: a negative value indicates
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   391
     an integer overflow. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   392
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   393
    while (isASCIIDigit(*p))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   394
        min = min * 10 + *p++ - '0';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   395
    if (min < 0 || min > 65535) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   396
        *errorCodePtr = ERR5;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   397
        return p;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   398
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   399
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   400
    /* Read the maximum value if there is one, and again do a paranoid on its size.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   401
     Also, max must not be less than min. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   402
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   403
    if (*p == '}')
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   404
        max = min;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   405
    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   406
        if (*(++p) != '}') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   407
            max = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   408
            while (isASCIIDigit(*p))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   409
                max = max * 10 + *p++ - '0';
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   410
            if (max < 0 || max > 65535) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   411
                *errorCodePtr = ERR5;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   412
                return p;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   413
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   414
            if (max < min) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   415
                *errorCodePtr = ERR4;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   416
                return p;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   417
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   418
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   419
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   420
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   421
    /* Fill in the required variables, and pass back the pointer to the terminating
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   422
     '}'. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   423
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   424
    *minp = min;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   425
    *maxp = max;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   426
    return p;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   427
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   428
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   429
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   430
*      Find first significant op code            *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   431
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   432
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   433
/* This is called by several functions that scan a compiled expression looking
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   434
for a fixed first character, or an anchoring op code etc. It skips over things
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   435
that do not influence this.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   436
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   437
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   438
  code         pointer to the start of the group
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   439
Returns:       pointer to the first significant opcode
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   440
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   441
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   442
static const unsigned char* firstSignificantOpcode(const unsigned char* code)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   443
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   444
    while (*code == OP_BRANUMBER)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   445
        code += 3;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   446
    return code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   447
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   448
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   449
static const unsigned char* firstSignificantOpcodeSkippingAssertions(const unsigned char* code)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   450
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   451
    while (true) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   452
        switch (*code) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   453
            case OP_ASSERT_NOT:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   454
                advanceToEndOfBracket(code);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   455
                code += 1 + LINK_SIZE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   456
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   457
            case OP_WORD_BOUNDARY:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   458
            case OP_NOT_WORD_BOUNDARY:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   459
                ++code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   460
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   461
            case OP_BRANUMBER:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   462
                code += 3;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   463
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   464
            default:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   465
                return code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   466
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   467
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   468
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   469
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   470
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   471
*           Get othercase range                  *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   472
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   473
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   474
/* This function is passed the start and end of a class range, in UTF-8 mode
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   475
with UCP support. It searches up the characters, looking for internal ranges of
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   476
characters in the "other" case. Each call returns the next one, updating the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   477
start address.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   478
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   479
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   480
  cptr        points to starting character value; updated
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   481
  d           end value
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   482
  ocptr       where to put start of othercase range
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   483
  odptr       where to put end of othercase range
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   484
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   485
Yield:        true when range returned; false when no more
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   486
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   487
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   488
static bool getOthercaseRange(int* cptr, int d, int* ocptr, int* odptr)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   489
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   490
    int c, othercase = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   491
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   492
    for (c = *cptr; c <= d; c++) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   493
        if ((othercase = jsc_pcre_ucp_othercase(c)) >= 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   494
            break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   495
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   496
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   497
    if (c > d)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   498
        return false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   499
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   500
    *ocptr = othercase;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   501
    int next = othercase + 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   502
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   503
    for (++c; c <= d; c++) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   504
        if (jsc_pcre_ucp_othercase(c) != next)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   505
            break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   506
        next++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   507
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   508
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   509
    *odptr = next - 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   510
    *cptr = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   511
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   512
    return true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   513
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   514
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   515
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   516
 *       Convert character value to UTF-8         *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   517
 *************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   518
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   519
/* This function takes an integer value in the range 0 - 0x7fffffff
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   520
 and encodes it as a UTF-8 character in 0 to 6 bytes.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   521
 
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   522
 Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   523
 cvalue     the character value
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   524
 buffer     pointer to buffer for result - at least 6 bytes long
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   525
 
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   526
 Returns:     number of characters placed in the buffer
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   527
 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   528
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   529
static int encodeUTF8(int cvalue, unsigned char *buffer)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   530
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   531
    int i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   532
    for (i = 0; i < jsc_pcre_utf8_table1_size; i++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   533
        if (cvalue <= jsc_pcre_utf8_table1[i])
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   534
            break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   535
    buffer += i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   536
    for (int j = i; j > 0; j--) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   537
        *buffer-- = 0x80 | (cvalue & 0x3f);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   538
        cvalue >>= 6;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   539
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   540
    *buffer = jsc_pcre_utf8_table2[i] | cvalue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   541
    return i + 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   542
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   543
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   544
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   545
*           Compile one branch                   *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   546
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   547
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   548
/* Scan the pattern, compiling it into the code vector.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   549
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   550
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   551
  options        the option bits
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   552
  brackets       points to number of extracting brackets used
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   553
  codePtr        points to the pointer to the current code point
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   554
  ptrPtr         points to the current pattern pointer
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   555
  errorCodePtr   points to error code variable
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   556
  firstbyteptr   set to initial literal character, or < 0 (REQ_UNSET, REQ_NONE)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   557
  reqbyteptr     set to the last literal character required, else < 0
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   558
  cd             contains pointers to tables etc.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   559
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   560
Returns:         true on success
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   561
                 false, with *errorCodePtr set non-zero on error
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   562
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   563
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   564
static inline bool safelyCheckNextChar(const UChar* ptr, const UChar* patternEnd, UChar expected)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   565
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   566
    return ((ptr + 1 < patternEnd) && ptr[1] == expected);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   567
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   568
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   569
static bool
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   570
compileBranch(int options, int* brackets, unsigned char** codePtr,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   571
               const UChar** ptrPtr, const UChar* patternEnd, ErrorCode* errorCodePtr, int *firstbyteptr,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   572
               int* reqbyteptr, CompileData& cd)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   573
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   574
    int repeatType, opType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   575
    int repeatMin = 0, repeat_max = 0;      /* To please picky compilers */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   576
    int bravalue = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   577
    int reqvary, tempreqvary;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   578
    int c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   579
    unsigned char* code = *codePtr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   580
    unsigned char* tempcode;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   581
    bool didGroupSetFirstByte = false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   582
    const UChar* ptr = *ptrPtr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   583
    const UChar* tempptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   584
    unsigned char* previous = NULL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   585
    unsigned char classbits[32];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   586
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   587
    bool class_utf8;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   588
    unsigned char* class_utf8data;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   589
    unsigned char utf8_char[6];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   590
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   591
    /* Initialize no first byte, no required byte. REQ_UNSET means "no char
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   592
     matching encountered yet". It gets changed to REQ_NONE if we hit something that
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   593
     matches a non-fixed char first char; reqByte just remains unset if we never
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   594
     find one.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   595
     
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   596
     When we hit a repeat whose minimum is zero, we may have to adjust these values
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   597
     to take the zero repeat into account. This is implemented by setting them to
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   598
     zeroFirstByte and zeroReqByte when such a repeat is encountered. The individual
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   599
     item types that can be repeated set these backoff variables appropriately. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   600
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   601
    int firstByte = REQ_UNSET;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   602
    int reqByte = REQ_UNSET;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   603
    int zeroReqByte = REQ_UNSET;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   604
    int zeroFirstByte = REQ_UNSET;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   605
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   606
    /* The variable reqCaseOpt contains either the REQ_IGNORE_CASE value or zero,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   607
     according to the current setting of the ignores-case flag. REQ_IGNORE_CASE is a bit
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   608
     value > 255. It is added into the firstByte or reqByte variables to record the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   609
     case status of the value. This is used only for ASCII characters. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   610
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   611
    int reqCaseOpt = (options & IgnoreCaseOption) ? REQ_IGNORE_CASE : 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   612
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   613
    /* Switch on next character until the end of the branch */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   614
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   615
    for (;; ptr++) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   616
        bool negateClass;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   617
        bool shouldFlipNegation; /* If a negative special such as \S is used, we should negate the whole class to properly support Unicode. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   618
        int classCharCount;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   619
        int classLastChar;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   620
        int skipBytes;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   621
        int subReqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   622
        int subFirstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   623
        int mcLength;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   624
        unsigned char mcbuffer[8];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   625
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   626
        /* Next byte in the pattern */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   627
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   628
        c = ptr < patternEnd ? *ptr : 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   629
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   630
        /* Fill in length of a previous callout, except when the next thing is
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   631
         a quantifier. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   632
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   633
        bool isQuantifier = c == '*' || c == '+' || c == '?' || (c == '{' && isCountedRepeat(ptr + 1, patternEnd));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   634
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   635
        switch (c) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   636
            /* The branch terminates at end of string, |, or ). */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   637
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   638
            case 0:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   639
                if (ptr < patternEnd)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   640
                    goto NORMAL_CHAR;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   641
                // End of string; fall through
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   642
            case '|':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   643
            case ')':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   644
                *firstbyteptr = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   645
                *reqbyteptr = reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   646
                *codePtr = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   647
                *ptrPtr = ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   648
                return true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   649
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   650
            /* Handle single-character metacharacters. In multiline mode, ^ disables
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   651
             the setting of any following char as a first character. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   652
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   653
            case '^':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   654
                if (options & MatchAcrossMultipleLinesOption) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   655
                    if (firstByte == REQ_UNSET)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   656
                        firstByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   657
                    *code++ = OP_BOL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   658
                } else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   659
                    *code++ = OP_CIRC;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   660
                previous = NULL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   661
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   662
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   663
            case '$':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   664
                previous = NULL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   665
                if (options & MatchAcrossMultipleLinesOption)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   666
                  *code++ = OP_EOL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   667
                else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   668
                  *code++ = OP_DOLL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   669
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   670
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   671
            /* There can never be a first char if '.' is first, whatever happens about
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   672
             repeats. The value of reqByte doesn't change either. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   673
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   674
            case '.':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   675
                if (firstByte == REQ_UNSET)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   676
                    firstByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   677
                zeroFirstByte = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   678
                zeroReqByte = reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   679
                previous = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   680
                *code++ = OP_NOT_NEWLINE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   681
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   682
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   683
            /* Character classes. If the included characters are all < 256, we build a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   684
             32-byte bitmap of the permitted characters, except in the special case
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   685
             where there is only one such character. For negated classes, we build the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   686
             map as usual, then invert it at the end. However, we use a different opcode
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   687
             so that data characters > 255 can be handled correctly.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   688
             
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   689
             If the class contains characters outside the 0-255 range, a different
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   690
             opcode is compiled. It may optionally have a bit map for characters < 256,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   691
             but those above are are explicitly listed afterwards. A flag byte tells
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   692
             whether the bitmap is present, and whether this is a negated class or not.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   693
             */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   694
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   695
            case '[': {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   696
                previous = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   697
                shouldFlipNegation = false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   698
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   699
                /* PCRE supports POSIX class stuff inside a class. Perl gives an error if
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   700
                 they are encountered at the top level, so we'll do that too. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   701
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   702
                /* If the first character is '^', set the negation flag and skip it. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   703
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   704
                if (ptr + 1 >= patternEnd) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   705
                    *errorCodePtr = ERR6;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   706
                    return false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   707
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   708
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   709
                if (ptr[1] == '^') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   710
                    negateClass = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   711
                    ++ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   712
                } else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   713
                    negateClass = false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   714
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   715
                /* Keep a count of chars with values < 256 so that we can optimize the case
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   716
                 of just a single character (as long as it's < 256). For higher valued UTF-8
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   717
                 characters, we don't yet do any optimization. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   718
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   719
                classCharCount = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   720
                classLastChar = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   721
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   722
                class_utf8 = false;                       /* No chars >= 256 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   723
                class_utf8data = code + LINK_SIZE + 34;   /* For UTF-8 items */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   724
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   725
                /* Initialize the 32-char bit map to all zeros. We have to build the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   726
                 map in a temporary bit of store, in case the class contains only 1
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   727
                 character (< 256), because in that case the compiled code doesn't use the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   728
                 bit map. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   729
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   730
                memset(classbits, 0, 32 * sizeof(unsigned char));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   731
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   732
                /* Process characters until ] is reached. The first pass
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   733
                 through the regex checked the overall syntax, so we don't need to be very
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   734
                 strict here. At the start of the loop, c contains the first byte of the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   735
                 character. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   736
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   737
                while ((++ptr < patternEnd) && (c = *ptr) != ']') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   738
                    /* Backslash may introduce a single character, or it may introduce one
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   739
                     of the specials, which just set a flag. Escaped items are checked for
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   740
                     validity in the pre-compiling pass. The sequence \b is a special case.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   741
                     Inside a class (and only there) it is treated as backspace. Elsewhere
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   742
                     it marks a word boundary. Other escapes have preset maps ready to
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   743
                     or into the one we are building. We assume they have more than one
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   744
                     character in them, so set classCharCount bigger than one. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   745
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   746
                    if (c == '\\') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   747
                        c = checkEscape(&ptr, patternEnd, errorCodePtr, cd.numCapturingBrackets, true);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   748
                        if (c < 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   749
                            classCharCount += 2;     /* Greater than 1 is what matters */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   750
                            switch (-c) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   751
                                case ESC_d:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   752
                                    for (c = 0; c < 32; c++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   753
                                        classbits[c] |= classBitmapForChar(c + cbit_digit);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   754
                                    continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   755
                                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   756
                                case ESC_D:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   757
                                    shouldFlipNegation = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   758
                                    for (c = 0; c < 32; c++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   759
                                        classbits[c] |= ~classBitmapForChar(c + cbit_digit);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   760
                                    continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   761
                                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   762
                                case ESC_w:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   763
                                    for (c = 0; c < 32; c++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   764
                                        classbits[c] |= classBitmapForChar(c + cbit_word);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   765
                                    continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   766
                                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   767
                                case ESC_W:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   768
                                    shouldFlipNegation = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   769
                                    for (c = 0; c < 32; c++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   770
                                        classbits[c] |= ~classBitmapForChar(c + cbit_word);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   771
                                    continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   772
                                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   773
                                case ESC_s:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   774
                                    for (c = 0; c < 32; c++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   775
                                         classbits[c] |= classBitmapForChar(c + cbit_space);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   776
                                    continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   777
                                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   778
                                case ESC_S:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   779
                                    shouldFlipNegation = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   780
                                    for (c = 0; c < 32; c++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   781
                                         classbits[c] |= ~classBitmapForChar(c + cbit_space);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   782
                                    continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   783
                                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   784
                                    /* Unrecognized escapes are faulted if PCRE is running in its
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   785
                                     strict mode. By default, for compatibility with Perl, they are
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   786
                                     treated as literals. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   787
                                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   788
                                default:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   789
                                    c = *ptr;              /* The final character */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   790
                                    classCharCount -= 2;  /* Undo the default count from above */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   791
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   792
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   793
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   794
                        /* Fall through if we have a single character (c >= 0). This may be
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   795
                         > 256 in UTF-8 mode. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   796
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   797
                    }   /* End of backslash handling */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   798
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   799
                    /* A single character may be followed by '-' to form a range. However,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   800
                     Perl does not permit ']' to be the end of the range. A '-' character
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   801
                     here is treated as a literal. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   802
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   803
                    if ((ptr + 2 < patternEnd) && ptr[1] == '-' && ptr[2] != ']') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   804
                        ptr += 2;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   805
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   806
                        int d = *ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   807
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   808
                        /* The second part of a range can be a single-character escape, but
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   809
                         not any of the other escapes. Perl 5.6 treats a hyphen as a literal
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   810
                         in such circumstances. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   811
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   812
                        if (d == '\\') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   813
                            const UChar* oldptr = ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   814
                            d = checkEscape(&ptr, patternEnd, errorCodePtr, cd.numCapturingBrackets, true);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   815
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   816
                            /* \X is literal X; any other special means the '-' was literal */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   817
                            if (d < 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   818
                                ptr = oldptr - 2;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   819
                                goto LONE_SINGLE_CHARACTER;  /* A few lines below */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   820
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   821
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   822
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   823
                        /* The check that the two values are in the correct order happens in
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   824
                         the pre-pass. Optimize one-character ranges */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   825
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   826
                        if (d == c)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   827
                            goto LONE_SINGLE_CHARACTER;  /* A few lines below */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   828
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   829
                        /* In UTF-8 mode, if the upper limit is > 255, or > 127 for caseless
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   830
                         matching, we have to use an XCLASS with extra data items. Caseless
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   831
                         matching for characters > 127 is available only if UCP support is
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   832
                         available. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   833
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   834
                        if ((d > 255 || ((options & IgnoreCaseOption) && d > 127))) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   835
                            class_utf8 = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   836
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   837
                            /* With UCP support, we can find the other case equivalents of
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   838
                             the relevant characters. There may be several ranges. Optimize how
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   839
                             they fit with the basic range. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   840
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   841
                            if (options & IgnoreCaseOption) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   842
                                int occ, ocd;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   843
                                int cc = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   844
                                int origd = d;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   845
                                while (getOthercaseRange(&cc, origd, &occ, &ocd)) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   846
                                    if (occ >= c && ocd <= d)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   847
                                        continue;  /* Skip embedded ranges */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   848
                                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   849
                                    if (occ < c  && ocd >= c - 1)        /* Extend the basic range */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   850
                                    {                                  /* if there is overlap,   */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   851
                                        c = occ;                           /* noting that if occ < c */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   852
                                        continue;                          /* we can't have ocd > d  */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   853
                                    }                                  /* because a subrange is  */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   854
                                    if (ocd > d && occ <= d + 1)         /* always shorter than    */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   855
                                    {                                  /* the basic range.       */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   856
                                        d = ocd;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   857
                                        continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   858
                                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   859
                                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   860
                                    if (occ == ocd)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   861
                                        *class_utf8data++ = XCL_SINGLE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   862
                                    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   863
                                        *class_utf8data++ = XCL_RANGE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   864
                                        class_utf8data += encodeUTF8(occ, class_utf8data);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   865
                                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   866
                                    class_utf8data += encodeUTF8(ocd, class_utf8data);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   867
                                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   868
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   869
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   870
                            /* Now record the original range, possibly modified for UCP caseless
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   871
                             overlapping ranges. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   872
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   873
                            *class_utf8data++ = XCL_RANGE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   874
                            class_utf8data += encodeUTF8(c, class_utf8data);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   875
                            class_utf8data += encodeUTF8(d, class_utf8data);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   876
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   877
                            /* With UCP support, we are done. Without UCP support, there is no
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   878
                             caseless matching for UTF-8 characters > 127; we can use the bit map
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   879
                             for the smaller ones. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   880
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   881
                            continue;    /* With next character in the class */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   882
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   883
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   884
                        /* We use the bit map for all cases when not in UTF-8 mode; else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   885
                         ranges that lie entirely within 0-127 when there is UCP support; else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   886
                         for partial ranges without UCP support. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   887
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   888
                        for (; c <= d; c++) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   889
                            classbits[c/8] |= (1 << (c&7));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   890
                            if (options & IgnoreCaseOption) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   891
                                int uc = flipCase(c);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   892
                                classbits[uc/8] |= (1 << (uc&7));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   893
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   894
                            classCharCount++;                /* in case a one-char range */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   895
                            classLastChar = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   896
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   897
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   898
                        continue;   /* Go get the next char in the class */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   899
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   900
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   901
                    /* Handle a lone single character - we can get here for a normal
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   902
                     non-escape char, or after \ that introduces a single character or for an
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   903
                     apparent range that isn't. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   904
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   905
                LONE_SINGLE_CHARACTER:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   906
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   907
                    /* Handle a character that cannot go in the bit map */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   908
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   909
                    if ((c > 255 || ((options & IgnoreCaseOption) && c > 127))) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   910
                        class_utf8 = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   911
                        *class_utf8data++ = XCL_SINGLE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   912
                        class_utf8data += encodeUTF8(c, class_utf8data);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   913
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   914
                        if (options & IgnoreCaseOption) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   915
                            int othercase;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   916
                            if ((othercase = jsc_pcre_ucp_othercase(c)) >= 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   917
                                *class_utf8data++ = XCL_SINGLE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   918
                                class_utf8data += encodeUTF8(othercase, class_utf8data);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   919
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   920
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   921
                    } else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   922
                        /* Handle a single-byte character */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   923
                        classbits[c/8] |= (1 << (c&7));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   924
                        if (options & IgnoreCaseOption) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   925
                            c = flipCase(c);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   926
                            classbits[c/8] |= (1 << (c&7));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   927
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   928
                        classCharCount++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   929
                        classLastChar = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   930
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   931
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   932
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   933
                /* If classCharCount is 1, we saw precisely one character whose value is
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   934
                 less than 256. In non-UTF-8 mode we can always optimize. In UTF-8 mode, we
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   935
                 can optimize the negative case only if there were no characters >= 128
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   936
                 because OP_NOT and the related opcodes like OP_NOTSTAR operate on
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   937
                 single-bytes only. This is an historical hangover. Maybe one day we can
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   938
                 tidy these opcodes to handle multi-byte characters.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   939
                 
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   940
                 The optimization throws away the bit map. We turn the item into a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   941
                 1-character OP_CHAR[NC] if it's positive, or OP_NOT if it's negative. Note
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   942
                 that OP_NOT does not support multibyte characters. In the positive case, it
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   943
                 can cause firstByte to be set. Otherwise, there can be no first char if
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   944
                 this item is first, whatever repeat count may follow. In the case of
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   945
                 reqByte, save the previous value for reinstating. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   946
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   947
                if (classCharCount == 1 && (!class_utf8 && (!negateClass || classLastChar < 128))) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   948
                    zeroReqByte = reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   949
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   950
                    /* The OP_NOT opcode works on one-byte characters only. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   951
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   952
                    if (negateClass) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   953
                        if (firstByte == REQ_UNSET)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   954
                            firstByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   955
                        zeroFirstByte = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   956
                        *code++ = OP_NOT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   957
                        *code++ = classLastChar;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   958
                        break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   959
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   960
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   961
                    /* For a single, positive character, get the value into c, and
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   962
                     then we can handle this with the normal one-character code. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   963
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   964
                    c = classLastChar;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   965
                    goto NORMAL_CHAR;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   966
                }       /* End of 1-char optimization */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   967
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   968
                /* The general case - not the one-char optimization. If this is the first
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   969
                 thing in the branch, there can be no first char setting, whatever the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   970
                 repeat count. Any reqByte setting must remain unchanged after any kind of
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   971
                 repeat. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   972
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   973
                if (firstByte == REQ_UNSET) firstByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   974
                zeroFirstByte = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   975
                zeroReqByte = reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   976
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   977
                /* If there are characters with values > 255, we have to compile an
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   978
                 extended class, with its own opcode. If there are no characters < 256,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   979
                 we can omit the bitmap. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   980
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   981
                if (class_utf8 && !shouldFlipNegation) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   982
                    *class_utf8data++ = XCL_END;    /* Marks the end of extra data */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   983
                    *code++ = OP_XCLASS;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   984
                    code += LINK_SIZE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   985
                    *code = negateClass? XCL_NOT : 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   986
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   987
                    /* If the map is required, install it, and move on to the end of
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   988
                     the extra data */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   989
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   990
                    if (classCharCount > 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   991
                        *code++ |= XCL_MAP;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   992
                        memcpy(code, classbits, 32);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   993
                        code = class_utf8data;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   994
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   995
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   996
                    /* If the map is not required, slide down the extra data. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   997
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   998
                    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
   999
                        int len = class_utf8data - (code + 33);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1000
                        memmove(code + 1, code + 33, len);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1001
                        code += len + 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1002
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1003
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1004
                    /* Now fill in the complete length of the item */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1005
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1006
                    putLinkValue(previous + 1, code - previous);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1007
                    break;   /* End of class handling */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1008
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1009
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1010
                /* If there are no characters > 255, negate the 32-byte map if necessary,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1011
                 and copy it into the code vector. If this is the first thing in the branch,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1012
                 there can be no first char setting, whatever the repeat count. Any reqByte
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1013
                 setting must remain unchanged after any kind of repeat. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1014
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1015
                *code++ = (negateClass == shouldFlipNegation) ? OP_CLASS : OP_NCLASS;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1016
                if (negateClass)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1017
                    for (c = 0; c < 32; c++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1018
                        code[c] = ~classbits[c];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1019
                else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1020
                    memcpy(code, classbits, 32);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1021
                code += 32;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1022
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1023
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1024
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1025
            /* Various kinds of repeat; '{' is not necessarily a quantifier, but this
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1026
             has been tested above. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1027
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1028
            case '{':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1029
                if (!isQuantifier)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1030
                    goto NORMAL_CHAR;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1031
                ptr = readRepeatCounts(ptr + 1, &repeatMin, &repeat_max, errorCodePtr);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1032
                if (*errorCodePtr)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1033
                    goto FAILED;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1034
                goto REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1035
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1036
            case '*':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1037
                repeatMin = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1038
                repeat_max = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1039
                goto REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1040
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1041
            case '+':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1042
                repeatMin = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1043
                repeat_max = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1044
                goto REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1045
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1046
            case '?':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1047
                repeatMin = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1048
                repeat_max = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1049
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1050
            REPEAT:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1051
                if (!previous) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1052
                    *errorCodePtr = ERR9;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1053
                    goto FAILED;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1054
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1055
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1056
                if (repeatMin == 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1057
                    firstByte = zeroFirstByte;    /* Adjust for zero repeat */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1058
                    reqByte = zeroReqByte;        /* Ditto */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1059
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1060
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1061
                /* Remember whether this is a variable length repeat */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1062
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1063
                reqvary = (repeatMin == repeat_max) ? 0 : REQ_VARY;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1064
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1065
                opType = 0;                    /* Default single-char op codes */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1066
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1067
                /* Save start of previous item, in case we have to move it up to make space
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1068
                 for an inserted OP_ONCE for the additional '+' extension. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1069
                /* FIXME: Probably don't need this because we don't use OP_ONCE. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1070
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1071
                tempcode = previous;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1072
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1073
                /* If the next character is '+', we have a possessive quantifier. This
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1074
                 implies greediness, whatever the setting of the PCRE_UNGREEDY option.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1075
                 If the next character is '?' this is a minimizing repeat, by default,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1076
                 but if PCRE_UNGREEDY is set, it works the other way round. We change the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1077
                 repeat type to the non-default. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1078
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1079
                if (safelyCheckNextChar(ptr, patternEnd, '?')) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1080
                    repeatType = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1081
                    ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1082
                } else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1083
                    repeatType = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1084
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1085
                /* If previous was a character match, abolish the item and generate a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1086
                 repeat item instead. If a char item has a minumum of more than one, ensure
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1087
                 that it is set in reqByte - it might not be if a sequence such as x{3} is
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1088
                 the first thing in a branch because the x will have gone into firstByte
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1089
                 instead.  */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1090
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1091
                if (*previous == OP_CHAR || *previous == OP_CHAR_IGNORING_CASE) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1092
                    /* Deal with UTF-8 characters that take up more than one byte. It's
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1093
                     easier to write this out separately than try to macrify it. Use c to
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1094
                     hold the length of the character in bytes, plus 0x80 to flag that it's a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1095
                     length rather than a small character. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1096
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1097
                    if (code[-1] & 0x80) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1098
                        unsigned char *lastchar = code - 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1099
                        while((*lastchar & 0xc0) == 0x80)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1100
                            lastchar--;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1101
                        c = code - lastchar;            /* Length of UTF-8 character */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1102
                        memcpy(utf8_char, lastchar, c); /* Save the char */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1103
                        c |= 0x80;                      /* Flag c as a length */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1104
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1105
                    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1106
                        c = code[-1];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1107
                        if (repeatMin > 1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1108
                            reqByte = c | reqCaseOpt | cd.reqVaryOpt;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1109
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1110
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1111
                    goto OUTPUT_SINGLE_REPEAT;   /* Code shared with single character types */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1112
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1113
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1114
                else if (*previous == OP_ASCII_CHAR || *previous == OP_ASCII_LETTER_IGNORING_CASE) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1115
                    c = previous[1];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1116
                    if (repeatMin > 1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1117
                        reqByte = c | reqCaseOpt | cd.reqVaryOpt;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1118
                    goto OUTPUT_SINGLE_REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1119
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1120
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1121
                /* If previous was a single negated character ([^a] or similar), we use
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1122
                 one of the special opcodes, replacing it. The code is shared with single-
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1123
                 character repeats by setting opt_type to add a suitable offset into
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1124
                 repeatType. OP_NOT is currently used only for single-byte chars. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1125
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1126
                else if (*previous == OP_NOT) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1127
                    opType = OP_NOTSTAR - OP_STAR;  /* Use "not" opcodes */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1128
                    c = previous[1];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1129
                    goto OUTPUT_SINGLE_REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1130
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1131
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1132
                /* If previous was a character type match (\d or similar), abolish it and
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1133
                 create a suitable repeat item. The code is shared with single-character
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1134
                 repeats by setting opType to add a suitable offset into repeatType. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1135
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1136
                else if (*previous <= OP_NOT_NEWLINE) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1137
                    opType = OP_TYPESTAR - OP_STAR;  /* Use type opcodes */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1138
                    c = *previous;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1139
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1140
                OUTPUT_SINGLE_REPEAT:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1141
                    int prop_type = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1142
                    int prop_value = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1143
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1144
                    unsigned char* oldcode = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1145
                    code = previous;                  /* Usually overwrite previous item */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1146
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1147
                    /* If the maximum is zero then the minimum must also be zero; Perl allows
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1148
                     this case, so we do too - by simply omitting the item altogether. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1149
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1150
                    if (repeat_max == 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1151
                        goto END_REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1152
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1153
                    /* Combine the opType with the repeatType */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1154
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1155
                    repeatType += opType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1156
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1157
                    /* A minimum of zero is handled either as the special case * or ?, or as
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1158
                     an UPTO, with the maximum given. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1159
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1160
                    if (repeatMin == 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1161
                        if (repeat_max == -1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1162
                            *code++ = OP_STAR + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1163
                        else if (repeat_max == 1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1164
                            *code++ = OP_QUERY + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1165
                        else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1166
                            *code++ = OP_UPTO + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1167
                            put2ByteValueAndAdvance(code, repeat_max);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1168
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1169
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1170
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1171
                    /* A repeat minimum of 1 is optimized into some special cases. If the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1172
                     maximum is unlimited, we use OP_PLUS. Otherwise, the original item it
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1173
                     left in place and, if the maximum is greater than 1, we use OP_UPTO with
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1174
                     one less than the maximum. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1175
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1176
                    else if (repeatMin == 1) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1177
                        if (repeat_max == -1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1178
                            *code++ = OP_PLUS + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1179
                        else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1180
                            code = oldcode;                 /* leave previous item in place */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1181
                            if (repeat_max == 1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1182
                                goto END_REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1183
                            *code++ = OP_UPTO + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1184
                            put2ByteValueAndAdvance(code, repeat_max - 1);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1185
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1186
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1187
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1188
                    /* The case {n,n} is just an EXACT, while the general case {n,m} is
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1189
                     handled as an EXACT followed by an UPTO. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1190
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1191
                    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1192
                        *code++ = OP_EXACT + opType;  /* NB EXACT doesn't have repeatType */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1193
                        put2ByteValueAndAdvance(code, repeatMin);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1194
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1195
                        /* If the maximum is unlimited, insert an OP_STAR. Before doing so,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1196
                         we have to insert the character for the previous code. For a repeated
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1197
                         Unicode property match, there are two extra bytes that define the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1198
                         required property. In UTF-8 mode, long characters have their length in
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1199
                         c, with the 0x80 bit as a flag. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1200
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1201
                        if (repeat_max < 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1202
                            if (c >= 128) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1203
                                memcpy(code, utf8_char, c & 7);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1204
                                code += c & 7;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1205
                            } else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1206
                                *code++ = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1207
                                if (prop_type >= 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1208
                                    *code++ = prop_type;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1209
                                    *code++ = prop_value;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1210
                                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1211
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1212
                            *code++ = OP_STAR + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1213
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1214
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1215
                        /* Else insert an UPTO if the max is greater than the min, again
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1216
                         preceded by the character, for the previously inserted code. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1217
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1218
                        else if (repeat_max != repeatMin) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1219
                            if (c >= 128) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1220
                                memcpy(code, utf8_char, c & 7);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1221
                                code += c & 7;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1222
                            } else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1223
                                *code++ = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1224
                            if (prop_type >= 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1225
                                *code++ = prop_type;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1226
                                *code++ = prop_value;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1227
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1228
                            repeat_max -= repeatMin;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1229
                            *code++ = OP_UPTO + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1230
                            put2ByteValueAndAdvance(code, repeat_max);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1231
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1232
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1233
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1234
                    /* The character or character type itself comes last in all cases. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1235
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1236
                    if (c >= 128) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1237
                        memcpy(code, utf8_char, c & 7);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1238
                        code += c & 7;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1239
                    } else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1240
                        *code++ = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1241
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1242
                    /* For a repeated Unicode property match, there are two extra bytes that
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1243
                     define the required property. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1244
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1245
                    if (prop_type >= 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1246
                        *code++ = prop_type;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1247
                        *code++ = prop_value;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1248
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1249
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1250
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1251
                /* If previous was a character class or a back reference, we put the repeat
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1252
                 stuff after it, but just skip the item if the repeat was {0,0}. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1253
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1254
                else if (*previous == OP_CLASS ||
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1255
                         *previous == OP_NCLASS ||
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1256
                         *previous == OP_XCLASS ||
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1257
                         *previous == OP_REF)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1258
                {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1259
                    if (repeat_max == 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1260
                        code = previous;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1261
                        goto END_REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1262
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1263
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1264
                    if (repeatMin == 0 && repeat_max == -1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1265
                        *code++ = OP_CRSTAR + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1266
                    else if (repeatMin == 1 && repeat_max == -1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1267
                        *code++ = OP_CRPLUS + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1268
                    else if (repeatMin == 0 && repeat_max == 1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1269
                        *code++ = OP_CRQUERY + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1270
                    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1271
                        *code++ = OP_CRRANGE + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1272
                        put2ByteValueAndAdvance(code, repeatMin);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1273
                        if (repeat_max == -1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1274
                            repeat_max = 0;  /* 2-byte encoding for max */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1275
                        put2ByteValueAndAdvance(code, repeat_max);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1276
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1277
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1278
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1279
                /* If previous was a bracket group, we may have to replicate it in certain
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1280
                 cases. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1281
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1282
                else if (*previous >= OP_BRA) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1283
                    int ketoffset = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1284
                    int len = code - previous;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1285
                    unsigned char* bralink = NULL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1286
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1287
                    /* If the maximum repeat count is unlimited, find the end of the bracket
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1288
                     by scanning through from the start, and compute the offset back to it
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1289
                     from the current code pointer. There may be an OP_OPT setting following
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1290
                     the final KET, so we can't find the end just by going back from the code
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1291
                     pointer. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1292
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1293
                    if (repeat_max == -1) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1294
                        const unsigned char* ket = previous;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1295
                        advanceToEndOfBracket(ket);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1296
                        ketoffset = code - ket;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1297
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1298
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1299
                    /* The case of a zero minimum is special because of the need to stick
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1300
                     OP_BRAZERO in front of it, and because the group appears once in the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1301
                     data, whereas in other cases it appears the minimum number of times. For
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1302
                     this reason, it is simplest to treat this case separately, as otherwise
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1303
                     the code gets far too messy. There are several special subcases when the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1304
                     minimum is zero. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1305
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1306
                    if (repeatMin == 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1307
                        /* If the maximum is also zero, we just omit the group from the output
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1308
                         altogether. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1309
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1310
                        if (repeat_max == 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1311
                            code = previous;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1312
                            goto END_REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1313
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1314
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1315
                        /* If the maximum is 1 or unlimited, we just have to stick in the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1316
                         BRAZERO and do no more at this point. However, we do need to adjust
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1317
                         any OP_RECURSE calls inside the group that refer to the group itself or
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1318
                         any internal group, because the offset is from the start of the whole
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1319
                         regex. Temporarily terminate the pattern while doing this. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1320
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1321
                        if (repeat_max <= 1) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1322
                            *code = OP_END;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1323
                            memmove(previous+1, previous, len);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1324
                            code++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1325
                            *previous++ = OP_BRAZERO + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1326
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1327
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1328
                        /* If the maximum is greater than 1 and limited, we have to replicate
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1329
                         in a nested fashion, sticking OP_BRAZERO before each set of brackets.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1330
                         The first one has to be handled carefully because it's the original
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1331
                         copy, which has to be moved up. The remainder can be handled by code
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1332
                         that is common with the non-zero minimum case below. We have to
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1333
                         adjust the value of repeat_max, since one less copy is required. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1334
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1335
                        else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1336
                            *code = OP_END;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1337
                            memmove(previous + 2 + LINK_SIZE, previous, len);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1338
                            code += 2 + LINK_SIZE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1339
                            *previous++ = OP_BRAZERO + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1340
                            *previous++ = OP_BRA;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1341
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1342
                            /* We chain together the bracket offset fields that have to be
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1343
                             filled in later when the ends of the brackets are reached. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1344
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1345
                            int offset = (!bralink) ? 0 : previous - bralink;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1346
                            bralink = previous;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1347
                            putLinkValueAllowZeroAndAdvance(previous, offset);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1348
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1349
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1350
                        repeat_max--;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1351
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1352
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1353
                    /* If the minimum is greater than zero, replicate the group as many
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1354
                     times as necessary, and adjust the maximum to the number of subsequent
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1355
                     copies that we need. If we set a first char from the group, and didn't
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1356
                     set a required char, copy the latter from the former. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1357
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1358
                    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1359
                        if (repeatMin > 1) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1360
                            if (didGroupSetFirstByte && reqByte < 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1361
                                reqByte = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1362
                            for (int i = 1; i < repeatMin; i++) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1363
                                memcpy(code, previous, len);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1364
                                code += len;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1365
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1366
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1367
                        if (repeat_max > 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1368
                            repeat_max -= repeatMin;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1369
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1370
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1371
                    /* This code is common to both the zero and non-zero minimum cases. If
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1372
                     the maximum is limited, it replicates the group in a nested fashion,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1373
                     remembering the bracket starts on a stack. In the case of a zero minimum,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1374
                     the first one was set up above. In all cases the repeat_max now specifies
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1375
                     the number of additional copies needed. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1376
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1377
                    if (repeat_max >= 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1378
                        for (int i = repeat_max - 1; i >= 0; i--) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1379
                            *code++ = OP_BRAZERO + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1380
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1381
                            /* All but the final copy start a new nesting, maintaining the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1382
                             chain of brackets outstanding. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1383
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1384
                            if (i != 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1385
                                *code++ = OP_BRA;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1386
                                int offset = (!bralink) ? 0 : code - bralink;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1387
                                bralink = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1388
                                putLinkValueAllowZeroAndAdvance(code, offset);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1389
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1390
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1391
                            memcpy(code, previous, len);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1392
                            code += len;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1393
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1394
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1395
                        /* Now chain through the pending brackets, and fill in their length
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1396
                         fields (which are holding the chain links pro tem). */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1397
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1398
                        while (bralink) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1399
                            int offset = code - bralink + 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1400
                            unsigned char* bra = code - offset;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1401
                            int oldlinkoffset = getLinkValueAllowZero(bra + 1);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1402
                            bralink = (!oldlinkoffset) ? 0 : bralink - oldlinkoffset;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1403
                            *code++ = OP_KET;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1404
                            putLinkValueAndAdvance(code, offset);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1405
                            putLinkValue(bra + 1, offset);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1406
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1407
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1408
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1409
                    /* If the maximum is unlimited, set a repeater in the final copy. We
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1410
                     can't just offset backwards from the current code point, because we
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1411
                     don't know if there's been an options resetting after the ket. The
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1412
                     correct offset was computed above. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1413
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1414
                    else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1415
                        code[-ketoffset] = OP_KETRMAX + repeatType;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1416
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1417
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1418
                // A quantifier after an assertion is mostly meaningless, but it
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1419
                // can nullify the assertion if it has a 0 minimum.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1420
                else if (*previous == OP_ASSERT || *previous == OP_ASSERT_NOT) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1421
                    if (repeatMin == 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1422
                        code = previous;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1423
                        goto END_REPEAT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1424
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1425
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1426
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1427
                /* Else there's some kind of shambles */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1428
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1429
                else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1430
                    *errorCodePtr = ERR11;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1431
                    goto FAILED;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1432
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1433
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1434
                /* In all case we no longer have a previous item. We also set the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1435
                 "follows varying string" flag for subsequently encountered reqbytes if
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1436
                 it isn't already set and we have just passed a varying length item. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1437
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1438
            END_REPEAT:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1439
                previous = NULL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1440
                cd.reqVaryOpt |= reqvary;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1441
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1442
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1443
            /* Start of nested bracket sub-expression, or comment or lookahead or
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1444
             lookbehind or option setting or condition. First deal with special things
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1445
             that can come after a bracket; all are introduced by ?, and the appearance
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1446
             of any of them means that this is not a referencing group. They were
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1447
             checked for validity in the first pass over the string, so we don't have to
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1448
             check for syntax errors here.  */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1449
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1450
            case '(':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1451
                skipBytes = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1452
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1453
                if (*(++ptr) == '?') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1454
                    switch (*(++ptr)) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1455
                        case ':':                 /* Non-extracting bracket */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1456
                            bravalue = OP_BRA;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1457
                            ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1458
                            break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1459
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1460
                        case '=':                 /* Positive lookahead */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1461
                            bravalue = OP_ASSERT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1462
                            ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1463
                            break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1464
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1465
                        case '!':                 /* Negative lookahead */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1466
                            bravalue = OP_ASSERT_NOT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1467
                            ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1468
                            break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1469
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1470
                        /* Character after (? not specially recognized */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1471
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1472
                        default:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1473
                            *errorCodePtr = ERR12;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1474
                            goto FAILED;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1475
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1476
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1477
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1478
                /* Else we have a referencing group; adjust the opcode. If the bracket
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1479
                 number is greater than EXTRACT_BASIC_MAX, we set the opcode one higher, and
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1480
                 arrange for the true number to follow later, in an OP_BRANUMBER item. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1481
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1482
                else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1483
                    if (++(*brackets) > EXTRACT_BASIC_MAX) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1484
                        bravalue = OP_BRA + EXTRACT_BASIC_MAX + 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1485
                        code[1 + LINK_SIZE] = OP_BRANUMBER;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1486
                        put2ByteValue(code + 2 + LINK_SIZE, *brackets);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1487
                        skipBytes = 3;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1488
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1489
                    else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1490
                        bravalue = OP_BRA + *brackets;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1491
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1492
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1493
                /* Process nested bracketed re. We copy code into a non-variable
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1494
                 in order to be able to pass its address because some compilers
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1495
                 complain otherwise. Pass in a new setting for the ims options
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1496
                 if they have changed. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1497
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1498
                previous = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1499
                *code = bravalue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1500
                tempcode = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1501
                tempreqvary = cd.reqVaryOpt;     /* Save value before bracket */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1502
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1503
                if (!compileBracket(
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1504
                                   options,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1505
                                   brackets,                     /* Extracting bracket count */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1506
                                   &tempcode,                    /* Where to put code (updated) */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1507
                                   &ptr,                         /* Input pointer (updated) */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1508
                                   patternEnd,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1509
                                   errorCodePtr,                 /* Where to put an error message */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1510
                                   skipBytes,                    /* Skip over OP_BRANUMBER */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1511
                                   &subFirstByte,                /* For possible first char */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1512
                                   &subReqByte,                  /* For possible last char */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1513
                                   cd))                          /* Tables block */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1514
                    goto FAILED;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1515
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1516
                /* At the end of compiling, code is still pointing to the start of the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1517
                 group, while tempcode has been updated to point past the end of the group
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1518
                 and any option resetting that may follow it. The pattern pointer (ptr)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1519
                 is on the bracket. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1520
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1521
                /* Handle updating of the required and first characters. Update for normal
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1522
                 brackets of all kinds, and conditions with two branches (see code above).
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1523
                 If the bracket is followed by a quantifier with zero repeat, we have to
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1524
                 back off. Hence the definition of zeroReqByte and zeroFirstByte outside the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1525
                 main loop so that they can be accessed for the back off. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1526
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1527
                zeroReqByte = reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1528
                zeroFirstByte = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1529
                didGroupSetFirstByte = false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1530
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1531
                if (bravalue >= OP_BRA) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1532
                    /* If we have not yet set a firstByte in this branch, take it from the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1533
                     subpattern, remembering that it was set here so that a repeat of more
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1534
                     than one can replicate it as reqByte if necessary. If the subpattern has
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1535
                     no firstByte, set "none" for the whole branch. In both cases, a zero
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1536
                     repeat forces firstByte to "none". */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1537
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1538
                    if (firstByte == REQ_UNSET) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1539
                        if (subFirstByte >= 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1540
                            firstByte = subFirstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1541
                            didGroupSetFirstByte = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1542
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1543
                        else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1544
                            firstByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1545
                        zeroFirstByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1546
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1547
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1548
                    /* If firstByte was previously set, convert the subpattern's firstByte
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1549
                     into reqByte if there wasn't one, using the vary flag that was in
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1550
                     existence beforehand. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1551
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1552
                    else if (subFirstByte >= 0 && subReqByte < 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1553
                        subReqByte = subFirstByte | tempreqvary;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1554
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1555
                    /* If the subpattern set a required byte (or set a first byte that isn't
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1556
                     really the first byte - see above), set it. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1557
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1558
                    if (subReqByte >= 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1559
                        reqByte = subReqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1560
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1561
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1562
                /* For a forward assertion, we take the reqByte, if set. This can be
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1563
                 helpful if the pattern that follows the assertion doesn't set a different
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1564
                 char. For example, it's useful for /(?=abcde).+/. We can't set firstByte
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1565
                 for an assertion, however because it leads to incorrect effect for patterns
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1566
                 such as /(?=a)a.+/ when the "real" "a" would then become a reqByte instead
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1567
                 of a firstByte. This is overcome by a scan at the end if there's no
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1568
                 firstByte, looking for an asserted first char. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1569
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1570
                else if (bravalue == OP_ASSERT && subReqByte >= 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1571
                    reqByte = subReqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1572
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1573
                /* Now update the main code pointer to the end of the group. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1574
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1575
                code = tempcode;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1576
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1577
                /* Error if hit end of pattern */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1578
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1579
                if (ptr >= patternEnd || *ptr != ')') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1580
                    *errorCodePtr = ERR14;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1581
                    goto FAILED;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1582
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1583
                break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1584
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1585
            /* Check \ for being a real metacharacter; if not, fall through and handle
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1586
             it as a data character at the start of a string. Escape items are checked
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1587
             for validity in the pre-compiling pass. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1588
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1589
            case '\\':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1590
                tempptr = ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1591
                c = checkEscape(&ptr, patternEnd, errorCodePtr, cd.numCapturingBrackets, false);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1592
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1593
                /* Handle metacharacters introduced by \. For ones like \d, the ESC_ values
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1594
                 are arranged to be the negation of the corresponding OP_values. For the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1595
                 back references, the values are ESC_REF plus the reference number. Only
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1596
                 back references and those types that consume a character may be repeated.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1597
                 We can test for values between ESC_b and ESC_w for the latter; this may
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1598
                 have to change if any new ones are ever created. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1599
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1600
                if (c < 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1601
                    /* For metasequences that actually match a character, we disable the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1602
                     setting of a first character if it hasn't already been set. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1603
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1604
                    if (firstByte == REQ_UNSET && -c > ESC_b && -c <= ESC_w)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1605
                        firstByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1606
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1607
                    /* Set values to reset to if this is followed by a zero repeat. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1608
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1609
                    zeroFirstByte = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1610
                    zeroReqByte = reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1611
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1612
                    /* Back references are handled specially */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1613
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1614
                    if (-c >= ESC_REF) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1615
                        int number = -c - ESC_REF;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1616
                        previous = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1617
                        *code++ = OP_REF;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1618
                        put2ByteValueAndAdvance(code, number);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1619
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1620
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1621
                    /* For the rest, we can obtain the OP value by negating the escape
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1622
                     value */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1623
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1624
                    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1625
                        previous = (-c > ESC_b && -c <= ESC_w) ? code : NULL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1626
                        *code++ = -c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1627
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1628
                    continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1629
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1630
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1631
                /* Fall through. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1632
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1633
                /* Handle a literal character. It is guaranteed not to be whitespace or #
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1634
                 when the extended flag is set. If we are in UTF-8 mode, it may be a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1635
                 multi-byte literal character. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1636
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1637
                default:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1638
            NORMAL_CHAR:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1639
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1640
                previous = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1641
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1642
                if (c < 128) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1643
                    mcLength = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1644
                    mcbuffer[0] = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1645
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1646
                    if ((options & IgnoreCaseOption) && (c | 0x20) >= 'a' && (c | 0x20) <= 'z') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1647
                        *code++ = OP_ASCII_LETTER_IGNORING_CASE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1648
                        *code++ = c | 0x20;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1649
                    } else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1650
                        *code++ = OP_ASCII_CHAR;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1651
                        *code++ = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1652
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1653
                } else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1654
                    mcLength = encodeUTF8(c, mcbuffer);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1655
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1656
                    *code++ = (options & IgnoreCaseOption) ? OP_CHAR_IGNORING_CASE : OP_CHAR;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1657
                    for (c = 0; c < mcLength; c++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1658
                        *code++ = mcbuffer[c];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1659
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1660
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1661
                /* Set the first and required bytes appropriately. If no previous first
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1662
                 byte, set it from this character, but revert to none on a zero repeat.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1663
                 Otherwise, leave the firstByte value alone, and don't change it on a zero
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1664
                 repeat. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1665
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1666
                if (firstByte == REQ_UNSET) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1667
                    zeroFirstByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1668
                    zeroReqByte = reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1669
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1670
                    /* If the character is more than one byte long, we can set firstByte
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1671
                     only if it is not to be matched caselessly. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1672
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1673
                    if (mcLength == 1 || reqCaseOpt == 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1674
                        firstByte = mcbuffer[0] | reqCaseOpt;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1675
                        if (mcLength != 1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1676
                            reqByte = code[-1] | cd.reqVaryOpt;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1677
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1678
                    else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1679
                        firstByte = reqByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1680
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1681
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1682
                /* firstByte was previously set; we can set reqByte only the length is
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1683
                 1 or the matching is caseful. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1684
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1685
                else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1686
                    zeroFirstByte = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1687
                    zeroReqByte = reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1688
                    if (mcLength == 1 || reqCaseOpt == 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1689
                        reqByte = code[-1] | reqCaseOpt | cd.reqVaryOpt;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1690
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1691
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1692
                break;            /* End of literal character handling */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1693
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1694
    }                   /* end of big loop */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1695
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1696
    /* Control never reaches here by falling through, only by a goto for all the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1697
     error states. Pass back the position in the pattern so that it can be displayed
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1698
     to the user for diagnosing the error. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1699
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1700
FAILED:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1701
    *ptrPtr = ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1702
    return false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1703
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1704
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1705
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1706
*     Compile sequence of alternatives           *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1707
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1708
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1709
/* On entry, ptr is pointing past the bracket character, but on return
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1710
it points to the closing bracket, or vertical bar, or end of string.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1711
The code variable is pointing at the byte into which the BRA operator has been
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1712
stored. If the ims options are changed at the start (for a (?ims: group) or
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1713
during any branch, we need to insert an OP_OPT item at the start of every
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1714
following branch to ensure they get set correctly at run time, and also pass
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1715
the new options into every subsequent branch compile.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1716
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1717
Argument:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1718
  options        option bits, including any changes for this subpattern
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1719
  brackets       -> int containing the number of extracting brackets used
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1720
  codePtr        -> the address of the current code pointer
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1721
  ptrPtr         -> the address of the current pattern pointer
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1722
  errorCodePtr   -> pointer to error code variable
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1723
  skipBytes      skip this many bytes at start (for OP_BRANUMBER)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1724
  firstbyteptr   place to put the first required character, or a negative number
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1725
  reqbyteptr     place to put the last required character, or a negative number
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1726
  cd             points to the data block with tables pointers etc.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1727
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1728
Returns:      true on success
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1729
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1730
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1731
static bool
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1732
compileBracket(int options, int* brackets, unsigned char** codePtr,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1733
    const UChar** ptrPtr, const UChar* patternEnd, ErrorCode* errorCodePtr, int skipBytes,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1734
    int* firstbyteptr, int* reqbyteptr, CompileData& cd)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1735
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1736
    const UChar* ptr = *ptrPtr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1737
    unsigned char* code = *codePtr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1738
    unsigned char* lastBranch = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1739
    unsigned char* start_bracket = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1740
    int firstByte = REQ_UNSET;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1741
    int reqByte = REQ_UNSET;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1742
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1743
    /* Offset is set zero to mark that this bracket is still open */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1744
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1745
    putLinkValueAllowZero(code + 1, 0);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1746
    code += 1 + LINK_SIZE + skipBytes;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1747
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1748
    /* Loop for each alternative branch */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1749
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1750
    while (true) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1751
        /* Now compile the branch */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1752
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1753
        int branchFirstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1754
        int branchReqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1755
        if (!compileBranch(options, brackets, &code, &ptr, patternEnd, errorCodePtr,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1756
                            &branchFirstByte, &branchReqByte, cd)) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1757
            *ptrPtr = ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1758
            return false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1759
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1760
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1761
        /* If this is the first branch, the firstByte and reqByte values for the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1762
         branch become the values for the regex. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1763
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1764
        if (*lastBranch != OP_ALT) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1765
            firstByte = branchFirstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1766
            reqByte = branchReqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1767
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1768
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1769
        /* If this is not the first branch, the first char and reqByte have to
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1770
         match the values from all the previous branches, except that if the previous
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1771
         value for reqByte didn't have REQ_VARY set, it can still match, and we set
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1772
         REQ_VARY for the regex. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1773
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1774
        else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1775
            /* If we previously had a firstByte, but it doesn't match the new branch,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1776
             we have to abandon the firstByte for the regex, but if there was previously
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1777
             no reqByte, it takes on the value of the old firstByte. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1778
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1779
            if (firstByte >= 0 && firstByte != branchFirstByte) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1780
                if (reqByte < 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1781
                    reqByte = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1782
                firstByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1783
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1784
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1785
            /* If we (now or from before) have no firstByte, a firstByte from the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1786
             branch becomes a reqByte if there isn't a branch reqByte. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1787
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1788
            if (firstByte < 0 && branchFirstByte >= 0 && branchReqByte < 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1789
                branchReqByte = branchFirstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1790
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1791
            /* Now ensure that the reqbytes match */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1792
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1793
            if ((reqByte & ~REQ_VARY) != (branchReqByte & ~REQ_VARY))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1794
                reqByte = REQ_NONE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1795
            else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1796
                reqByte |= branchReqByte;   /* To "or" REQ_VARY */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1797
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1798
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1799
        /* Reached end of expression, either ')' or end of pattern. Go back through
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1800
         the alternative branches and reverse the chain of offsets, with the field in
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1801
         the BRA item now becoming an offset to the first alternative. If there are
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1802
         no alternatives, it points to the end of the group. The length in the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1803
         terminating ket is always the length of the whole bracketed item. If any of
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1804
         the ims options were changed inside the group, compile a resetting op-code
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1805
         following, except at the very end of the pattern. Return leaving the pointer
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1806
         at the terminating char. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1807
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1808
        if (ptr >= patternEnd || *ptr != '|') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1809
            int length = code - lastBranch;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1810
            do {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1811
                int prevLength = getLinkValueAllowZero(lastBranch + 1);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1812
                putLinkValue(lastBranch + 1, length);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1813
                length = prevLength;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1814
                lastBranch -= length;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1815
            } while (length > 0);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1816
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1817
            /* Fill in the ket */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1818
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1819
            *code = OP_KET;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1820
            putLinkValue(code + 1, code - start_bracket);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1821
            code += 1 + LINK_SIZE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1822
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1823
            /* Set values to pass back */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1824
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1825
            *codePtr = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1826
            *ptrPtr = ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1827
            *firstbyteptr = firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1828
            *reqbyteptr = reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1829
            return true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1830
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1831
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1832
        /* Another branch follows; insert an "or" node. Its length field points back
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1833
         to the previous branch while the bracket remains open. At the end the chain
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1834
         is reversed. It's done like this so that the start of the bracket has a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1835
         zero offset until it is closed, making it possible to detect recursion. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1836
        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1837
        *code = OP_ALT;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1838
        putLinkValue(code + 1, code - lastBranch);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1839
        lastBranch = code;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1840
        code += 1 + LINK_SIZE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1841
        ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1842
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1843
    ASSERT_NOT_REACHED();
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1844
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1845
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1846
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1847
*          Check for anchored expression         *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1848
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1849
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1850
/* Try to find out if this is an anchored regular expression. Consider each
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1851
alternative branch. If they all start OP_CIRC, or with a bracket
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1852
all of whose alternatives start OP_CIRC (recurse ad lib), then
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1853
it's anchored.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1854
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1855
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1856
  code          points to start of expression (the bracket)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1857
  captureMap    a bitmap of which brackets we are inside while testing; this
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1858
                 handles up to substring 31; all brackets after that share
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1859
                 the zero bit
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1860
  backrefMap    the back reference bitmap
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1861
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1862
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1863
static bool branchIsAnchored(const unsigned char* code)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1864
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1865
    const unsigned char* scode = firstSignificantOpcode(code);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1866
    int op = *scode;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1867
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1868
    /* Brackets */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1869
    if (op >= OP_BRA || op == OP_ASSERT)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1870
        return bracketIsAnchored(scode);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1871
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1872
    /* Check for explicit anchoring */    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1873
    return op == OP_CIRC;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1874
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1875
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1876
static bool bracketIsAnchored(const unsigned char* code)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1877
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1878
    do {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1879
        if (!branchIsAnchored(code + 1 + LINK_SIZE))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1880
            return false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1881
        code += getLinkValue(code + 1);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1882
    } while (*code == OP_ALT);   /* Loop for each alternative */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1883
    return true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1884
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1885
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1886
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1887
*         Check for starting with ^ or .*        *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1888
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1889
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1890
/* This is called to find out if every branch starts with ^ or .* so that
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1891
"first char" processing can be done to speed things up in multiline
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1892
matching and for non-DOTALL patterns that start with .* (which must start at
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1893
the beginning or after \n)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1894
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1895
Except when the .* appears inside capturing parentheses, and there is a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1896
subsequent back reference to those parentheses. By keeping a bitmap of the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1897
first 31 back references, we can catch some of the more common cases more
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1898
precisely; all the greater back references share a single bit.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1899
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1900
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1901
  code          points to start of expression (the bracket)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1902
  captureMap    a bitmap of which brackets we are inside while testing; this
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1903
                 handles up to substring 31; all brackets after that share
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1904
                 the zero bit
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1905
  backrefMap    the back reference bitmap
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1906
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1907
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1908
static bool branchNeedsLineStart(const unsigned char* code, unsigned captureMap, unsigned backrefMap)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1909
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1910
    const unsigned char* scode = firstSignificantOpcode(code);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1911
    int op = *scode;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1912
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1913
    /* Capturing brackets */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1914
    if (op > OP_BRA) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1915
        int captureNum = op - OP_BRA;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1916
        if (captureNum > EXTRACT_BASIC_MAX)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1917
            captureNum = get2ByteValue(scode + 2 + LINK_SIZE);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1918
        int bracketMask = (captureNum < 32) ? (1 << captureNum) : 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1919
        return bracketNeedsLineStart(scode, captureMap | bracketMask, backrefMap);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1920
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1921
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1922
    /* Other brackets */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1923
    if (op == OP_BRA || op == OP_ASSERT)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1924
        return bracketNeedsLineStart(scode, captureMap, backrefMap);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1925
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1926
    /* .* means "start at start or after \n" if it isn't in brackets that
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1927
     may be referenced. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1928
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1929
    if (op == OP_TYPESTAR || op == OP_TYPEMINSTAR)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1930
        return scode[1] == OP_NOT_NEWLINE && !(captureMap & backrefMap);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1931
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1932
    /* Explicit ^ */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1933
    return op == OP_CIRC || op == OP_BOL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1934
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1935
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1936
static bool bracketNeedsLineStart(const unsigned char* code, unsigned captureMap, unsigned backrefMap)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1937
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1938
    do {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1939
        if (!branchNeedsLineStart(code + 1 + LINK_SIZE, captureMap, backrefMap))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1940
            return false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1941
        code += getLinkValue(code + 1);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1942
    } while (*code == OP_ALT);  /* Loop for each alternative */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1943
    return true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1944
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1945
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1946
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1947
*       Check for asserted fixed first char      *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1948
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1949
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1950
/* During compilation, the "first char" settings from forward assertions are
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1951
discarded, because they can cause conflicts with actual literals that follow.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1952
However, if we end up without a first char setting for an unanchored pattern,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1953
it is worth scanning the regex to see if there is an initial asserted first
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1954
char. If all branches start with the same asserted char, or with a bracket all
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1955
of whose alternatives start with the same asserted char (recurse ad lib), then
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1956
we return that char, otherwise -1.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1957
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1958
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1959
  code       points to start of expression (the bracket)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1960
  options    pointer to the options (used to check casing changes)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1961
  inassert   true if in an assertion
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1962
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1963
Returns:     -1 or the fixed first char
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1964
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1965
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1966
static int branchFindFirstAssertedCharacter(const unsigned char* code, bool inassert)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1967
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1968
    const unsigned char* scode = firstSignificantOpcodeSkippingAssertions(code);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1969
    int op = *scode;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1970
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1971
    if (op >= OP_BRA)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1972
        op = OP_BRA;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1973
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1974
    switch (op) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1975
        default:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1976
            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1977
            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1978
        case OP_BRA:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1979
        case OP_ASSERT:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1980
            return bracketFindFirstAssertedCharacter(scode, op == OP_ASSERT);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1981
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1982
        case OP_EXACT:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1983
            scode += 2;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1984
            /* Fall through */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1985
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1986
        case OP_CHAR:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1987
        case OP_CHAR_IGNORING_CASE:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1988
        case OP_ASCII_CHAR:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1989
        case OP_ASCII_LETTER_IGNORING_CASE:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1990
        case OP_PLUS:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1991
        case OP_MINPLUS:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1992
            if (!inassert)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1993
                return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1994
            return scode[1];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1995
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1996
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1997
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1998
static int bracketFindFirstAssertedCharacter(const unsigned char* code, bool inassert)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  1999
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2000
    int c = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2001
    do {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2002
        int d = branchFindFirstAssertedCharacter(code + 1 + LINK_SIZE, inassert);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2003
        if (d < 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2004
            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2005
        if (c < 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2006
            c = d;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2007
        else if (c != d)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2008
            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2009
        code += getLinkValue(code + 1);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2010
    } while (*code == OP_ALT);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2011
    return c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2012
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2013
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2014
static inline int multiplyWithOverflowCheck(int a, int b)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2015
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2016
    if (!a || !b)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2017
        return 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2018
    if (a > MAX_PATTERN_SIZE / b)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2019
        return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2020
    return a * b;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2021
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2022
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2023
static int calculateCompiledPatternLength(const UChar* pattern, int patternLength, JSRegExpIgnoreCaseOption ignoreCase,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2024
    CompileData& cd, ErrorCode& errorcode)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2025
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2026
    /* Make a pass over the pattern to compute the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2027
     amount of store required to hold the compiled code. This does not have to be
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2028
     perfect as long as errors are overestimates. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2029
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2030
    if (patternLength > MAX_PATTERN_SIZE) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2031
        errorcode = ERR16;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2032
        return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2033
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2034
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2035
    int length = 1 + LINK_SIZE;      /* For initial BRA plus length */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2036
    int branch_extra = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2037
    int lastitemlength = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2038
    unsigned brastackptr = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2039
    FixedArray<int, BRASTACK_SIZE> brastack;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2040
    FixedArray<unsigned char, BRASTACK_SIZE> bralenstack;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2041
    int bracount = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2042
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2043
    const UChar* ptr = (const UChar*)(pattern - 1);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2044
    const UChar* patternEnd = (const UChar*)(pattern + patternLength);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2045
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2046
    while (++ptr < patternEnd) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2047
        int minRepeats = 0, maxRepeats = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2048
        int c = *ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2049
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2050
        switch (c) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2051
            /* A backslashed item may be an escaped data character or it may be a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2052
             character type. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2053
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2054
            case '\\':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2055
                c = checkEscape(&ptr, patternEnd, &errorcode, cd.numCapturingBrackets, false);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2056
                if (errorcode != 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2057
                    return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2058
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2059
                lastitemlength = 1;     /* Default length of last item for repeats */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2060
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2061
                if (c >= 0) {            /* Data character */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2062
                    length += 2;          /* For a one-byte character */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2063
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2064
                    if (c > 127) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2065
                        int i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2066
                        for (i = 0; i < jsc_pcre_utf8_table1_size; i++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2067
                            if (c <= jsc_pcre_utf8_table1[i]) break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2068
                        length += i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2069
                        lastitemlength += i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2070
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2071
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2072
                    continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2073
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2074
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2075
                /* Other escapes need one byte */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2076
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2077
                length++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2078
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2079
                /* A back reference needs an additional 2 bytes, plus either one or 5
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2080
                 bytes for a repeat. We also need to keep the value of the highest
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2081
                 back reference. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2082
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2083
                if (c <= -ESC_REF) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2084
                    int refnum = -c - ESC_REF;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2085
                    cd.backrefMap |= (refnum < 32) ? (1 << refnum) : 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2086
                    if (refnum > cd.topBackref)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2087
                        cd.topBackref = refnum;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2088
                    length += 2;   /* For single back reference */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2089
                    if (safelyCheckNextChar(ptr, patternEnd, '{') && isCountedRepeat(ptr + 2, patternEnd)) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2090
                        ptr = readRepeatCounts(ptr + 2, &minRepeats, &maxRepeats, &errorcode);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2091
                        if (errorcode)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2092
                            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2093
                        if ((minRepeats == 0 && (maxRepeats == 1 || maxRepeats == -1)) ||
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2094
                            (minRepeats == 1 && maxRepeats == -1))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2095
                            length++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2096
                        else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2097
                            length += 5;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2098
                        if (safelyCheckNextChar(ptr, patternEnd, '?'))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2099
                            ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2100
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2101
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2102
                continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2103
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2104
            case '^':     /* Single-byte metacharacters */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2105
            case '.':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2106
            case '$':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2107
                length++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2108
                lastitemlength = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2109
                continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2110
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2111
            case '*':            /* These repeats won't be after brackets; */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2112
            case '+':            /* those are handled separately */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2113
            case '?':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2114
                length++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2115
                goto POSSESSIVE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2116
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2117
            /* This covers the cases of braced repeats after a single char, metachar,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2118
             class, or back reference. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2119
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2120
            case '{':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2121
                if (!isCountedRepeat(ptr + 1, patternEnd))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2122
                    goto NORMAL_CHAR;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2123
                ptr = readRepeatCounts(ptr + 1, &minRepeats, &maxRepeats, &errorcode);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2124
                if (errorcode != 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2125
                    return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2126
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2127
                /* These special cases just insert one extra opcode */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2128
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2129
                if ((minRepeats == 0 && (maxRepeats == 1 || maxRepeats == -1)) ||
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2130
                    (minRepeats == 1 && maxRepeats == -1))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2131
                    length++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2132
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2133
                /* These cases might insert additional copies of a preceding character. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2134
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2135
                else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2136
                    if (minRepeats != 1) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2137
                        length -= lastitemlength;   /* Uncount the original char or metachar */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2138
                        if (minRepeats > 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2139
                            length += 3 + lastitemlength;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2140
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2141
                    length += lastitemlength + ((maxRepeats > 0) ? 3 : 1);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2142
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2143
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2144
                if (safelyCheckNextChar(ptr, patternEnd, '?'))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2145
                    ptr++;      /* Needs no extra length */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2146
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2147
            POSSESSIVE:                     /* Test for possessive quantifier */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2148
                if (safelyCheckNextChar(ptr, patternEnd, '+')) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2149
                    ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2150
                    length += 2 + 2 * LINK_SIZE;   /* Allow for atomic brackets */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2151
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2152
                continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2153
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2154
            /* An alternation contains an offset to the next branch or ket. If any ims
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2155
             options changed in the previous branch(es), and/or if we are in a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2156
             lookbehind assertion, extra space will be needed at the start of the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2157
             branch. This is handled by branch_extra. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2158
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2159
            case '|':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2160
                if (brastackptr == 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2161
                    cd.needOuterBracket = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2162
                length += 1 + LINK_SIZE + branch_extra;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2163
                continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2164
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2165
            /* A character class uses 33 characters provided that all the character
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2166
             values are less than 256. Otherwise, it uses a bit map for low valued
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2167
             characters, and individual items for others. Don't worry about character
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2168
             types that aren't allowed in classes - they'll get picked up during the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2169
             compile. A character class that contains only one single-byte character
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2170
             uses 2 or 3 bytes, depending on whether it is negated or not. Notice this
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2171
             where we can. (In UTF-8 mode we can do this only for chars < 128.) */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2172
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2173
            case '[': {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2174
                int class_optcount;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2175
                if (*(++ptr) == '^') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2176
                    class_optcount = 10;  /* Greater than one */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2177
                    ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2178
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2179
                else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2180
                    class_optcount = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2181
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2182
                bool class_utf8 = false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2183
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2184
                for (; ptr < patternEnd && *ptr != ']'; ++ptr) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2185
                    /* Check for escapes */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2186
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2187
                    if (*ptr == '\\') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2188
                        c = checkEscape(&ptr, patternEnd, &errorcode, cd.numCapturingBrackets, true);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2189
                        if (errorcode != 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2190
                            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2191
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2192
                        /* Handle escapes that turn into characters */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2193
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2194
                        if (c >= 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2195
                            goto NON_SPECIAL_CHARACTER;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2196
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2197
                        /* Escapes that are meta-things. The normal ones just affect the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2198
                         bit map, but Unicode properties require an XCLASS extended item. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2199
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2200
                        else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2201
                            class_optcount = 10;         /* \d, \s etc; make sure > 1 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2202
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2203
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2204
                    /* Anything else increments the possible optimization count. We have to
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2205
                     detect ranges here so that we can compute the number of extra ranges for
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2206
                     caseless wide characters when UCP support is available. If there are wide
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2207
                     characters, we are going to have to use an XCLASS, even for single
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2208
                     characters. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2209
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2210
                    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2211
                        c = *ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2212
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2213
                        /* Come here from handling \ above when it escapes to a char value */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2214
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2215
                    NON_SPECIAL_CHARACTER:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2216
                        class_optcount++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2217
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2218
                        int d = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2219
                        if (safelyCheckNextChar(ptr, patternEnd, '-')) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2220
                            const UChar* hyptr = ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2221
                            if (safelyCheckNextChar(ptr, patternEnd, '\\')) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2222
                                ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2223
                                d = checkEscape(&ptr, patternEnd, &errorcode, cd.numCapturingBrackets, true);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2224
                                if (errorcode != 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2225
                                    return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2226
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2227
                            else if ((ptr + 1 < patternEnd) && ptr[1] != ']')
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2228
                                d = *++ptr;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2229
                            if (d < 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2230
                                ptr = hyptr;      /* go back to hyphen as data */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2231
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2232
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2233
                        /* If d >= 0 we have a range. In UTF-8 mode, if the end is > 255, or >
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2234
                         127 for caseless matching, we will need to use an XCLASS. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2235
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2236
                        if (d >= 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2237
                            class_optcount = 10;     /* Ensure > 1 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2238
                            if (d < c) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2239
                                errorcode = ERR8;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2240
                                return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2241
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2242
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2243
                            if ((d > 255 || (ignoreCase && d > 127))) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2244
                                unsigned char buffer[6];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2245
                                if (!class_utf8)         /* Allow for XCLASS overhead */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2246
                                {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2247
                                    class_utf8 = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2248
                                    length += LINK_SIZE + 2;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2249
                                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2250
                                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2251
                                /* If we have UCP support, find out how many extra ranges are
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2252
                                 needed to map the other case of characters within this range. We
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2253
                                 have to mimic the range optimization here, because extending the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2254
                                 range upwards might push d over a boundary that makes it use
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2255
                                 another byte in the UTF-8 representation. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2256
                                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2257
                                if (ignoreCase) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2258
                                    int occ, ocd;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2259
                                    int cc = c;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2260
                                    int origd = d;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2261
                                    while (getOthercaseRange(&cc, origd, &occ, &ocd)) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2262
                                        if (occ >= c && ocd <= d)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2263
                                            continue;   /* Skip embedded */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2264
                                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2265
                                        if (occ < c  && ocd >= c - 1)  /* Extend the basic range */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2266
                                        {                            /* if there is overlap,   */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2267
                                            c = occ;                     /* noting that if occ < c */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2268
                                            continue;                    /* we can't have ocd > d  */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2269
                                        }                            /* because a subrange is  */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2270
                                        if (ocd > d && occ <= d + 1)   /* always shorter than    */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2271
                                        {                            /* the basic range.       */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2272
                                            d = ocd;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2273
                                            continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2274
                                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2275
                                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2276
                                        /* An extra item is needed */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2277
                                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2278
                                        length += 1 + encodeUTF8(occ, buffer) +
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2279
                                        ((occ == ocd) ? 0 : encodeUTF8(ocd, buffer));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2280
                                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2281
                                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2282
                                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2283
                                /* The length of the (possibly extended) range */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2284
                                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2285
                                length += 1 + encodeUTF8(c, buffer) + encodeUTF8(d, buffer);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2286
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2287
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2288
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2289
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2290
                        /* We have a single character. There is nothing to be done unless we
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2291
                         are in UTF-8 mode. If the char is > 255, or 127 when caseless, we must
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2292
                         allow for an XCL_SINGLE item, doubled for caselessness if there is UCP
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2293
                         support. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2294
                        
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2295
                        else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2296
                            if ((c > 255 || (ignoreCase && c > 127))) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2297
                                unsigned char buffer[6];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2298
                                class_optcount = 10;     /* Ensure > 1 */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2299
                                if (!class_utf8)         /* Allow for XCLASS overhead */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2300
                                {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2301
                                    class_utf8 = true;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2302
                                    length += LINK_SIZE + 2;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2303
                                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2304
                                length += (ignoreCase ? 2 : 1) * (1 + encodeUTF8(c, buffer));
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2305
                            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2306
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2307
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2308
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2309
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2310
                if (ptr >= patternEnd) {   /* Missing terminating ']' */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2311
                    errorcode = ERR6;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2312
                    return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2313
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2314
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2315
                /* We can optimize when there was only one optimizable character.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2316
                 Note that this does not detect the case of a negated single character.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2317
                 In that case we do an incorrect length computation, but it's not a serious
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2318
                 problem because the computed length is too large rather than too small. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2319
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2320
                if (class_optcount == 1)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2321
                    goto NORMAL_CHAR;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2322
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2323
                /* Here, we handle repeats for the class opcodes. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2324
                {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2325
                    length += 33;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2326
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2327
                    /* A repeat needs either 1 or 5 bytes. If it is a possessive quantifier,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2328
                     we also need extra for wrapping the whole thing in a sub-pattern. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2329
                    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2330
                    if (safelyCheckNextChar(ptr, patternEnd, '{') && isCountedRepeat(ptr + 2, patternEnd)) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2331
                        ptr = readRepeatCounts(ptr + 2, &minRepeats, &maxRepeats, &errorcode);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2332
                        if (errorcode != 0)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2333
                            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2334
                        if ((minRepeats == 0 && (maxRepeats == 1 || maxRepeats == -1)) ||
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2335
                            (minRepeats == 1 && maxRepeats == -1))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2336
                            length++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2337
                        else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2338
                            length += 5;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2339
                        if (safelyCheckNextChar(ptr, patternEnd, '+')) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2340
                            ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2341
                            length += 2 + 2 * LINK_SIZE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2342
                        } else if (safelyCheckNextChar(ptr, patternEnd, '?'))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2343
                            ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2344
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2345
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2346
                continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2347
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2348
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2349
            /* Brackets may be genuine groups or special things */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2350
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2351
            case '(': {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2352
                int branch_newextra = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2353
                int bracket_length = 1 + LINK_SIZE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2354
                bool capturing = false;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2355
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2356
                /* Handle special forms of bracket, which all start (? */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2357
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2358
                if (safelyCheckNextChar(ptr, patternEnd, '?')) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2359
                    switch (c = (ptr + 2 < patternEnd ? ptr[2] : 0)) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2360
                        /* Non-referencing groups and lookaheads just move the pointer on, and
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2361
                         then behave like a non-special bracket, except that they don't increment
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2362
                         the count of extracting brackets. Ditto for the "once only" bracket,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2363
                         which is in Perl from version 5.005. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2364
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2365
                        case ':':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2366
                        case '=':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2367
                        case '!':
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2368
                            ptr += 2;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2369
                            break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2370
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2371
                        /* Else loop checking valid options until ) is met. Anything else is an
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2372
                         error. If we are without any brackets, i.e. at top level, the settings
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2373
                         act as if specified in the options, so massage the options immediately.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2374
                         This is for backward compatibility with Perl 5.004. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2375
                            
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2376
                        default:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2377
                            errorcode = ERR12;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2378
                            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2379
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2380
                } else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2381
                    capturing = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2382
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2383
                /* Capturing brackets must be counted so we can process escapes in a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2384
                 Perlish way. If the number exceeds EXTRACT_BASIC_MAX we are going to need
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2385
                 an additional 3 bytes of memory per capturing bracket. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2386
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2387
                if (capturing) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2388
                    bracount++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2389
                    if (bracount > EXTRACT_BASIC_MAX)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2390
                        bracket_length += 3;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2391
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2392
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2393
                /* Save length for computing whole length at end if there's a repeat that
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2394
                 requires duplication of the group. Also save the current value of
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2395
                 branch_extra, and start the new group with the new value. If non-zero, this
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2396
                 will either be 2 for a (?imsx: group, or 3 for a lookbehind assertion. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2397
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2398
                if (brastackptr >= sizeof(brastack)/sizeof(int)) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2399
                    errorcode = ERR17;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2400
                    return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2401
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2402
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2403
                bralenstack[brastackptr] = branch_extra;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2404
                branch_extra = branch_newextra;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2405
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2406
                brastack[brastackptr++] = length;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2407
                length += bracket_length;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2408
                continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2409
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2410
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2411
            /* Handle ket. Look for subsequent maxRepeats/minRepeats; for certain sets of values we
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2412
             have to replicate this bracket up to that many times. If brastackptr is
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2413
             0 this is an unmatched bracket which will generate an error, but take care
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2414
             not to try to access brastack[-1] when computing the length and restoring
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2415
             the branch_extra value. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2416
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2417
            case ')': {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2418
                int duplength;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2419
                length += 1 + LINK_SIZE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2420
                if (brastackptr > 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2421
                    duplength = length - brastack[--brastackptr];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2422
                    branch_extra = bralenstack[brastackptr];
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2423
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2424
                else
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2425
                    duplength = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2426
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2427
                /* Leave ptr at the final char; for readRepeatCounts this happens
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2428
                 automatically; for the others we need an increment. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2429
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2430
                if ((ptr + 1 < patternEnd) && (c = ptr[1]) == '{' && isCountedRepeat(ptr + 2, patternEnd)) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2431
                    ptr = readRepeatCounts(ptr + 2, &minRepeats, &maxRepeats, &errorcode);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2432
                    if (errorcode)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2433
                        return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2434
                } else if (c == '*') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2435
                    minRepeats = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2436
                    maxRepeats = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2437
                    ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2438
                } else if (c == '+') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2439
                    minRepeats = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2440
                    maxRepeats = -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2441
                    ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2442
                } else if (c == '?') {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2443
                    minRepeats = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2444
                    maxRepeats = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2445
                    ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2446
                } else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2447
                    minRepeats = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2448
                    maxRepeats = 1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2449
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2450
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2451
                /* If the minimum is zero, we have to allow for an OP_BRAZERO before the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2452
                 group, and if the maximum is greater than zero, we have to replicate
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2453
                 maxval-1 times; each replication acquires an OP_BRAZERO plus a nesting
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2454
                 bracket set. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2455
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2456
                int repeatsLength;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2457
                if (minRepeats == 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2458
                    length++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2459
                    if (maxRepeats > 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2460
                        repeatsLength = multiplyWithOverflowCheck(maxRepeats - 1, duplength + 3 + 2 * LINK_SIZE);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2461
                        if (repeatsLength < 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2462
                            errorcode = ERR16;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2463
                            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2464
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2465
                        length += repeatsLength;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2466
                        if (length > MAX_PATTERN_SIZE) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2467
                            errorcode = ERR16;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2468
                            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2469
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2470
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2471
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2472
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2473
                /* When the minimum is greater than zero, we have to replicate up to
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2474
                 minval-1 times, with no additions required in the copies. Then, if there
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2475
                 is a limited maximum we have to replicate up to maxval-1 times allowing
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2476
                 for a BRAZERO item before each optional copy and nesting brackets for all
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2477
                 but one of the optional copies. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2478
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2479
                else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2480
                    repeatsLength = multiplyWithOverflowCheck(minRepeats - 1, duplength);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2481
                    if (repeatsLength < 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2482
                        errorcode = ERR16;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2483
                        return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2484
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2485
                    length += repeatsLength;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2486
                    if (maxRepeats > minRepeats) { /* Need this test as maxRepeats=-1 means no limit */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2487
                        repeatsLength = multiplyWithOverflowCheck(maxRepeats - minRepeats, duplength + 3 + 2 * LINK_SIZE);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2488
                        if (repeatsLength < 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2489
                            errorcode = ERR16;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2490
                            return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2491
                        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2492
                        length += repeatsLength - (2 + 2 * LINK_SIZE);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2493
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2494
                    if (length > MAX_PATTERN_SIZE) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2495
                        errorcode = ERR16;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2496
                        return -1;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2497
                    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2498
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2499
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2500
                /* Allow space for once brackets for "possessive quantifier" */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2501
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2502
                if (safelyCheckNextChar(ptr, patternEnd, '+')) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2503
                    ptr++;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2504
                    length += 2 + 2 * LINK_SIZE;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2505
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2506
                continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2507
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2508
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2509
            /* Non-special character. It won't be space or # in extended mode, so it is
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2510
             always a genuine character. If we are in a \Q...\E sequence, check for the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2511
             end; if not, we have a literal. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2512
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2513
            default:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2514
            NORMAL_CHAR:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2515
                length += 2;          /* For a one-byte character */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2516
                lastitemlength = 1;   /* Default length of last item for repeats */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2517
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2518
                if (c > 127) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2519
                    int i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2520
                    for (i = 0; i < jsc_pcre_utf8_table1_size; i++)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2521
                        if (c <= jsc_pcre_utf8_table1[i])
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2522
                            break;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2523
                    length += i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2524
                    lastitemlength += i;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2525
                }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2526
                
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2527
                continue;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2528
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2529
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2530
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2531
    length += 2 + LINK_SIZE;    /* For final KET and END */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2532
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2533
    cd.numCapturingBrackets = bracount;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2534
    return length;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2535
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2536
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2537
/*************************************************
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2538
*        Compile a Regular Expression            *
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2539
*************************************************/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2540
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2541
/* This function takes a string and returns a pointer to a block of store
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2542
holding a compiled version of the expression. The original API for this
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2543
function had no error code return variable; it is retained for backwards
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2544
compatibility. The new function is given a new name.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2545
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2546
Arguments:
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2547
  pattern       the regular expression
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2548
  options       various option bits
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2549
  errorCodePtr  pointer to error code variable (pcre_compile2() only)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2550
                  can be NULL if you don't want a code value
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2551
  errorPtr      pointer to pointer to error text
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2552
  erroroffset   ptr offset in pattern where error was detected
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2553
  tables        pointer to character tables or NULL
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2554
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2555
Returns:        pointer to compiled data block, or NULL on error,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2556
                with errorPtr and erroroffset set
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2557
*/
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2558
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2559
static inline JSRegExp* returnError(ErrorCode errorcode, const char** errorPtr)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2560
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2561
    *errorPtr = errorText(errorcode);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2562
    return 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2563
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2564
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2565
JSRegExp* jsRegExpCompile(const UChar* pattern, int patternLength,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2566
                JSRegExpIgnoreCaseOption ignoreCase, JSRegExpMultilineOption multiline,
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2567
                unsigned* numSubpatterns, const char** errorPtr)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2568
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2569
    /* We can't pass back an error message if errorPtr is NULL; I guess the best we
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2570
     can do is just return NULL, but we can set a code value if there is a code pointer. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2571
    if (!errorPtr)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2572
        return 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2573
    *errorPtr = NULL;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2574
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2575
    CompileData cd;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2576
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2577
    ErrorCode errorcode = ERR0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2578
    /* Call this once just to count the brackets. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2579
    calculateCompiledPatternLength(pattern, patternLength, ignoreCase, cd, errorcode);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2580
    /* Call it again to compute the length. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2581
    int length = calculateCompiledPatternLength(pattern, patternLength, ignoreCase, cd, errorcode);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2582
    if (errorcode)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2583
        return returnError(errorcode, errorPtr);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2584
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2585
    if (length > MAX_PATTERN_SIZE)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2586
        return returnError(ERR16, errorPtr);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2587
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2588
    size_t size = length + sizeof(JSRegExp);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2589
#if REGEXP_HISTOGRAM
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2590
    size_t stringOffset = (size + sizeof(UChar) - 1) / sizeof(UChar) * sizeof(UChar);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2591
    size = stringOffset + patternLength * sizeof(UChar);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2592
#endif
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2593
    JSRegExp* re = reinterpret_cast<JSRegExp*>(new char[size]);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2594
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2595
    if (!re)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2596
        return returnError(ERR13, errorPtr);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2597
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2598
    re->options = (ignoreCase ? IgnoreCaseOption : 0) | (multiline ? MatchAcrossMultipleLinesOption : 0);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2599
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2600
    /* The starting points of the name/number translation table and of the code are
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2601
     passed around in the compile data block. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2602
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2603
    const unsigned char* codeStart = (const unsigned char*)(re + 1);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2604
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2605
    /* Set up a starting, non-extracting bracket, then compile the expression. On
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2606
     error, errorcode will be set non-zero, so we don't need to look at the result
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2607
     of the function here. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2608
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2609
    const UChar* ptr = (const UChar*)pattern;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2610
    const UChar* patternEnd = pattern + patternLength;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2611
    unsigned char* code = const_cast<unsigned char*>(codeStart);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2612
    int firstByte, reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2613
    int bracketCount = 0;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2614
    if (!cd.needOuterBracket)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2615
        compileBranch(re->options, &bracketCount, &code, &ptr, patternEnd, &errorcode, &firstByte, &reqByte, cd);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2616
    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2617
        *code = OP_BRA;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2618
        compileBracket(re->options, &bracketCount, &code, &ptr, patternEnd, &errorcode, 0, &firstByte, &reqByte, cd);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2619
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2620
    re->topBracket = bracketCount;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2621
    re->topBackref = cd.topBackref;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2622
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2623
    /* If not reached end of pattern on success, there's an excess bracket. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2624
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2625
    if (errorcode == 0 && ptr < patternEnd)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2626
        errorcode = ERR10;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2627
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2628
    /* Fill in the terminating state and check for disastrous overflow, but
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2629
     if debugging, leave the test till after things are printed out. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2630
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2631
    *code++ = OP_END;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2632
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2633
    ASSERT(code - codeStart <= length);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2634
    if (code - codeStart > length)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2635
        errorcode = ERR7;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2636
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2637
    /* Give an error if there's back reference to a non-existent capturing
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2638
     subpattern. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2639
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2640
    if (re->topBackref > re->topBracket)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2641
        errorcode = ERR15;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2642
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2643
    /* Failed to compile, or error while post-processing */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2644
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2645
    if (errorcode != ERR0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2646
        delete [] reinterpret_cast<char*>(re);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2647
        return returnError(errorcode, errorPtr);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2648
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2649
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2650
    /* If the anchored option was not passed, set the flag if we can determine that
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2651
     the pattern is anchored by virtue of ^ characters or \A or anything else (such
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2652
     as starting with .* when DOTALL is set).
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2653
     
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2654
     Otherwise, if we know what the first character has to be, save it, because that
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2655
     speeds up unanchored matches no end. If not, see if we can set the
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2656
     UseMultiLineFirstByteOptimizationOption flag. This is helpful for multiline matches when all branches
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2657
     start with ^. and also when all branches start with .* for non-DOTALL matches.
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2658
     */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2659
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2660
    if (cd.needOuterBracket ? bracketIsAnchored(codeStart) : branchIsAnchored(codeStart))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2661
        re->options |= IsAnchoredOption;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2662
    else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2663
        if (firstByte < 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2664
            firstByte = (cd.needOuterBracket
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2665
                    ? bracketFindFirstAssertedCharacter(codeStart, false)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2666
                    : branchFindFirstAssertedCharacter(codeStart, false))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2667
                | ((re->options & IgnoreCaseOption) ? REQ_IGNORE_CASE : 0);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2668
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2669
        if (firstByte >= 0) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2670
            int ch = firstByte & 255;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2671
            if (ch < 127) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2672
                re->firstByte = ((firstByte & REQ_IGNORE_CASE) && flipCase(ch) == ch) ? ch : firstByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2673
                re->options |= UseFirstByteOptimizationOption;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2674
            }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2675
        } else {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2676
            if (cd.needOuterBracket ? bracketNeedsLineStart(codeStart, 0, cd.backrefMap) : branchNeedsLineStart(codeStart, 0, cd.backrefMap))
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2677
                re->options |= UseMultiLineFirstByteOptimizationOption;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2678
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2679
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2680
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2681
    /* For an anchored pattern, we use the "required byte" only if it follows a
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2682
     variable length item in the regex. Remove the caseless flag for non-caseable
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2683
     bytes. */
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2684
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2685
    if (reqByte >= 0 && (!(re->options & IsAnchoredOption) || (reqByte & REQ_VARY))) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2686
        int ch = reqByte & 255;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2687
        if (ch < 127) {
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2688
            re->reqByte = ((reqByte & REQ_IGNORE_CASE) && flipCase(ch) == ch) ? (reqByte & ~REQ_IGNORE_CASE) : reqByte;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2689
            re->options |= UseRequiredByteOptimizationOption;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2690
        }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2691
    }
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2692
    
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2693
#if REGEXP_HISTOGRAM
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2694
    re->stringOffset = stringOffset;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2695
    re->stringLength = patternLength;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2696
    memcpy(reinterpret_cast<char*>(re) + stringOffset, pattern, patternLength * 2);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2697
#endif
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2698
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2699
    if (numSubpatterns)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2700
        *numSubpatterns = re->topBracket;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2701
    return re;
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2702
}
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2703
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2704
void jsRegExpFree(JSRegExp* re)
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2705
{
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2706
    delete [] reinterpret_cast<char*>(re);
4f2f89ce4247 Revision: 201037
Dremov Kirill (Nokia-D-MSW/Tampere) <kirill.dremov@nokia.com>
parents:
diff changeset
  2707
}