Delphi Inspiration

Components and Applications

User Tools

Site Tools


products:pcre2:history

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

products:pcre2:history [2019/03/07 18:01] (current)
Line 1: Line 1:
 +====== YuPcre2: Version History ======
 +{{page>​header}}
 +=====YuPcre2 1.10.0 – 7 Mar 2019=====
 +
 +  * Fix: ''​TDIRegEx2_8.Replace''​ and ''​TDIRegEx2_16.Replace''​ did not return the start of the string if StartOffset > 0.
 +  * Adjust ''​TDIRegEx2SearchStream_Enc''​ to DIConverters 1.18.0: Converter functions now use the native unsigned integer type for the length of a string and support stings longer than 2 GB. This change only affects projects using DIConverters 1.18.0.
 +
 +=====YuPcre2 1.9.2 – 8 Jan 2019=====
 +
 +  * Matching the pattern ''​(*UTF)\C[^\v]+\x80''​ against an 8-bit string containing multi-code-unit characters caused bad behaviour and possibly a crash.
 +  * When returning an error from ''​pcre2_pattern_convert'',​ ensure the error offset is set zero for early errors.
 +  * Refactored ''​pcre2_dfa_match''​ so that the internal recursive calls no longer use the stack for local workspace and local ovectors. Instead, an initial block of stack is reserved, but if this is insufficient,​ heap memory is used. The heap limit parameter now applies to ''​pcre2_dfa_match''​.
 +  * In ''​pcre2_substitute'',​ with global matching, a pattern that matched an empty string, but never at the starting match offset, was not handled in a Perl-compatible way. The pattern ''​(%%<​%%?​=\G.)''​ is an example of such a pattern. Because ''​\G''​ is in a lookbehind assertion, there has to be a "​bumpalong"​ before there can be a match. The automatic "​advance by one character after an empty string match" rule is therefore inappropriate. A more complicated algorithm has now been implemented.
 +  * When checking to see if a lookbehind is of fixed length, lookaheads were correctly ignored, but qualifiers on lookaheads were not being ignored, leading to an incorrect "​lookbehind assertion is not fixed length"​ error.
 +  * Updated to Unicode version 11.0.0. As well as the usual addition of new scripts and characters, this involved re-jigging the grapheme break property algorithm because Unicode has changed the way emojis are handled.
 +  * Fixed an obscure bug that struck when there were two atomic groups not separated by something with a backtracking point. There could be an incorrect backtrack into the first of the atomic groups. A complicated example is ''​(?>​a(*:​1))(?>​b)(*SKIP:​1)x|.*''​ matched against "​abc",​ where the ''​*SKIP''​ shouldn'​t find a MARK (because is in an atomic group), but it did.
 +  * ''​(*ACCEPT:​ARG)'',​ ''​(*FAIL:​ARG)'',​ and ''​(*COMMIT:​ARG)''​ are now supported.
 +  * A ''​(*MARK)''​ name was not being passed back for positive assertions that were terminated by ''​(*ACCEPT)''​.
 +  * Add support for ''​\N{U+dddd}'',​ but only in Unicode mode.
 +  * Add support for ''​(?​^)''​ for unsetting all ''​imnsx''​ options.
 +  * The ''​PCRE2_EXTENDED''​ (''/​x''​) option only ever discarded space characters whose code point was less than 256. Now, when Unicode support is compiled, ''​PCRE2_EXTENDED''​ also discards U+0085, U+200E, U+200F, U+2028, and U+2029, which are additional characters defined by Unicode as "​Pattern White Space"​. This makes PCRE2 compatible with Perl.
 +  * In certain circumstances,​ option settings within patterns were not being correctly processed. For example, the pattern ''​%%((%%?​i)A)(?​m)B''​ incorrectly matched "​ab"​. (The ''​(?​m)''​ setting lost the fact that ''​(?​i)''​ should be reset at the end of its group during the parse process, but without another setting such as ''​(?​m)''​ the compile phase got it right.)
 +  * When serializing a pattern, set the memctl, executable_jit,​ and tables fields (that is, all the fields that contain pointers) to zeros so that the result of serializing is always the same. These fields are re-set when the pattern is deserialized.
 +  * In a pattern such as ''​[^\x{100}-\x{ffff}]*[\x80-\xff]''​ which has a repeated negative class with no characters less than 0x100 followed by a positive class with only characters less than 0x100, the first class was incorrectly being auto-possessified,​ causing incorrect match failures.
 +  * If the only branch in a conditional subpattern was anchored, the whole subpattern was treated as anchored, when it should not have been, since the assumed empty second branch cannot be anchored. Demonstrated by test patterns such as ''​(?​(1)^())b''​ or ''​(?​(?​=^))b''​.
 +  * A repeated conditional subpattern that could match an empty string was always assumed to be unanchored. Now it it checked just like any other repeated conditional subpattern, and can be found to be anchored if the minimum quantifier is one or more.
 +
 +=====YuPcre2 1.9.1 – 1 Jan 2019=====
 +
 +  * Fix ''​TDIRegEx2_16.MatchNext''​ which might not not have properly advanced the start offset if the previous match was an empty string.
 +  * In YuPcre2_RegEx2.pas,​ replace a few character constants with ordinal constants to work around duplicate case label errors with at least one Delphi 10.3 Rio installation.
 +
 +=====YuPcre2 1.9.0 – 24 Dec 2018=====
 +
 +  * Support Delphi 10.3 Rio Win32 and Win64.
 +
 +=====YuPcre2 1.8.0 – 2 Mar 2018=====
 +
 +  * Add new ''​pcre2_config''​ options: ''​PCRE2_CONFIG_NEVER_BACKSLASH_C''​ and ''​PCRE2_CONFIG_COMPILED_WIDTHS''​.
 +  * Defined public names for all the ''​pcre2_compile''​ error numbers.
 +  * When an assertion contained (*ACCEPT) it caused all open capturing groups to be closed (as for a non-assertion ACCEPT), which was wrong and could lead to misbehaviour for subsequent references to groups that started outside the assertion. ACCEPT in an assertion now closes only those groups that were started within that assertion.
 +  * Although ''​pcre2_jit_match''​ checks whether the pattern is compiled in a given mode, it was also expected that at least one mode is available. This is fixed and ''​pcre2_jit_match''​ returns with ''​PCRE2_ERROR_JIT_BADOPTION''​ when the pattern is not optimized by JIT at all.
 +  * If a backreference with a minimum repeat count of zero was first in a pattern, apart from assertions, an incorrect first matching character could be recorded. For example, for the pattern ''​(?​=(a))\1?​b'',​ "​b"​ was incorrectly set as the first character of a match.
 +  * Characters in a leading positive assertion are considered for recording a first character of a match when the rest of the pattern does not provide one. However, a character in a non-assertive group within a leading assertion such as in the pattern ''​(?​=(a))\1?​b''​ caused this process to fail. This was an infelicity rather than an outright bug, because it did not affect the result of a match, just its speed. (In fact, in this case, the starting '​a'​ was subsequently picked up in the study.)
 +  * Allocate a single callout block on the stack at the start of ''​pcre2_match''​ and set its never-changing fields once only. Do the same for ''​pcre2_dfa_match''​.
 +  * Save the extra compile options (set in the compile context) with the compiled pattern (they were not previously saved), add ''​PCRE2_INFO_EXTRAOPTIONS''​ to retrieve them.
 +  * Added ''​PCRE2_CALLOUT_STARTMATCH''​ and ''​PCRE2_CALLOUT_BACKTRACK''​ bits to a new field callout_flags in callout blocks. The bits are set by ''​pcre2_match'',​ but not by JIT or ''​pcre2_dfa_match''​. These bits are provided to help with tracking how a backtracking match is proceeding.
 +  * When ''​PCRE2_FIRSTLINE''​ without ''​PCRE2_NO_START_OPTIMIZE''​ was used in non-JIT matching (both ''​pcre2_match''​ and ''​pcre2_dfa_match''​) and the matched string started with the first code unit of a newline sequence, matching failed because it was not tried at the newline.
 +  * Code for giving up a non-partial match after failing to find a starting code unit anywhere in the subject was missing when searching for one of a number of code units (the bitmap case) in both ''​pcre2_match''​ and ''​pcre2_dfa_match''​. This was a missing optimization rather than a bug.
 +  * The JIT compiler has been updated.
 +  * Avoid pointer overflow for unset captures in ''​pcre2_substring_list_get''​. This could not actually cause a crash because it was always used in a memcpy() call with zero length.
 +  * Auto-possessification at the end of a capturing group was dependent on what follows the group (e.g. ''​(a+)b''​ would auto-possessify the ''​a+''​) but this caused incorrect behaviour when the group was called recursively from elsewhere in the pattern where something different might follow. Iterators at the ends of capturing groups are no longer considered for auto-possessification if the pattern contains any recursions.
 +
 +=====YuPcre2 1.7.0 – 16 Aug 2017=====
 +
 +  * Implement ''​PCRE2_ENDANCHORED'',​ ''​coEndAnchored'',​ and ''​moEndAnchored''​.
 +  * Add an explicit limit on the amount of heap used by ''​pcre2_match'',​ set by ''​pcre2_set_heap_limit'',​ ''​TDIPerlRegEx2_8.HeapLimit'',​ ''​TDIDfaRegEx2_16.HeapLimit'',​ and the pattern start ''​(*LIMIT_HEAP=xxx)''​.
 +  * Extend auto-anchoring etc. to ignore groups with a zero qualifier and single-branch conditions with a false condition (e.g. DEFINE) at the start of a branch. For example, ''​(?​(DEFINE)...)^A''​ and ''​(...){0}^B''​ are now flagged as anchored.
 +  * Implement ''​PCRE2_EXTENDED_MORE''​ and ''​coExtendedMore'',​ and related ''/​xx''​ and ''​(?​xx)''​ features.
 +  * Implement ''​(?​n:''​ for ''​PCRE2_NO_AUTO_CAPTURE''​ and ''​coNoAutoCapture'',​ because Perl now has this.
 +  * Implement extra compile options in the compile context:
 +    * ''​PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES''​ and ''​coAllowSurrogateEscapes'';​
 +    * ''​PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL''​ and ''​coBadEscapeIsLiteral'';​
 +    * ''​PCRE2_EXTRA_MATCH_LINE''​ and ''​coMatchLine'';​
 +    * ''​PCRE2_EXTRA_MATCH_WORD''​ and ''​coMatchWord''​.
 +  * Implement newline type ''​PCRE2_NEWLINE_NUL''​.
 +  * A lookbehind assertion that had a zero-length branch caused undefined behaviour when processed by ''​pcre2_dfa_match''​.
 +  * The match limit value now also applies to ''​pcre2_dfa_match''​ as there are patterns that can use up a lot of resources without necessarily recursing very deeply.
 +  * Implement ''​PCRE2_LITERAL''​ and ''​coLiteral''​.
 +  * Increased the limit for searching for a "must be present"​ code unit in subjects from 1000 to 2000 for 8-bit searches, since they are much faster.
 +  * Arrange for anchored patterns to record and use "first code unit" data, because this can give a fast "no match" without searching for a "​required code unit". Previously only non-anchored patterns did this.
 +  * Upgraded the Unicode tables from Unicode 8.0.0 to Unicode 10.0.0.
 +  * Update extended grapheme breaking rules to the latest set that are in Unicode Standard Annex #29.
 +  * Added experimental foreign pattern conversion facilities (''​pcre2_pattern_convert''​ and friends).
 +  * If a hyphen that follows a character class is the last character in the class, Perl does not give a warning. PCRE2 now also treats this as a literal.
 +  * PCRE2 was not throwing an error for ''​[\d-X]''​ (and similar escapes), as is documented.
 +
 +=====YuPcre2 1.6.0 – 3 Apr 2017=====
 +
 +**New features:**
 +
 +  * Support Delphi 10.2 Tokyo Win32 and Win64.
 +  * The main interpreter,​ ''​pcre2_match'',​ has been refactored into a new version that does not use recursive function calls (and therefore the stack) for remembering backtracking positions. The new implementation allows backtracking into recursive group calls in patterns, making it more compatible with Perl, and also fixes some other hard-to-do issues.
 +    * Now that ''​pcre2_match''​ no longer uses recursive function calls (see above), the "match limit recursion"​ value seems misnamed. It still exists, and limits the depth of tree that is searched. To avoid future confusion, it has been renamed as "depth limit" in all relevant places (''​TDIRegEx2Base.MatchLimitDepth'',​ ''​PCRE2_INFO_DEPTHLIMIT'',​ ''​PCRE2_CONFIG_DEPTHLIMIT'',​ ''​PCRE2_ERROR_DEPTHLIMIT'',​ ''​pcre2_set_depth_limit'',​ etc.) but the old names are still available for backwards compatibility.
 +    * ''​PCRE2_CONFIG_STACKRECURSE''​ is no longer used and deprecated.
 +  * Added the ''​PCRE2_INFO_FRAMESIZE''​ item to ''​pcre2_pattern_info''​ and the ''​InfoFrameSize''​ property to ''​TDIRegEx2_8''​ as well as ''​TDIRegEx2_16.InfoFrameSize''​.
 +  * The depth (formerly recursion) limit now applies to DFA matching.
 +
 +** Bug fixes:**
 +
 +  * In the 32-bit library in non-UTF mode, an attempt to find a Unicode property for a character with a code point greater than 0x10ffff (the Unicode maximum) caused a crash.
 +  * If a lookbehind assertion that contained a back reference to a group appearing later in the pattern was compiled with the ''​PCRE2_ANCHORED''​ option, undefined actions (often a segmentation fault) could occur, depending on what other options were set. An example assertion is ''​(?​%%<​%%!\1(abc))''​ where the reference ''​\1''​ precedes the group ''​(abc)''​.
 +  * Fix memory leak in ''​pcre2_serialize_decode''​ when the input is invalid.
 +  * Fix potential nil dereference in ''​pcre2_callout_enumerate''​ if called with a nil pattern pointer.
 +  * The alternative matching function, ''​pcre2_dfa_match''​ misbehaved if it encountered a character class with a possessive repeat, for example ''​[a-f]{3}+''​.
 +
 +=====YuPcre2 1.5.0 – 17 Feb 2017=====
 +
 +**New features:**
 +
 +  * Implemented ''​pcre2_code_copy_with_tables''​.
 +  * ''​\g{+%%<​%%number>​}''​ (e.g. ''​\g{+2}''​) is now supported. It is a "​forward back reference"​ and can be useful in repetitions (compare ''​\g{-%%<​%%number>​}''​). Perl does not recognize this syntax.
 +
 +**Optimizations:​**
 +
 +  * When a pattern is too complicated,​ PCRE2 gives up trying to find a minimum matching length and just records zero. Typically this happens when there are too many nested or recursive back references. If the limit was reached in certain recursive cases it failed to be triggered and an internal error could be the result.
 +  * The ''​pcre2_dfa_match''​ function now takes note of the recursion limit for the internal recursive calls that are used for lookrounds and recursions within the pattern.
 +  * Detecting patterns that are too large inside the length-measuring loop saves processing ridiculously long patterns to their end.
 +  * When autopossessifying,​ skip empty branches without recursion, to reduce stack usage. Example pattern: ''​X?​(R||){3335}''​.
 +  * A pattern with very many explicit back references to a group that is a long way from the start of the pattern could take a long time to compile because searching for the referenced group in order to find the minimum length was being done repeatedly. Now up to 128 group minimum lengths are cached and the attempt to find a minimum length is abandoned if there is a back reference to a group whose number is greater than 128. (In that case, the pattern is so complicated that this optimization probably isn't worth it.)
 +
 +**Bug fixes:**
 +
 +  * In any wide-character mode (8-bit UTF or any 16-bit or 32-bit mode), without PCRE2_UCP set, a negative character type such as ''​\D''​ in a positive class should cause all characters greater than 255 to match, whatever else is in the class. There was a bug that caused this not to happen if a Unicode property item was added to such a class, for example ''​[\D\P{Nd}]''​ or ''​[\W\pL]''​.
 +  * There has been a major re-factoring of ''​pcre2_compile''​. Most syntax checking is now done in the pre-pass that identifies capturing groups. While doing this, some minor bugs and Perl incompatibilities were fixed, including:
 +    - ''​\Q\E''​ in the middle of a quantifier such as ''​A+\Q\E+''​ is now ignored instead of giving an invalid quantifier error.
 +    - ''​{0}''​ can now be used after a group in a lookbehind assertion; previously this caused an "​assertion is not fixed length"​ error.
 +    - Perl always treats ''​(?​(DEFINE)''​ as a "​define"​ group, even if a group with the name "​DEFINE"​ exists. PCRE2 now does likewise.
 +    - A recursion condition test such as ''​(?​(R2)...)''​ must now refer to an existing subpattern.
 +    - A conditional recursion test such as ''​(?​(R)...)''​ misbehaved if there was a group whose name began with "​R"​.
 +    - A hyphen appearing immediately after a POSIX character class (for example ''​%%[[%%:​ascii:​]-z]''​) now generates an error. Perl does accept this as a literal, but gives a warning, so it seems best to fail it in PCRE.
 +    - An empty ''​\Q\E''​ sequence may appear after a callout that precedes an assertion condition (it is, of course, ignored).\\ \\ One effect of the refactoring is that some error numbers and messages have changed, and the pattern offset given for compiling errors is not always the right-most character that has been read. In particular, for a variable-length lookbehind assertion it now points to the start of the assertion. Another change is that when a callout appears before a group, the "​length of next pattern item" that is passed now just gives the length of the opening parenthesis item, not the length of the whole group. A length of zero is now given only for a callout at the end of the pattern. Automatic callouts are no longer inserted before and after explicit callouts in the pattern. * Back references are now permitted in lookbehind assertions when there are no duplicated group numbers (that is, ''​(?​|''​ has not been used), and, if the reference is by name, there is only one group of that name. The referenced group must, of course be of fixed length.
 +  * Automatic callouts are no longer generated before and after callouts in the pattern.
 +  * A number of bugs have been mended relating to match start-up optimizations when the first thing in a pattern is a positive lookahead. These all applied only when ''​PCRE2_NO_START_OPTIMIZE''​ was *not* set:
 +    - A pattern such as ''​(?​=.*X)X$''​ was incorrectly optimized as if it needed both an initial '​X'​ and a following '​X'​.
 +    - Some patterns starting with an assertion that started with ''​.*''​ were incorrectly optimized as having to match at the start of the subject or after a newline. There are cases where this is not true, for example, ''​(?​=.*[A-Z])(?​=.{8,​16})(?​!.*[\s])''​ matches after the start in lines that start with spaces. Starting ''​.*''​ in an assertion is no longer taken as an indication of matching at the start (or after a newline).
 +  * A pattern with ''​PCRE2_DOTALL''​ (''/​s''​) set but not ''​PCRE2_NO_DOTSTAR_ANCHOR'',​ and which started with ''​.*''​ inside a positive lookahead was incorrectly being compiled as implicitly anchored.
 +  * Fix out-of-bounds read for partial matching of ''​.''​ against an empty string when the newline type is CRLF.
 +  * The appearance of ''​\p'',​ ''​\P'',​ or ''​\X''​ in a substitution string when ''​PCRE2_SUBSTITUTE_EXTENDED''​ was set caused a segmentation fault (''​nil''​ dereference).
 +  * If the starting offset was specified as greater than the subject length in a call to ''​pcre2_substitute''​ an out-of-bounds memory reference could occur.
 +  * Incorrect data was compiled for a pattern with ''​PCRE2_UCP''​ set without ''​PCRE2_UTF''​ if a class required all wide characters to match (for example, ''​[\s[:​^ascii:​]]''​).
 +  * The limit in the auto-possessification code that was intended to catch overly-complicated patterns and not spend too much time auto-possessifying was being reset too often, resulting in very long compile times for some patterns. Now such patterns are no longer completely auto-possessified.
 +  * Ignore ''​PCRE2_CASELESS''​ when processing ''​\h'',​ ''​\H'',​ ''​\v'',​ and ''​\V''​ in classes as it just wastes time. In the UTF case it can also produce redundant entries in XCLASS lists caused by characters with multiple other cases and pairs of characters in the same "​not-x"​ sublists.
 +
 +=====YuPcre2 1.4.0 – 31 Jul 2016=====
 +
 +**New Features:**
 +
 +  * Implemented ''​pcre2_code_copy''​ to make a copy of a compiled pattern.
 +  * Implemented the ''​PCRE2_NO_JIT''​ option for ''​pcre2_match''​ and ''​moNoJit''​ option for ''​TDIRegEx2Base.MatchOptions''​.
 +  * Calls to ''​pcre2_get_error_message''​ with error numbers that are never returned by PCRE2 functions were returning empty strings. Now the error code ''​PCRE2_ERROR_BADDATA''​ is returned.
 +  * Allow ''​\C''​ in lookbehinds and DFA matching in UTF-32 mode.
 +
 +**Bug fixes:**
 +
 +  * Detect unmatched closing parentheses and give the error in the pre-scan instead of later. Previously the pre-scan carried on and could give a misleading incorrect error message. For example, ''​(?​J)(?'​a'​))(?'​a'​)''​ gave a message about invalid duplicate group names.
 +  * A pattern that included ''​(*ACCEPT)''​ in the middle of a sufficiently deeply nested set of parentheses of sufficient size caused an overflow of the compiling workspace (which was diagnosed, but of course is not desirable).
 +  * Detect missing closing parentheses during the pre-pass for group identification.
 +  * Fix a racing condition in JIT.
 +  * Fix register overwrite in JIT when SSE2 acceleration is enabled.
 +
 +=====YuPcre2 1.3.0 – 7 May 2016=====
 +
 +  * Support Delphi 10.1 Berlin Win32 and Win64.
 +
 +=====YuPcre2 1.2.0 – 4 Mar 2016=====
 +
 +** New features:**
 +
 +  * New option to limit the length of a pattern: ''​TDIRegEx2Base.MaxPatternLength''​ and ''​pcre2_set_max_pattern_length''​.
 +  * New option to limit the offset of unanchored matches: ''​TDIRegEx2Base.OffsetLimit''​ and ''​pcre2_set_offset_limit''​.
 +  * New ''​pcre2_substitute''​ options ''​PCRE2_SUBSTITUTE_EXTENDED'',​ ''​PCRE2_SUBSTITUTE_UNSET_EMPTY'',​ ''​PCRE2_SUBSTITUTE_UNKNOWN_UNSET'',​ and ''​PCRE2_SUBSTITUTE_OVERFLOW_LENGTH''​.
 +
 +** Bug fixes:**
 +
 +  * In a character class such as ''​[\W\p{Any}]''​ where both a negative-type escape ("not a word character"​) and a property escape were present, the property escape was being ignored.
 +  * Fixed integer overflow for patterns whose minimum matching length is very, very large.
 +  * The special sequences ''​%%[%%[:​%%<​%%:​]]''​ and ''​%%[%%[:>:​]]''​ gave rise to incorrect compiling errors or other strange effects if compiled in UCP mode.
 +  * Adding group information caching improves the speed of compiling when checking whether a group has a fixed length and/or could match an empty string, especially when recursion or subroutine calls are involved.
 +  * If ''​[:​^ascii:​]''​ or ''​[:​^xdigit:​]''​ are present in a non-negated class, all characters with code points greater than 255 are in the class. When a Unicode property was also in the class (if ''​PCRE2_UCP''​ is set, escapes such as ''​\w''​ are turned into Unicode properties),​ wide characters were not correctly handled, and could fail to match. Negated classes such as ''​[^[:​^ascii:​]\d]''​ were also not working correctly in UCP mode.
 +  * If ''​PCRE2_AUTO_CALLOUT''​ was set on a pattern that had a ''​(?#''​ comment between an item and its qualifier (for example, ''​A(?#​comment)?​B''​) ''​pcre2_compile''​ misbehaved.
 +  * Similarly, if an isolated ''​\E''​ was present between an item and its qualifier when ''​PCRE2_AUTO_CALLOUT''​ was set, ''​pcre2_compile''​ misbehaved.
 +  * The error for an invalid UTF pattern string always gave the code unit offset as zero instead of where the invalidity was found.
 +  * An empty ''​\Q\E''​ sequence between an item and its qualifier caused ''​pcre2_compile''​ to misbehave when auto callouts were enabled.
 +  * If both ''​PCRE2_ALT_VERBNAMES''​ and ''​PCRE2_EXTENDED''​ were set, and a ''​(*MARK)''​ or other verb "​name"​ ended with whitespace immediately before the closing parenthesis,​ ''​pcre2_compile''​ misbehaved. Example: ''​(*:​abc )'',​ but only when both those options were set.
 +  * In a number of places ''​pcre2_compile''​ was not handling ''​nil''​ characters correctly.
 +  * If a pattern that was compiled with ''​PCRE2_EXTENDED''​ started with white space or a #-type comment that was followed by ''​(?​-x)'',​ which turns off ''​PCRE2_EXTENDED'',​ and there was no subsequent ''​(?​x)''​ to turn it on again, ''​pcre2_compile''​ assumed that ''​(?​-x)''​ applied to the whole pattern and consequently mis-compiled it. The fix for this bug means that a setting of any of the ''​(?​imsxU)''​ options at the start of a pattern is no longer transferred to the options that are returned by ''​PCRE2_INFO_ALLOPTIONS''​. In fact, this was an anachronism that should have changed when the effects of those options were all moved to compile time.
 +  * An escaped closing parenthesis in the "​name"​ part of a ''​(*verb)''​ when ''​PCRE2_ALT_VERBNAMES''​ was set caused ''​pcre2_compile''​ to malfunction.
 +
 +=====YuPcre2 1.1.0 – 15 Sep 2015=====
 +
 +  * Support Delphi 10 Seattle Win32 and Win64.
 +
 +  * Match limit check added to recursion.
 +  * Arrange for the UTF check in ''​pcre2_match''​ and ''​pcre2_dfa_match''​ to look only at the part of the subject that is relevant when the starting offset is non-zero.
 +  * Improve first character match in JIT with SSE2 on x86.
 +  * Fixed two assertion fails in JIT.
 +  * Fixed a corner case of range optimization in JIT.
 +  * Add the ${*MARK} facility to ''​pcre2_substitute''​.
 +  * Implemented ''​PCRE2_ALT_VERBNAMES''​ and ''​coAltVerbnames''​.
 +  * Fixed two issues in JIT.
 +
 +=====YuPcre2 1.0.1 – 8 Aug 2015=====
 +
 +  * Pathological patterns containing many nested occurrences of ''​[:''​ caused ''​pcre2_compile''​ to run for a very long time.
 +  * A missing closing parenthesis for a callout with a string argument was not being diagnosed, possibly leading to a buffer overflow.
 +  * A conditional group with only one branch has an implicit empty alternative branch and must therefore be treated as potentially matching an empty string.
 +  * If ''​(?​R''​ was followed by ''​-''​ or ''​+''​ incorrect behaviour happened instead of a diagnostic.
 +  * Conditional groups whose condition was an assertion preceded by an explicit callout with a string argument might be incorrectly processed, especially if the string contained ''​\Q''​.
 +  * Fix buffer overflow while checking a UTF-8 string if the final multi-byte UTF-8 character was truncated.
 +  * Finding the minimum matching length of complex patterns with back references and/or recursions can take a long time. There is now a cut-off that gives up trying to find a minimum length when things get too complex.
 +  * An optimization has been added that speeds up finding the minimum matching length for patterns containing repeated capturing groups or recursions.
 +  * If a pattern contained a back reference to a group whose number was duplicated as a result of appearing in a ''​(?​|...)''​ group, the computation of the minimum matching length gave a wrong result, which could cause incorrect "no match" errors. For such patterns, a minimum matching length cannot at present be computed.
 +  * Added a check for integer overflow in conditions ''​(?​(%%<​%%digits>​)''​ and ''​(?​(R%%<​%%digits>​)''​.
 +  * Fixed an issue when ''​\p{Any}''​ inside an xclass did not read the current character.
 +  * The JIT compiler did not restore the control verb head in case of ''​*THEN''​ control verbs.
 +  * The way recursive references such as ''​(?​3)''​ are compiled has been re-written because the old way was the cause of many issues. Now, conversion of the group number into a pattern offset does not happen until the pattern has been completely compiled. This does mean that detection of all infinitely looping recursions is postponed till match time. In the past, some easy ones were detected at compile time.
 +  * A test for a back reference to a non-existent group was missing for items such as ''​\987''​. This caused incorrect code to be compiled.
 +  * Error messages for syntax errors following ''​\g''​ and ''​\k''​ were giving inaccurate offsets in the pattern.
 +  * Improve the performance of starting single character repetitions in JIT.
 +  * ''​(*LIMIT_MATCH=)''​ now gives an error instead of setting the value to 0.
 +  * Error messages for syntax errors in *LIMIT_MATCH and *LIMIT_RECURSION now give the right offset instead of zero.
 +  * The JIT compiler should not check repeats after a {0,1} repeat byte code.
 +  * The JIT compiler should restore the control chain for empty possessive repeats.
 +
 +=====YuPcre2 1.0.0 – 22 Jul 2015=====
 +
 +  * Initial release.
 +
  
products/pcre2/history.txt · Last modified: 2019/03/07 18:01 (external edit)