You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1408 lines
45 KiB

pcre: Integrate pending patches for next upstream version 8.39 - Fix auto-callout (http://vcs.pcre.org/viewvc?view=rev&revision=1611) - Fix negated POSIX class within negated overall class UCP (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1612 2f5784b3-3f2a-0410-8824-cb99058d5e15) - Fix bug for isolated \E between an item and its qualifier when auto callout is set. (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1613 2f5784b3-3f2a-0410-8824-cb99058d5e15) - Give error for regexec with pmatch=NULL and REG_STARTEND set (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1614 2f5784b3-3f2a-0410-8824-cb99058d5e15) - Fix \Q\E before qualifier bug when auto callouts are (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1616 2f5784b3-3f2a-0410-8824-cb99058d5e15) - Fix /x bug when pattern starts with white space and (?-x) (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1617 2f5784b3-3f2a-0410-8824-cb99058d5e15) - Fix copy named substring bug. (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1618 2f5784b3-3f2a-0410-8824-cb99058d5e15) - Fix (by hacking) another length computation issue. (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1619 2f5784b3-3f2a-0410-8824-cb99058d5e15 - Fix get_substring_list() bug when \K is used in an assertion. (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1620 2f5784b3-3f2a-0410-8824-cb99058d5e15 - Fix pcretest bad behaviour for callout in lookbehind. (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1625 2f5784b3-3f2a-0410-8824-cb99058d5e15 - Fix workspace overflow for (*ACCEPT) with deeply nested (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1631 2f5784b3-3f2a-0410-8824-cb99058d5e15 fixes CVE-2016-3191 - Fix Yet another duplicate name bugfix by overestimating the memory needed (i.e. another hack - PCRE2 has this "properly" fixed). (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1636 2f5784b3-3f2a-0410-8824-cb99058d5e15 - Fix pcretest loop for global matching with an ovector size (git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1637 2f5784b3-3f2a-0410-8824-cb99058d5e15 Signed-off-by: heil <heil@terminal-consulting.de>
9 years ago
  1. Submitted By: Ken Moffat <ken at linuxfromscratch dot org>
  2. Date: 2016-03-16
  3. Initial Package Version: 8.38
  4. Upstream Status: Applied
  5. Origin: Upstream, backported to 8.38 by Petr Písař at redhat
  6. Description: Various fixes, including for CVE-2016-1263 and many other
  7. bugs which have been fixed upstream. Many of these bugs were found by
  8. fuzzing, upstream is trying to persuade its users to move to pcre2 and
  9. giving low priority to further pcre1 maintenance releases.
  10. From 3c80e02cd464ea049e117b423fd48fab294c51a9 Mon Sep 17 00:00:00 2001
  11. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  12. Date: Thu, 26 Nov 2015 20:29:13 +0000
  13. Subject: [PATCH] Fix auto-callout (?# comment bug.
  14. MIME-Version: 1.0
  15. Content-Type: text/plain; charset=UTF-8
  16. Content-Transfer-Encoding: 8bit
  17. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1611 2f5784b3-3f2a-0410-8824-cb99058d5e15
  18. Petr Pisar: Ported to 8.38.
  19. diff --git a/pcre_compile.c b/pcre_compile.c
  20. index 4d3b313..3360a8b 100644
  21. --- a/pcre_compile.c
  22. +++ b/pcre_compile.c
  23. @@ -4699,6 +4699,23 @@ for (;; ptr++)
  24. }
  25. }
  26. + /* Skip over (?# comments. We need to do this here because we want to know if
  27. + the next thing is a quantifier, and these comments may come between an item
  28. + and its quantifier. */
  29. +
  30. + if (c == CHAR_LEFT_PARENTHESIS && ptr[1] == CHAR_QUESTION_MARK &&
  31. + ptr[2] == CHAR_NUMBER_SIGN)
  32. + {
  33. + ptr += 3;
  34. + while (*ptr != CHAR_NULL && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
  35. + if (*ptr == CHAR_NULL)
  36. + {
  37. + *errorcodeptr = ERR18;
  38. + goto FAILED;
  39. + }
  40. + continue;
  41. + }
  42. +
  43. /* See if the next thing is a quantifier. */
  44. is_quantifier =
  45. @@ -6529,21 +6546,6 @@ for (;; ptr++)
  46. case CHAR_LEFT_PARENTHESIS:
  47. ptr++;
  48. - /* First deal with comments. Putting this code right at the start ensures
  49. - that comments have no bad side effects. */
  50. -
  51. - if (ptr[0] == CHAR_QUESTION_MARK && ptr[1] == CHAR_NUMBER_SIGN)
  52. - {
  53. - ptr += 2;
  54. - while (*ptr != CHAR_NULL && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
  55. - if (*ptr == CHAR_NULL)
  56. - {
  57. - *errorcodeptr = ERR18;
  58. - goto FAILED;
  59. - }
  60. - continue;
  61. - }
  62. -
  63. /* Now deal with various "verbs" that can be introduced by '*'. */
  64. if (ptr[0] == CHAR_ASTERISK && (ptr[1] == ':'
  65. diff --git a/testdata/testinput2 b/testdata/testinput2
  66. index e2e520f..92e3359 100644
  67. --- a/testdata/testinput2
  68. +++ b/testdata/testinput2
  69. @@ -4217,4 +4217,12 @@ backtracking verbs. --/
  70. /a[[:punct:]b]/BZ
  71. +/L(?#(|++<!(2)?/BZ
  72. +
  73. +/L(?#(|++<!(2)?/BOZ
  74. +
  75. +/L(?#(|++<!(2)?/BCZ
  76. +
  77. +/L(?#(|++<!(2)?/BCOZ
  78. +
  79. /-- End of testinput2 --/
  80. diff --git a/testdata/testinput7 b/testdata/testinput7
  81. index e411a4b..00b9738 100644
  82. --- a/testdata/testinput7
  83. +++ b/testdata/testinput7
  84. @@ -853,4 +853,8 @@ of case for anything other than the ASCII letters. --/
  85. /a[b[:punct:]]/8WBZ
  86. +/L(?#(|++<!(2)?/B8COZ
  87. +
  88. +/L(?#(|++<!(2)?/B8WCZ
  89. +
  90. /-- End of testinput7 --/
  91. diff --git a/testdata/testoutput2 b/testdata/testoutput2
  92. index 85c565d..2cf7a90 100644
  93. --- a/testdata/testoutput2
  94. +++ b/testdata/testoutput2
  95. @@ -14574,4 +14574,40 @@ No match
  96. End
  97. ------------------------------------------------------------------
  98. +/L(?#(|++<!(2)?/BZ
  99. +------------------------------------------------------------------
  100. + Bra
  101. + L?+
  102. + Ket
  103. + End
  104. +------------------------------------------------------------------
  105. +
  106. +/L(?#(|++<!(2)?/BOZ
  107. +------------------------------------------------------------------
  108. + Bra
  109. + L?
  110. + Ket
  111. + End
  112. +------------------------------------------------------------------
  113. +
  114. +/L(?#(|++<!(2)?/BCZ
  115. +------------------------------------------------------------------
  116. + Bra
  117. + Callout 255 0 14
  118. + L?+
  119. + Callout 255 14 0
  120. + Ket
  121. + End
  122. +------------------------------------------------------------------
  123. +
  124. +/L(?#(|++<!(2)?/BCOZ
  125. +------------------------------------------------------------------
  126. + Bra
  127. + Callout 255 0 14
  128. + L?
  129. + Callout 255 14 0
  130. + Ket
  131. + End
  132. +------------------------------------------------------------------
  133. +
  134. /-- End of testinput2 --/
  135. diff --git a/testdata/testoutput7 b/testdata/testoutput7
  136. index cc9ebdd..fdfff64 100644
  137. --- a/testdata/testoutput7
  138. +++ b/testdata/testoutput7
  139. @@ -2348,4 +2348,24 @@ No match
  140. End
  141. ------------------------------------------------------------------
  142. +/L(?#(|++<!(2)?/B8COZ
  143. +------------------------------------------------------------------
  144. + Bra
  145. + Callout 255 0 14
  146. + L?
  147. + Callout 255 14 0
  148. + Ket
  149. + End
  150. +------------------------------------------------------------------
  151. +
  152. +/L(?#(|++<!(2)?/B8WCZ
  153. +------------------------------------------------------------------
  154. + Bra
  155. + Callout 255 0 14
  156. + L?+
  157. + Callout 255 14 0
  158. + Ket
  159. + End
  160. +------------------------------------------------------------------
  161. +
  162. /-- End of testinput7 --/
  163. --
  164. 2.4.3
  165. From ef6b10fcde41a2687f38d4a9ff2886b037948a1b Mon Sep 17 00:00:00 2001
  166. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  167. Date: Fri, 27 Nov 2015 17:13:13 +0000
  168. Subject: [PATCH 1/5] Fix negated POSIX class within negated overall class UCP
  169. bug.
  170. MIME-Version: 1.0
  171. Content-Type: text/plain; charset=UTF-8
  172. Content-Transfer-Encoding: 8bit
  173. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1612 2f5784b3-3f2a-0410-8824-cb99058d5e15
  174. Petr Písař: Ported to 8.38.
  175. diff --git a/pcre_compile.c b/pcre_compile.c
  176. index 3360a8b..3670f1e 100644
  177. --- a/pcre_compile.c
  178. +++ b/pcre_compile.c
  179. @@ -5063,20 +5063,22 @@ for (;; ptr++)
  180. ptr = tempptr + 1;
  181. continue;
  182. - /* For the other POSIX classes (ascii, xdigit) we are going to fall
  183. - through to the non-UCP case and build a bit map for characters with
  184. - code points less than 256. If we are in a negated POSIX class
  185. - within a non-negated overall class, characters with code points
  186. - greater than 255 must all match. In the special case where we have
  187. - not yet generated any xclass data, and this is the final item in
  188. - the overall class, we need do nothing: later on, the opcode
  189. + /* For the other POSIX classes (ascii, cntrl, xdigit) we are going
  190. + to fall through to the non-UCP case and build a bit map for
  191. + characters with code points less than 256. If we are in a negated
  192. + POSIX class, characters with code points greater than 255 must
  193. + either all match or all not match. In the special case where we
  194. + have not yet generated any xclass data, and this is the final item
  195. + in the overall class, we need do nothing: later on, the opcode
  196. OP_NCLASS will be used to indicate that characters greater than 255
  197. are acceptable. If we have already seen an xclass item or one may
  198. follow (we have to assume that it might if this is not the end of
  199. - the class), explicitly match all wide codepoints. */
  200. + the class), explicitly list all wide codepoints, which will then
  201. + either not match or match, depending on whether the class is or is
  202. + not negated. */
  203. default:
  204. - if (!negate_class && local_negate &&
  205. + if (local_negate &&
  206. (xclass || tempptr[2] != CHAR_RIGHT_SQUARE_BRACKET))
  207. {
  208. *class_uchardata++ = XCL_RANGE;
  209. diff --git a/testdata/testinput6 b/testdata/testinput6
  210. index aeb62a0..a178d3d 100644
  211. --- a/testdata/testinput6
  212. +++ b/testdata/testinput6
  213. @@ -1553,4 +1553,13 @@
  214. \x{200}
  215. \x{37e}
  216. +/[^[:^ascii:]\d]/8W
  217. + a
  218. + ~
  219. + 0
  220. + \a
  221. + \x{7f}
  222. + \x{389}
  223. + \x{20ac}
  224. +
  225. /-- End of testinput6 --/
  226. diff --git a/testdata/testoutput6 b/testdata/testoutput6
  227. index beb85aa..b64dc0d 100644
  228. --- a/testdata/testoutput6
  229. +++ b/testdata/testoutput6
  230. @@ -2557,4 +2557,20 @@ No match
  231. \x{37e}
  232. 0: \x{37e}
  233. +/[^[:^ascii:]\d]/8W
  234. + a
  235. + 0: a
  236. + ~
  237. + 0: ~
  238. + 0
  239. +No match
  240. + \a
  241. + 0: \x{07}
  242. + \x{7f}
  243. + 0: \x{7f}
  244. + \x{389}
  245. +No match
  246. + \x{20ac}
  247. +No match
  248. +
  249. /-- End of testinput6 --/
  250. --
  251. 2.4.3
  252. From bfc1dfa660c24dc7a75108d934290e50d7db2719 Mon Sep 17 00:00:00 2001
  253. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  254. Date: Fri, 27 Nov 2015 17:41:04 +0000
  255. Subject: [PATCH 2/5] Fix bug for isolated \E between an item and its qualifier
  256. when auto callout is set.
  257. MIME-Version: 1.0
  258. Content-Type: text/plain; charset=UTF-8
  259. Content-Transfer-Encoding: 8bit
  260. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1613 2f5784b3-3f2a-0410-8824-cb99058d5e15
  261. Petr Písař: Ported to 8.38.
  262. diff --git a/pcre_compile.c b/pcre_compile.c
  263. index 3670f1e..5786cd3 100644
  264. --- a/pcre_compile.c
  265. +++ b/pcre_compile.c
  266. @@ -4645,9 +4645,10 @@ for (;; ptr++)
  267. goto FAILED;
  268. }
  269. - /* If in \Q...\E, check for the end; if not, we have a literal */
  270. + /* If in \Q...\E, check for the end; if not, we have a literal. Otherwise an
  271. + isolated \E is ignored. */
  272. - if (inescq && c != CHAR_NULL)
  273. + if (c != CHAR_NULL)
  274. {
  275. if (c == CHAR_BACKSLASH && ptr[1] == CHAR_E)
  276. {
  277. @@ -4655,7 +4656,7 @@ for (;; ptr++)
  278. ptr++;
  279. continue;
  280. }
  281. - else
  282. + else if (inescq)
  283. {
  284. if (previous_callout != NULL)
  285. {
  286. @@ -4670,7 +4671,6 @@ for (;; ptr++)
  287. }
  288. goto NORMAL_CHAR;
  289. }
  290. - /* Control does not reach here. */
  291. }
  292. /* In extended mode, skip white space and comments. We need a loop in order
  293. diff --git a/testdata/testinput2 b/testdata/testinput2
  294. index 92e3359..e8ca4fe 100644
  295. --- a/testdata/testinput2
  296. +++ b/testdata/testinput2
  297. @@ -4225,4 +4225,6 @@ backtracking verbs. --/
  298. /L(?#(|++<!(2)?/BCOZ
  299. +/(A*)\E+/CBZ
  300. +
  301. /-- End of testinput2 --/
  302. diff --git a/testdata/testoutput2 b/testdata/testoutput2
  303. index 2cf7a90..09756b8 100644
  304. --- a/testdata/testoutput2
  305. +++ b/testdata/testoutput2
  306. @@ -14610,4 +14610,18 @@ No match
  307. End
  308. ------------------------------------------------------------------
  309. +/(A*)\E+/CBZ
  310. +------------------------------------------------------------------
  311. + Bra
  312. + Callout 255 0 7
  313. + SCBra 1
  314. + Callout 255 1 2
  315. + A*
  316. + Callout 255 3 0
  317. + KetRmax
  318. + Callout 255 7 0
  319. + Ket
  320. + End
  321. +------------------------------------------------------------------
  322. +
  323. /-- End of testinput2 --/
  324. --
  325. 2.4.3
  326. From 108377b836fc29a84f5286287629d96549b1c777 Mon Sep 17 00:00:00 2001
  327. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  328. Date: Sun, 29 Nov 2015 17:38:25 +0000
  329. Subject: [PATCH 3/5] Give error for regexec with pmatch=NULL and REG_STARTEND
  330. set.
  331. MIME-Version: 1.0
  332. Content-Type: text/plain; charset=UTF-8
  333. Content-Transfer-Encoding: 8bit
  334. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1614 2f5784b3-3f2a-0410-8824-cb99058d5e15
  335. Petr Písař: Ported to 8.38.
  336. diff --git a/pcreposix.c b/pcreposix.c
  337. index f024423..dcc13ef 100644
  338. --- a/pcreposix.c
  339. +++ b/pcreposix.c
  340. @@ -364,6 +364,7 @@ start location rather than being passed as a PCRE "starting offset". */
  341. if ((eflags & REG_STARTEND) != 0)
  342. {
  343. + if (pmatch == NULL) return REG_INVARG;
  344. so = pmatch[0].rm_so;
  345. eo = pmatch[0].rm_eo;
  346. }
  347. --
  348. 2.4.3
  349. From e347b40d5bb12f7ef1e632aa649571a107be7d8a Mon Sep 17 00:00:00 2001
  350. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  351. Date: Sun, 29 Nov 2015 17:46:23 +0000
  352. Subject: [PATCH 4/5] Allow for up to 32-bit numbers in the ordin() function in
  353. pcregrep.
  354. MIME-Version: 1.0
  355. Content-Type: text/plain; charset=UTF-8
  356. Content-Transfer-Encoding: 8bit
  357. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1615 2f5784b3-3f2a-0410-8824-cb99058d5e15
  358. Petr Písař: Ported to 8.38.
  359. diff --git a/pcregrep.c b/pcregrep.c
  360. index 64986b0..cd53c64 100644
  361. --- a/pcregrep.c
  362. +++ b/pcregrep.c
  363. @@ -2437,7 +2437,7 @@ return options;
  364. static char *
  365. ordin(int n)
  366. {
  367. -static char buffer[8];
  368. +static char buffer[14];
  369. char *p = buffer;
  370. sprintf(p, "%d", n);
  371. while (*p != 0) p++;
  372. --
  373. 2.4.3
  374. From e78ad4264b16988b826bd2939a1781c1165a92d9 Mon Sep 17 00:00:00 2001
  375. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  376. Date: Mon, 30 Nov 2015 17:44:45 +0000
  377. Subject: [PATCH 5/5] Fix \Q\E before qualifier bug when auto callouts are
  378. enabled.
  379. MIME-Version: 1.0
  380. Content-Type: text/plain; charset=UTF-8
  381. Content-Transfer-Encoding: 8bit
  382. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1616 2f5784b3-3f2a-0410-8824-cb99058d5e15
  383. Petr Písař: Ported to 8.38.
  384. diff --git a/pcre_compile.c b/pcre_compile.c
  385. index 5786cd3..beed46b 100644
  386. --- a/pcre_compile.c
  387. +++ b/pcre_compile.c
  388. @@ -4671,17 +4671,27 @@ for (;; ptr++)
  389. }
  390. goto NORMAL_CHAR;
  391. }
  392. +
  393. + /* Check for the start of a \Q...\E sequence. We must do this here rather
  394. + than later in case it is immediately followed by \E, which turns it into a
  395. + "do nothing" sequence. */
  396. +
  397. + if (c == CHAR_BACKSLASH && ptr[1] == CHAR_Q)
  398. + {
  399. + inescq = TRUE;
  400. + ptr++;
  401. + continue;
  402. + }
  403. }
  404. - /* In extended mode, skip white space and comments. We need a loop in order
  405. - to check for more white space and more comments after a comment. */
  406. + /* In extended mode, skip white space and comments. */
  407. if ((options & PCRE_EXTENDED) != 0)
  408. {
  409. - for (;;)
  410. + const pcre_uchar *wscptr = ptr;
  411. + while (MAX_255(c) && (cd->ctypes[c] & ctype_space) != 0) c = *(++ptr);
  412. + if (c == CHAR_NUMBER_SIGN)
  413. {
  414. - while (MAX_255(c) && (cd->ctypes[c] & ctype_space) != 0) c = *(++ptr);
  415. - if (c != CHAR_NUMBER_SIGN) break;
  416. ptr++;
  417. while (*ptr != CHAR_NULL)
  418. {
  419. @@ -4695,7 +4705,15 @@ for (;; ptr++)
  420. if (utf) FORWARDCHAR(ptr);
  421. #endif
  422. }
  423. - c = *ptr; /* Either NULL or the char after a newline */
  424. + }
  425. +
  426. + /* If we skipped any characters, restart the loop. Otherwise, we didn't see
  427. + a comment. */
  428. +
  429. + if (ptr > wscptr)
  430. + {
  431. + ptr--;
  432. + continue;
  433. }
  434. }
  435. @@ -7900,16 +7918,6 @@ for (;; ptr++)
  436. c = ec;
  437. else
  438. {
  439. - if (escape == ESC_Q) /* Handle start of quoted string */
  440. - {
  441. - if (ptr[1] == CHAR_BACKSLASH && ptr[2] == CHAR_E)
  442. - ptr += 2; /* avoid empty string */
  443. - else inescq = TRUE;
  444. - continue;
  445. - }
  446. -
  447. - if (escape == ESC_E) continue; /* Perl ignores an orphan \E */
  448. -
  449. /* For metasequences that actually match a character, we disable the
  450. setting of a first character if it hasn't already been set. */
  451. diff --git a/testdata/testinput2 b/testdata/testinput2
  452. index e8ca4fe..3a1134f 100644
  453. --- a/testdata/testinput2
  454. +++ b/testdata/testinput2
  455. @@ -4227,4 +4227,6 @@ backtracking verbs. --/
  456. /(A*)\E+/CBZ
  457. +/()\Q\E*]/BCZ
  458. +
  459. /-- End of testinput2 --/
  460. diff --git a/testdata/testoutput2 b/testdata/testoutput2
  461. index 09756b8..ac33cc4 100644
  462. --- a/testdata/testoutput2
  463. +++ b/testdata/testoutput2
  464. @@ -14624,4 +14624,19 @@ No match
  465. End
  466. ------------------------------------------------------------------
  467. +/()\Q\E*]/BCZ
  468. +------------------------------------------------------------------
  469. + Bra
  470. + Callout 255 0 7
  471. + Brazero
  472. + SCBra 1
  473. + Callout 255 1 0
  474. + KetRmax
  475. + Callout 255 7 1
  476. + ]
  477. + Callout 255 8 0
  478. + Ket
  479. + End
  480. +------------------------------------------------------------------
  481. +
  482. /-- End of testinput2 --/
  483. --
  484. 2.4.3
  485. From 46ed1a703b067e5b679eacf6500a54dae35f8130 Mon Sep 17 00:00:00 2001
  486. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  487. Date: Thu, 3 Dec 2015 17:05:40 +0000
  488. Subject: [PATCH] Fix /x bug when pattern starts with white space and (?-x)
  489. MIME-Version: 1.0
  490. Content-Type: text/plain; charset=UTF-8
  491. Content-Transfer-Encoding: 8bit
  492. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1617 2f5784b3-3f2a-0410-8824-cb99058d5e15
  493. Petr Písař: Ported to 8.38.
  494. diff --git a/pcre_compile.c b/pcre_compile.c
  495. index beed46b..57719b9 100644
  496. --- a/pcre_compile.c
  497. +++ b/pcre_compile.c
  498. @@ -7607,39 +7607,15 @@ for (;; ptr++)
  499. newoptions = (options | set) & (~unset);
  500. /* If the options ended with ')' this is not the start of a nested
  501. - group with option changes, so the options change at this level. If this
  502. - item is right at the start of the pattern, the options can be
  503. - abstracted and made external in the pre-compile phase, and ignored in
  504. - the compile phase. This can be helpful when matching -- for instance in
  505. - caseless checking of required bytes.
  506. -
  507. - If the code pointer is not (cd->start_code + 1 + LINK_SIZE), we are
  508. - definitely *not* at the start of the pattern because something has been
  509. - compiled. In the pre-compile phase, however, the code pointer can have
  510. - that value after the start, because it gets reset as code is discarded
  511. - during the pre-compile. However, this can happen only at top level - if
  512. - we are within parentheses, the starting BRA will still be present. At
  513. - any parenthesis level, the length value can be used to test if anything
  514. - has been compiled at that level. Thus, a test for both these conditions
  515. - is necessary to ensure we correctly detect the start of the pattern in
  516. - both phases.
  517. -
  518. + group with option changes, so the options change at this level.
  519. If we are not at the pattern start, reset the greedy defaults and the
  520. case value for firstchar and reqchar. */
  521. if (*ptr == CHAR_RIGHT_PARENTHESIS)
  522. {
  523. - if (code == cd->start_code + 1 + LINK_SIZE &&
  524. - (lengthptr == NULL || *lengthptr == 2 + 2*LINK_SIZE))
  525. - {
  526. - cd->external_options = newoptions;
  527. - }
  528. - else
  529. - {
  530. - greedy_default = ((newoptions & PCRE_UNGREEDY) != 0);
  531. - greedy_non_default = greedy_default ^ 1;
  532. - req_caseopt = ((newoptions & PCRE_CASELESS) != 0)? REQ_CASELESS:0;
  533. - }
  534. + greedy_default = ((newoptions & PCRE_UNGREEDY) != 0);
  535. + greedy_non_default = greedy_default ^ 1;
  536. + req_caseopt = ((newoptions & PCRE_CASELESS) != 0)? REQ_CASELESS:0;
  537. /* Change options at this level, and pass them back for use
  538. in subsequent branches. */
  539. diff --git a/testdata/testoutput2 b/testdata/testoutput2
  540. index ac33cc4..6c42897 100644
  541. --- a/testdata/testoutput2
  542. +++ b/testdata/testoutput2
  543. @@ -419,7 +419,7 @@ Need char = '>'
  544. /(?U)<.*>/I
  545. Capturing subpattern count = 0
  546. -Options: ungreedy
  547. +No options
  548. First char = '<'
  549. Need char = '>'
  550. abc<def>ghi<klm>nop
  551. @@ -443,7 +443,7 @@ Need char = '='
  552. /(?U)={3,}?/I
  553. Capturing subpattern count = 0
  554. -Options: ungreedy
  555. +No options
  556. First char = '='
  557. Need char = '='
  558. abc========def
  559. @@ -477,7 +477,7 @@ Failed: lookbehind assertion is not fixed length at offset 12
  560. /(?i)abc/I
  561. Capturing subpattern count = 0
  562. -Options: caseless
  563. +No options
  564. First char = 'a' (caseless)
  565. Need char = 'c' (caseless)
  566. @@ -489,7 +489,7 @@ No need char
  567. /(?i)^1234/I
  568. Capturing subpattern count = 0
  569. -Options: anchored caseless
  570. +Options: anchored
  571. No first char
  572. No need char
  573. @@ -502,7 +502,7 @@ No need char
  574. /(?s).*/I
  575. Capturing subpattern count = 0
  576. May match empty string
  577. -Options: anchored dotall
  578. +Options: anchored
  579. No first char
  580. No need char
  581. @@ -516,7 +516,7 @@ Starting chars: a b c d
  582. /(?i)[abcd]/IS
  583. Capturing subpattern count = 0
  584. -Options: caseless
  585. +No options
  586. No first char
  587. No need char
  588. Subject length lower bound = 1
  589. @@ -524,7 +524,7 @@ Starting chars: A B C D a b c d
  590. /(?m)[xy]|(b|c)/IS
  591. Capturing subpattern count = 1
  592. -Options: multiline
  593. +No options
  594. No first char
  595. No need char
  596. Subject length lower bound = 1
  597. @@ -538,7 +538,7 @@ No need char
  598. /(?i)(^a|^b)/Im
  599. Capturing subpattern count = 1
  600. -Options: caseless multiline
  601. +Options: multiline
  602. First char at start or follows newline
  603. No need char
  604. @@ -1179,7 +1179,7 @@ No need char
  605. End
  606. ------------------------------------------------------------------
  607. Capturing subpattern count = 1
  608. -Options: anchored dotall
  609. +Options: anchored
  610. No first char
  611. No need char
  612. @@ -2735,7 +2735,7 @@ No match
  613. End
  614. ------------------------------------------------------------------
  615. Capturing subpattern count = 0
  616. -Options: caseless extended
  617. +Options: extended
  618. First char = 'a' (caseless)
  619. Need char = 'c' (caseless)
  620. @@ -2748,7 +2748,7 @@ Need char = 'c' (caseless)
  621. End
  622. ------------------------------------------------------------------
  623. Capturing subpattern count = 0
  624. -Options: caseless extended
  625. +Options: extended
  626. First char = 'a' (caseless)
  627. Need char = 'c' (caseless)
  628. @@ -3095,7 +3095,7 @@ Need char = 'b'
  629. End
  630. ------------------------------------------------------------------
  631. Capturing subpattern count = 0
  632. -Options: ungreedy
  633. +No options
  634. First char = 'x'
  635. Need char = 'b'
  636. xaaaab
  637. @@ -3497,7 +3497,7 @@ Need char = 'c'
  638. /(?i)[ab]/IS
  639. Capturing subpattern count = 0
  640. -Options: caseless
  641. +No options
  642. No first char
  643. No need char
  644. Subject length lower bound = 1
  645. @@ -6299,7 +6299,7 @@ Capturing subpattern count = 3
  646. Named capturing subpatterns:
  647. A 2
  648. A 3
  649. -Options: anchored dupnames
  650. +Options: anchored
  651. Duplicate name status changes
  652. No first char
  653. No need char
  654. --
  655. 2.4.3
  656. From db1fb68feddc9afe6f8822d099fa9ff25e3ea8e7 Mon Sep 17 00:00:00 2001
  657. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  658. Date: Sat, 5 Dec 2015 16:30:14 +0000
  659. Subject: [PATCH] Fix copy named substring bug.
  660. MIME-Version: 1.0
  661. Content-Type: text/plain; charset=UTF-8
  662. Content-Transfer-Encoding: 8bit
  663. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1618 2f5784b3-3f2a-0410-8824-cb99058d5e15
  664. Petr Písař: Ported to 8.38.
  665. diff --git a/pcre_get.c b/pcre_get.c
  666. index 8094b34..41eda9c 100644
  667. --- a/pcre_get.c
  668. +++ b/pcre_get.c
  669. @@ -250,6 +250,7 @@ Arguments:
  670. code the compiled regex
  671. stringname the name of the capturing substring
  672. ovector the vector of matched substrings
  673. + stringcount number of captured substrings
  674. Returns: the number of the first that is set,
  675. or the number of the last one if none are set,
  676. @@ -258,13 +259,16 @@ Returns: the number of the first that is set,
  677. #if defined COMPILE_PCRE8
  678. static int
  679. -get_first_set(const pcre *code, const char *stringname, int *ovector)
  680. +get_first_set(const pcre *code, const char *stringname, int *ovector,
  681. + int stringcount)
  682. #elif defined COMPILE_PCRE16
  683. static int
  684. -get_first_set(const pcre16 *code, PCRE_SPTR16 stringname, int *ovector)
  685. +get_first_set(const pcre16 *code, PCRE_SPTR16 stringname, int *ovector,
  686. + int stringcount)
  687. #elif defined COMPILE_PCRE32
  688. static int
  689. -get_first_set(const pcre32 *code, PCRE_SPTR32 stringname, int *ovector)
  690. +get_first_set(const pcre32 *code, PCRE_SPTR32 stringname, int *ovector,
  691. + int stringcount)
  692. #endif
  693. {
  694. const REAL_PCRE *re = (const REAL_PCRE *)code;
  695. @@ -295,7 +299,7 @@ if (entrysize <= 0) return entrysize;
  696. for (entry = (pcre_uchar *)first; entry <= (pcre_uchar *)last; entry += entrysize)
  697. {
  698. int n = GET2(entry, 0);
  699. - if (ovector[n*2] >= 0) return n;
  700. + if (n < stringcount && ovector[n*2] >= 0) return n;
  701. }
  702. return GET2(entry, 0);
  703. }
  704. @@ -402,7 +406,7 @@ pcre32_copy_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
  705. PCRE_UCHAR32 *buffer, int size)
  706. #endif
  707. {
  708. -int n = get_first_set(code, stringname, ovector);
  709. +int n = get_first_set(code, stringname, ovector, stringcount);
  710. if (n <= 0) return n;
  711. #if defined COMPILE_PCRE8
  712. return pcre_copy_substring(subject, ovector, stringcount, n, buffer, size);
  713. @@ -619,7 +623,7 @@ pcre32_get_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
  714. PCRE_SPTR32 *stringptr)
  715. #endif
  716. {
  717. -int n = get_first_set(code, stringname, ovector);
  718. +int n = get_first_set(code, stringname, ovector, stringcount);
  719. if (n <= 0) return n;
  720. #if defined COMPILE_PCRE8
  721. return pcre_get_substring(subject, ovector, stringcount, n, stringptr);
  722. diff --git a/testdata/testinput2 b/testdata/testinput2
  723. index 3a1134f..00ffe32 100644
  724. --- a/testdata/testinput2
  725. +++ b/testdata/testinput2
  726. @@ -4229,4 +4229,7 @@ backtracking verbs. --/
  727. /()\Q\E*]/BCZ
  728. +/(?<A>)(?J:(?<B>)(?<B>))(?<C>)/
  729. + \O\CC
  730. +
  731. /-- End of testinput2 --/
  732. diff --git a/testdata/testoutput2 b/testdata/testoutput2
  733. index 6c42897..ffb4466 100644
  734. --- a/testdata/testoutput2
  735. +++ b/testdata/testoutput2
  736. @@ -14639,4 +14639,9 @@ No match
  737. End
  738. ------------------------------------------------------------------
  739. +/(?<A>)(?J:(?<B>)(?<B>))(?<C>)/
  740. + \O\CC
  741. +Matched, but too many substrings
  742. +copy substring C failed -7
  743. +
  744. /-- End of testinput2 --/
  745. --
  746. 2.4.3
  747. From 40363ebc19baeab160abaaa55dc84322a89ac35a Mon Sep 17 00:00:00 2001
  748. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  749. Date: Sat, 5 Dec 2015 16:58:46 +0000
  750. Subject: [PATCH] Fix (by hacking) another length computation issue.
  751. MIME-Version: 1.0
  752. Content-Type: text/plain; charset=UTF-8
  753. Content-Transfer-Encoding: 8bit
  754. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1619 2f5784b3-3f2a-0410-8824-cb99058d5e15
  755. Petr Písař: Ported to 8.38.
  756. diff --git a/pcre_compile.c b/pcre_compile.c
  757. index 57719b9..087bf2a 100644
  758. --- a/pcre_compile.c
  759. +++ b/pcre_compile.c
  760. @@ -7280,7 +7280,7 @@ for (;; ptr++)
  761. issue is fixed "properly" in PCRE2. As PCRE1 is now in maintenance
  762. only mode, we finesse the bug by allowing more memory always. */
  763. - *lengthptr += 2 + 2*LINK_SIZE;
  764. + *lengthptr += 4 + 4*LINK_SIZE;
  765. /* It is even worse than that. The current reference may be to an
  766. existing named group with a different number (so apparently not
  767. diff --git a/testdata/testoutput11-16 b/testdata/testoutput11-16
  768. index 9a0a12d..280692e 100644
  769. --- a/testdata/testoutput11-16
  770. +++ b/testdata/testoutput11-16
  771. @@ -231,7 +231,7 @@ Memory allocation (code space): 73
  772. ------------------------------------------------------------------
  773. /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
  774. -Memory allocation (code space): 77
  775. +Memory allocation (code space): 93
  776. ------------------------------------------------------------------
  777. 0 24 Bra
  778. 2 5 CBra 1
  779. diff --git a/testdata/testoutput11-32 b/testdata/testoutput11-32
  780. index 57e5da0..cdbda74 100644
  781. --- a/testdata/testoutput11-32
  782. +++ b/testdata/testoutput11-32
  783. @@ -231,7 +231,7 @@ Memory allocation (code space): 155
  784. ------------------------------------------------------------------
  785. /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
  786. -Memory allocation (code space): 157
  787. +Memory allocation (code space): 189
  788. ------------------------------------------------------------------
  789. 0 24 Bra
  790. 2 5 CBra 1
  791. diff --git a/testdata/testoutput11-8 b/testdata/testoutput11-8
  792. index 748548a..cb37896 100644
  793. --- a/testdata/testoutput11-8
  794. +++ b/testdata/testoutput11-8
  795. @@ -231,7 +231,7 @@ Memory allocation (code space): 45
  796. ------------------------------------------------------------------
  797. /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
  798. -Memory allocation (code space): 50
  799. +Memory allocation (code space): 62
  800. ------------------------------------------------------------------
  801. 0 30 Bra
  802. 3 7 CBra 1
  803. --
  804. 2.4.3
  805. From 4f47274a2eb10131d88145ad7fd0eed4027a0c51 Mon Sep 17 00:00:00 2001
  806. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  807. Date: Tue, 8 Dec 2015 11:06:40 +0000
  808. Subject: [PATCH] Fix get_substring_list() bug when \K is used in an assertion.
  809. MIME-Version: 1.0
  810. Content-Type: text/plain; charset=UTF-8
  811. Content-Transfer-Encoding: 8bit
  812. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1620 2f5784b3-3f2a-0410-8824-cb99058d5e15
  813. Petr Písař: ported to 8.38.
  814. diff --git a/pcre_get.c b/pcre_get.c
  815. index 41eda9c..cdd2abc 100644
  816. --- a/pcre_get.c
  817. +++ b/pcre_get.c
  818. @@ -461,7 +461,10 @@ pcre_uchar **stringlist;
  819. pcre_uchar *p;
  820. for (i = 0; i < double_count; i += 2)
  821. - size += sizeof(pcre_uchar *) + IN_UCHARS(ovector[i+1] - ovector[i] + 1);
  822. + {
  823. + size += sizeof(pcre_uchar *) + IN_UCHARS(1);
  824. + if (ovector[i+1] > ovector[i]) size += IN_UCHARS(ovector[i+1] - ovector[i]);
  825. + }
  826. stringlist = (pcre_uchar **)(PUBL(malloc))(size);
  827. if (stringlist == NULL) return PCRE_ERROR_NOMEMORY;
  828. @@ -477,7 +480,7 @@ p = (pcre_uchar *)(stringlist + stringcount + 1);
  829. for (i = 0; i < double_count; i += 2)
  830. {
  831. - int len = ovector[i+1] - ovector[i];
  832. + int len = (ovector[i+1] > ovector[i])? (ovector[i+1] - ovector[i]) : 0;
  833. memcpy(p, subject + ovector[i], IN_UCHARS(len));
  834. *stringlist++ = p;
  835. p += len;
  836. diff --git a/testdata/testinput2 b/testdata/testinput2
  837. index 00ffe32..967a241 100644
  838. --- a/testdata/testinput2
  839. +++ b/testdata/testinput2
  840. @@ -4232,4 +4232,7 @@ backtracking verbs. --/
  841. /(?<A>)(?J:(?<B>)(?<B>))(?<C>)/
  842. \O\CC
  843. +/(?=a\K)/
  844. + ring bpattingbobnd $ 1,oern cou \rb\L
  845. +
  846. /-- End of testinput2 --/
  847. diff --git a/testdata/testoutput2 b/testdata/testoutput2
  848. index ffb4466..5fb28d5 100644
  849. --- a/testdata/testoutput2
  850. +++ b/testdata/testoutput2
  851. @@ -14644,4 +14644,10 @@ No match
  852. Matched, but too many substrings
  853. copy substring C failed -7
  854. +/(?=a\K)/
  855. + ring bpattingbobnd $ 1,oern cou \rb\L
  856. +Start of matched string is beyond its end - displaying from end to start.
  857. + 0: a
  858. + 0L
  859. +
  860. /-- End of testinput2 --/
  861. --
  862. 2.5.0
  863. From 3da5528b47b88c32224cf9d14d8a4e80cd7a0815 Mon Sep 17 00:00:00 2001
  864. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  865. Date: Sat, 6 Feb 2016 16:54:14 +0000
  866. Subject: [PATCH] Fix pcretest bad behaviour for callout in lookbehind.
  867. MIME-Version: 1.0
  868. Content-Type: text/plain; charset=UTF-8
  869. Content-Transfer-Encoding: 8bit
  870. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1625 2f5784b3-3f2a-0410-8824-cb99058d5e15
  871. Petr Písař: Ported to 8.38.
  872. diff --git a/pcretest.c b/pcretest.c
  873. index 488e419..63869fd 100644
  874. --- a/pcretest.c
  875. +++ b/pcretest.c
  876. @@ -2250,7 +2250,7 @@ data is not zero. */
  877. static int callout(pcre_callout_block *cb)
  878. {
  879. FILE *f = (first_callout | callout_extra)? outfile : NULL;
  880. -int i, pre_start, post_start, subject_length;
  881. +int i, current_position, pre_start, post_start, subject_length;
  882. if (callout_extra)
  883. {
  884. @@ -2280,14 +2280,19 @@ printed lengths of the substrings. */
  885. if (f != NULL) fprintf(f, "--->");
  886. +/* If a lookbehind is involved, the current position may be earlier than the
  887. +match start. If so, use the match start instead. */
  888. +
  889. +current_position = (cb->current_position >= cb->start_match)?
  890. + cb->current_position : cb->start_match;
  891. +
  892. PCHARS(pre_start, cb->subject, 0, cb->start_match, f);
  893. PCHARS(post_start, cb->subject, cb->start_match,
  894. - cb->current_position - cb->start_match, f);
  895. + current_position - cb->start_match, f);
  896. PCHARS(subject_length, cb->subject, 0, cb->subject_length, NULL);
  897. -PCHARSV(cb->subject, cb->current_position,
  898. - cb->subject_length - cb->current_position, f);
  899. +PCHARSV(cb->subject, current_position, cb->subject_length - current_position, f);
  900. if (f != NULL) fprintf(f, "\n");
  901. @@ -5740,3 +5745,4 @@ return yield;
  902. }
  903. /* End of pcretest.c */
  904. +
  905. diff --git a/testdata/testinput2 b/testdata/testinput2
  906. index 967a241..086e0f4 100644
  907. --- a/testdata/testinput2
  908. +++ b/testdata/testinput2
  909. @@ -4235,4 +4235,8 @@ backtracking verbs. --/
  910. /(?=a\K)/
  911. ring bpattingbobnd $ 1,oern cou \rb\L
  912. +/(?<=((?C)0))/
  913. + 9010
  914. + abcd
  915. +
  916. /-- End of testinput2 --/
  917. diff --git a/testdata/testoutput2 b/testdata/testoutput2
  918. index 5fb28d5..d414a72 100644
  919. --- a/testdata/testoutput2
  920. +++ b/testdata/testoutput2
  921. @@ -14650,4 +14650,19 @@ Start of matched string is beyond its end - displaying from end to start.
  922. 0: a
  923. 0L
  924. +/(?<=((?C)0))/
  925. + 9010
  926. +--->9010
  927. + 0 ^ 0
  928. + 0 ^ 0
  929. + 0:
  930. + 1: 0
  931. + abcd
  932. +--->abcd
  933. + 0 ^ 0
  934. + 0 ^ 0
  935. + 0 ^ 0
  936. + 0 ^ 0
  937. +No match
  938. +
  939. /-- End of testinput2 --/
  940. --
  941. 2.5.0
  942. From 943a5105b9fe2842851003f692c7077a6cdbeefe Mon Sep 17 00:00:00 2001
  943. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  944. Date: Wed, 10 Feb 2016 19:13:17 +0000
  945. Subject: [PATCH] Fix workspace overflow for (*ACCEPT) with deeply nested
  946. parentheses.
  947. MIME-Version: 1.0
  948. Content-Type: text/plain; charset=UTF-8
  949. Content-Transfer-Encoding: 8bit
  950. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1631 2f5784b3-3f2a-0410-8824-cb99058d5e15
  951. Petr Písař: Ported to 8.38.
  952. diff --git a/pcre_compile.c b/pcre_compile.c
  953. index b9a239e..5019854 100644
  954. --- a/pcre_compile.c
  955. +++ b/pcre_compile.c
  956. @@ -6,7 +6,7 @@
  957. and semantics are as close as possible to those of the Perl 5 language.
  958. Written by Philip Hazel
  959. - Copyright (c) 1997-2014 University of Cambridge
  960. + Copyright (c) 1997-2016 University of Cambridge
  961. -----------------------------------------------------------------------------
  962. Redistribution and use in source and binary forms, with or without
  963. @@ -560,6 +560,7 @@ static const char error_texts[] =
  964. /* 85 */
  965. "parentheses are too deeply nested (stack check)\0"
  966. "digits missing in \\x{} or \\o{}\0"
  967. + "regular expression is too complicated\0"
  968. ;
  969. /* Table to identify digits and hex digits. This is used when compiling
  970. @@ -4591,7 +4592,8 @@ for (;; ptr++)
  971. if (code > cd->start_workspace + cd->workspace_size -
  972. WORK_SIZE_SAFETY_MARGIN) /* Check for overrun */
  973. {
  974. - *errorcodeptr = ERR52;
  975. + *errorcodeptr = (code >= cd->start_workspace + cd->workspace_size)?
  976. + ERR52 : ERR87;
  977. goto FAILED;
  978. }
  979. @@ -6626,8 +6628,21 @@ for (;; ptr++)
  980. cd->had_accept = TRUE;
  981. for (oc = cd->open_caps; oc != NULL; oc = oc->next)
  982. {
  983. - *code++ = OP_CLOSE;
  984. - PUT2INC(code, 0, oc->number);
  985. + if (lengthptr != NULL)
  986. + {
  987. +#ifdef COMPILE_PCRE8
  988. + *lengthptr += 1 + IMM2_SIZE;
  989. +#elif defined COMPILE_PCRE16
  990. + *lengthptr += 2 + IMM2_SIZE;
  991. +#elif defined COMPILE_PCRE32
  992. + *lengthptr += 4 + IMM2_SIZE;
  993. +#endif
  994. + }
  995. + else
  996. + {
  997. + *code++ = OP_CLOSE;
  998. + PUT2INC(code, 0, oc->number);
  999. + }
  1000. }
  1001. setverb = *code++ =
  1002. (cd->assert_depth > 0)? OP_ASSERT_ACCEPT : OP_ACCEPT;
  1003. diff --git a/pcre_internal.h b/pcre_internal.h
  1004. index f7a5ee7..dbfe80e 100644
  1005. --- a/pcre_internal.h
  1006. +++ b/pcre_internal.h
  1007. @@ -7,7 +7,7 @@
  1008. and semantics are as close as possible to those of the Perl 5 language.
  1009. Written by Philip Hazel
  1010. - Copyright (c) 1997-2014 University of Cambridge
  1011. + Copyright (c) 1997-2016 University of Cambridge
  1012. -----------------------------------------------------------------------------
  1013. Redistribution and use in source and binary forms, with or without
  1014. @@ -2289,7 +2289,7 @@ enum { ERR0, ERR1, ERR2, ERR3, ERR4, ERR5, ERR6, ERR7, ERR8, ERR9,
  1015. ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
  1016. ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
  1017. ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79,
  1018. - ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERRCOUNT };
  1019. + ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERR87, ERRCOUNT };
  1020. /* JIT compiling modes. The function list is indexed by them. */
  1021. diff --git a/pcreposix.c b/pcreposix.c
  1022. index dcc13ef..55b6ddc 100644
  1023. --- a/pcreposix.c
  1024. +++ b/pcreposix.c
  1025. @@ -6,7 +6,7 @@
  1026. and semantics are as close as possible to those of the Perl 5 language.
  1027. Written by Philip Hazel
  1028. - Copyright (c) 1997-2014 University of Cambridge
  1029. + Copyright (c) 1997-2016 University of Cambridge
  1030. -----------------------------------------------------------------------------
  1031. Redistribution and use in source and binary forms, with or without
  1032. @@ -173,7 +173,8 @@ static const int eint[] = {
  1033. REG_BADPAT, /* group name must start with a non-digit */
  1034. /* 85 */
  1035. REG_BADPAT, /* parentheses too deeply nested (stack check) */
  1036. - REG_BADPAT /* missing digits in \x{} or \o{} */
  1037. + REG_BADPAT, /* missing digits in \x{} or \o{} */
  1038. + REG_BADPAT /* pattern too complicated */
  1039. };
  1040. /* Table of texts corresponding to POSIX error codes */
  1041. diff --git a/testdata/testinput11 b/testdata/testinput11
  1042. index ac9d228..6f0989a 100644
  1043. --- a/testdata/testinput11
  1044. +++ b/testdata/testinput11
  1045. @@ -138,4 +138,6 @@ is required for these tests. --/
  1046. /.((?2)(?R)\1)()/B
  1047. +/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
  1048. +
  1049. /-- End of testinput11 --/
  1050. diff --git a/testdata/testoutput11-16 b/testdata/testoutput11-16
  1051. index 280692e..3c485da 100644
  1052. --- a/testdata/testoutput11-16
  1053. +++ b/testdata/testoutput11-16
  1054. @@ -765,4 +765,7 @@ Memory allocation (code space): 14
  1055. 25 End
  1056. ------------------------------------------------------------------
  1057. +/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
  1058. +Failed: regular expression is too complicated at offset 490
  1059. +
  1060. /-- End of testinput11 --/
  1061. diff --git a/testdata/testoutput11-32 b/testdata/testoutput11-32
  1062. index cdbda74..e19518d 100644
  1063. --- a/testdata/testoutput11-32
  1064. +++ b/testdata/testoutput11-32
  1065. @@ -765,4 +765,7 @@ Memory allocation (code space): 28
  1066. 25 End
  1067. ------------------------------------------------------------------
  1068. +/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
  1069. +Failed: missing ) at offset 509
  1070. +
  1071. /-- End of testinput11 --/
  1072. diff --git a/testdata/testoutput11-8 b/testdata/testoutput11-8
  1073. index cb37896..5a4fbb2 100644
  1074. --- a/testdata/testoutput11-8
  1075. +++ b/testdata/testoutput11-8
  1076. @@ -765,4 +765,7 @@ Memory allocation (code space): 10
  1077. 38 End
  1078. ------------------------------------------------------------------
  1079. +/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
  1080. +Failed: missing ) at offset 509
  1081. +
  1082. /-- End of testinput11 --/
  1083. --
  1084. 2.5.0
  1085. From b7537308b7c758f33c347cb0bec62754c43c271f Mon Sep 17 00:00:00 2001
  1086. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  1087. Date: Sat, 27 Feb 2016 17:38:11 +0000
  1088. Subject: [PATCH] Yet another duplicate name bugfix by overestimating the
  1089. memory needed (i.e. another hack - PCRE2 has this "properly" fixed).
  1090. MIME-Version: 1.0
  1091. Content-Type: text/plain; charset=UTF-8
  1092. Content-Transfer-Encoding: 8bit
  1093. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1636 2f5784b3-3f2a-0410-8824-cb99058d5e15
  1094. Petr Písař: Ported to 8.38.
  1095. diff --git a/pcre_compile.c b/pcre_compile.c
  1096. index 5019854..4ffea0c 100644
  1097. --- a/pcre_compile.c
  1098. +++ b/pcre_compile.c
  1099. @@ -7311,7 +7311,12 @@ for (;; ptr++)
  1100. so far in order to get the number. If the name is not found, leave
  1101. the value of recno as 0 for a forward reference. */
  1102. - else
  1103. + /* This patch (removing "else") fixes a problem when a reference is
  1104. + to multiple identically named nested groups from within the nest.
  1105. + Once again, it is not the "proper" fix, and it results in an
  1106. + over-allocation of memory. */
  1107. +
  1108. + /* else */
  1109. {
  1110. ng = cd->named_groups;
  1111. for (i = 0; i < cd->names_found; i++, ng++)
  1112. diff --git a/testdata/testinput2 b/testdata/testinput2
  1113. index 086e0f4..c805f5f 100644
  1114. --- a/testdata/testinput2
  1115. +++ b/testdata/testinput2
  1116. @@ -4239,4 +4239,6 @@ backtracking verbs. --/
  1117. 9010
  1118. abcd
  1119. +/((?J)(?'R'(?'R'(?'R'(?'R'(?'R'(?|(\k'R'))))))))/
  1120. +
  1121. /-- End of testinput2 --/
  1122. diff --git a/testdata/testoutput2 b/testdata/testoutput2
  1123. index d414a72..800a72f 100644
  1124. --- a/testdata/testoutput2
  1125. +++ b/testdata/testoutput2
  1126. @@ -14665,4 +14665,6 @@ Start of matched string is beyond its end - displaying from end to start.
  1127. 0 ^ 0
  1128. No match
  1129. +/((?J)(?'R'(?'R'(?'R'(?'R'(?'R'(?|(\k'R'))))))))/
  1130. +
  1131. /-- End of testinput2 --/
  1132. --
  1133. 2.5.0
  1134. From 0fc2edb79b3815c6511fd75c36a57893e4acaee6 Mon Sep 17 00:00:00 2001
  1135. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  1136. Date: Sat, 27 Feb 2016 17:55:24 +0000
  1137. Subject: [PATCH] Fix pcretest loop for global matching with an ovector size
  1138. less than 2.
  1139. MIME-Version: 1.0
  1140. Content-Type: text/plain; charset=UTF-8
  1141. Content-Transfer-Encoding: 8bit
  1142. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1637 2f5784b3-3f2a-0410-8824-cb99058d5e15
  1143. Petr Písař: Ported to 8.38.
  1144. diff --git a/pcretest.c b/pcretest.c
  1145. index 63869fd..78ef517 100644
  1146. --- a/pcretest.c
  1147. +++ b/pcretest.c
  1148. @@ -5617,6 +5617,12 @@ while (!done)
  1149. break;
  1150. }
  1151. + if (use_size_offsets < 2)
  1152. + {
  1153. + fprintf(outfile, "Cannot do global matching with an ovector size < 2\n");
  1154. + break;
  1155. + }
  1156. +
  1157. /* If we have matched an empty string, first check to see if we are at
  1158. the end of the subject. If so, the /g loop is over. Otherwise, mimic what
  1159. Perl's /g options does. This turns out to be rather cunning. First we set
  1160. --
  1161. 2.5.0
  1162. From b3db1b7de5cfaa026ec2bc4a393129461a0f5c57 Mon Sep 17 00:00:00 2001
  1163. From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
  1164. Date: Sat, 27 Feb 2016 18:44:41 +0000
  1165. Subject: [PATCH] Fix non-diagnosis of missing assertion after (?(?C).
  1166. MIME-Version: 1.0
  1167. Content-Type: text/plain; charset=UTF-8
  1168. Content-Transfer-Encoding: 8bit
  1169. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1638 2f5784b3-3f2a-0410-8824-cb99058d5e15
  1170. Petr Písař: Ported to 8.38.
  1171. diff --git a/pcre_compile.c b/pcre_compile.c
  1172. index 4ffea0c..254c629 100644
  1173. --- a/pcre_compile.c
  1174. +++ b/pcre_compile.c
  1175. @@ -485,7 +485,7 @@ static const char error_texts[] =
  1176. "lookbehind assertion is not fixed length\0"
  1177. "malformed number or name after (?(\0"
  1178. "conditional group contains more than two branches\0"
  1179. - "assertion expected after (?(\0"
  1180. + "assertion expected after (?( or (?(?C)\0"
  1181. "(?R or (?[+-]digits must be followed by )\0"
  1182. /* 30 */
  1183. "unknown POSIX class name\0"
  1184. @@ -6771,6 +6771,15 @@ for (;; ptr++)
  1185. for (i = 3;; i++) if (!IS_DIGIT(ptr[i])) break;
  1186. if (ptr[i] == CHAR_RIGHT_PARENTHESIS)
  1187. tempptr += i + 1;
  1188. +
  1189. + /* tempptr should now be pointing to the opening parenthesis of the
  1190. + assertion condition. */
  1191. +
  1192. + if (*tempptr != CHAR_LEFT_PARENTHESIS)
  1193. + {
  1194. + *errorcodeptr = ERR28;
  1195. + goto FAILED;
  1196. + }
  1197. }
  1198. /* For conditions that are assertions, check the syntax, and then exit
  1199. diff --git a/testdata/testinput2 b/testdata/testinput2
  1200. index c805f5f..75e402e 100644
  1201. --- a/testdata/testinput2
  1202. +++ b/testdata/testinput2
  1203. @@ -4241,4 +4241,6 @@ backtracking verbs. --/
  1204. /((?J)(?'R'(?'R'(?'R'(?'R'(?'R'(?|(\k'R'))))))))/
  1205. +/\N(?(?C)0?!.)*/
  1206. +
  1207. /-- End of testinput2 --/
  1208. diff --git a/testdata/testoutput2 b/testdata/testoutput2
  1209. index 800a72f..5e88d1a 100644
  1210. --- a/testdata/testoutput2
  1211. +++ b/testdata/testoutput2
  1212. @@ -555,13 +555,13 @@ Failed: malformed number or name after (?( at offset 4
  1213. Failed: malformed number or name after (?( at offset 4
  1214. /(?(?i))/
  1215. -Failed: assertion expected after (?( at offset 3
  1216. +Failed: assertion expected after (?( or (?(?C) at offset 3
  1217. /(?(abc))/
  1218. Failed: reference to non-existent subpattern at offset 7
  1219. /(?(?<ab))/
  1220. -Failed: assertion expected after (?( at offset 3
  1221. +Failed: assertion expected after (?( or (?(?C) at offset 3
  1222. /((?s)blah)\s+\1/I
  1223. Capturing subpattern count = 1
  1224. @@ -7870,7 +7870,7 @@ No match
  1225. Failed: malformed number or name after (?( at offset 6
  1226. /(?(''))/
  1227. -Failed: assertion expected after (?( at offset 4
  1228. +Failed: assertion expected after (?( or (?(?C) at offset 4
  1229. /(?('R')stuff)/
  1230. Failed: reference to non-existent subpattern at offset 7
  1231. @@ -14346,7 +14346,7 @@ No match
  1232. "((?2)+)((?1))"
  1233. "(?(?<E>.*!.*)?)"
  1234. -Failed: assertion expected after (?( at offset 3
  1235. +Failed: assertion expected after (?( or (?(?C) at offset 3
  1236. "X((?2)()*+){2}+"BZ
  1237. ------------------------------------------------------------------
  1238. @@ -14667,4 +14667,7 @@ No match
  1239. /((?J)(?'R'(?'R'(?'R'(?'R'(?'R'(?|(\k'R'))))))))/
  1240. +/\N(?(?C)0?!.)*/
  1241. +Failed: assertion expected after (?( or (?(?C) at offset 4
  1242. +
  1243. /-- End of testinput2 --/
  1244. --
  1245. 2.5.0