与正则表达式有关:sre_yield

jopen 9年前

sre_yield 是用于生成正则表达式匹配结果的 Python 模块,并尽可能的匹配到所有有效值。它采用了解析正则表达式的方式,所以你可以得到一个更加精确的结果,而不仅仅只是分散的字符串。

sre_yield 通常都无法处理反向引用、lookarounds 正则表达式,除此之外,还有在这几种情况下也无法处理;

  • The maximum value for repeats is system-dependant -- CPython'ssremodule there's a special value which is treated as infinite (either 2**16-1 or 2**32-1 depending on build).  In sre_yield, this is taken as a literal, rather than infinite, thus (on a 2**16-1 platform):

  • Theremodule docs say "Regular expression pattern strings may not contain null bytes" yet this appears to work fine.

  • Order does not depend on greediness.

  • The regex is treated as fullmatch.

  • sre_yieldis confused by even the simplest of anchors:

代码示例:

>>> import random  >>> v = sre_yield.AllStrings('[abc]{1,4}')  >>> len(v)  120     # Now random.choice(v) has a 3/120 chance of choosing a single letter.  >>> random.seed(1)  >>> sum([1 if len(random.choice(v)) == 1 else 0 for _ in range(120)])  3     # xeger(v) has ~25% chance of choosing a single letter, because the length  and match are chosen independently.  > from rstr import xeger  > sum([1 if len(xeger('[abc]{1,4}')) == 1 else 0 for _ in range(120)])  26

项目主页:http://www.open-open.com/lib/view/home/1429276344236