Subtracting characters in a back reference from a character class in java.util.regex.Pattern -
is possible subtract characters in java regex reference character class?
e.g., want use string#matches(regex)
match either:
any group of characters
[a-z']
enclosed"
matches: "abc'abc"
doesn't match: "1abc'abc"
doesn't match: 'abc"abc'
any group of characters
[a-z"]
enclosed'
matches: 'abc"abc'
doesn't match: '1abc"abc'
doesn't match: "abc'abc"
the following regex won't compile because [^\1]
isn't supported:
(['"])[a-z'"&&[^\1]]*\1
obviously, following work:
'[a-z"]*'|"[a-z']*"
but, style isn't particularly legible when a-z
replaced more complex character class must kept same in each side of "or" condition.
i know that, in java, can use string
concatenation following:
string charclass = "a-z"; string regex = "'[" + charclass + "\"]*'|\"[" + charclass + "']*\"";
but, sometimes, need specify regex in config file, xml, or json, etc., java code not available.
i assume i'm asking not possible, figured wouldn't hurt ask...
one approach use negative look-ahead make sure every character in between quotes not quotes:
(['"])(?:(?!\1)[a-z'"])*+\1 ^^^^^^
(i make quantifier possessive, since there no use backtracking here)
this approach is, however, rather inefficient, since pattern check quote character every single character, on top of checking character 1 of allowed character.
the alternative 2 branches in question '[a-z"]*'|"[a-z']*"
better, since engine checks quote character once , goes through rest checking current character in character class.
Comments
Post a Comment