Subtracting characters in a back reference from a character class in java.util.regex.Pattern -


is possible subtract characters in java regex reference character class?

e.g., want use string#matches(regex) match either:

  1. any group of characters [a-z'] enclosed "

    matches: "abc'abc"

    doesn't match: "1abc'abc"

    doesn't match: 'abc"abc'

  2. any group of characters [a-z"] enclosed '

    matches: 'abc"abc'

    doesn't match: '1abc"abc'

    doesn't match: "abc'abc"

the following regex won't compile because [^\1] isn't supported:

(['"])[a-z'"&&[^\1]]*\1 

obviously, following work:

'[a-z"]*'|"[a-z']*" 

but, style isn't particularly legible when a-z replaced more complex character class must kept same in each side of "or" condition.

i know that, in java, can use string concatenation following:

string charclass = "a-z"; string regex     = "'[" + charclass + "\"]*'|\"[" + charclass + "']*\""; 

but, sometimes, need specify regex in config file, xml, or json, etc., java code not available.

i assume i'm asking not possible, figured wouldn't hurt ask...

one approach use negative look-ahead make sure every character in between quotes not quotes:

(['"])(?:(?!\1)[a-z'"])*+\1          ^^^^^^ 

(i make quantifier possessive, since there no use backtracking here)

this approach is, however, rather inefficient, since pattern check quote character every single character, on top of checking character 1 of allowed character.

the alternative 2 branches in question '[a-z"]*'|"[a-z']*" better, since engine checks quote character once , goes through rest checking current character in character class.


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -