vendredi 11 octobre 2019

Non capturing group in Java Scanner is ignored

I am trying to get the scanner split a string on every @ symbol, except when escaped (or at the start of a line)

My RegEx: (?:[^\\])@

(?:            // Start of non-capturing group (0)
  [            // Match any characters in square brackets [
    ^\\        // Match any non-\ character.
  ]            // ]
)              // End of non-capturing group (0)
@              // Match literal '@'

From, my understanding, this should work for my intentions.

However when using this pattern in a scanner, it simply ignores the fact that the non-capturing group should not be counted towards the delimiter, simply to match against, the delimiter (the part to be removed/split at) should be just '@'. So for the following example String: "Hello@World", The result would have to be ["Hello", "World"].

Except running below code sample:

private static void test() {
    try (Scanner sc = new Scanner("test@here")) {
        sc.useDelimiter("(?:[^\\\\])@"); // Every unescaped @ sign.
        while (sc.hasNext()) {
            String token = sc.next();
            System.out.println(token);
        }
    }   
}

yields:

tes
here

instead of the expected:

test
here

Aucun commentaire:

Enregistrer un commentaire