Home » Android » Different regular expression result in Java SE and Android platform

Different regular expression result in Java SE and Android platform

Posted by: admin May 14, 2020 Leave a comment

Questions:

I have the following Java SE code, which runs on PC

public static void main(String[] args) {
    // stringCommaPattern will change
    // ","abc,def","
    // to
    // ","abcdef","        
    Pattern stringCommaPattern = Pattern.compile("(\",\")|,(?=[^\"[,]]*\",\")");
    String data = "\"SAN\",\"Banco Santander, \",\"NYSE\"";
    System.out.println(data);
    final String result = stringCommaPattern.matcher(data).replaceAll("$1");
    System.out.println(result);
}

I’m getting expected result

"SAN","Banco Santander, ","NYSE"
"SAN","Banco Santander ","NYSE"

However, when comes to Android.

Pattern stringCommaPattern = Pattern.compile("(\",\")|,(?=[^\"[,]]*\",\")");
String data = "\"SAN\",\"Banco Santander, \",\"NYSE\"";
Log.i("CHEOK", data);
final String result = stringCommaPattern.matcher(data).replaceAll("$1");
Log.i("CHEOK", result);

I’m getting

"SAN","Banco Santander, ","NYSE"
"SAN","Banco Santandernull ","NYSE"

Any suggestion and workaround, how I can make this code behaves same as it is at Java SE?


Additional Note :

Other patterns yield the same result as well. It seems that, Android is using null string for unmatched group, and Java SE is using empty string for unmatched group.

Take the following code.

public static void main(String[] args) {
    // Used to remove the comma within an integer digit. The digit must be located
    // in between two string. Replaced with $1.
    //
    // digitPattern will change
    // ",100,000,"
    // to
    // ",100000,"        
    final Pattern digitPattern = Pattern.compile("(\",)|,(?=[\d,]+,\")");
    String data = "\",100,000,000,\"";
    System.out.println(data);
    final String result = digitPattern.matcher(data).replaceAll("$1");
    System.out.println(result);
}

Java SE

",100,000,000,"
",100000000,"

Android

",100,000,000,"
",100null000null000,"
How to&Answers:

Not a reason why, but as a workaround you could do the appendReplacement loop yourself rather than using replaceAll

StringBuffer result = new StringBuffer();
Matcher m = digitPattern.matcher(data);
while(m.find()) {
  m.appendReplacement(result, (m.group(1) == null ? "" : "$1"));
}
m.appendTail(result);

This should work on both JavaSE and Android.

Or sidestep the problem entirely by changing the regex

Pattern commaNotBetweenQuotes = Pattern.compile("(?<!\"),(?!\")");
String result = commaNotBetweenQuotes.matcher(data).replaceAll("");

Here the regex matches just the commas you want to change, and not the ones you want to leave intact, so you can just replace them all with "" with no need for capturing groups.