Home » Java » How do I split a string with any whitespace chars as delimiters?

How do I split a string with any whitespace chars as delimiters?

Posted by: admin November 2, 2017 Leave a comment

Questions:

What regex pattern would need I to pass to the java.lang.String.split() method to split a String into an Array of substrings using all whitespace characters (‘ ‘, ‘\t’, ‘\n’, etc.) as delimiters?

Answers:

Something in the lines of

myString.split("\s+");

This groups all white spaces as a delimiter.

So if I have the string:

"Hello[space][tab]World"

This should yield the strings "Hello" and "World" and omit the empty space between the [space] and the [tab].

As VonC pointed out, the backslash should be escaped, because Java would first try to escape the string to a special character, and send that to be parsed. What you want, is the literal "\s", which means, you need to pass "\\s". It can get a bit confusing.

The \\s is equivalent to [ \\t\\n\\x0B\\f\\r]

Questions:
Answers:

In most regex dialects there are a set of convenient character summaries you can use for this kind of thing – these are good ones to remember:

\w – Matches any word character.

\W – Matches any nonword character.

\s – Matches any white-space character.

\S – Matches anything but white-space characters.

\d – Matches any digit.

\D – Matches anything except digits.

A search for “Regex Cheatsheets” should reward you with a whole lot of useful summaries.

Questions:
Answers:

To get this working in Javascript, I had to do the following:

myString.split(/\s+/g)

Questions:
Answers:

“\\s+” should do the trick

Questions:
Answers:

Also you may have a UniCode non-breaking space xA0…

String[] elements = s.split("[\s\xA0]+"); //include uniCode non-breaking

Questions:
Answers:

Apache Commons Lang has a method to split a string with whitespace characters as delimiters:

StringUtils.split("abc def")

http://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#split(java.lang.String)

This might be easier to use than a regex pattern.

Questions:
Answers:
String string = "Ram is going to school";
String[] arrayOfString = string.split("\s+");

Questions:
Answers:

Since it is a regular expression, and i’m assuming u would also not want non-alphanumeric chars like commas, dots, etc that could be surrounded by blanks (e.g. “one , two” should give [one][two]), it should be:

myString.split(/[\s\W]+/)

Questions:
Answers:

you can split a string by line break by using the following statement :

 String textStr[] = yourString.split("\r?\n");

you can split a string by Whitespace by using the following statement :

String textStr[] = yourString.split("\s+");

Questions:
Answers:
String str = "Hello   World";
String res[] = str.split("\s+");

Questions:
Answers:

Study this code.. good luck

    import java.util.*;
class Demo{
    public static void main(String args[]){
        Scanner input = new Scanner(System.in);
        System.out.print("Input String : ");
        String s1 = input.nextLine();   
        String[] tokens = s1.split("[\s\xA0]+");      
        System.out.println(tokens.length);      
        for(String s : tokens){
            System.out.println(s);

        } 
    }
}