Home » Android » How to check whether given text is english or chinese in android?

How to check whether given text is english or chinese in android?

Posted by: admin June 15, 2020 Leave a comment

Questions:

I am designing one android application in English and Chinese both. I want to know whether the user type English text or Chinese text?. Is there any way to check this in android?

How to&Answers:

If you want to detect whether the input string contains Chinese-like character(s) (CJK), the following may help you:

public static boolean isCJK(String str){
        int length = str.length();
        for (int i = 0; i < length; i++){
            char ch = str.charAt(i);
            Character.UnicodeBlock block = Character.UnicodeBlock.of(ch);
            if (Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS.equals(block)|| 
                Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS.equals(block)|| 
                Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A.equals(block)){
                return true;
            }
        }
        return false;
    }

Answer:

The accepted answer is either incomplete or outdated. Here are a few methods you can use to test if a character is a CJK Ideograph. My fuller answer is here.

It is better to use the codepoint rather than charAt (as in the accepted answer) because many Chinese characters are in a higher code plane. Using charAt will just give you one of the surrogate pairs rather than the actual Chinese character. So a better way to loop through a String is like this:

final int length = myString.length();
for (int offset = 0; offset < length; ) {
    final int codepoint = Character.codePointAt(myString, offset);

    // use codepoint here

    offset += Character.charCount(codepoint);
}

And testing the codepoints can be done in one of the following ways.

private boolean isCJK(int codepoint) {
    Character.UnicodeBlock block = Character.UnicodeBlock.of(codepoint);
    return (Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS.equals(block)||
            Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A.equals(block) ||
            Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B.equals(block) ||
            Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_C.equals(block) || // api 19, remove these if supporting lower versions
            Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_D.equals(block) || // api 19
            Character.UnicodeBlock.CJK_COMPATIBILITY.equals(block) ||
            Character.UnicodeBlock.CJK_COMPATIBILITY_FORMS.equals(block) ||
            Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS.equals(block) ||
            Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT.equals(block) ||
            Character.UnicodeBlock.CJK_RADICALS_SUPPLEMENT.equals(block) ||
            Character.UnicodeBlock.CJK_STROKES.equals(block) ||                        // api 19
            Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION.equals(block) ||
            Character.UnicodeBlock.ENCLOSED_CJK_LETTERS_AND_MONTHS.equals(block) ||
            Character.UnicodeBlock.ENCLOSED_IDEOGRAPHIC_SUPPLEMENT.equals(block) ||    // api 19
            Character.UnicodeBlock.KANGXI_RADICALS.equals(block) ||
            Character.UnicodeBlock.IDEOGRAPHIC_DESCRIPTION_CHARACTERS.equals(block));
}

Or for API 19

private boolean isCJK(int codepoint) {
    return Character.isIdeographic(codepoint);
}

Or for API 24

private boolean isCJK(int codepoint) {
    return (Character.UnicodeScript.of(codepoint) == Character.UnicodeScript.HAN);
}

Answer:

If you want to get default language of device Locale.getDisplayLanguage() should give you user’s language.

Otherwise, this may help you.

EDIT:

I am not sure but Google Translate does this. When user types something, it automatically detects the language. So, Google Translate API should be able to do this for you.

EDIT2

Yes, it does with a simple HttpGet, here is the link.

Answer:

String language = Locale.getDefault().getDisplayLanguage(); will give your default language of your device
System.out.println("My locale::"+Locale.getDefault().getDisplayLanguage());
it will print like My locale:: english;

now you have to check

if(language.equalignorecase("engish")){
  // do your stuff for english
}else{
// do your stuff for chienese
}

also you can use Language Detection Library for android