Home » Java » Converting non-numeric String to Integer?

Converting non-numeric String to Integer?

Posted by: admin December 28, 2021 Leave a comment

Questions:

How can I convert a non-numeric String to an Integer?

I got for instance:

String unique = "FUBAR";

What’s a good way to represent the String as an Integer with no collisions e.g. “FUBAR” should always be represented as the same number and shan’t collide with any other String. For instance, String a = "A"; should be represented as the Integer 1 and so on, but what is a method that does this (preferrably for all unicode strings, but in my case ASCII values could be sufficient).

Answers:

This is impossible. Think about it, an Integer can only be 32 bits. So, by the pigeonhole principle, there must exist at least two strings that have the same Integer value no matter what technique you use for conversion. In reality, there are infinite with the same values…

If you’re just looking for an efficient mapping, then I suggest that you just use the int returned by hashCode(), which for reference is actually 31 bits.

###

You can map Strings to unique IDs using table. There is not way to do this generically.

final Map<String, Integer> map = new HashMap<>();
public int idFor(String s) {
    Integer id = map.get(s);
    if (id == null)
       map.put(s, id = map.size());
    return id;
}

Note: having unique id’s doesn’t guarantee no collisions in a hash collection.

http://vanillajava.blogspot.co.uk/2013/10/unique-hashcodes-is-not-enough-to-avoid.html

###

If you know the character set used in your strings, then you can think of the string as number with base other than 10. For example, hexadecimal numbers contain letters from A to F.

Therefore, if you know that your strings only contain letters from an 8-bit character set, you can treat the string as a 256-base number. In pseudo code this would be:

number n;
for each letter in string
    n = 256 * n + (letter's position in character set)

If your character set contains 65535 characters, then just multiply ‘n’ with that number on each step. But beware, the 32 bits of an integer will be easily overflown. You probably need to use a type that can hold a larger number.

###

private BigDecimal createBigDecimalFromString(String data)
{
    BigDecimal value = BigDecimal.ZERO;

    try
    {
        byte[] tmp = data.getBytes("UTF-8");
        int numBytes = tmp.length;
        for(int i = numBytes - 1; i >= 0; i--)
        {
            BigDecimal exponent = new BigDecimal(256).pow(i);
            value = value.add(exponent.multiply(new BigDecimal(tmp[i])));
        }
    }
    catch (UnsupportedEncodingException e)
    {
    }
    return value;
}

###

Regardless of the accepted answer, it is possible to represent any String as an Integer by computing that String’s Gödelnumber, which is a unique product of prime numbers for every possible String. With that being said it’s quite impractical and slow to implement, also for most Strings you would need a BigInteger rather than a normal Integer and to decode a Gödelnumber into its corresponding String you need to have a defined Charset.

###

Maybe a little bit late, but I’m going to give my 10 cents to simplify it (internally is similar to BigDecimal suggested by @Romain Hippeau)

public static BigInteger getNumberId(final String value) {
    return new BigInteger(value.getBytes(Charset.availableCharsets().get("UTF-8")));
}