Converting Java String from/to UTF-8
Java internally encodes String as UTF-16. If you need to send UTF-8 Java String, for example as CORBA string parameter, you must convert it in the following way:
public class StringHelper { // convert from UTF-8 -> internal Java String format public static String convertFromUTF8(String s) { String out = null; try { out = new String(s.getBytes("ISO-8859-1"), "UTF-8"); } catch (java.io.UnsupportedEncodingException e) { return null; } return out; } // convert from internal Java String format -> UTF-8 public static String convertToUTF8(String s) { String out = null; try { out = new String(s.getBytes("UTF-8"), "ISO-8859-1"); } catch (java.io.UnsupportedEncodingException e) { return null; } return out; } public static void main(String[] args) { String xmlstring = "Здравей' хора"; String utf8string = StringHelper.convertToUTF8(xmlstring); for (int i = 0; c < utf8string.length(); ++i) { System.out.printf("%x ", (int) utf8string.charAt(c)); } } }
ISO-8859-1 encoding is just used to transfer 8 bit array into a String.