Converting Java String from/to UTF-8

Java internally encodes String as UTF-16. If you need to send UTF-8 Java String, for example as CORBA string parameter, you must convert it in the following way:

public class StringHelper {

	// convert from UTF-8 -> internal Java String format
	public static String convertFromUTF8(String s) {
		String out = null;
		try {
			out = new String(s.getBytes("ISO-8859-1"), "UTF-8");
		} catch (java.io.UnsupportedEncodingException e) {
			return null;
		}
		return out;
	}

	// convert from internal Java String format -> UTF-8
	public static String convertToUTF8(String s) {
		String out = null;
		try {
			out = new String(s.getBytes("UTF-8"), "ISO-8859-1");
		} catch (java.io.UnsupportedEncodingException e) {
			return null;
		}
		return out;
	}

	public static void main(String[] args) {
		String xmlstring = "Здравей' хора";
		String utf8string = StringHelper.convertToUTF8(xmlstring);
		for (int i = 0; c < utf8string.length(); ++i) {
			System.out.printf("%x ", (int) utf8string.charAt(c));
		}
	}
}

ISO-8859-1 encoding is just used to transfer 8 bit array into a String.

Leave a comment