Class UTF8Convert

java.lang.Object
com.ibm.wala.core.util.strings.UTF8Convert

public abstract class UTF8Convert extends Object
Abstract class that contains conversion routines to/from utf8 and/or pseudo-utf8. It does not support utf8 encodings of more than 3 bytes.

The difference between utf8 and pseudo-utf8 is the special treatment of null. In utf8, null is encoded as a single byte directly, whereas in pseudo-utf8, it is encoded as a two-byte sequence. See the JVM spec for more information.

  • Constructor Details

    • UTF8Convert

      public UTF8Convert()
  • Method Details

    • fromUTF8

      public static String fromUTF8(byte[] utf8) throws UTFDataFormatException
      Convert the given sequence of (pseudo-)utf8 formatted bytes into a String.

      The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.

      Parameters:
      utf8 - (pseudo-)utf8 byte array
      Returns:
      unicode string
      Throws:
      UTFDataFormatException - if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8
      IllegalArgumentException - if utf8 is null
    • toUTF8

      public static byte[] toUTF8(String s)
      Convert the given String into a sequence of (pseudo-)utf8 formatted bytes.

      The output format is controlled by the WRITE_PSEUDO_UTF8 flag.

      Parameters:
      s - String to convert
      Returns:
      array containing sequence of (pseudo-)utf8 formatted bytes
      Throws:
      IllegalArgumentException - if s is null
    • utfLength

      public static int utfLength(String s)
      Returns the length of a string's UTF encoded form.
      Throws:
      IllegalArgumentException - if s is null
    • check

      public static boolean check(byte[] bytes)
      Check whether the given sequence of bytes is valid (pseudo-)utf8.
      Parameters:
      bytes - byte array to check
      Returns:
      true iff the given sequence is valid (pseudo-)utf8.
      Throws:
      IllegalArgumentException - if bytes is null
    • fromUTF8

      public static String fromUTF8(ImmutableByteArray s)