Class Tokenizer

java.lang.Object
org.moeaframework.util.io.Tokenizer

public class Tokenizer extends Object
Tokenizer for encoding and decoding content on a line, escaping any special characters.
  • Constructor Details

    • Tokenizer

      public Tokenizer()
      Constructs a new tokenizer with default settings.
  • Method Details

    • reset

      public void reset()
      Resets this tokenizer back to its default state, removing any custom settings.
    • escapeChar

      public void escapeChar(CharSequence original, CharSequence replacement)
      Registers an escaped character by specifying the original character and its escaped representation. Note that whitespace, control characters, unicode, and '\' are escaped by default.
      Parameters:
      original - the original character
      replacement - the replacement string, which must start with '\'
    • setDelimiter

      public void setDelimiter(char delimiter)
      Sets the delimiter, adding it as an escape character if not already configured.
      Parameters:
      delimiter - the delimiter character
    • getDelimiter

      public String getDelimiter()
      Returns the delimiter used by this tokenizer.
      Returns:
      the delimiter character
    • unescape

      public String unescape(String str)
      Unescape the string without splitting into tokens.
      Parameters:
      str - the string
      Returns:
      the unescaped string
    • decodeToArray

      public String[] decodeToArray(String line)
      Decodes or parses the string into individual tokens. See decode(String) for details.
      Parameters:
      line - the line to decode
      Returns:
      the tokens
    • decode

      public List<String> decode(String line)
      Decodes or parses the string into individual tokens, converting any escaped characters back to their original.

      Leading and trailing whitespace are trimmed from the tokens. Any such whitespace that is part of the token needs to be escaped. For example, " foo " becomes ["foo"], but the whitespace can be escaped with "\ \ foo\ \ ".

      If the delimiter is a whitespace character, multiple adjacent whitespace are treated as one delimiter. However, if the delimiter is a non-whitespace character, each delimiter denotes a new token. This leads to slightly different behavior when dealing with delimiters. For instance, "foo bar" becomes ["foo", "bar"], but "foo,,bar" becomes ["foo", "", "bar"].

      Parameters:
      line - the line to decode
      Returns:
      the tokens
    • escape

      public String escape(String str)
      Escapes the characters in a string.
      Parameters:
      str - the string
      Returns:
      the escaped string
    • encode

      public String encode(String[] tokens)
      Encodes the tokens into a string. See encode(Iterable) for details.
      Parameters:
      tokens - the tokens to encode
      Returns:
      the encoded string
    • encode

      public String encode(Stream<String> tokens)
      Encodes the tokens into a string. See encode(Iterable) for details.
      Parameters:
      tokens - the tokens to encode
      Returns:
      the encoded string
    • encode

      public String encode(Iterable<String> tokens)
      Encodes the tokens into a string. Each token will be escaped following the rules of this tokenizer and joined into a string separated by the delimiter.
      Parameters:
      tokens - the tokens to encode
      Returns:
      the encoded string