AmigaOS 4.0 - About OS4 - Commands

Purpose: Convert a text file from one charset into another.
Format: CHARSETCONVERT <fromfile> <fromcharset> [ [TO] <tofile>] [ [TOCHARSET] <tocharset>] [EOL CR | LF | CRLF]
CHARSETCONVERT converts the text file specified by the <fromfile> argument using the charset specified by the <fromcharset> argument to the text file specified by the <to> argument using charset specified by the <tocharset> argument. The <from> and <to> arguments can specify either a file or a device.

If the <to> or TO <to> argument is not specified, output defaults to the current window.

If the <tocharset> or TOCHARSET <tocharset> argument is not specified, output defaults to the current system default charset. Note that the TOCHARSET <tocharset> form of the argument must be used when the [TO] <tofile> argument is not specified.

The <fromcharset> and <tocharset> arguments refer to MIME charset names or aliases registered at IANA and stored in L:Charsets/character-sets, as well as custom charset names or aliases stored in L:Charsets/custom-character-sets, as listed for the SetFontCharSet command. Currently only 8-bit charsets with a mapping table to Unicode (which can be created by the BuildMapTable command) in L:Charsets/ are supported, plus those additional charsets:

  • <fromcharset> may also be UTF-7, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE.
  • <tocharset> may also be UTF-8.
EOL <type> converts End-Of-Line (EOL) sequences to the specified type. If not specified, no EOL conversion will happen. The EOL <type> argument must match one of the following keywords:
  • CR     - output EOL as CR   (0x0D,   "\r")   (Mac style).
  • LF     - output EOL as LF   (0x0A,   "\n")   (Amiga style).
  • CRLF - output EOL as CRLF (0x0D0A, "\r\n") (PC style).
CHARSETCONVERT will return with result code 20 (FAILURE) when a severe error occurs, with result code 10 (ERROR) when the input file contains invalid data (NUL or a character or encoding sequence which is undefined or invalid in <fromcharset> found in the <fromfile>, with result code 5 (WARN) when at least one input character could not be represented in the <tocharset> and was replaced by an "" sequence, and with result code 0 (OK) if all went well.

Example 1:

3.OS4:> CHARSETCONVERT russian KOI8-R russian-ISO ISO-8859-5 EOL=LF reads the text file russian, converts the charset from KOI8-R to ISO-8859-5, converts the EOL sequences to Amiga style, and writes the result to russian-ISO.

Example 2:

3.OS4:> CHARSETCONVERT czech.txt X-ATO-E2 czech-ISO2.txt ISO-8859-2 reads the text file czech.txt, converts the charset from X-ATO-E2 (the charset of the OS3.x czech catalog files) to ISO-8859-2, replaces unconvertable characters with an sequence and writes the result to czech-ISO2.txt.

Example 3:

3.OS4:> SETFONT topaz 8 CHARSET ISO-8859-16
reads the text file polish.txt, converts the charset from X-ATO-PL (the charset of the OS3.x polish catalog files) to ISO-8859-16 and displays the result in the current window with topaz.font, size 8, in ISO-8859-16.

