The FTXT spec doesn't mention CSET at all, however it does say all characters in the CHRS chunk must be in the core character set of ISO/DIS 6429.2 and ANSI X3.64-1979 standards.
the CSET spec is very minimal, it's a LONG to specify the charset (I'd imagine we are talking MIBenum here as used elsewhere in OS4), followed by seven reserved LONGs. It relates to text in any IFF FORM, and I would agree this includes CHRS chunks in FTXT.
I haven't seen Detlef's updated CSET spec, as the public SDK is typically out of date and doesn't contain it, or a number of other functions I'd quite like to use.
Explanation of possible CodeSet values: --------------------------------------- 0 = undefined, has to be interpreted as ECMA-94 Latin 1. ECMA-94 Latin 1 was the only standard Amiga charset before AmigaOS4, it is identical to ISO-8859-1.
See the autodoc for diskfont.library/ObtainCharsetInfo() how to interpret non-zero values.
See Documentation/Charsets.doc on the OS4 CD for an explanation of character sets.
Here is an incomplete(!) list of character sets which are supported by some parts of AmigaOS4.
3 - US-ASCII 4 - ISO-8859-1 5 - ISO-8859-2 6 - ISO-8859-3 7 - ISO-8859-4 8 - ISO-8859-5 9 - ISO-8859-6 Caution, ISO-8859-6-E has value 81, ISO-8859-6-I has 82. 10 - ISO-8859-7 11 - ISO-8859-8 Caution, ISO-8859-8-E has value 84, ISO-8859-8-I has 85. 12 - ISO-8859-9 13 - ISO-8859-10 ISO-8859-11 is not an official IANA charset. Use TIS-620 instead. 109 - ISO-8859-13 110 - ISO-8859-14 111 - ISO-8859-15 112 - ISO-8859-16 106 - UTF-8 Can already be used in catalogs and keymaps. 1012 - UTF-7 Supported by the CharsetConvert command. 1013 - UTF-16BE Supported by the CharsetConvert command. 1014 - UTF-16LE Supported by the CharsetConvert command. 1018 - UTF-32BE Supported by the CharsetConvert command. 1019 - UTF-32LE Supported by the CharsetConvert command. 2042 - IBM CodePage 437 2084 - KOI8-R 2104 - Amiga-1251 See SDK:Documentation/Localization/Charsets/Amiga-1251/ 2250 - windows-1250 2251 - windows-1251 2252 - windows-1252 2253 - windows-1253 2254 - windows-1254 2255 - windows-1255 2256 - windows-1256 2257 - windows-1257 2258 - windows-1258 2259 - TIS-620
AmigaOS may support additional character sets in future. IANA may define additional character sets in future.
Ah, that spec is the same as the one I have then, except with the extra explanation and clarification of valid values for CodeSet (and change of LONG to uint32, which seems sensible)
I know that, my point is that the clipboard itself is not a problem, and right now OWB can start putting a CSET chunk on the clipboard.
The sooner apps start doing that, the better.
But Shinkuro's original problem was with text copied into OWB. Solving that would still require that the other program knew about the CSET chunk and set it correctly.
"Never ascribe to malice that which is adequately explained by incompetence." (Napol?on Bonaparte) "I would love to change the world, but they won?t give me the source code." (Unknown)
I see another problem here - Datatypes textclass doesn't have any charset information, and (I'm guessing) the FTXT datatype doesn't look at the CSET chunk.
This isn't a problem for most applications that use the clipboard, as they will be reading and writing clips directly or through iffparse rather than using Datatypes.
However, if the clipboard is viewed through Multiview using Datatypes, or an FTXT file not on the clipboard is read through Datatypes, the Charset will be ignored. This probably isn't relevant to OWB but it does need fixing (the addition of a tn_CharSet value to the Text structure and a flag to indicate that this is available would be a good start)