Home

Recent files OS4Depot

IRC Channel info

Discord invite link

In cooperation with

OS4Depot.net [Bugs]

Other

Statement of Intent

Terms of Service

Search the site

Login

Lost Password?

Register now!

Sections

Home

Forums

Articles

News

User Profile

Headlines

Images

Polls

Who's Online

109 user(s) are online (105 user(s) are browsing Forums)

Members: 1
Guests: 108

K-L, more...

Support us!

Headlines

amiarcadia.lha - emulation/gamesystem
Nov 21, 2024
mgba.lha - emulation/gamesystem
Nov 21, 2024
uhctools.lha - utility/misc
Nov 21, 2024
mce.lha - game/utility
Nov 16, 2024
sdl2.lha - library/misc
Nov 16, 2024
reportplus.lha - utility/misc
Nov 13, 2024

What is OS4's default character set?

	Bottom Previous Topic Next Topic
Register To Post

(1) 2 »

What is OS4's default character set?

Posted on: 2014/11/10 10:48 #1

Home away from home

Home away from home

Assuming I am using a standard (British or USA) English installation of AmigaOS4, does it USUALLY use the ISO-8859-1 (Latin 1) character set?

Wikipedia states that AmigaDOS (1.x) uses ISO-8859-1 (Latin 1), so I'm guessing that's what future versions of AmigaOS stuck to for English users?

BTW, I see a Libs:Charsets folder - is there a standard Amiga library which allows converting between charsets? The SDKBrowser wasn't much help in answering this question.

Author of the PortablE programming language.

Re: What is OS4's default character set?

Posted on: 2014/11/10 12:20 #2

Just popping in

Just popping in

That depends on your locale settings. For plain english the system charset should be ISO-8859-1, but i.e. for czech or other slavic languages it will be ISO-8859-2.

These two lines will give you the currently used charset:

LONG default_charset = GetDiskFontCtrl(DFCTRL_CHARSET);
char *charset = (char *)ObtainCharsetInfo(DFCS_NUMBER, default_charset, DFCS_MIMENAME);

Both are functions of diskfont.library.

Conversion is best done by codesets.library.

Why stop it now, just when I am hating it?

Thore Böckelmann

Re: What is OS4's default character set?

Posted on: 2014/11/10 12:24 #3

Just can't stay away

Just can't stay away

@ChrisH

You can use the iconv functions from newlib.library to convert between charsets.

Re: What is OS4's default character set?

Posted on: 2014/11/10 12:37 #4

Amigans Defender

Amigans Defender

@ChrisH

Don't assume a charset. I use ISO-8859-15, otherwise the Euro symbol is missing (I know we don't use the Euro here, but that doesn't mean I don't want or need to type it...!).

There are various ways of finding the current charset. I use the same as tboeckel's code above, although somebody did mention that I shouldn't be using that, but the alternative looked convoluted, not that I can remember what it was.

Re: What is OS4's default character set?

Posted on: 2014/11/10 12:53 #5

Home away from home

Home away from home

struct Locale locale = ILocale->OpenLocale(NULL);

charset = locale->loc_CodeSet;

where

uint32 loc_CodeSet
Specifies the code set required by this locale. Before V50, this
value was always 0. Since V50, this is the IANA charset number
(see L:CharSets/character-sets). For compatibility, 0 should be
handled as equal to 4, both meaning ISO-8859-1 Latin1.

Blender For OS4.x : Blues : Walker Broad

Re: What is OS4's default character set?

Posted on: 2014/11/10 12:54 #6

Home away from home

Home away from home

or

Starting with V50, locale.library maintains a global environment
variable called "Charset" which contains the MIME name of the
current default charset as used in the system. This is the name
of the charset associated with the Locale structure returned by
OpenLocale(NULL).

Blender For OS4.x : Blues : Walker Broad

Re: What is OS4's default character set?

Posted on: 2014/11/10 12:55 #7

Just popping in

Just popping in

I want to use my AmigaOS4.1 in English but set the fonts to Turkish. I still haven't figured this out.

Re: What is OS4's default character set?

Posted on: 2014/11/10 12:55 #8

Home away from home

Home away from home

PS: It took me two minutes from not knowing, to RTFM, to finding out, why autodoc authors even bother?

Blender For OS4.x : Blues : Walker Broad

Re: What is OS4's default character set?

Posted on: 2014/11/10 13:02 #9

Home away from home

Home away from home

@broadblues Quote:

charset = locale->loc_CodeSet;

On it's own a "MIBenum" number doesn't look terribly useful. I'll have to see if there is a way to get a meaningful name from it... (Maybe GetDiskFontCtrl() will do the job.)

Quote:

PS: It took me two minutes from not knowing, to RTFM, to finding out, why autodoc authors even bother?

You didn't even answer my main question (i.e. is ISO-8859-1 the default for English), so no need to be grumpy. Wikipedia was literally the ONLY website with any information.

Author of the PortablE programming language.

Re: What is OS4's default character set?

Posted on: 2014/11/10 13:15 #10

Home away from home

Home away from home

@ChrisH

Quote:

You didn't even answer my main question (i.e. is ISO-8859-1 the default for English),

There is no "default" always use Locale to determine the charset. For example once ancilmon creates his own custom locale for english in turkish character set then, just the fact it's using english would mess you up.

And as the other Chris said many english users use iso-8859-15 these days.

Quote:

so no need to be grumpy.

You beat me to the edit, where I was about to say I'm only saying this because all the contributors to the thread are established and experience developers, who should at least know how to read the autodocs, I ofcourse wouldn't say it to a newbie dev, and if it came over as overly grumpy then sorry

Quote:

Wikipedia was literally the ONLY website with any information.

I would trust wiki.amigaos.net over Wikipedia in such matters any day.

Blender For OS4.x : Blues : Walker Broad

Re: What is OS4's default character set?

Posted on: 2014/11/10 13:25 #11

Home away from home

Home away from home

@tboeckel Quote:

For plain english the system charset should be ISO-8859-1

Thanks! It's amazing how this seems to be assumed as common knowledge, but doesn't actually seem to be stated anywhere (apart from Wikipedia, the unreliable font of all knowledge).

Quote:

LONG default_charset = GetDiskFontCtrl(DFCTRL_CHARSET);
char *charset = (char *)ObtainCharsetInfo(DFCS_NUMBER, default_charset, DFCS_MIMENAME);

Both are functions of diskfont.library.

Thanks. I may use that eventually - at the moment I just want to get something working with the common case (aka ISO-8859-1).

Quote:

Conversion is best done by codesets.library.

Which mean I probably won't

Author of the PortablE programming language.

Re: What is OS4's default character set?

Posted on: 2014/11/10 13:39 #12

Home away from home

Home away from home

@ChrisH

Quote:

I'll have to see if there is a way to get a meaningful name from it... (Maybe GetDiskFontCtrl() will do the job.)

No, put the number from struct Locale into ObtainCharsetInfo()

Blender For OS4.x : Blues : Walker Broad

Re: What is OS4's default character set?

Posted on: 2014/11/10 14:17 #13

Home away from home

Home away from home

@tboeckel Quote:

LONG default_charset = GetDiskFontCtrl(DFCTRL_CHARSET);
char *charset = (char *)ObtainCharsetInfo(DFCS_NUMBER, default_charset, DFCS_MIMENAME);

Both of those appear to be V50, so looks like I'll still have to assume ISO-8859-1 for AmigaOS 3.x (and probably MOS+AROS until I can be bothered to find out how they do it).

Author of the PortablE programming language.

Re: What is OS4's default character set?

Posted on: 2014/11/10 14:31 #14

Just popping in

Just popping in

Take a look at the source of codesets.library, function getSystemCodeset():

http://sourceforge.net/p/codesetslib/ ... EAD/tree/trunk/src/init.c

There you will find how codesets.lib supports all systems to obtain the currently active charset. Eventually it falls back to ISO-8859-1 if all other attempts fail.

Why stop it now, just when I am hating it?

Thore Böckelmann

Re: What is OS4's default character set?

Posted on: 2014/11/10 14:52 #15

Amigans Defender

Amigans Defender

@ChrisH

Quote:

Thanks. I may use that eventually - at the moment I just want to get something working with the common case (aka ISO-8859-1).

No. DO NOT ASSUME A CERTAIN CHARSET IS IN USE.

Re: What is OS4's default character set?

Posted on: 2014/11/10 15:41 #16

Home away from home

Home away from home

@ChrisH

Way are you interested in character set for?

if you wont consistency, you should store any text string as UTF8 and convert it to character set used by the user.

At least if it is a language file.

In 8 BIT ASCII you have 0 to 127 the typical English (7BIT ASCII), from 128 to 255 you have language specific chars, the symbols for this are not the same between languages, this are controlled by code set that the user has selected.

If it's English you wont, it make little difference what charset you use, besides the "€" symbol.

The codeset defines what symbol that OS should show depending on the language. They also are the same as values as in UTF32 table, used by the fonts.

There for there is no “default character set”, character sets are irrelevant when it comes to 7bit ASCII.

Edited by LiveForIt on 2014/11/10 15:57:30

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: What is OS4's default character set?

Posted on: 2014/11/10 18:50 #17

Home away from home

Home away from home

@ancalimon

Quote:

I want to use my AmigaOS4.1 in English but set the fonts to Turkish. I still haven't figured this out.

You will need to set your keyboard map to turkish to get your charset, then your prefered language to english to get you language. They may or may not produce the effect you requIre?Iim testIng that concept as I type and these arenit typos! There agIn I donit know If the odd characters from they board wIll get trasmItted by AWeb????

BeIng unable to read turkIsh Iim unable to verIfy If that results In turkIsh language beIng dIsplayed correctly?

Blender For OS4.x : Blues : Walker Broad

Re: What is OS4's default character set?

Posted on: 2014/11/12 3:41 #18

Just popping in

Just popping in

@All,

The only "common" information is... RKRM 3rd Edition based,

ISO-Latin-1 (this is ISO-8859-1 through ISO-8859-15 collectively)

You can only trust the Character Codes up o code 127(DEL)

Anything above character 127 is subject to change at the users whims.

Additionally ... I am working with UTF-8 as the codeset of choice for my own projects.

Use Locale.Library to get the MIBenum value and then query the on-disk reference file mapping them to names if you looked into S: and L:

DiskFont.Library will only tell you about what is currently displayed (and I am having fun and games with *multiple* Keymaps along with chording whole typed words for presenting small menus of options... 3000+ "daily Kanji" with readings anywhere from 1 through to 8 syllables for common and upto 16 syllables for uncommon readings, each "syllable" is equal to 2 or 3 English Letters...and that is only for the Japanese).

I wonder how anyone will cope when the "system default" is set for Unicode and there is no "upper limit" for Character codes (when a 32bit CodePoint IS reasonable).

Assumptions == Screwups of the worst kind... good to ask and definitely double-check before cutting code out of the frypan :P

Re: What is OS4's default character set?

Posted on: 2015/1/18 10:54 #19

Home away from home

Home away from home

Thanks again for everyone's suggestions on how to determine the current character set... even though it wasn't originally my intention to ask for that! Having finally got through a large list of things which were more necessary for my new program to function, I've now looked through those suggestions again, and implement a hopefully good way of getting the current character set.

@Chris
Quote:

No. DO NOT ASSUME A CERTAIN CHARSET IS IN USE.

I don't see what's wrong with writing a "stub" function (which always returns ISO-8859-1), until my program becomes functional enough that it's worth finding out how to do it properly. Us solo programmers need to pick our fights carefully, and avoid extra work which isn't strictly necessary:
http://www.lispcast.com/how-to-write-software
(I agree with virtually everything he writes, apart from the part where he says to spend ages ensuring you write something 100% perfect the first time around.)

@broadblues
Thanks for your suggestion of "locale->loc_CodeSet". At the moment I'm using that first, and only if it fails for some reason do I fall-back to using "GetDiskFontCtrl(DFCTRL_CHARSET)".

Quote:

I would trust wiki.amigaos.net over Wikipedia in such matters any day.

Of course. But Google didn't find the info I was after on wiki.amigaos.net .

@tboeckel
Thanks for both of your suggestions. getSystemCodeset() was helpful in seeing how to do it on MorphOS & AROS.

Quote:

Conversion is best done by codesets.library.

I ended-up writing my own code to convert to/from other charsets, and read/write UTF-8 (the latter being somewhat time consuming since I wasn't familiar with how UTF-8 worked before). One benefit of doing it with my own code is that it will work on Windows/etc without any extra effort. Another benefit is that I can convert to/from UTF-8 while simultaneously converting encoded XML characters (rather than doing it less efficiently in two separate passes).

Author of the PortablE programming language.

Re: What is OS4's default character set?

Posted on: 2015/1/18 11:08 #20

Home away from home

Home away from home

@LiveForIt Quote:

Way are you interested in character set for?

Text downloaded from the internet comes in all sorts of encodings, and displaying them correctly is tricky.

Quote:

if you wont consistency, you should store any text string as UTF8 and convert it to character set used by the user.

That is in fact what I settled on doing, otherwise things get too complicated. Luckily XML tends to be UTF8 in the first place.

@Belxjander
Quote:

DiskFont.Library will only tell you about what is currently displayed

I'm afraid I don't understand how that might be a problem. Would "locale->loc_CodeSet" (after "locale=OpenLocale(NULL);") be better than "GetDiskFontCtrl(DFCTRL_CHARSET)"?

Quote:

I wonder how anyone will cope when the "system default" is set for Unicode and there is no "upper limit" for Character codes (when a 32bit CodePoint IS reasonable).

I don't see how AmigaOS can support Unicode as the system default. Even using UTF-8 would cause problems for many programs, which assume 1 byte is 1 character.

About the only solution I *can* see for AmigaOS, would be to have new functions which were explicitly UTF-8 (possibly also allowing UTF-16), and then have all legacy OS functions automatically convert UTF-8 to/from a "legacy character set". Anything which can't be converted would get replaced by a question mark or whatever (which apparently isn't advised for security reasons, but I can't see a better solution).

Author of the PortablE programming language.

Register To Post	(1) 2 »
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 2 ( 0 members and 2 Anonymous Users )

Powered by XOOPS 2.0 © 2001-2024 The XOOPS Project