Home

Recent files OS4Depot

IRC Channel info

Discord invite link

In cooperation with

OS4Depot.net [Bugs]

Other

Statement of Intent

Terms of Service

Search the site

Login

Lost Password?

Register now!

Sections

Home

Forums

Articles

News

User Profile

Headlines

Images

Polls

Who's Online

17 user(s) are online (14 user(s) are browsing Forums)

Members: 0
Guests: 17

more...

Support us!

Headlines

arabic_console_devicepro2.lha - driver/input
Apr 22, 2025
amiarcadia.lha - emulation/gamesystem
Apr 22, 2025
imp3.lha - audio/play
Apr 20, 2025
amicraftnova_src.rar - game/misc
Apr 20, 2025
aiostreams.lha - video/misc
Apr 19, 2025
polarpaint.lha - graphics/edit
Apr 18, 2025
polarpaint_small.lha - graphics/edit
Apr 18, 2025
deark.lha - utility/archive
Apr 18, 2025
amifish.lha - game/board
Apr 17, 2025
baccarat.lha - game/card
Apr 16, 2025

Unicode support in future os4 updates?

	Bottom Previous Topic Next Topic
Register To Post

« 1 (2) 3 »

Re: Unicode support in future os4 updates?

Posted on: 2007/7/14 13:13 #21

Home away from home

Home away from home

@ZeroG

Quote:

Where is the problem to display swedish characters using the swedish charset, when the UFT-8 String has been converted to the normal swedish amiga-charset?

On AmigaOS4 with ReAction there is no problem. On OS4 with MUI you have to use workarounds (fontname_MIME-charset-name.font). If swedish doesn't use ISO-8859-1 it's not possible on AmigaOS 3.x in MUI programs, on AmigaOS <= 3.9 the charset is always ISO-8859-1, you can use something else only if you use unicode and the bullet API and render all texts yourself, not in any GUI system.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/14 13:33 #22

Home away from home

Home away from home

@TetiSoft

Quote:

I've heard that the IRC protocol doesnt include any MIME
specification of the used charset. The user is responsible
to know which charset is used by the other user and to send
the text he typed in the charset which is expected by the
other user.

Yes, and it's even much worse than that, originally IRC used the 7 bit IBM national charset of norway (IIRC, or whereever else it was invented)

But since there is no charset in IRC using anything but UFT-8 (or US-ASCII) doesn't make sense.

Quote:

Or in other words, you have to tell the other users that they shall send ISO-8859-1 or -15.

Except for private chats you usually can't do that, the networks and/or channels define a charset all users should use.

Quote:

Or you use an IRC client
which is able to decode UTF-8 and to convert it to the
current OS4 system default charset before displaying the
text.

And which can convert everything you type from the system default charset to UTF-8 before it sends it. It should support converting from/to other 8 bit charsets as well since UTF-8 isn't used everywhere. All IIRC clients, except for the AmigaOS one (AFAIK WookieChat is the only one still developed), do support that, and most try to auto detect UTF-8 when receiving texts even if you have configured it to use a 8 bit charset.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/14 20:36 #23

Just can't stay away

Just can't stay away

@joerg

Quote:

joerg wrote:
@TetiSoft
Yes, and it's even much worse than that, originally IRC used the 7 bit IBM national charset of norway (IIRC, or whereever else it was invented)

In Oulu, Finland ;) The guy is a Ph.D. nowadays.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/15 0:11 #24

Quite a regular

Quite a regular

@keisangi

If you want to be taken serious, you should really adapt your comment style.

TetiSoft is an expert when it comes to charsets, you'd be much better of listening to him instead of making unqualified comments.

Edited by orgin on 2007/7/15 8:31:26

Seriously, if you do want to contact me write me a mail. You're more likely to get a reply then.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 15:59 #25

Home away from home

Home away from home

@joerg

Quote:

you'd have to (re)implement all gadgets

UTF8 is not magic, it?s easy detectable because of the way it encoded, you can check if?s valid format or not, if not is ASCII, for storage it only required a zero terminated text string, but problem is program requires ASCII (8 bit) and need to read attributes from UTF-8 supported gadget class, the you can?t do that whit existing TAGS, there for explicit UTF-8 support tags most be added.

UTF8 support can be extended one class at time.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 16:17 #26

Home away from home

Home away from home

@TetiSoft

Quote:

which is attached to the serial port

Well one of your excuse I believe, because you know UTF-8 will need to be extended system wide, and you know it big job to add it.

It?s not like can?t convert UTF-8 back to 8bit using the codepage before sending it over the serial port.

Quote:

or logged in via net.

Again I don?t se how this is relevant, SSH and other shells can be supported, simply char conversions from UTF-8.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 16:51 #27

Home away from home

Home away from home

@TetiSoft

Quote:

a simple conversion from UTF-8, to the current system default charset is enough to talk
swedish with swedish friends.

Well in that process you?re losing lost of information, better convert 64 bit UTF-8 display the chars directly.

Your option is some what hack if ask me, if like display the text in the right format, first need analyze the UTF-8 data, to detect what language it typed, the switch code page to correct format, as well as convert it 8bit, then you can render the data and switch code page back to your where using, from developers point of view does it?

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 17:05 #28

Home away from home

Home away from home

@TetiSoft

Quote:

I would not even think about allowing AmigaOS to display
UTF-8 until most parts of AmigaOS are charset aware

Because you and I know big task of adding UTF-8 support in AmigaOS unrealistic to thing that UTF-8 will be added to all corners of the OS, whit having some basic UTF-8 support.

In no way is it appropriate to add UTF-8 support as hack that conflicts whit major parts of the OS, there for UTF-8 most be extended as optional feature that can be added to supported components one by one, some thing that can be used by any thing that supports it, but can not be used by some thing that don?t support it, there for support of UTF-8 starting whit displaying UTF-8 text in simple way, we are not taking about bullet API.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 18:14 #29

Just popping in

Just popping in

@LiveForIt

Quote:

UTF8 is not magic

We are aware of that already.
Quote:

it's easy detectable because of the way it encoded,
you can check if's valid format or not, if not is ASCII,
for storage it only required a zero terminated text string,
but problem is program requires ASCII (8 bit) and need to
read attributes from UTF-8 supported gadget class, the you
can't do that whit existing TAGS, there for explicit UTF-8
support tags most be added.

OS4 uses explicit charset tags for charsets since years
and you suggest we should drop that and try to use UTF-8
autodetection? No, thanks

Please read the docs again.
The existing tag value for UTF-8 is 106, as specified by
IANA in L:charsets/character-sets.

Quote:

UTF8 support can be extended one class at time.

I dont see a reason why any class should handle UTF-8
different than ISO-8859-7. Ok, when it tries to interpret
the text to e.g. force underlining of shortcut characters
in labels it needs some adjustments, but for simply
displaying text there should IMHO be no difference between
any 8bit charset and UTF-8.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 18:54 #30

Just popping in

Just popping in

@LiveForIt

Quote:

Well one of your excuse I believe,

Of course I'm sitting here since years doing nothing,
enjoying the Ferrari I bought from the OS4 sales,
and formulate lame excuses when somebody asks for
"UTF-8 support in OS4" without giving details where
and why he needs exactly what. In real life the Ferrari
is an Opel, and I dont know how to pay its ensurance.

Quote:

because you know UTF-8 will need to be extended system wide,
and you know it big job to add it.

UTF-8 shall be used system wide? And its my job to do that?
Is it your job to do the user support hotline when the
users cant read their existing text files anymore?

Quote:

It's not like can't convert UTF-8 back to 8bit using the
codepage before sending it over the serial port.

You dont know what you are talking about, sorry.
I am talking about adding support for UTF-8 keymaps
to keymap.library. Yes, it already can read UTF-8 keymap
text files, just in case you missed it. But when it would
create UTF-8 keymaps in memory, this would break nearly
every existing shell/console/terminal/KingCON/whatever.
The UTF-8 encoded strings for non-ASCII characters would
be no problem in most cases, BUT the special keys
(cursor, function keys, ESC etc) are used unchanged by
most software, do you volunteer to create new shells/
consoles/con-handlers etc which can handle ESC[ or UTF-8
escape sequences, not only CSI escape sequences? No?
Any excuse? Well, when you dont, why should I?

Quote:

Again I don't se how this is relevant,
SSH and other shells can be supported,
simply char conversions from UTF-8.

SSH is a shell and either already knows that the next generation
OS4 will use UTF-8 encoded escape sequences or doesnt interpret
escape sequences at all? And you already modified its source code
to tell keymap.library which charset shall be used?

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 19:18 #31

Just popping in

Just popping in

@LiveForIt

Quote:

Well in that process you're losing lost of information,
better convert 64 bit UTF-8 display the chars directly.

64 bit UTF-8? Which RFC is that exactly?

When typing swedish to a swedish friend, you cant lose any
information when you dont support more than ISO-8859-1
or -15, because those ISO standards were created for swedish
people. When you lose something it must have been something
which was definitely not swedish...

Quote:

Your option is some what hack if ask me, if like display
the text in the right format, first need analyze the UTF-8
data, to detect what language it typed, the switch code
page to correct format, as well as convert it 8bit, then
you can render the data and switch code page back to your
where using, from developers point of view does it?

We are NOT talking about word processers or text layout
engines or lexical applications here, the context was
an IRC client. Be assured that I never ever typed a single
character in an IRC client, but I've seen some IRC logs
already, and I'm sure that the average IRC user doesnt
care much about spelling errors, upper or lower case,
apostroph or single quotation mark or double quotation
mark etc, that in at least 99% of all cases the
received UTF-8 text can be displayed in at least one
8bit charset, and that the experienced user will have
no problem to either switch his system to greek before
chatting with greeks or to tell his IRC client to use
a greek font with the already mentioned charset hack.

What you are proposing is IMHO too complicated for the
average IRC client software. When you really wanna handle
full Unicode repertoire, dont forget the combining
characters, the non-spacing characters, the bidirectional
writing direction, the Unicode characters which require
a fixed-width font, etc. Its really enough when it decodes
to the current system default 8bit charset and it would
be fine when it would allow to choose any 8bit charset
supported by OS4, but full Unicode is IMHO not needed (yet)
for an IRC client. For an office application, maybe.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 19:42 #32

Just popping in

Just popping in

@LiveForIt

Quote:

In no way is it appropriate to add UTF-8 support as hack
that conflicts whit major parts of the OS, there for UTF-8
most be extended as optional feature that can be added to
supported components one by one, some thing that can be
used by any thing that supports it, but can not be used by
some thing that don't support it, there for support of
UTF-8 starting whit displaying UTF-8 text in simple way,
we are not taking about bullet API.

Just in case you missed it, before OS4 nearly no AmigaOS
application supported greek, cyrillic, czech etc. With
OS4 most applications support it AFAIK, even those
broken word processors which were unable to speak Unicode
when using bullet API are magically fixed by ft2.library,
in the meantime even the PostScript and most PCL printer
drivers support greek

So I still think we should try
to make it possible to use UTF-8 with pre-OS4 applications.

Especially the fact that many OS4 applications are not
charset aware anyway would lead to the conclusion that
UTF-8 support which needs a new API would be used by
only three new programs per year so it would not be
worth the effort to implement it at all...

At the moment my interest in adding more UTF-8 support is
rather low, one blocker is the missing ESC[ support in
console handlers (according to the standards both ESC[
and CSI should be supported).

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 20:17 #33

Home away from home

Home away from home

@TetiSoft

Quote:

OS4 uses explicit charset tags for charsets since years and you suggest we should drop that and try to use UTF-8

Yes becose then you can copy any mixture of language in the string textbox gadget whit caring what is what.

Quote:

autodetection? No, thanks

it bit useless is it not, but lets say you have some text in UTF8 format, the old crappy program doesn?t know its UTF-8, but uses a old class tag for string gadget, the SetAttrsA() won?t make the accident converting the UTF-8 string to UTF-8 string corrupted.

Quote:

The existing tag value for UTF-8 is 106, as specified by
IANA in L:charsets/character-sets.

Well I was reading the sdk:documentations/Autodocs /#?.gc or some thing like that before computer stopped working, did not notice any tags for UTF8 support for reading / setting string attributes the gadgets.

Quote:

OS4 uses explicit charset tags for charsets since years
and you suggest we should drop that and try to use UTF-8

Yes and No, we can?t drop legacy can we, but its not possible to persevere Unicode?s in ASCII, the UTF8 most be default format, any program that request ASCII using old tags will get a converted ASCII string from the UTF8 original string, all buffers are freed when classes are disposed off, ASCII buffer is provided when need, a simple zero pointer until UTF8 most converted in ASCII and buffer most be created.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 20:17 #34

Just popping in

Just popping in

@TetiSoft

My IRC client defaults to latin0 (or whatever I set as default charset), but recognises utf8 as well when it sees it. Ofcourse it can be tricked, but I've never seen that happen unintentionally.

And yes, the need for mixing old charsets is there, I frequently mix latin0 and cyrillic for example, writing both russian and norwegian at once. "8bit-apps" and codeset translation is not very usefull since you're not able to, for example, transcode utf8 to both latin0 and koi8-r at once.

-- kolla

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 20:45 #35

Home away from home

Home away from home

@TetiSoft

Quote:

create UTF-8 keymaps in memory, this would break nearly
every existing shell/console/terminal/KingCON/whatever.

Yes will break KingCon, but so did SnoopDOS.
Console/Shell? will need UTF8 support anyway.

Terminal / whatever? is the problem really.

What you need is legacy API and new style API,
so programs that don?t know of UTF-8 don?t notice it.

Quote:

do you volunteer to create new shells/ consoles/con-handlers etc which can handle ESC[ or UTF-8 escape sequences

Yes, development is a hobby for me not a job, so I even do it for fun, even add the KingCon scrollbar.

I need the source-code of the OS4 con-handler.

Quote:

Any excuse?

I don?t make excuse so way should you?

Edited by LiveForIt on 2007/7/16 22:13:56

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 21:05 #36

Home away from home

Home away from home

@TetiSoft

Quote:

So I still think we should try to make it possible to use UTF-8 with pre-OS4 applications

UTF8 is detectable to some degree it possible, some program will break, if older program do need UTF8, and they are of use to day, I?m quite shore they be replaced or updated at some time.

Quote:

API would be used by only three new programs per year so it would not be worth the effort to implement it at all...

Well some do like to use IRC and some need it for WEB browsing and E-mail, and some just like preserve there original filenames, then we have Japanese and China and few other countries that do not have support for there symbols.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 21:06 #37

Not too shy to talk

Not too shy to talk

@LiveForIt

Quote:

Well I was reading the sdk:documentations/Autodocs /#?.gc or some thing like that before computer stopped working, did not notice any tags for UTF8 support for reading / setting string attributes the gadgets.

Read again:
Quote:

LAYOUT_CharSet (ULONG) (V51)
The character set the layout group and all its members should
display their text in, regardless of the particular font used.
If zero, no character set will be explicitly enforced.

Defaults to zero.

Applicability is (OM_NEW, OM_SET, OM_GET)

Quote:

REQ_CharSet (ULONG) (V51.11)
The character set for the requester's text and gadgets.

Defaults to 0, meaning no character set is required.

Applicability is (OM_NEW, OM_SET, OM_GET, RM_OPENREQ)

...

Quote:

WINDOW_CharSet (ULONG)
The charset of the WINDOW_NewMenu menu strings and the
WINDOW_HintInfo help strings. Should be specified with
e.g. the cat_CodeSet value of the catalog your application
opened. (V51.11)

Defaults to 0.

Applicability is (OM_NEW, OM_SET)

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 21:18 #38

Home away from home

Home away from home

@ZeroG

I will have look at that when I get my Computer up and runing again..

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 21:22 #39

Just popping in

Just popping in

@LiveForIt

Quote:

Quote:

OS4 uses explicit charset tags for charsets since years and you suggest we should drop that and try to use UTF-8

Yes becose then you can copy any mixture of language in the
string textbox gadget whit caring what is what.

This implies that somebody created a charset aware
clipboard.device which doesnt exist yet. No, it cant
imply that the user is able to type multiple languages
with one keyboard and one keymap but only when he is able
to use more than an 8bit charset. All keymaps I'm aware of
can be described with an 8bit charset, minus the special
cases where both Euro and currency sign are present
(nobody ever needed the currency sign) and minus the
special cases where a special input method is needed
to handle the keyboard anyway (japanese for example).

But back on topic, of course you need a charset tag
to specify the font that shall be used to display the
text typed in the string gadget, and of course you need
a charset tag to specify the charset of the keymap
of the string gadget. Currently both keymap.library
and diskfont.library dont accept UTF-8 as charset
but I've avoided to write anywhere that this limitation
exists now or will still exist in future.

Quote:

it bit useless is it not, but lets say you have some text
in UTF8 format, the old crappy program doesn't know its
UTF-8, but uses a old class tag for string gadget, the
SetAttrsA() won't make the accident converting the UTF-8
string to UTF-8 string corrupted.

Any crappy old program will not specify any charset tag
at all. Ergo it will continue to work with UTF-8. Switch
your system to greek and your old text editor still works
(but in greek, it was written for latin), why should it
behave different with UTF-8? When it tries to interpret C1
control sequences its broken (IMHO).

Of course the user will be responsible to ensure the old
program is running in a latin environment before trying
to feed it a latin text and in an UTF-8 environment before
trying to feed it an UTF-8 text. Exactly the same as
with greek and cyrillic.

Quote:

Well I was reading the sdk:documentations/Autodocs /#?.gc
or some thing like that before computer stopped working,
did not notice any tags for UTF8 support for reading / setting
string attributes the gadgets.

The word "charset" appears about 140 times in my SDK:Autodocs
directory, the word "codeset" about 12 times.

1> grep -i charset sdk:include/include_h/*/*.h | wc -l
70

Quote:

Yes and No, we can't drop legacy can we, but its not
possible to persevere Unicode's in ASCII, the UTF8 most be
default format, any program that request ASCII using old
tags will get a converted ASCII string from the UTF8
original string, all buffers are freed when classes are
disposed off, ASCII buffer is provided when need, a simple
zero pointer until UTF8 most converted in ASCII and buffer
most be created.

Its absolutely unusual to change the language or charset or
font of an already existing gadget, and until now nobody
wanted that IIRC. When you change the text and have no idea
about charsets, you also had no idea about charsets when
creating the gadget, ergo the default has not changed, no
conversion necessary.

Re: Unicode support in future os4 updates?

Posted on: 2007/7/16 21:43 #40

Just popping in

Just popping in

@kolla

Quote:

My IRC client defaults to latin0 (or whatever I set as
default charset), but recognises utf8 as well when it sees
it. Ofcourse it can be tricked, but I've never seen that
happen unintentionally.

And yes, the need for mixing old charsets is there,
I frequently mix latin0 and cyrillic for example,
writing both russian and norwegian at once.
"8bit-apps" and codeset translation is not very usefull
since you're not able to, for example, transcode utf8 to
both latin0 and koi8-r at once.

You are typing russian and norwegian in the exact same
IRC session?

With two sessions, two windows, two fonts and two keymaps
you could do it in OS4 (when there would exist an IRC
client which is charset and UTF-8 aware and supports
specifying the keymap and charset).

With one session you have to wait until either the IRC
client supports bullet API to display full Unicode or
OS4 supports displaying UTF-8 in Text(), and you'd
need support in keymap.library to create UTF-8 keymaps.

Register To Post	« 1 (2) 3 »
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )

Powered by XOOPS 2.0 © 2001-2024 The XOOPS Project