Perception IME - Extending locale for Input

	Bottom Previous Topic Next Topic
Register To Post

Belxjander

Posted on: 2012/1/26 4:18 #1

Just popping in

Ive mentioned "Perception IME" and tried to Explain this before,

I am simply trying again... hopefully without confusing everyone in the process...

first a little background, with many languages being spoken around the world, there is also many ways to write ideas,

each of these ways are picture based in some way...

the question is then ... how many pictures?

in English this is somewhat limited, 26 letters, 10 numbers and various symbols with specific meanings.

I will choose Japanese as an example language to compare,

the majority of keyboards are "qwerty" or "azerty" based,
with some being Dvorak or another Keymap.

Generally with 1 symbol per key or a modified meaning when
used with the Control / Alt / Shift or Command keys.

in Japanese there are 50 based sounds,
and you have 2 sound based symbol groups to type this,

But there is a major catch to this... both symbol sets use
the *same* english forms.

there is a further complication as well,
a third set of symbols.

these three sets of symbols are known as "Hiragana" "Katakana" and "Kanji".

The Hiragana and Katakana symbols are the easiest to type
and are generally widespread for use to enter material.

The challenge for entering Japanese is to do with selecting Kanji

While it IS possible to keymap for Hiragana or Katakana to be usable on a keyboard each, you can't create a Keymap for both Hiragana AND Katakana successfully as each symbol in
these two sets uses between one to three english characters
for a basic sound meaning.

Examples are A I U E O becoming あいうえお or アイウエオ,

and かが　さざ　ただ　な　はばぱ　ま　や　ら　わ　ん
kaga saza tada na habapa ma ya ra wa n

okay... skipping any further details ... we also have Kanji

Now this is where things get interesting...

Kanji don't work on a "single press for a single Kanji" entry rule.

Hell... you can't simply say a single word in japanese without context.

I'll use the word "Kanji" iself as a prime example of this.

there are 14 pair of Kanji options in the menu when triggering the Windows IME
after typing "kanji" without the Hiragana or Katakana written forms.

[edit]so the whole point of my trying to explain this is to find out who else wants this kind of support being available?

If anyone who know of other than Japanese needing such functions being added properly to the OS?

I also plan on making this an open to all project once the initial main library is
built and accessible from two or more language packs.

I will be adding a "nihongo.language" library for locale support and I will additionally
be adding the IME commodity support through the way locale.library and its preferences get applied.

I'm still in the middle of sorting out a proper development machine and I've already started making a list of beta testers,
and I would be more than appreciative of any messages in this thread towards support or interest in testing.

I will not ask for anything more than a listing of files and their versions installed from what I package.
followed up by a description of actions triggering any particular error.

Thank you also in advance to everyone who has shown support already as well

Edited by Belxjander on 2012/1/26 7:31:41

trixie

Re: Perception IME - Extending locale for Input

Posted on: 2012/1/26 7:51 #2

Amigans Defender

So IME stands for "Input Method Editor", right? Sounds interesting enough for me.

The Rear Window blog

AmigaOne X5000 @ 2GHz / 4GB RAM / Radeon RX 560 / ESI Juli@ / AmigaOS 4.1 Final Edition
SAM440ep-flex @ 667MHz / 1GB RAM / Radeon 9250 / AmigaOS 4.1 Final Edition

Belxjander

Re: Perception IME - Extending locale for Input

Posted on: 2012/1/26 10:23 #3

Just popping in

@trixie

Yes, IME is Input Method Editor,

[edit]removed an extra chapter of book materials[/edit]

no point in wasting everyones time in repeating why I won't use any of the Emulators or cross-compilers and my negative experiences with them.

nbache

Re: Perception IME - Extending locale for Input

Posted on: 2012/1/26 22:14 #4

Just can't stay away

Quote:

Belxjander wrote:

I will be adding a "nihongo.language" library for locale support

I know, this is a minor technical detail at this point, but better now than too late: As of OS4, languages and their locales are no longer implemented under their native names, but the English ones (because, depending on the system charset chosen by each user, some of the old file names might not be readable at all), so you should probably plan on calling it "japanese.language" instead of "nihongo.language".

Best regards,

Niels

Belxjander

Re: Perception IME - Extending locale for Input

Posted on: 2012/1/27 5:34 #5

Just popping in

@nbache

Is the "use the english name" recent for AOS4.x?

I have no experience coding for newer than Amiga OS 3.9
which was a lot of experimental materials

but if that is what is needed and documented for AOS4.x
then I'll put it together as "Japanese.language"
along with "chinese.language" for Chinese

trixie

Re: Perception IME - Extending locale for Input

Posted on: 2012/1/27 7:19 #6

Amigans Defender

Quote:

Belxjander wrote:

Is the "use the english name" recent for AOS4.x?

Yes, this usage was introduced in 4.x. As there's no Unicode support in the OS, accented characters used in certain native language names (like Czech, Turkish or Catalan) may display as gibberish if the system codeset does not include these characters. So English names are a safe bet.

How do you plan to implement cooperation between the IME and other programs, say, text editors?

The Rear Window blog

AmigaOne X5000 @ 2GHz / 4GB RAM / Radeon RX 560 / ESI Juli@ / AmigaOS 4.1 Final Edition
SAM440ep-flex @ 667MHz / 1GB RAM / Radeon 9250 / AmigaOS 4.1 Final Edition

DAX

Re: Perception IME - Extending locale for Input

Posted on: 2012/1/27 9:04 #7

Not too shy to talk

I could certainly Beta Test the Japanese IME myself ;)

It would be great if it could work just like the windows one, ie: you type the word in kana and use the space-bar as the "henkan"(変換) button (was used to electronic dictionaries ^_^).
The best thing would be to be able to use japanese inside an AmigaOS document (for example, a japanese language page's HTML text file) and then have this file render correctly in a web browser.
Right now if you do this in Windows it works (I believe it's because of unicode) but what about amiga? Is it gonna be a specialized char-set that when brought to a browser turns to gibberish?
And what about viceversa (when I copy japanese text from the web to my AOS document)? Will it display correctly?

If it could be done so that text can be easily interchanged between OS and Browser it would be great (of course it should also be possible to type in japanese inside web forms and so on).

SamFlex Complete System + AmigaOS4.1 Update 4
Amiga 2000 GVP GForce-040 Picasso II AmigaOS3.9 BB2
Amiga CD-32

Belxjander

Re: Perception IME - Extending locale for Input

Posted on: 2012/1/27 10:08 #8

Just popping in

@trixie: I will be hooking into the standard text input routines and "buffering" the Input stream of characters...

Intuition will only see the "result string" I output from my own Input Handler (commodities filter level)
where commodities and other tools will remain unaffected

I will be dooing my utmost best to handle input properly
and as patchless as possible until I need otherwise.

@DAX certainly... right now I am only delayed in needing
to get hold of my developer machine

And I can certainly accept extra beta-testers.

Especially since the 4th edition of Cygnus Ed will be one
of my own primary test Applications.

Chris

Re: Perception IME - Extending locale for Input

Posted on: 2012/1/27 18:05 #9

Amigans Defender

@DAX

Quote:

Right now if you do this in Windows it works (I believe it's because of unicode) but what about amiga? Is it gonna be a specialized char-set that when brought to a browser turns to gibberish?
And what about viceversa (when I copy japanese text from the web to my AOS document)? Will it display correctly?

Copying of Japanese text is a different subject relating to UTF-8 clipboard contents, and another thread (which this is kind of a spin-off of) deals with that. If applications support UTF-8 clipboard it works, if they don't it will be gibberish.

@belxjander

I can see a problem with pushing events into the input stream, when the total number of characters in your charset exceeds 256 (which I think Japanese does). All the current charsets are 8-bit, and although you could write a UTF-8.charset (or some other charset exceeding 256 characters), I'm not sure how other bits of the OS will cope with that. So, unfortunately, you might need to patch a few functions unless you can persuade OS4 developers to fix anything that has a fixed 8-bit character limit.

Belxjander

Re: Perception IME - Extending locale for Input

Posted on: 2012/1/28 8:18 #10

Just popping in

@DAX: one of the reasons why I will need to find out what is needing to be patched and what will be
functional without patching, with a basic avoidance of patching anything unless absolutely needed.

I will need to find a list of beta-testers who have a range of software so that anything I introduce
does not force any major changes to anyone elses software.

The input changes will be on a highly limited basis and definitely open to suggestions

DAX

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/7 16:12 #11

Not too shy to talk

@Chris
But is this solely dependant on the App or is it some sort of system-wide OS feature that new apps can take advantage of?

@Belxjander
As I said I would love to test it for you, however it would seem we need UTF-8 so my next question is:

@All
What are the apps already supporting UTF-8 aside web-browsers?
I seem to remember Page Stream does (am I wrong?) Others?

SamFlex Complete System + AmigaOS4.1 Update 4
Amiga 2000 GVP GForce-040 Picasso II AmigaOS3.9 BB2
Amiga CD-32

Chris

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/7 18:12 #12

Amigans Defender

@DAX

The clipboard is IFF FTXT, all the current crop of UTF-8 clipboard apps do, is add a new chunk (CSET) specifying the character set. Any app can take advantage of this. Any app that isn't UTF-8 clipboard aware will assume the clipboard contents are local charset, which generally means ASCII will come through but little else.

Belxjander

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/8 20:56 #13

Just popping in

@Chris - Well I will definitely need to test and find out what works.

I'll get the IME framework into place before asking the devs about any changes only once they are proven to be needed.

Until that time I will try to do my best with what is already existing.

DAX

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/11 10:43 #14

Not too shy to talk

Thanks for the explanation.

I hope this goes through, I have a ton of "Mangajin" magazines here (the American publication about learning japanese through manga translations) and there are a ton of Apple ads boasting Mac's easier access to japanese input and general usage.
Nowadays PC folks can lough at those ads, we, in the amiga community, still cannot

SamFlex Complete System + AmigaOS4.1 Update 4
Amiga 2000 GVP GForce-040 Picasso II AmigaOS3.9 BB2
Amiga CD-32

Belxjander

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/11 11:16 #15

Just popping in

I also have personal (real life) reasons to see this through
not just get it started but a complete Japanese locale
with both Language presentation (TrueType/OpenType fonts)
and language Entry (Complete IME Commodity & docky toolkit)

But I am also trying to make sure the framework allows for other languages as well, equally.
not just Japanese only, but a more generic IME.

That way Chinese and other languages can be more readily added as well.

I will be blogging and documenting everything as I work on this particular project.

Due to current changes in my life, I will also be actively working from in Japan directly and possibly also showing off the machine I am building as well.
I just need to finish paying for the core parts.

see os4coding.net for some of my blogged ramblings about this.

I've only just opened the door and begun things...
and I am stubborn enough to see things through

ChrisH

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/12 10:08 #16

Home away from home

@Belxjander Quote:

I will also be actively working from in Japan directly

Consider me jealous

Quote:

I am stubborn enough to see things through

I think that's a key requirement for a programmer

P.S. Sadly my Japanese is actually very limited, but one day I will surely find Japanese support for OS4 to be very useful.

Author of the PortablE programming language.

Belxjander

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/12 12:23 #17

Just popping in

Well I will try to deal with making it as simple to use as possible,

But I already have in mind the most simplistic form of UI already...
but I will be dealing with "docky" material after I learn about programming in Amiga OS 4.x

I may yet need PortablE to deal with the Application end of things,
(I actually paid for Wouters 3.3 release of AmigaE)

But I know I will have to deal with mixing at least some assembler for one project in,
the catch being it will be multiple CPUs assembler...

Does anyone here really grok x86 and MIPS or ARM at the assembler level?
I can do the 680x0 and PPC without difficulties...

But I also have to wade into Java and Smalltalk opcodes as well...
hopefully I get to keep which processor is which sorted out!

Belxjander

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/20 7:50 #18

Just popping in

**Bumping the Thread**
Status Updates : I'm in Japan for a month,
actively setting up for living here long-term.

Anyone who wants to can help towards my Amiga Developments,

My own documentation style does appear initially random,
And I have already decided that the *entirety* of the
Perception IME project will be fully published and documented
on os4coding.net for my own Blogging... along with at least
two source repositories, OpenAmiga.org for anything possibly sensitive and initial test materials,
and another site (undecided) for everyone to work from.

My other project "Polymorph" will only be partially documented.
It is a LOT more difficult to explain since it is based
on how some *simple* technical abstractions work together
to make one pretty dammed quick algorythm.

There is *nothing* new inside the way Polymorph works,
the only thing new is the way things are put together
using standard Amiga techniques.

DAX

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/23 18:11 #19

Not too shy to talk

Great to hear, being in Japan will indeed reinforce the notion that we absolutely need the IME!

And besides, how will we conquer that market otherwise?? :D ^__^

SamFlex Complete System + AmigaOS4.1 Update 4
Amiga 2000 GVP GForce-040 Picasso II AmigaOS3.9 BB2
Amiga CD-32

Belxjander

Re: Perception IME - Extending locale for Input

Posted on: 2012/2/24 3:03 #20

Just popping in

I've also managed to work up to half-paid on my sam purchase
just organizing a last payment at the moment.

paypal problems making delays again! ugh :|

anyway... I have to run...literally.

If anyone wants to help with testing or donations...
Message me and I'll get back to you.

so far I have had a couple of donations that helped get past
the paypal quirks so far... prefer smaller donations
but anything donated will go to what it is donated for

Definitely an incentive to get Japanese support completely
functional by actually being here!!

Register To Post
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )