Mosaic-L10N: User's Guide

Selecting Character Set

"Fonts" menu in the "Options" menubar is expanded for selecting multiple character set. Choose appropriate character set.

Note: You can specify the default character set by both command line argument and X resources.

Note: When you set Mosaic*simpleInterface resource (see here) to True, localization facilities cannot be used.

We have new "Accept Languages" menu in the "Options" menubar. If you choose this entry, "Accept Languages Window" like this is popped-up. A string you type-in here is directly passed to HTTP server as Accept Language: request header field.

As of today, to best of my knowledge, strict interpretation of this request header field is not determined yet, but use of a language code, which is an ISO 639 language code with an optional ISO 3166 country code to specify a national variant, is encourage now. For example, en_UK means that the content of the message is in British English, while en means that the language is English in one of its forms (for technical details, see here).

A few sample for this language preference feature is provided. Try these URL

    http://palomine.nttam.com/mldocs/hello.html
    http://palomine.nttam.com/mldocs/radio-j.html
    http://palomine.nttam.com/Mule/FAQ.txt

with the following language code. If there is no document in the language you specified, English version is returned.

    en,ja,ru,el,fr,de,es,it,in,sw,ms,sv,ko,iw,zh,
    zh_CN,zh_TW,zh_HK

(BTW, do you know what languages do these codes mean completely? The answer is here :-)

Furthermore, you can specify a list of preferred languages like this:

    xx; yy; de; zz

In this case, we have no documents in language such like xx, yy and zz; so you get German.

Note: You can specify the default accept languages by both command line argument and X resources.

Selecting Bi-directionality

We have new "Bi-directionality" menu in the "Options" menubar. Choose Visual or Implicit. A sample for this bi-directionality feature is provided. Try these two URL.

    http://www.ntt.com/Mosaic-l10n/hebrew-visual.html
    http://www.ntt.com/Mosaic-l10n/hebrew-implicit.html

Note: Currently, implementation of Implicit directionality is still immature. Explicit directionality is not supported yet. For technical details, see RFC-1556.

Note: You can specify the default bi-directionality by both command line argument and X resources.

Automatic character sets detection

L10N-enhanced Mosaic supports a subset of ISO 2022's codeset designation escape sequences. For example, documents encoded in ISO-2022-JP (RFC-1468) or ISO-2022-KR (RFC-1557) are automatically displayed by Japanese/Korean fonts without font setting via Font Menu.

However, documents in ISO 8859-X, EUC-C/J/K, KOI and Big5 cannot be judged what character set is used without external information. This enhancement supports the following ISO 2022 initial designation sequences. (Note: We will always use G0 and G1 for GL and GR, respectively.)

 "<ESC> - A"	designate right-hand part of ISO 8859-1 into G1
 "<ESC> - B"	designate right-hand part of ISO 8859-2 into G1
 "<ESC> - C"	designate right-hand part of ISO 8859-3 into G1
 "<ESC> - D"	designate right-hand part of ISO 8859-4 into G1
 "<ESC> - L"	designate right-hand part of ISO 8859-5 into G1
 "<ESC> - F"	designate right-hand part of ISO 8859-7 into G1
 "<ESC> - H"	designate right-hand part of ISO 8859-8 into G1
 "<ESC> - M"	designate right-hand part of ISO 8859-9 into G1
 "<ESC> $ ) A"	designate GB 2312-1980 into G1
 "<ESC> $ ) B"	designate JIS X 0208-1983 into G1
 "<ESC> $ ) C"	designate KSC 5601-1987 into G1
 "<ESC> ( B"	designate 7-bit ASCII graphics into G0
 "<ESC> $ B"	designate JIS X 0208-1983 into G0

If a document includes the one of these escape sequences, the document is displayed by appropriate fonts without font setting via Font Menu. These examples

are encoded in this way. Please check it.

Note: We also have a little note on Japanese encoding methods and a related problem. It would be helpful in considering handling multi-byte characters in WWW. Please check out here.

Note: L10N-enhanced Mosaic remembers a specified character set of each document in the Window History (not Hotlist), and if you revisit the document, the current character set is automatically changed. If you don't like this, set Mosaic*keepDocumentCharset resource to False (see here).

TAKADA Toshihiro

Mosaic-L10N: User's Guide

Selecting Character Set

Setting Accept Languages

Selecting Bi-directionality

Automatic character sets detection