PROXY  WHOIS  RQUOTE  TEXTS  SOFT  FOREX  BBOARD
 Music  Philosophy  Code  Literature  Russian

= ROOT|Technical|RFC|rfc2130.txt =

page 8 of 18



   from S.  There are different principles for how this inevitable
   difference should be handled.  A choice between them should be made,
   depending on the purpose and requirements of the conversion.  Where
   possible, the client application should be given mechanisms to
   determine what has been done to the text.

   3.5.2.1:  Length-modifying conversion for human display

   When the length of the target text T is allowed to differ from the
   length of the source text S, one should use a conversion method in
   which each source character is converted to one or several target
   character(s), using a best resemblance criteria in the choice of that
   target character(s).

   Examples:
      LATIN CAPITAL LETTER [*] ->  AE
      COPYRIGHT SIGN       [*] -> (c)

3.5.2.2:  Length-preserving conversion for human display

   Where the text T must be presented and the length of T cannot differ
   from the length of S, one should use a conversion method where each
   source character is converted to one target character, using some
   kind of best  resemblance criteria in the choice of target character.






 
RFC 2130             Character Set Workshop Report            April 1997


   Examples:
     LATIN CAPITAL LETTER  [*] -> A
     COPYRIGHT SIGN        [*] -> C

3.5.2.3:  Conversion without data loss

   Where the conversion of the text S into T must be completely
   reversible, apply a Character Encoding Syntax or other reversible
   transformation method.  This case is most frequently met in data
   storage requirements.

   Examples:
     LATIN CAPITAL LETTER [*] -> &AE
     COPYRIGHT SIGN       [*] -> &(C

   An alternate method, which can be used if the size of Rep(CCS(T)) >=
   Rep(CCS(S)), then for each character in Rep(CCS(S)) which is not
   present in Rep(CCS(T)), define a mapping into a character in
   Rep(CCS(T)) which is not present in Rep(CCS(S)).

   Examples:
     LATIN CAPITAL LETTER  [*] -> CYRILLIC CAPITAL LETTER [*]
     COPYRIGHT SIGN  [*] -> PARTIAL DIFFERENTIAL SIGN [*]

   Note that conversion without data loss requires redefining some
   member of T to indicate "the introduction of character data outside
   T".  This effectively adds another level of CES on top of CES(T).

4: Presentation issues

   There are a number of considerations to make in selecting the base
   character set.  One such consideration is the protocol's convenience
   to users with limited equipment (for example only ISO 8859-1 or a
   keyboard without the ability to enter all the characters in ISO
   10646).  Alternative representation should be considered for these
   users, both for input and output.  Possible options for the
   representation of characters that can not be displayed include
   transliteration (a la CEN/TC304 or ISO TC46/SC2 ), RFC 1345 [RFC-
   1345] representative icons, or the WG2 short name (u+xxxx).

5: Open issues

   In addition to the issues declared out of scope and enumerated in
   section 2.1, the following issues are still open and will need to be
   addressed in other forums.  These issues: language tags, public
   identifiers such as URL names, and bi-directionality are briefly
   discussed below as they repeatedly encroached the discussion.





 
RFC 2130             Character Set Workshop Report            April 1997


5.1: Language tags

   Although the workshop decided not to explicitly address the so-called
   "CJK issue", a few members felt it was necessary to have some
   mechanism to address the problem of correct Han character display in
   the ISO-10646 issue, and that saying that it was a "font issue" would
   not suffice.

   The "CJK issue" refers to the extended discussion about "Han
   unification", the use of a single ISO-10646 codepoint to represent
=8=

1|2|3|4|5|6|7| < PREV = PAGE 8 = NEXT > |9|10|11|12|13|14|15|16|17|18

UP TO ROOT | UP TO DIR | TO FIRST PAGE

Google
 


E-mail Facebook Google Digg del.icio.us BlinkList Fark Furl Ma.gnolia Netscape NewsVine Reddit Slashdot Spurl StumbleUpon Technorati YahooMyWeb LiveJournal Blogmarks TwitThis Live News2.ru BobrDobr.ru Memori.ru MoeMesto.ru

0.012337 wallclock secs ( 0.01 usr + 0.00 sys = 0.01 CPU)