There is no recommended default for this level. For plain text
oriented protocols, the bytestream transport format should be 8-bit
clean, possibly with normalization of end-of-line indicators. Some
special cases could be made for protocols that are not 8-bit clean,
such as encoding it for transport over 7-bit connections. For binary
the same recommendation holds as above. The specification technique
should either be defined in the protocol, if only one way is
permitted, or by use of MIME content-transfer-encoding (CTE)
techniques, using IANA registered values.
RFC 2130 Character Set Workshop Report April 1997
3.4.5: Default Language
There is no recommended default for the language level. For human
readable text, there should always be a way to specify the natural
language. The specification technique should be a MIME identifier
with IANA registered values for languages. If headers are used, the
header should be 'Content-Language'.
3.4.6: Default Locale
The default should be the POSIX locale. The specification technique
should use the Cultural register of CEN ENV 12005 [CEN] for the
values. If headers are used, the header should be 'Content-Locale'.
3.4.7: Default Culture
There is no recommended default for the Culture level. The
specification technique should be a MIME or MIME-like identifier
(e.g. Content-Culture) and should use the Cultural register of CEN
ENV 12005 for its values.
3.4.8: Default Presentation
There is no recommended default for the Presentation level. The
specification technique should be a MIME or MIME-like identifier
(e.g. Content-Layout) and use the glyph register of ISO 10036 and
other registers for its values.
3.4.9: Multiplexing
In some cases, text transmission may require the use of a number of
different values for a given parameter; for example, English
annotation of Japanese text might well require shifting the Content-
Language parameter. The way to switch the value of parameters within
a single body of text depends on the application. For instance, the
HTML I18N [I18N] work defines a language attribute on most of its
elements, including , , and , for the purpose of
switching between different languages. When only one value is
needed, this value should be as general as possible, and specified in
the protocol standard with reference to the IANA or other registry
value. All levels should be specified explicitly.
3.4.10: Storage
Because stored text may very well be stored without any of the
additional information necessary for decoding, stored text SHOULD be
tagged in a MIME compliant fashion. This alleviates the problem of
being unable to interpret text which has been stored for a long time,
RFC 2130 Character Set Workshop Report April 1997
or text whose provenance is not available.
3.5: Guidelines for conversions between coded character sets
This section covers various algorithms to convert a source text S,
encoded in the coded character set CCS(S), to a target text T,
encoded in the coded character set CCS(T).
Rep(X) is the character repertoire of coded character set X, i.e. the
set of characters which can be represented with X.
3.5.1: Exact conversion
When Rep(CCS(S)) and Rep(CCS(T)) are equal or Rep(CCS(S)) is a subset
of Rep(CCS(T)), exact conversion is possible; i.e. T is equal to S.
The octets just need to be remapped. The algorithm for performing
this remapping is simple, if the IANA-registered definition tables
for CCS(S) and CCS(T) are available.
3.5.2: Approximate conversion
In all other cases, any conversion creates a text T which differs
=7= |