Character set
The IBM1047 character set [21], excluding NUL, is used for the
definition of meta-variables, header fields, values, TEXT strings
and the PATH_TRANSLATED value. The newline (NL) sequence is LF;
servers should also accept CR LF as a newline.
media-type charset default
The default charset value for text (and other implementation-
defined) media types is IBM1047.
8. Implementation
8.1. Recommendations for Servers
Although the server and the CGI script need not be consistent in
their handling of URL paths (client URLs and the PATH_INFO data,
respectively), server authors may wish to impose consistency. So the
server implementation should specify its behaviour for the following
cases:
1. define any restrictions on allowed path segments, in particular
whether non-terminal NULL segments are permitted;
RFC 3875 CGI Version 1.1 October 2004
2. define the behaviour for "." or ".." path segments; i.e.,
whether they are prohibited, treated as ordinary path segments
or interpreted in accordance with the relative URL
specification [2];
3. define any limits of the implementation, including limits on
path or search string lengths, and limits on the volume of
header fields the server will parse.
8.2. Recommendations for Scripts
If the script does not intend processing the PATH_INFO data, then it
should reject the request with 404 Not Found if PATH_INFO is not
NULL.
If the output of a form is being processed, check that CONTENT_TYPE
is "application/x-www-form-urlencoded" [18] or "multipart/form-data"
[16]. If CONTENT_TYPE is blank, the script can reject the request
with a 415 'Unsupported Media Type' error, where supported by the
protocol.
When parsing PATH_INFO, PATH_TRANSLATED or SCRIPT_NAME the script
should be careful of void path segments ("//") and special path
segments ("." and ".."). They should either be removed from the path
before use in OS system calls, or the request should be rejected with
404 'Not Found'.
When returning header fields, the script should try to send the CGI
header fields as soon as possible, and should send them before any
HTTP header fields. This may help reduce the server's memory
requirements.
Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST
meta-variables (see sections 4.1.8 and 4.1.9) may not identify the
ultimate source of the request. They identify the client for the
immediate request to the server; that client may be a proxy, gateway,
or other intermediary acting on behalf of the actual source client.
9. Security Considerations
9.1. Safe Methods
As discussed in the security considerations of the HTTP
specifications [1], [4], the convention has been established that the
GET and HEAD methods should be 'safe' and 'idempotent' (repeated
requests have the same effect as a single request). See section 9.1
of RFC 2616 [4] for a full discussion.
RFC 3875 CGI Version 1.1 October 2004
9.2. Header Fields Containing Sensitive Information
Some HTTP header fields may carry sensitive information which the
server should not pass on to the script unless explicitly configured
to do so. For example, if the server protects the script by using
the Basic authentication scheme, then the client will send an
Authorization header field containing a username and password. The
server validates this information and so it should not pass on the
password via the HTTP_AUTHORIZATION meta-variable without careful
consideration. This also applies to the Proxy-Authorization header
field and the corresponding HTTP_PROXY_AUTHORIZATION meta-variable.
9.3. Data Privacy
=17= |