Web Server
Web server is an Internet server which communicates with clients via
Hyper Text Transfer Protocol (HTTP). There are more than 20 web
servers on the market. The most popular one is a freeware version
named
Apache. Attached is the latest Web
servers survey of March from
Netcraft:
Apache 7870864 60.05% (+1.97%)
Microsoft IIS 2742931 20.93% (-1.00%)
Netscape 955148 7.29% (-0.48%)
Netscape
Netscape includes Enterprise Server FastTrack, Commerce,
Communications, Netsite-Commerce & Netsite-Communications.
Doesn't work on Linux. It runs on all major Unix boxes and NT.
It does not come with OS and it allows to install into designated
directory that you prefer.
IIS
Microsoft IIS includes IIS, IIS-W, PWS-95, and PWS. IIS comes with NT and you
may install it into a designated directory that you prefer after
installing NT.
Apache
Apache was originally based on code and ideas found
NCSA httpd 1.3 (early 1995). It has evolved
into a far superior system which can surpass
almost any other web server in
terms of functionality, efficiency and speed. In
addition, it has been ported to Win32 environment, BS2000/OSD
on an IBM 390-compatible processor and AS 400. Open Linux2.3
installs Apache under /home/apache. It is easy to compile
and install the
latest version of Apache with
Apache-XML (Extensible Meta Language) parser to
handle
XML. Typically Apache
is installed under /usr/local/apache.
Apache has an
Jserv project for providing servlet and
RMI capability.
Server Directory Layout and Configuration
Directory Layout
- bin: httpd, startup script and other utilities
- cgi-bin: default cgi program directory
- conf: configuration files
- htdocs: documentation root or home page top directory
- icons: images used in htdocs
- include: header files for making web server, i.e. httpd
- libexec: shared objects which include runtime library and modules
- logs: big brother's holder
- man: manual
- proxy: when compiled with --enable-module=proxy
Configuration (httpd.conf)
ServerRoot "/usr/local/apache"
# Dynamic Shared Object (DSO) Support
LoadModule servletexec_module libexec/mod_servletexec.so
Port 80
User nobody
Group nobody
ServerAdmin root@1sa11.bitmotel.com
ServerName 1sa11.bitmotel.com
DocumentRoot "/usr/local/apache/htdocs"
Options Indexes FollowSymLinks ExecCGI
< Directory >
Options FollowSymLinks ExecCGI
AllowOverride None
< /Directory >
ScriptAlias /cgi-bin/ "/usr/local/apache/cgi-bin/"
AddHandler cgi-script .cgi
Virtual Host
< VirtualHost www.smallco.com >
ServerAdmin webmaster@mail.smallco.com
DocumentRoot /groups/smallco/www
ServerName www.smallco.com
ErrorLog /groups/smallco/logs/error_log
TransferLog /groups/smallco/logs/access_log
< /VirtualHost >
Redirect / http://www.apache.org/
The majority of servers configuration on Unix platform can be accomplished
with an editor (such as emacs and vi). There are
GUI tools
for cross platform configuration and management.
Apache Modules
Apache has been designed with
'modular' architecture in mind. Meaning that
adding new functions to the server is possible.
- mod_proxy:
This module allows Apache to act as a caching-proxy server.
The proxy fetches the document and returns it to the client.
Like your browser, proxy server
keeps a copy of the document fetched to
reduce Internet traffic. This also improves
response time to the user. Most importantly, companies use proxy
servers to keep track of which URLs that you download. Corporations
typically sets up a firewall to allow the proxy server
connecting to the Internet without exposing inside networks.
The most popular proxy-caching freeware is
Squid, which
keeps caches DNS lookups,
meta data and especially hot objects cached in RAM.
Browser can be configured
with auto-proxy and manual-proxy.
In auto-proxy, proxy server uses Java script to designate a proxy server
or to assign direct connection.
Reverse proxy allows the public
to access your company's web servers that are not directly
accessible from the Internet. (documentation of internal web servers
will be fetched and saved on the reverse proxy)
- mod_ssl:
SSLeay is a free implementation of Netscape's Secure Socket Layer.
When a connection starts, certificates are checked and a new session key
is agreed between the client and server. (via public key encryption)
- mod_cgi: provides
CGI capability (default).
suEXEC: provide users the ability to run CGI and SSI
programs under "uid" different from the id of the calling
web-server. (i.e. nobody) Refer to effective uid/gid in
kernel upgrade.
- mod_jserv: Java engine for supporting
servlet. There are commercial Java engines such as
Java server, and
Jrun which provide server side Java program
functionality as CGI does.
Access Log
maturana.excite.com - - [31/Mar/2000:13:33:59 -0500] "GET /notes/webintc/tsld011.htm HTTP/1.0" 200 900
scooter2.sv.alta-vista.net - - [31/Mar/2000:13:43:01 -0500] "HEAD /jobs/maryland.htm HTTP/1.0" 200 0
curly.bitmotel.com - - [31/Mar/2000:13:54:32 -0500] "GET /notes/cgiperl HTTP/1.0" 301 188
extfw.dcgov.org - - [31/Mar/2000:13:56:16 -0500] "GET /images/formosa.jpg HTTP/1.0" 304 -
extfw.dcgov.org - - [31/Mar/2000:13:56:23 -0500] "GET /sinbuun.html HTTP/1.0" 304 -
CGI
The Common Gateway Interface (CGI) is a standard for
interfacing external applications with Web servers.
A plain HTML document that the Web daemon
retrieves is static, which means it exists in a constant state:
a text file that doesn't change. A CGI program, on the other hand, is
executed in real-time, so that it can output dynamic information.
A CGI program may be written in many languages. The most popular
language used is a script language named
Perl.
To simplify the programming tasks, many programs were written for parsing
variables passed from the browser. One of the earliest and popular
one is called cgi-lib.pl.
This library has become the de facto standard library
for creating Common Gateway Interface (CGI) scripts in the Perl language.
Tutorial Examples
You may download some simple examples from
CGI-lib web site.
References: