= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
In This Chapter
Download and Install The Apache Package
Configuration – Multiple Sites And IP Addresses
Using Data Compression On Web Pages
Apache Running On A Server Behind A Firewall
How To Protect Web Page Directories With Passwords
Issues When Upgrading To Apache 2.0
© Peter Harrison, www.linuxhomenetworking.com
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
This is page outlines how to create multiple websites using a single IP address for a basic home configuration. The Apache online documentation gets a little complicated.
Most RedHat Linux software products are available in the RPM format. Downloading and installing RPMs isn’t hard. If you need a refresher, the chapter on RPMs covers how to do this in detail. It is best to use the latest version of Apache.
· For example, the RedHat 8.0 RPM as of this writing was:
httpd-2.0.40-8.i386.rpm
· Install the package using the rpm command
[root@bigboy tmp]# rpm -Uvh httpd-2.0.40-8.i386.rpm
By default, Apache expects its HTML files to be located in the /var/www/html directory
· Use the chkconfig configure Apache to start at boot:
[root@bigboy tmp]# chkconfig --level 35 httpd on
· Use the httpd init script in the /etc/init.d directory to start/stop/restart Apache after booting
[root@bigboy tmp]# /etc/init.d/httpd start
[root@bigboy tmp]# /etc/init.d/httpd stop
[root@bigboy tmp]# /etc/init.d/httpd restart
· You can test whether the Apache process is running with the following command, you should get a response of plain old process ID numbers:
[root@bigboy tmp]# pgrep httpd
Remember that you will never receive the correct traffic unless you have configured DNS for your domain to make your new Linux box web server the target of the DNS domain's www entry. See either the Static DNS or Dynamic DNS pages on how to do this.
The configuration file used by Apache is /etc/httpd/conf/httpd.conf. Examples of this will follow.
You can make your web server host more than one site per IP address by using Apache's "named virtual hosting" feature. The NameVirtualHost directive in the /etc/httpd/conf/httpd.conf file is used to tell Apache the IP addresses which will participate in this feature. Here is the format:
NameVirtualHost 97.158.253.26
The <VirtualHost> sections in the file then tell Apache where it should look for the web pages used on each web site. You must specify the IP address for which each <VirtualHost> section applies. Here is the format:
<VirtualHost 97.158.253.26>
Directives for site #1
</VirtualHost>
<VirtualHost 97.158.253.26>
Directives for site #2
</VirtualHost>
Within each <VirtualHost> section you then specify the primary website domain name for that IP address with the ServerName directive. The directory where the index page for that site is located is defined with the DocumentRoot directive.
You can also list secondary domain names which will serve the same content as the primary ServerName using the ServerAlias directive.
As explained on the apache website: "When a request arrives, the server will first check if it is using an IP address that matches the NameVirtualHost. If it is, then it will look at each <VirtualHost> section with a matching IP address and try to find one where the ServerName or ServerAlias matches the requested hostname. If it finds one, then it uses the configuration for that server. If no matching virtual host is found, then the first listed virtual host that matches the IP address will be used."
The other virtual hosting option is to have one IP address per website which is also known as IP based virtual hosting. In this case you will not have a NameVirtualHost directive for the IP address, and you must only have a single <VirtualHost> section per IP address.
It is common for system administrators to replace the IP address in the <VirtualHost> and NameVirtualHost directives with the “*” (all IP addresses) wildcard character. This makes configuration easier.
If you installed Apache with support for secure HTTPS / SSL, which is used frequently in credit card and shopping cart web pages, then wild cards won’t work. The Apache SSL module demands at least one explicit <VirtualHost> directive for IP based virtual hosting. When you use wild cards, Apache interprets it as an overlap of name based and IP based <VirtualHost> directives and will give errors like this because it can’t make up its mind about which method to use:
Starting httpd: [Sat Oct 12 21:21:49 2002] [error] VirtualHost _default_:443 -- mixing * ports and non-* ports with a NameVirtualHost address is not supported, proceeding with undefined results
If you try to load any webpage on your web server you’ll also notice an error like this:
Bad request!
Your browser (or proxy) sent a request that this server could not understand.
If you think this is a server error, please contact the webmaster
You have two options to overcome this problem.
o Continue using wildcards and disable SSL.
o Run Apache with more careful use of wildcards
If you wish to host a basic home SOHO website in which secure connections for credit card payments are unnecessary then you have the option of disabling SSL altogether. This can be done by not loading all the modules from the /etc/httpd/conf.d directory. By default, all the modules in this directory are loaded with the following directive in the /etc/httpd/conf/httpd.conf file:
Include conf.d/*.conf
You can therefore do a listing of all the files in this
directory and specifically load all except ssl. In this case we load
only the php and perl modules.
Include conf.d/perl.conf
Include conf.d/php.conf
You will then have to restart Apache for the changes to take effect.
The other choice is not to use virtual hosting statements with wild cards. The only exception would be the very first <VirtualHost> directive which defines the web pages to be displayed when matches to the other <VirtualHost> directives cannot be found.
By default, Apache will search the DocumentRoot directory for an index or “home” page named index.html. So for example, if you have a VirtualHost site of www.my-site.com with a DocumentRoot directory of /home/www/site1/, Apache will display the contents of the file /home/www/site1/index.html when you enter http://www.my-site.com in your browser.
Some editors like Microsoft FrontPage will create files with an “.htm”, not “.html” extension. This isn’t usually a problem if all your HTML files have hyperlinks pointing to files ending in “.htm” as FrontPage does. The problem occurs with Apache not recognizing the topmost index.htm page. The easiest solution is to create a symbolic link (“shortcut” for Windows users) called index.html pointing to the file index.htm. This will then allow you to edit/copy the file index.htm with index.html being updated automatically. You’ll almost never have to worry about index.html and Apache again!
[root@bigboy tmp]# cd /home/www/site1
[root@bigboy site1]# ln -s index.htm index.html
[root@bigboy site1]# ll index.*
-rw-rw-r-- 1 root home 48590 Jun 18 23:43 index.htm
lrwxrwxrwx 1 root root 9 Jun 21 18:05 index.html -> index.htm
[root@bigboy site1]#
The “l“ at the very beginning of the index.html entry signifies a link and the “->” the link target.
What follows are snippets of the section of the /etc/httpd/conf/httpd.conf file you'll need to edit. In this scenario:
· The systems administrator for the server has previously created DNS entries for www.my-site.com, my-site.com, www.my-cool-site.com, www.default-site.com and www.test-site.com to map to an IP address 97.158.253.26 on this web server. The domain www.my-other-site.com was also configured to point to alias IP address 97.158.253.27.
· Traffic to www.my-site.com, my-site.com, www.my-cool-site.com must get content from sub-directory site2. Hitting these URLs will cause Apache to display the contents of file index.html in this directory.
· Traffic to www.test-site.com must get content from sub-directory site3.
· Named virtual hosting will be required for 97.158.253.26 as in this case we have a single IP address serving different content for a variety of domains. A NameVirtualHost directive for 97.158.253.26 is therefore required.
· Traffic going to www.my-other-site.com will get content from directory site4.
· There is no ServerName directive for www.default-site.com and so traffic going to this domain
· All other domains pointing to this server that don’t have a matching ServerName directive will get web pages from the directory defined in the very first <VirtualHost> section. In this case is directory site1. Site www.default-site.com falls in this category.
Web Hosting Scenario Summary
|
Domain |
IP address |
Directory |
Type of Virtual Hosting |
|
www.my-site.com my-site.com www.my-cool-site.com |
97.158.253.26 |
Site2 |
Name Based |
|
www.test-site.com |
97.158.253.26 |
Site3 |
Name Based |
|
www.my-other-site.com All other domains |
97.158.253.27 |
Site1 |
IP Based |
|
www.default-site.com |
97.158.253.26 |
Site1 |
Name Based |
A sample snippet or a working httpd.conf file is listed below. The statements listed would normally be found at the very bottom of the file where virtual hosting statements normally reside. The last section of this configuration snippet has some additional statements to ensure read-only access to your web pages with the exception of web based forms using POSTs (pages with “submit” buttons). Remember to restart Apache every time you update the conf file for the changes to take effect on the running process.
ServerName localhost
NameVirtualHost 97.158.253.26
#
# Match a webpage directory with each website
#
<VirtualHost *>
DocumentRoot /var/www/html/site1
</VirtualHost>
<VirtualHost 97.158.253.26>
DocumentRoot /var/www/html/site2
ServerName www.my-site.com
ServerAlias my-site.com, www.my-cool-site.com
</VirtualHost>
<VirtualHost 97.158.253.26>
DocumentRoot /var/www/html/site3
ServerName www.test-site.com
</VirtualHost>
<VirtualHost 97.158.253.27>
DocumentRoot /var/www/html/site4
ServerName www.my-other-site.com
</VirtualHost>
#
# Make sure the directories specified above
# have restricted access to read-only.
#
<Directory "/var/www/html/*">
Order allow,deny
Allow from all
AllowOverride FileInfo AuthConfig Limit
Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec
<Limit GET POST OPTIONS>
Order allow,deny
Allow from all
</Limit>
<LimitExcept GET POST OPTIONS>
Order deny,allow
Deny from all
</LimitExcept>
</Directory>
You will have to configure your DNS server to point to the correct IP address used for each of the websites you host. The chapter on static DNS shows you how to configure multiple domains such as my-site.com and my-other-site.com on your DNS server.
Be careful if you create subdirectories under your DocumentRoot directory. Apache will run as expected if you link to files in the subdirectory. Be careful if you have designed your site without index.html pages in each subdirectory.
Say for example we create a subdirectory named /home/www/site1/example under www.my-site.com’s DocumentRoot of /home/www/site1/. Now we’ll be able to view the contents of the file my-example.html in this subdirectory if we point our browser to:
http://www.my-site.com/example/my-example.html
If a curious surfer decides to see what the index page is for www.my-site.com/example, they would type the link:
http://www.my-site.com/example
Apache will list all the contents of the files in the “example” directory if it can’t find the index.html file. You can disable the directory listing by using a “-Indexes” option in the <Directory> directive for the DocumentRoot like this:
<Directory "/home/www/*">
…
…
…
Options MultiViews -Indexes SymLinksIfOwnerMatch
IncludesNoExec
Remember to restart Apache after the changes. Users attempting to access the nonexistent index page will now get a “403 Access denied” message.
You can tell Apache to display a pre-defined HTML file whenever a surfer attempts to access a non-index page that doesn’t exist. You can place this statement in the httpd.conf file which will make Apache display the contents of missing.htm instead of a generic “404 file Not Found” message.
ErrorDocument 404 /missing.htm
Remember to put a file with this name in each DocumentRoot directory. You can see the missing.htm file I use by clicking the non-existent link below. You’ll notice that this gives the same output as http://www.linuxhomenetworking.com/missing.htm.
http://www.linuxhomenetworking.com/bogus-file.htm
Apache also has the ability to dynamically compress static web pages into gzip format and then send the result to the remote web surfers’ web browser. Most current web browsers support this format and will transparently uncompress the data and present it on the screen. This can significantly reduce bandwidth charges if you are paying for internet access by the megabyte.
First you need to load Apache version 2’s deflate module in your httpd.conf file and then use Location directives to specify what type of files to compress. After making these modifications and restarting Apache you will be able to verify from your /var/log/httpd/access_log file that the sizes of the transmitted HTML pages has shrunk.
Here is a comparison of the file sizes in the Apache logs and the document directory, 78,350 bytes shrunk to 15,190 bytes, almost 80% compression.
Log File
67.119.25.115 - - [15/Feb/2003:23:06:51 -0800] "GET /dns-static.htm HTTP/1.1" 200 15190 "http://www.siliconvalleyccie.com/sendmail.htm" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0; AT&T CSM6.0; YComp 5.0.2.6)"
Corresponding Directory Listing
[root@ bigboy tmp]# ll /web-dir/dns-static.htm
-rw-r--r-- 1 user group 78350 Feb 15 00:53 /home/www/ccie/dns-static.htm
[root@bigboy tmp]#
You can insert these statements just before your virtual hosting section of your httpd.conf file to activate the compression of static pages. Remember to restart Apache when you do.
LoadModule deflate_module modules/mod_deflate.so
<Location />
# Insert filter
SetOutputFilter DEFLATE
# Netscape 4.x has some problems...
BrowserMatch ^Mozilla/4 gzip-only-text/html
# Netscape 4.06-4.08 have some more problems
BrowserMatch ^Mozilla/4\.0[678] no-gzip
# MSIE masquerades as Netscape, but it is fine
BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
# Don't compress images
SetEnvIfNoCase Request_URI \
\.(?:gif|jpe?g|png)$ no-gzip dont-vary
# Make sure proxies don't deliver the wrong content
Header append Vary User-Agent env=!dont-vary
</Location>
If your webserver is behind a firewall, and you are logged on a machine behind the firewall as well, then you may find problems when trying to access www.mysite.com of www.my-other-site.com. The reason for this is that due to NAT (Network Address translation), firewalls frequently won't allow access from their protected network to IP addresses that they masquerade on the outside.
For example, in this case, Linux web server bigboy has an internal IP address of 192.168.1.100, but the firewall presents it to the world with an external IP address of 97.158.253.26 via NAT/masquerading. If you are on the inside, 192.168.1.X network, you may find it impossible to hit URLs that resolve in DNS to 97.158.253.26.
The solution to this can also be solved with virtual hosting. You can configure Apache to serve the correct content when accessing www.mysite.com or www.my-other-site.com from the outside, and also when accessing the specific IP address 192.168.1.100 from the inside. Fortunately Apache allows you to specify multiple IP addresses in the <VirtualHost> statements to help you overcome this problem. Here is an example:
NameVirtualHost 192.168.1.100
NameVirtualHost 97.158.253.26
<VirtualHost 192.168.1.100 97.158.253.26>
DocumentRoot /www/server1
ServerName www.my-site.com
ServerAlias bigboy, www.my-site-192-168-1-100.com
</VirtualHost>
Remember that if you get a "permissions" error in your web browser after trying to browse your newly configured website, then you need to ensure that you allow "others" to have read access to the directory all the way from the root directory "/" to the target sub-directory.
The appendix has a short script that you can use to recursively set the file permissions in a directory to match those expected by Apache.
You may also have to use the "Directory" directive to make Apache serve the pages once the file permissions have been correctly set. If you have your files in the default /var/www/html directory then this second step becomes unnecessary.
You can password protect content in both the main and sub-directories of your DocumentRoot fairly easily. I know of cases where persons will allow normal access to their regular web pages, but require passwords for directories / pages that show MRTG or Webalizer data. In this example we'll show how to password protect the /var/www/html directory.
· Apache has a password utility called "htpasswd" which can create "username password" combinations independent of your system login password for web page access. You have to specify the location of the password file, and if it doesn't yet exist, you'll have to include a "-c" or "create" switch on the command line. I recommend placing the file in your /etc/httpd/conf directory, away from the DocumentRoot tree where web users could possibly view it. Here is an example for a first user named "peter" and a second named "paul":
[root@bigboy tmp]# htpasswd -c
/etc/httpd/conf/.htpasswd peter
New password:
Re-type new password:
Adding password for user peter
[root@bigboy tmp]#
[root@bigboy tmp]# htpasswd /etc/httpd/conf/.htpasswd paul
New password:
Re-type new password:
Adding password for user paul
[root@bigboy tmp]#
· Make the .htpasswd file readable by all users.
[root@bigboy tmp]# chmod 644 /etc/httpd/conf/.htpasswd
· Create a .htaccess file in the directory to which you want password control with the following entries. Remember this will password protect this directory and all its sub directories.
AuthUserFile /etc/httpd/conf/.htpasswd
AuthGroupFile /dev/null
AuthName EnterPassword
AuthType Basic
require user peter
· The AuthUserFile tells Apache to use the “.htpasswd” file
· The "require user" tells Apache that only user "peter" in the “.htpasswd” file should have access. If you wanted all “.htpasswd” users to have access then you'd replace this line with require valid-user
· "AuthType Basic" instructs Apache to accept basic unencrypted passwords from the remote users web browser.
· Set the correct file protections on your new .htaccess file in the directory /var/www/html.
[root@bigboy tmp]# chmod 644 /var/www/html/.htaccess
· Make sure your /etc/httpd/conf/http.conf file has an AllowOverride statement in a <Directory> directive for any directory in the tree above /var/www/html. In the example below, we want all directories below /var/www/ to require password authorization.
<Directory /var/www/html/*>
AllowOverride AuthConfig
</Directory>
· You must also ensure that you have a <VirtualHost> directive that defines access to /var/www/html or another directory higher up in the tree.
<VirtualHost *>
ServerName 97.158.253.26
DocumentRoot /var/www/html
</VirtualHost>
· Restart Apache. Try accessing the web site and you'll be prompted for a password.
Your old configuration files will be incompatible when upgrading from Apache version 1.3 to Apache 2.X. The new version 2.X default configuration file is stored in /etc/httpd/conf/httpd.conf.rpmnew. For the simple virtual hosting example above, it would be easiest to:
o Save the old httpd.conf file with another name, httpd.conf-version-1.x for example. Copy the ServerName, NameVirtualHost, and VirtualHost sections from the old file and place them in the new file httpd.conf.rpmnew
o Copy the httpd.conf.rpmnew file an name it httpd.conf
o Restart Apache