One of the most common needs Webmasters have is to cause the Web server to
handle all the documents in a particular directory, or tree of directories,
in the same way -- such as requiring a password before granting access to any
file in the directory, or allowing (or disallowing) directory listings. However,
this need often extends to more than just the Webmaster; consider students on
a departmental Web server at a university, or individual customers of an ISP,
or clients of a Web-hosting company. This article describes how the Webmaster
can extend permission to tailor Apache's behaviour to users, allowing them to
have some control over how it handles their own sub-areas of its total Web-space.
This article shows how you can use per-directory configuration files,
called .htaccess files, to customise Apache behaviour --
or allow your users to do so for their own documents.
Apache's configuration system addresses the need to group documents by directory
in a straightforward manner. To apply controls to a particular directory tree,
for instance, you can use the <Directory> container directive
in the server's configuration files:
<Directory \"C:/Program Files/Apache Group/Apache/htdocs\">
AllowOverride None
Options None
</Directory>
This has the advantage of keeping control in the Webmaster's hands;
there's no need to worry about any of the server's users being able
to change the settings, since the server configuration files are
generally not modifiable by anyone except the admin. Unfortunately,
it has the disadvantages of
requiring a restart of Apache any time the config file is changed,
and that it can become truly burdensome to add all the
<Directory> containers that might be needed
for all the users that have special requirements.
An alternative method for supplying the desired granularity of
Apache configuration -- down to the directory level -- is to use
special partial config files in each directory with special
requirements.
An .htaccess file is simply a text file containing Apache directives.
Those directives apply to the documents in the directory where the .htaccess
file is located, and to all subdirectories under it as well. Other .htaccess
files in subdirectories may change or nullify the effects of those in parent
directories; see the section on merging for more information.
As text files, you can use whatever text editor you like to create or make
changes to .htaccess files.
These files are called '.htaccess files' because that's what
they're typically named. This naming scheme has its roots in the NCSA Web server
and the Unix file system; files whose names begin with a dot are often considered
to be 'hidden' and aren't displayed in a normal directory listing. The NCSA
developers chose the name '.htaccess' so that a control file in
a directory would have a fairly reasonable name ('ht' for 'hypertext') and not
clutter up directory listings. Plus, there's a long history of Unix utilities
storing their preferences information in such 'hidden' files.
The name '.htaccess' isn't universally acceptable, though. Sometimes
it can quite difficult to persuade a system to let you create or edit a file
with such a name. For this reason, you can change the name that Apache will
use when looking for these per-directory config files by using the AccessFileName
directive in your server's httpd.conf file. For instance,
AccessFileName ht.acl
will cause Apache to look for files named ht.acl instead of .htaccess.
They'll be treated the same way, though, and they're still called '.htaccess
files' for convenience.
When Apache determines that a requested resource actually represents a file
on the disk, it starts a process called the 'directory walk.' This involves
checking through its internal list of <Directory> containers
to find those that apply, and possibly searching the directories on the filesystem
for .htaccess files.
Each time the directory walk finds a new set of directives that apply to the
request, they are merged with the settings already accumulated. The result
is a collection of settings that apply to the final document, culled from all
of its ancestor directories and the server's config files.
When searching for .htaccess files, Apache starts at the top
of the filesystem. (On Windows, that usually means 'C:\\'; otherwise,
the root directory '/'.) It then walks down the directories to
the one containing the final document, processing and merging any .htaccess
files it finds that the config files say should be processed. (See the section
on overrides for more information on how the server
determines whether an .htaccess file should be processed or not.)
This can be an intensive process. Consider a request for <URI:http://your.host.com/foo/bar/gritch/x.html>
which resolves to the file
C:\\Program Files\\Apache Group\\Apache\\htdocs\\foo\\bar\\gritch\\x.html
Unless instructed otherwise, Apache is going to look for each of the following
.htaccess files, and process any it finds:
- C:\\.htaccess
- C:\\Program Files\\.htaccess
- C:\\Program Files\\Apache Group\\.htaccess
- C:\\Program Files\\Apache Group\\Apache\\.htaccess
- C:\\Program Files\\Apache Group\\Apache\\htdocs\\.htaccess
- C:\\Program Files\\Apache Group\\Apache\\htdocs\\foo\\.htaccess
- C:\\Program Files\\Apache Group\\Apache\\htdocs\\foo\\bar\\.htaccess
- C:\\Program Files\\Apache Group\\Apache\\htdocs\\foo\\bar\\gritch\\.htaccess
That's a lot of work just to return a single file! And the server will repeat
this process each and every time the file is requested. See the overrides
section for a way to reduce this overhead with the AllowOverride None
directive.
Because .htaccess files are evaluated for each request, you don't
need to reload the Apache server whenever you make a change. This makes them
particularly well suited for environments with multiple groups or individuals
sharing a single Web server system; if the Webmaster allows, they can exercide
control over their own areas without nagging the Webmaster to reload Apache
with each change. Also, if there's a syntax error in an .htaccess
file, it only affects a portion of the server's Web space, rather than keeping
the server from running at all (which is what would happen if the error was
in the server-wide config files).
Not all directives will work in .htaccess files; for example,
it makes no sense to allow a ServerName directive to appear in
one, since the server is already running and knows its name -- and cannot change
it -- by the time a request would cause the .htaccess file to be
read. Other directives aren't allowed because they deal with features that are
server-wide, or perhaps are too sensitive.
However, most directives are allowed in .htaccess files.
If you're not sure, take a look at the directive's documentation. Figure 1 is
a sample extracted from the Apache documentation. You can see where the text
says 'Context' that .htaccess is listed; that means this directive
can be used in the per-directory config files.
The SetEnvIf Directive
rel=\"Help\"
>Syntax: SetEnvIf attribute regex envar[=value] [...]
rel=\"Help\"
>Default: none
rel=\"Help\"
>Context: server config, virtual host, directory, .htaccess
rel=\"Help\"
>Override: FileInfo
rel=\"Help\"
>Status: Base
rel=\"Help\"
>Module: mod_setenvif
rel=\"Help\"
>Compatibility: Apache 1.3 and above; the Request_Protocol keyword
and environment-variable matching are only available with 1.3.7 and
later; use in .htaccess files only supported with 1.3.13 and later |
Figure 1: Directive Documentation |
Note, however, that there's more information on the Compatibility line;
it says that this directive can only be used in .htaccess files
if you're running Apache version 1.3.13 or later.
If you try to include a directive in an .htaccess file that isn't
permitted there, any requests for documents under that directory will result
in a '500 Server Error' error page and a message in the
server's error log.
If your .htaccess file contains directives that aren't covered
by the current set of override categories, they won't cause an error
-- the server will just ignore them. So your file can contain directives in
any -- or all -- of the categories, and only those in the categories listed
in the AllowOverride list will be processed. All of the others
will be checked for syntax, but otherwise not interpreted.
Apache directives fall into seven different categories, and all can appear
in the server-wide config files. Only five of the categories can be used in
.htaccess files, though, and in order for Apache to accept a directive
in a per-directory file, the settings for the directory must permit the
directive's category to be overridden.
The five categories of directives are:
AuthConfig
- This category is intended to be used to control directives that have to
do with Web page security, such as the AuthName, Satisfy,
and Require directives. This is the most common category to allow
to be overridden, as it allows users to protect their own documents.
FileInfo
- Directives that control how files are processed are
Indexes
- Directives that affect file listings should be in this category. It includes
IndexOptions, AddDescription, and DirectoryIndex,
for example.
Limit
- This category is similar to the
AuthConfig one in that the
directives it covers are typically related to security. However, they usually
involve involuntary controls, such as controlling access by IP address.
Directive in this category include Order, Allow,
and Deny.
Options
- The
Options category is intended for directives that support
miscellaneous options, such as ContentDigest, XBitHack,
and Options itself.
A special directive, which is usable only in the server-wide configuration
files, dictates which categories may be overridden in any particular directory
tree. The AllowOverride directive accepts two special keywords
in addition to the category names listed above:
All
- This is a shorthand way of listing all of the categories; the two
statements below are equivalent:
AllowOverride AuthConfig FileInfo Indexes Limits Options
AllowOverride All
None
- This keyword totally disables the processing of
.htaccess files
for the specified directory and its descendants (unless another AllowOverride
directive for a subdirectory is defined in the server config files). 'Disabled'
means that Apache won't even look for .htaccess files,
much less process them. This can result in a performance savings, and is why
the default httpd.conf file includes such a directive for the
top-level system directory. .htaccess processing is disabled
for all directories by default by that directive, and is only selectively
enabled for those trees where it makes sense.
As shown above, the AllowOverride directive takes a whitespace-separated
list of category names as its argument.
By allowing the use of .htaccess files in user (or customer or
client) directories, you're essentially extending a bit of your Webmaster privileges
to anyone who can edit those files. So if you choose to do this, you should
consider occasionally performing an audit to make sure the files are appropriately
protected -- and, if you're really ambitious, that they contain only settings
of which you approve.
Because of the very coarse granularity of the possible override categories,
it's quite possible that by granting a user the aility to override one set of
directives you're inadvertently delegating more power than you anticipate. For
instance, you might want to include a \"AllowOverride FileInfo\"
directive for user directories so that individuals can use the AddType
directive to label documents with MIME types that aren't in the server-wide
list -- but were you aware when you did this that you were also giving them
access to the Alias, Header, Action,
and Rewrite* directives as well? Directives are associated with
override categories on a per-module basis, so tracking down what's permitted
by allowing a particular category of override can be a tedious process.
The ultimate answer to what directives are in which categories is the source
code. If you really want to know, examine the source for the following
strings:
String |
Corresponding AllowOverride Keyword |
OR_AUTHCFG |
AllowOverride AuthConfig |
OR_FILEINFO |
AllowOverride FileInfo |
OR_INDEXES |
AllowOverride Indexes |
OR_LIMIT |
AllowOverride Limit |
OR_OPTIONS |
AllowOverride Options |
(See the previous section for a description of what
the different override categories mean.)
As you can see, with the exception of the AuthConfig/AUTHCFG keywords, the
source keywords are identical to the directive keywords. This is convenient!
Before enabling .htaccess files, consider the advantages and
disadvanteges. On servers I run myself, with no users, I tend to use .htaccess
files for testing and debugging, and when I have a configuration I like, I move
the directives into a <Directory> container in the httpd.conf
file and delete the .htaccess file. For this reason, I have overrides
enabled just about everywhere. This allows me to balance the convenience of
.htaccess files against their performance impact.
On some of my servers I have some user accounts for people I know and trust,
and in those environments I'm more cautious and don't allow all overrides globally.
I do tend to allow whatever overrides my friends need for their own directories,
though.
And in some cases I have real 'user' accounts, for people I do not
know as well -- and on those servers AllowOverride None is
the rule. I occasionally allow .htaccess files in their private
directories, but I carefully audit the possible effects before granting an override
category.
The two main disadvantages to using .htaccess are the performance
impact and the extending of control access to others. The first is somewhat
manageable through the judicious use of the AllowOverride directive,
and the latter is a matter of establishing trust -- and performing risk assessment.
What mix works best in your environment is something you'll need to determine
for yourself.
Here are some of the most common problems I've seen people have (or have had
myself) with .htaccess files. One thing I should stress first,
though: the server error log is your friend. You should always consult
the error log when things don't seem to be functioning correctly. If it doesn't
say anything about your problem, try boosting the message detail by changing
your LogLevel directive to debug. (Or adding a LogLevel debug
line of you don't have a LogLevel already).
- 'Internal Server Error' page is displayed when a document is requested
- This indicates a problem with your configuration. Check the Apache error
log file for a more detailed explanation of what went wrong. You probably
have used a directive that isn't allowed in .htaccess files,
or have a directive with incorrect syntax.
.htaccess file doesn't seem to change anything
- It's possible that the directory is within the scope of an
AllowOverride None
directive. Try putting a line of gibberish in the .htaccess file
and force a reload of the page. If you still get the same page instead of
an 'Internal Server Error' display, then this is probably the cause
of the problem. Another slight possibility is that the document you're requesting
isn't actually controlled by the .htaccess file you're editing;
this can sometimes happen if you're accessing a document with a common name,
such as index.html. If there's any chance of this, try changing
the actual document and requesting it again to make sure you can see the change.
this isn't happening.
- I've added some security directives to my
.htaccess file,
but I'm not getting challenged for a username and password
- The most common cause of this is having the
.htaccess directives
within the scope of a Satisfy Any directive. Explicitly
disable this by adding a Satisfy All to the .htaccess
file, and try again.
Once you've first tasted success with your use of .htaccess files,
you'll probably want to do more, and more, and ... Here are some pointers to
resources for further investigation:
- The main Apache Web site, of course: <URL:
>http://www.apache.org/>
- The documentation for Apache and its modules: <URL:
>http://www.apache.org/docs/>
- The canonical email response page: <URL:
>http://www.apache.org/foundation/preFAQ.html>
(This page is normally used to respond to email requests for support, but
there are lots of good resources listed on it.)
Apache provides two main ways of controlling its behaviour on a per-directory
level: <Directory> containers in the server-wide configuration
files, and .htaccess files in each directory where they're needed.
Each method has its advantages and its disadvantages; you, as the Webmaster,
need to balance these against each other to decide what mix of the techniques
is best for your environment.
If you do decide to permit the use of .htaccess files,
be sure to limit them to appropriate areas and improve your performance by using
AllowOverride None elsewhere. This will save unnecessary disk
activity.
Got a Topic You Want Covered?
If you have a particular Apache-related topic that you'd like covered in a
future article in this column, please let me know; drop me an email at <coar@Apache.Org>.
I do read and answer my email, usually within a few hours (although
a few days may pass if I'm travelling or my mail volume is 'way up). If I don't
respond within what seems to be a reasonable amount of time, feel free to ping
me again.
About the Author
Ken Coar is a member of the Apache Group
and a director and vice president of the Apache
Software Foundation. He is also a core member of the
>Jikes open-source Java compiler project, a contributor to the PHP
project, the author of
>Apache Server for Dummies, a lead author of
>Apache Server Unleashed, and is currently working with Ryan
Bloom on a book for Addison-Wesley tentatively entitled Apache Module
Development in C. He can be reached via email at <coar@apache.org>.