Introduction
The main purpose of HTML is to enable web authors to specify structural information
about their pages - for example tables, paragraphs, images and so on. However,
it also provides a way of adding information about the page and its content.
Such information is known as metadata, and is added through the use
of the <META> tag.
The tag can also be used to create the equivalent of HTTP (HyperText Transfer
Protocol) headers, which can provide instructions to the browser.
This article will describe the most common uses of the META tag. It is not
intended to be a definitive list; new metadata is being created and used all
the time, for various purposes. If you are interested in metadata, you may be
interested to read Janus
Boye's article RDF
- What's in it for us?.
Using the META tag to provide information
All META tags should be placed within the HEAD section of your web pages. When
using META to provide information about your page, the general syntax is as
follows:
<META NAME=\"dataname\" CONTENT=\"datavalue\">
In the above line, dataname is a specific identifier for the information
you are providing. The browser (or other program) looks at this name and then
decides how to treat the data. The information itself is represented by datavalue.
In this section we'll be looking at the most common types of metadata and how
they are typically used.
Description
This is simply a basic description of the page's content, in a few sentences.
Search engines often use it to display a brief page summary in the search results.
An example might be:
<META NAME=\"description\" CONTENT=\"An article about metadata and the META tag.\">
Author
Another self-explanatory one; this is usually used to display the name of the
person who wrote the page's content (or, in some cases, the designer of the
page, if this is different).
<META NAME=\"author\" CONTENT=\"Michael Bednarek\">
Keywords
This is the metadata that everyone is talking about, although in reality its
effectiveness is overhyped. The Keywords item allows you to specify a number
of themed words and phrases which may be associated with that web page in some
particular way. For example, some keywords associated with this article might
be: metadata, META tag, search engines, HTTP, HTML, etc. etc.
When you search for something in a search engine, you generally type in a few
words (or perhaps a phrase) related to what you are looking for. The engine
then matches up the keywords you have entered with the pages stored in its database.
This is why people have been going nuts over the META keywords tag; they think
that if you don't have it on each page, and don't have an extensive list of
words, then you won't get very good search engine results.
In fact, only three of the major search engines (AltaVista,InfoSeek,
HotBot) give any importance
to META tags - the others base their results upon a page's actual content. However,
it is still worth adding META keywords to your pages if you want good results
in those engines. Words and phrases are treated differently - so for example,
you would need to include the phrase \"web authoring\" as well as the two individual
words \"web\" and \"authoring\" for best results. Here's an example tag:
<META NAME=\"keywords\" CONTENT=\"metadata, META tag, meta, keywords, search engines\">
Robots
Finally, there is the Robots item. This is also related to search engines,
in a way. A robot is a program which will visit a web page, index it somewhere,
and then visit all the hyperlinks in that page, indexing them all. It may continue
in this fashion indefinitely, or it may stop after it has reached a certain
level. Search engines often send a robot round to your site, in order to add
all its pages to their database.
Sometimes you may not want certain pages on your site to appear in a search
engine. These might be pages containing sensitive information, or those which
should not be viewed outside of a frameset. You can use the META tag to provide
instructions to robots visiting a page - you can tell them not to index the
page, or not to follow any of the links on it, or both.
Here are examples of some of the combinations you can use:
<META NAME=\"robots\" CONTENT=\"NOINDEX,NOFOLLOW\">
<META NAME=\"robots\" CONTENT=\"NOINDEX,FOLLOW\">
<META NAME=\"robots\" CONTENT=\"INDEX,NOFOLLOW\">
<META NAME=\"robots\" CONTENT=\"INDEX,FOLLOW\">
The last line in this list is in fact the default setting, so you wouldn't
ever need to use it in practice.
Using the META tag to control the browser
As I mentioned in the introduction to this article, the META tag can also be
used to generate the equivalent of HTTP headers. In practice, this means you
can control the behaviour of the user's browser.
Preventing a page from being cached
There may be occasions when you would want to prevent a page from being cached
locally on the user's computer, and thus force the browser to load a fresh copy
each time. One example of this might be a webcam which is automatically updated
every few seconds - if the user visited at a later date, the browser might show
them the cached (and therefore outdated) version.
There are in fact three META tag variates which you should use to cause this
behaviour. This is because they are accepted in different browsers. The first,
Expires, is actually supposed to specify an expiry date for the web page. However,
if you leave the value as 0, then it treats it as \"now\", and therefore asks
for a new version of the page every time. The other two, Pragma and Cache Control,
are specifically designed to prevent (or control) caching, and should take a
value of \"no-cache\". So, to prevent your page being cached in most browsers,
you should use the following lines:
<META HTTP-EQUIV=\"Expires\" CONTENT=\"0\">
<META HTTP-EQUIV=\"Pragma\" CONTENT=\"no-cache\">
<META HTTP-EQUIV=\"Cache-Control\" CONTENT=\"no-cache\">
Incidentally, if you wish for your page to expire at a later date, you can
specify the date in the following format (using GMT time):
<META HTTP-EQUIV=\"Expires\" CONTENT=\"Thu, 03 Aug 1999 09:30:00 GMT\">
Redirecting the browser to another URL
With web hosting services becoming cheaper and cheaper, many people decide
to change the location of their web sites from free ISP space to professional
server space. This relocation obviously causes confusion, because visitors to
the site will still try to use the old URL. What many people do is keep one
page at the old address which, when visited, will automatically send the user
on to the site's new home. This can be easily achieved using the META tag, which
will redirect the user either instantly or after a given time delay.
For example, the line below will redirect the visitor to irt.org after ten
seconds, giving them enough time to read about what's going on:
<META HTTP-EQUIV=\"Refresh\" CONTENT=\"10;URL=http://www.irt.org\">
Conclusion
META tags can be very useful additions to your web pages, especially if you
would like more control over how the search engines treat your site. Although
the keywords and description tags are the most well known, the other forms of
metadata can also prove to be helpful.
References