An In-Depth Discussion of Virtual Host Matching - Apache HTTP Server Version 2.4

Apache Server 2.4

Apache HTTP Server Version 2.4

<-

An In-Depth Discussion of Virtual Host Matching

This document attempts to explain exactly what Apache HTTP Server does when deciding what virtual host to serve a request from.

Most users should read about Name-based vs. IP-based Virtual Hosts to decide which type they want to use, then read more about name-based or IP-based virtualhosts, and then see some examples.

If you want to understand all the details, then you can come back to this page.

top

Configuration File

There is a main server which consists of all the definitions appearing outside of <VirtualHost> sections.

There are virtual servers, called vhosts, which are defined by <VirtualHost> sections.

Each VirtualHost directive includes one or more addresses and optional ports.

Hostnames can be used in place of IP addresses in a virtual host definition, but they are resolved at startup and if any name resolutions fail, those virtual host definitions are ignored. This is, therefore, not recommended.

The address can be specified as *, which will match a request if no other vhost has the explicit address on which the request was received.

The address appearing in the VirtualHost directive can have an optional port. If the port is unspecified, it is treated as a wildcard port, which can also be indicated explicitly using *. The wildcard port matches any port.

(Port numbers specified in the VirtualHost directive do not influence what port numbers Apache will listen on, they only control which VirtualHost will be selected to handle a request. Use the Listen directive to control the addresses and ports on which the server listens.)

Collectively the entire set of addresses (including multiple results from DNS lookups) are called the vhost's address set.

Apache automatically discriminates on the basis of the HTTP Host header supplied by the client whenever the most specific match for an IP address and port combination is listed in multiple virtual hosts.

The ServerName directive may appear anywhere within the definition of a server. However, each appearance overrides the previous appearance (within that server). If no ServerName is specified, the server attempts to deduce it from the server's IP address.

The first name-based vhost in the configuration file for a given IP:port pair is significant because it is used for all requests received on that address and port for which no other vhost for that IP:port pair has a matching ServerName or ServerAlias. It is also used for all SSL connections if the server does not support Server Name Indication.

The complete list of names in the VirtualHost directive are treated just like a (non wildcard) ServerAlias (but are not overridden by any ServerAlias statement).

For every vhost various default values are set. In particular:

  1. If a vhost has no ServerAdmin, Timeout, KeepAliveTimeout, KeepAlive, MaxKeepAliveRequests, ReceiveBufferSize, or SendBufferSize directive then the respective value is inherited from the main server. (That is, inherited from whatever the final setting of that value is in the main server.)
  2. The "lookup defaults" that define the default directory permissions for a vhost are merged with those of the main server. This includes any per-directory configuration information for any module.
  3. The per-server configs for each module from the main server are merged into the vhost server.

Essentially, the main server is treated as "defaults" or a "base" on which to build each vhost. But the positioning of these main server definitions in the config file is largely irrelevant -- the entire config of the main server has been parsed when this final merging occurs. So even if a main server definition appears after a vhost definition it might affect the vhost definition.

If the main server has no ServerName at this point, then the hostname of the machine that httpd is running on is used instead. We will call the main server address set those IP addresses returned by a DNS lookup on the ServerName of the main server.

For any undefined ServerName fields, a name-based vhost defaults to the address given first in the VirtualHost statement defining the vhost.

Any vhost that includes the magic _default_ wildcard is given the same ServerName as the main server.

top

Virtual Host Matching

The server determines which vhost to use for a request as follows:

IP address lookup

When the connection is first received on some address and port, the server looks for all the VirtualHost definitions that have the same IP address and port.

If there are no exact matches for the address and port, then wildcard (*) matches are considered.

If no matches are found, the request is served by the main server.

If there are VirtualHost definitions for the IP address, the next step is to decide if we have to deal with an IP-based or a name-based vhost.

IP-based vhost

If there is exactly one VirtualHost directive listing the IP address and port combination that was determined to be the best match, no further actions are performed and the request is served from the matching vhost.

Name-based vhost

If there are multiple VirtualHost directives listing the IP address and port combination that was determined to be the best match, the "list" in the remaining steps refers to the list of vhosts that matched, in the order they were in the configuration file.

If the connection is using SSL, the server supports Server Name Indication, and the SSL client handshake includes the TLS extension with the requested hostname, then that hostname is used below just like the Host: header would be used on a non-SSL connection. Otherwise, the first name-based vhost whose address matched is used for SSL connections. This is significant because the vhost determines which certificate the server will use for the connection.

If the request contains a Host: header field, the list is searched for the first vhost with a matching ServerName or ServerAlias, and the request is served from that vhost. A Host: header field can contain a port number, but Apache always ignores it and matches against the real port to which the client sent the request.

The first vhost in the config file with the specified IP address has the highest priority and catches any request to an unknown server name, or a request without a Host: header field (such as a HTTP/1.0 request).

Persistent connections

The IP lookup described above is only done once for a particular TCP/IP session while the name lookup is done on every request during a KeepAlive/persistent connection. In other words, a client may request pages from different name-based vhosts during a single persistent connection.

Absolute URI

If the URI from the request is an absolute URI, and its hostname and port match the main server or one of the configured virtual hosts and match the address and port to which the client sent the request, then the scheme/hostname/port prefix is stripped off and the remaining relative URI is served by the corresponding main server or virtual host. If it does not match, then the URI remains untouched and the request is taken to be a proxy request.

Observations

  • Name-based virtual hosting is a process applied after the server has selected the best matching IP-based virtual host.
  • If you don't care what IP address the client has connected to, use a "*" as the address of every virtual host, and name-based virtual hosting is applied across all configured virtual hosts.
  • ServerName and ServerAlias checks are never performed for an IP-based vhost.
  • Only the ordering of name-based vhosts for a specific address set is significant. The one name-based vhosts that comes first in the configuration file has the highest priority for its corresponding address set.
  • Any port in the Host: header field is never used during the matching process. Apache always uses the real port to which the client sent the request.
  • If two vhosts have an address in common, those common addresses act as name-based virtual hosts implicitly. This is new behavior as of 2.3.11.
  • The main server is only used to serve a request if the IP address and port number to which the client connected does not match any vhost (including a * vhost). In other words, the main server only catches a request for an unspecified address/port combination (unless there is a _default_ vhost which matches that port).
  • You should never specify DNS names in VirtualHost directives because it will force your server to rely on DNS to boot. Furthermore it poses a security threat if you do not control the DNS for all the domains listed. There's more information available on this and the next two topics.
  • ServerName should always be set for each vhost. Otherwise a DNS lookup is required for each vhost.
top

Tips

In addition to the tips on the DNS Issues page, here are some further tips:

  • Place all main server definitions before any VirtualHost definitions. (This is to aid the readability of the configuration -- the post-config merging process makes it non-obvious that definitions mixed in around virtual hosts might affect all virtual hosts.)