![]() |
The Domain, DomainJavascript (DJ), Host, HostJavascript (HJ), and URL directives in config.txt have both unique functions and overlapping functions.
There are two separate instances in which EZproxy uses these entries to decide whether or not to proxy a particular hostname: while processing a starting point URL and when encountering hostnames that appear within web pages that are being proxied. An explanation of which directives affect which of these instances follows.
The following terms will be used when referencing portions of a URL. These definitions are simplified versions that are adequate to understand their use within this document but are over generalized from their exact meanings.
URL 1 | http://www.somedb.com/ |
scheme | http |
hostname | www.somedb.com |
port | 80 |
origin | http://www.somedb.com:80 |
path | / |
query | |
fragment |
URL 2 | http://www.somedb.com:80 |
scheme | http |
hostname | www.somedb.com |
port | 80 |
origin | http://www.somedb.com:80 |
path | / |
query | |
fragment |
URL 3 | http://www.somedb.com/search?q=ancient |
scheme | http |
hostname | www.somedb.com |
port | 80 |
origin | http://www.somedb.com:80 |
path | /search |
query | ?q=ancient |
fragment |
URL 4 | https://www.somedb.com/search?q=ancient |
scheme | https |
hostname | www.somedb.com |
port | 443 |
origin | https://www.somedb.com:443 |
path | /search |
query | ?q=ancient |
fragment |
URL 5 | http://www.somedb.com:8080/history?era=darkages |
scheme | http |
hostname | www.somedb.com |
port | 8080 |
origin | http://www.somedb.com:8080 |
path | /history |
query | ?era=darkages |
fragment |
URL 6 | http://search.somedb.com:8080/history?era=darkages |
scheme | http |
hostname | search.somedb.com |
port | 8080 |
origin | http://search.somedb.com:8080 |
path | /history |
query | ?era=darkages |
fragment |
URL 7 | http://search.somedb.com:8080/history#?modern |
scheme | http |
hostname | search.somedb.com |
port | 8080 |
origin | http://search.somedb.com:8080 |
path | /history |
query | |
fragment | #?modern |
URLs 1 and 2 are functionally equivalent even though URL 1 uses the default port and URL 2 uses the default path.
URLs 1, 2 and 3 all use the same origin, even though 1 and 3 depend on the default port, wereas 2 has an explicit port and 3 has a path.
URLs 3 and 4 are not functionally equivalent as they use different schemes.
URLs 5 and 6 are not functionally equivalent as they use different hostnames.
URL 7 does not have a query since the first question mark (?) appears after the first hash (#).
In general, EZproxy ignores the path, query and fragment when making proxying decisions. These details are only used when generating the URLs shown in the default menu page and the server status page.
Users are routed to specific databases using starting point URLs. Starting point URLs take the form:
where http://www.somedb.com/index.html is an example of a URL to which the user should be proxied.
When processing a starting point URL, EZproxy decides whether or not to allow access by taking the origin of the request URL (e.g., http://www.somedb.com:80) and trying to find a URL, Host, or HostJavascript (HJ) directive with the identical origin. The Domain and DomainJavascript (DJ) are not directly involved in this processing. Any of the following directives in config.txt would be considered a match to authorize the starting point URL since they share the same origin http://www.somedb.com:80 as the starting point URL.
All three of these URL directives would authorize http://www.somedb.com/index.html for access, since all of them have the origin http://www.somedb.com:80, even though they have different paths.
Host and HostJavascript directives default to the http scheme, making the two Host directives equivalent and the two HostJavascript directives equivalent.
There is an exception in which Domain and DomainJavascript (DJ) directives are indirectly involved in authorizing hostnames for use in starting point URLs, but this occurs by a fluke that is discussed at the end of this page. The recommended method for configuring EZproxy is to keep in mind that the origin of any starting point URL must match the origin of a URL, Host, or HostJavascript line.
The previous examples all assume that the destination URLs are of the form http://www.somedb.com, using http:// at the start and no port at the end. If a destination URL uses https:// or includes a non-defaultport number, then it will only match to a URL, Host, or HostJavascript (HJ) line with the same information. For instance:
has the origin http://www.somedb.com:8080 which does NOT match the origin of any of the previous examples, but would match:
Likewise, the starting point URL:
has the origin https://www.somedb.com:443 which does NOT match any of the previous examples, but would match:
Note in these examples that there is no simple "Host www.somedb.com" style of entry as that basic form of entry defaults to http, not https.
Starting point URLs inject the user into the proxying process. Once a user starts requesting web pages through EZproxy, EZproxy will start retrieving and processing web pages from remote servers. As a web page is retrieved, EZproxy must decide whether or not to rewrite web page links that it encounters.
As EZproxy encounters each URL, it will choose to proxy a URL if that URL matches the starting point URL logic mentioned above and EZproxy will also look consider Domain and DomainJavascript (DJ) directives. When attempting to match to a Domain or DomainJavascript (DJ) line, EZproxy ignores the scheme (http:// or https://) and port. EZproxy considers a hostname to match a Domain or DomainJavascript (DJ) directive if the hostname matches the domain name or ends in the domain name. For instance, if EZproxy encounters:
this would not origin match any of the previous URL, Host, or HostJavascript directives, since http://www.history.somedb.com:80 does not exactly match the origin of any of those directives, but it would match:
since in each of these examples, the hostname www.history.somedb.com either matches the specified domain exactly or ends with a period followed by one of the specified domains.
Domain and DomainJavascript (DJ) directives allow EZproxy to automatically proxy all of the additional hosts that are used by a database vendor without requiring you to predict all the hostnames that might be encountered.
Some vendors use IP addresses as hostnames. In such an instance, the rules for URL, Host, and HostJavascript are exactly the same, using an exact match. To match a series of IP addresses with a Domain or DomainJavascript (DJ) directive, you must introduce an asterisk wildcard, such as:
The HostJavascript (HJ) and DomainJavascript (DJ) directives indicate that when EZproxy is proxying a web page from a matching server, additional JavaScript processing should be performed. For example:
Title Some Database
URL http://www.somedb.com
DJ somedb.com
indicates that all hosts that end with or are somedb.com should have additional JavaScript processing performed. In a mixture of JavaScript and non-JavaScript directives, the JavaScript directives takes priority. For example, in:
Title Some Database
URL http://www.somedb.com
Host search.somedb.com
DJ somedb.com
when search.somedb.com is proxied, JavaScript processing will be enabled since its name matches the "DJ somedb.com" directive.
When developing database stanza, the recommendation is to start with the normal form of the directives, but if you see the user slipping away from proxying, try using the JavaScript counter-parts to see if they resolve the issue.
In some instances, a particular hostname may match multiple database stanzas in config.txt. For instance, consider http://www.somedb.com against these entries:
Since http://search.somedb.com has an origin match to the second URL line, it can be used in a starting point URL.
Since http://www.somedb.com matches the second URL directives, the Domain directive, and the DJ directive, it would be rewritten if encountered in a web page.
Since EZproxy bases its proxying behavior on the very first database stanza that matches via URL, Host, HostJavascript, Domain, or DomainJavascript directives, ignoring subsequent stanzas, the first stanza controls proxying behavior. As such, the proxying of http://search.somedb.com will NOT have additional Javascript processing, with the subsequent "DJ somedb.com" effectively ignored due to the earlier stanza.
Consider the database stanza:
By everything discussed thus far, this stanza would allow this URL to work:
but would cause this URL to fail:
since there is no URL, Host, or HostJavascript (HJ) line that matches the origin http://search.somedb.com:80. Yet, in practice, you will encounter scenarios where this will appear to work correctly.
The instance in which this occurs happens when one of your users starts out by entering at http://www.somedb.com. As EZproxy retrieves that page, it encounters a link to http://search.somedb.com. At that point, search.somedb.com will match the Domain line, so EZproxy will proxy the link. At that moment, EZproxy creates a virtual web server for http://search.somedb.com. Once this happens, EZproxy will accept a starting point URL to http://search.somedb.com.
In this instance, http://search.somedb.com is working as a side-effect. It is a bad idea to depend on this type of behavior, as EZproxy can discard that information over time, causing the links that work one day to stop working the next if the ezproxy.hst file is reset.
As a result, any URL that appears in a starting point URL should always origin match with a URL, Host, or HostJavascript (HJ) directive.
If you have any questions, comments or suggestions, from the smallest typo to the biggest problem, please send them to info@UsefulUtilities.com.