LASP version 1.0


Lightweight ACLs for the Squid Proxy
LASP is a simple "helper" application for the Squid proxy to provide fast matching of URLs in an Access Control List (ACL). It uses a simple language to define the hosts, domains, URLs and substrings to be blocked or allowed. In addition it implements a simple priority level scheme to allow the URLs to be somewhat ordered during a search for a match.

The LASP suite consists of 3 applications:

  • lasp - the redirector for Squid
  • lasp_check - an application to check for optimum rules by comparing rules sets to see if rules are repeated.
  • lasp_cgi - an example cgi application (for apache) to be used as the target for redirecting bad requests. This is more intended as a quick template that you can customize.

Supported Platforms
LASP is known to compile and run on the following platforms:
  • RedHat Linux 8.0 and Enterprise Server (tested in a production environment)
  • FreeBSD 5.1 (tested in a production environment)
  • Solaris 2.8 (development environment)
  • Windows Services for UNIX on Windows XP (should work with cygwin too)
  • NetBSD 1.5 (tested outside of Squid)

Quick Start
To get started quickly, read the Compilation and Installation sections below. The default rules files will probably be good enough for most sites.


Compiling
LASP should compile out of the box on FreeBSD, Solaris and RedHat Linux. Before building however, please read all these instructions. In addition, should you decide to use the lasp_cgi application, you will need to edit lasp_cgi.c to use locations and filenames that are on your system (see line 12 in src/lasp_cgi.c).

Unpack the LASP_1.0.tar.gz file in a suitable place to build it. Assuming you want to install LASP into /usr/local follow these commands. If you want to install in another location, change the target location in the second command below.

% cd LASP_1.0
% ./configure -prefix=/usr/local/
% make
Installation
NOTE: If you are planning installation of LASP on an already running proxy server it is STRONGLY recommended you practice this setup off to the side on a test instance before deploying on your live server. Also remember to keep backups of your squid.conf (and other configuration) files in case you need to restore.

Installing LASP requires a little bit of work by hand to add its configuration to the Squid proxy, however most of the installation is automated.

% make install
If you are using the example lasp_cgi unmodified, copy the ERR_ACCESS_DENIED file to the place you configured it. Typically this is /var/www/html. You will also need to copy bin/lasp_cgi to your web servers cgi_bin directory.

You now need to modify your lasp.conf file, particularly to change the target URL for failed matches. See below for more extensive instructions on this. However for now, all you need to do is modify the final line of the file to make sure that the URL points to a valid target (normally this is the cgi script mentioned above, but it can be any URL)

Installation Check
You should now check lasp is working. You can do this by running lasp from the command line. Once it has started up you can manually enter URLs to make sure blocking is working. If you are using the default rules provided as an example the following is a good test. Lines in italic are user input, lines in bold are the response from LASP:
# /usr/local/bin/lasp
http://www.theregiester.co.uk/ 1.2.3.4/- aaa aaa

http://www.sex.com/ 1.2.3.4/- aaa aaa
http://somehost.domain.com/cgi_bin/lasp_cgi?http://www.sex.com/
You can exit LASP by pressing ctrl-c or ctrl-d one you have verified things are working. You should also check the log files to make sure that LASP is logging the requests correctly. For the default installations theses are:
  • /usr/local/lasp/logs/lasp.log - General information, including startup messages, URLs accessed and other minor debug information.
  • /usr/local/lasp/logs/lasp_block.log - List of blocked URLs (providing: date/time, host, URL and rule that caused it to be blocked)

Hooking into Squid
In order to hook LASP into Squid, you will need to modify your squid.conf file. Assuming you installed LASP into /usr/local/bin, add or modify the following lines in squid.conf:
redirect_program /usr/local/bin/lasp
redirect_children 10
You may need to experiment with the redirect_children setting. If you proxy is giving warnings about pending queued requests, you should set this number higher. Typically 10-20 has been found to work well.

If you have already been using ACLs with the Squid native ACL handling, you will need to remove them from the squid.conf file. Don't forget, you might also need to add or modify:

http_access allow all
Finishing Up the install
You can now restart your proxy. Normally you can do this by using:
# squid -k reconfigure 
You can now test a few URLs to make sure its all working.
Using the lasp.conf file
The lasp.conf file is the file that lasp reads to locate and set up the main preferences. It is typically located at /usr/local/lasp/lasp.conf. There are a number of directives that are in this file. With the exception of the rulesfile directive if multiple version of the same directive appear in a file, the last version of that directive will take precedence.
lasp.conf directives
The following is a list of directives that the lasp will understand in the lasp.conf file.

logfile <filename>
The name of the logfile to write all the informational logging to. Typically this is /usr/local/lasp/logs/lasp.log. Most of the information in this file can be found already in the Squid access.log file so its main use is for debugging purposes. You may want to change this setting to /dev/null to turn off the basic logging

blockfile <filename>
The name of the file to write the list of blocked requests to. This file will contain a line giving the date and time, source IP, URL and the rule location for all blocked requests. Generally this points to /usr/local/lasp/logs/lasp_block.log. If you do not want this information set it to point to /dev/null to turn of this logging feature.

highpriority <number>
lowpriority <number>
These define the High and Low priority levels you are planning to use. Typically rules are defined in the range 1 (high) through 9 (low). Lasp will always take a lower number as a higher priority when processing rules files. In most cases the default values in the provided lasp.conf file will be sufficient.

acldefault <allow|deny>
This specifies the default behaviour for a URL that doesn't get a match in the rules file. If set to allow all URLs that do not match will be permitted by lasp and thus by Squid. If set to deny all non-matched URLs will be blocked.

rulesfile <filename>
This specifies a file to load containg the rules for URLs to block or allow. See below for the syntax in these files. You may specify multiple rules files in the lasp.conf file if you wish to divide up your rules sets into a more logical manner. Lasp will read the rules in the order in which your specify these files.

One final note on the lasp.conf file
The rule verification program, lasp_check also uses the lasp.conf file to determine the rule files to load. See the section below on the lasp_check application.
Writing ACL files
THe ACLs used by lasp are stored in their own configuration files. There may be many or just one of these files. In reality it is recommended that you use multiple files to help maintainance but in practice for a small number of entries you may just wish to use one.

The structure of the file is fairly straight forward:

   
priority defines the priority level for the rule. Rules are searched by priority level (with a lower number being searched before a higher number) and in the order they are defined. That is a rule at level 2 would be search before a rule at level 5, but whether it is searched before or after rules at the same level depends upon the order they are in the configuration files.

access defines whether to allow or deny access via this rule. If access is denied, a URL is blocked, if it is allowed, the URL is permitted. The use of the access keyword is helpful in unblocking URLs that would otherwise be blocked by subsequent rules.

command is the type of rule that this is:

  • host - uses the host name in the URL
  • domain - matches the domain name (i.e. everything after the first '.' in the URL
  • substr - matches the given portion of the string against the supplied string
  • URL - matches the entire URL

string is the keyword or string that you wish to match against.

Examples
The following rules are included as examples. They may or may not be useful for your location:
THIS SECTION IS STILL UNDER DEVELOPMENT


The lasp_check application
The lasp_check application can be used to optimize your rule sets. It uses the same configuration files as lasp (lasp.conf and the specified rule files). The application reads in the rules one at a time, and performs a search against previously read rules to ensure that there is not a match. If there is a match found, the two matching rules will be printed on the console (including their file name and line numbers) allowing the user to create a list of duplicate rules and modify their rules files accordingly.


To Do List for Subsequent releases
This list represents features that we would like to see in future releases:
  • Better handling of the "domain" rule, using reverse rather than forward searching
  • Better internal hashing routines. Right now the hash routine is fast but non-unique for each ACL entry. A fast unique has plan would increase peformance for about 90% of the ACL tests.
  • Possible inclusion of a real regexp engine if a fast enough one can be found. For now you can still get regexp blocking by using the Squid ACL mechanism in conjunction with LASP however.
  • Documentaion improvements. Break this document up into 2 or 3 documents and improve the installation instructions.
  • Regular expression matches rather than substrings if a fast enough RegExp engine could be found
  • Rewrite the configuration code and improve startup time for large numbers of ACLs (right now we are at about 20,000 ACLs per second on startup).
  • Internal statistics dump (how often is an ACL entry hit, etc)
  • Logfile rotation
  • Better lasp_check application that can do a more intelligent parse of the rule files and write them back out in a more optimal form. In addition, the lasp_check program currently optimizes on the fly so it does not easily allow for the case of higher priority rules being specified after lower priority ones.

LASP_1.0 version c - lasp_dev@nc.com