You can get the latest version of FilterProxy, and some instructions at
http://draal.physics.wisc.edu/FilterProxy/
FilterProxy is a personal filtering proxy. It is unique in that it allows
"Modules" to be installed that can perform arbitrary transformations on HTML
(or any other mime-type). Currently it filters ads by rewriting HTML,
compresses HTML content (for a 5-1 speedup on modems!), and de-animates
animated gif's. Configuration is done with web forms.
Modules currently supplied and tested are:
* Rewrite: allows removal and modification of arbitrary parts of a
html file using a configurable set of 'rules'.
* XSLT: XML Stylesheet Language Transformations. XSLT is a W3C
recommendation, which is a language for transforming XML documents into
other XML documents.
* Header: can strip or add headers by regex.
* Compress: uses gzip compression to compress html. (4-5 times speed
improvement for html, your browser uncompresses it)
* DeAnim: de-animates animated gifs, and removes other "extension
blocks", which often reduces the size of gifs.
* Skeleton: a barebones, heavily commented module for people wanting
to write new modules. See the TODO file for a list of work that
needs to be done to extend this program.
* ImageComp: a module which uses ImageMagick to recompress various image
formats to reduce their size. (INCOMPLETE - Volunteers needed)
Where to run FilterProxy:
-------------------------
There are two basic ways to run FilterProxy.
One is where FilterProxy is running on the same machine you are browsing from,
and that machine is connected to the net via a slow interface (i.e. a modem)
In this case it makes sense to use the following modules:
Rewrite
XSLT
Mirror (when/if written)
I also suggest enabling "localhost only" in this mode (for security).
The second is where FilterProxy is running on a computer with relativly fast
connection, and you are using it from a different computer, over a modem.
In this case it makes sense to use the following modules:
Compress (this will give you a ~5x speed improvement for html!)
Rewrite
XSLT
Another way is where FilterProxy is running on a computer with a fast
connection, and you are browsing from the same computer. This is basically
the same as #1 above. Again, using "localhost only" is recommended.
Make sure to install FilterProxy on a relatively fast computer. Don't
put it on your OpenBSD firewall that's got a Pentium 90 in it. Parsing
and filtering HTML is a computationally intensive task, and requires
a reasonable amount of CPU. On my 533 Mhz alpha, most pages get filtered
in under 0.5 seconds. On an 800 Mhz athlon I have access to, most
pages get filtered in under 0.2 seconds. But on an older computer
it could take many seconds, introducing a noticable delay. (This is
only for HTML, images are usually very fast)
If you're installing from the rpm, FilterProxy will install itself in
/home/filterproxy, create a user for itself, and create an init script
/etc/rc.d/init.d/filterproxy. If you wish to start FilterProxy on bootup, you
should create a link to this script from /etc/rc5.d/ (or whatever your default
runlevel directory is)
FilterProxy also supports the following command line options:
# FilterProxy.pl -h
Options recognized by FilterProxy:
-h Print this help message
-k Kill an already running copy of FilterProxy
-f <file> Specify an alternate config file
(default is `pwd`/FilterProxy.conf)
-p <port> Specify the port to which FilterProxy will bind
(default is 8888)
-n Do not daemonize: stay connected to the terminal from which
it was started and print debugging messages.
If you wish to use *another* proxy in addition to FilterProxy, you may
set the environment variable http_proxy to point to the other proxy.
It is also possible to set this from the CGI config page, FilterProxy.html
For instance, if your ISP runs a caching proxy, set something like:
# setenv http_proxy http://your.isp.here:1234 (csh syntax)
# http_proxy=http://your.isp.here:1234 (sh syntax)
Where 1234 is the port your other proxy runs on, and your.isp.here is
the ip address of the proxy. (I have not tested this very well, but I
have reports that it works as of ~0.15) If the upstream proxy requires
authentication, this information can be entered on the main FilterProxy
config page. (only works with BASIC authentication right now)
The reason I wrote FilterProxy is to fix some problems with the web (in
general) and brain dead web-site designers (specifically). Modules that
I would like to see in the future:
Cookie Filter cookies by server (i.e. do not send any cookies to
ad servers, while still allowing cookies for other
sites) Allow for sophisticated cookie management
(check out HTTP::Cookies).
Anchorizer Add <a href="..."> to identifiable URL's in a web page, when
those URL's don't already contain them.
Clean clean-up HTML (specifically, remove MS's attempts at
redefining ASCII by adding forward and back quotes,
which appear on many browsers as '?') (use HTML::Clean
=1= |