Docs/SysAdmin/Server/Mail/POPFile

From Mandriva Community Wiki

Jump to: navigation, search
POPFile Filtering

The instructions for installing POPFile that are found at the POPFile site are for Windows® users. The instructions below are for Linux users.

These instructions will cause POPFile to add "[spam]" to the subject line of every incoming message that it classifies as such. Genuine e-mail (or "ham") will remain unaltered. There are other ways to mark incoming spam, but this method is guaranteed to work with all e-mail clients.


Contents

[edit] Install and run POPFile - original instructions

To install the latest version of POPFile, you'll need to have the perl-BerkeleyDB rpm already installed; it can be found on the "contrib" mirrors for your Mandriva Linux release level. If you already have a "contrib" mirror defined for urpmi/rpmdrake, installing it is as easy as running (as root):

urpmi perl-BerkeleyDB

See at: instructions on setting up a contrib source.

  1. As root, make a home for POPFile: mkdir /usr/local/bin/popfile
  2. cd to your new directory and download POPFile: wget http://aleron.dl.sourceforge.net/sourceforge/popfile/popfile-0.20.1.zip
  3. Unzip it: unzip popfile-0.20.1.zip
  4. Correct the permissions if need be: chmod 755 popfile.pl
  5. Now run it: ./popfile.pl
  6. To run POPFile automatically at boot time, add this to /etc/rc.d/rc.local:
cd /usr/local/bin/popfile; ./popfile.pl


[edit] Configure POPFile

  1. Open any Web browser and point it to http://localhost:8080
  2. Click the Buckets tab. In the bottom left corner, use the Create field and button to identify at least two e-mail categories: spam and ham ("ham" being genuine e-mail).
  3. Click the Magnets tab and make simple rules to identify e-mails that need not be analyzed such as the e-mail addresses of friends and relatives, and the subject lines of list messages.
  4. Click the Configuration tab. In the E-Mail Text Insertion box, see that it's turned on for the subject, and off for the other two.

Note: Step #4 above will cause the subject line of each message that POPFile classifies to be modified, inserting "[ham] " or "[spam] " at the beginning of that message's subject. While this is necessary for e-mail clients that cannot sort on specific headers (i.e. Outlook Express), many users will be using more sophisticated mail clients and will find this behavior unpleasant, as it will also affect the subject lines of any replies they make to these messages. For this reason, if you are using a clueful client that can filter on the "X-Text-Classification:" header (i.e. Kmail, Evolution, mutt, Pine - in fact, most Linux and many Win32 clients), you should probably turn the first choice here off, and the second one on.


[edit] Configure your e-mail browser

  1. Within your e-mail browser, make a spam folder, and then make a new filter rule that moves anything with "[spam]" in the subject line to the new spam folder. If you're using the "X-Text-Classification:" header, have your rule filter on the value of that header instead (see Note above).
  2. Open the dialog that lets you change your account settings.
  3. Alter the login account name by prefixing your ISP's POP server and a colon. Ex: From "robin" to "pop.mailbox.com:robin"
  4. Change the POP/receiving server from your ISP's address to "127.0.0.1". Save the new configuration--you're now ready to go.


[edit] Training POPFile

POPFile must be trained to discern spam and ham. So for the first day, teach POPFile every few hours; once a day should be fine thereafter.

  1. Open your Web browser and point it to http://localhost:8080.
  2. Click the History tab.
  3. If you spot unclassified or mis-classified messages, select the correct category using that message's pulldown menu. When you're done classifying and re-classifying all the messages on the page, click "Reclassify" at the top or bottom of that column.
  4. Once all the messages on the page are correctly classified, you can delete them by clicking Remove Page.

POPFile will only store a limited amount of e-mail, so don't worry about it filling your hard drive. You can change the number of storage days under the Configuration tab.


[edit] Using upgrade to change to an orthodox install

The latest version (as of February 2005) is 0.22.2. It now uses an SQLite database. Documentation is much improved, although much is still devoted to Windows installations. However, the cross-platform version now has reasonably good documentation. There are instructions on http://popfile.sourceforge.net/cgi-bin/wiki.pl?HowTos/Mandrake which should work perfectly for a new install. However, if, like me, you followed the previous set of instructions you will have what the developers consider to be an unorthodox install, and installing according to the PopFile instructions will cause you to need some additional steps. What follows should put everything where the developers expect to find it, and will also preserve and import your corpus - the information about your buckets and parameters for classification.


[edit] Prepare

IMPORTANT: Before starting an upgrade, backup your popfile.cfg, popfile.db and stopwords.

Follow the instructions on http://popfile.sourceforge.net/cgi-bin/wiki.pl?HowTos/Mandrake to make a standard install. If problems arise, try the following.


[edit] Troubleshooting

If the following directories do not exist, create them:

/var/log/popfile
/var/lib/popfile

You need to find where the original install is pointing the variable POPFILE_USER. In a standard install that would be

/var/lib/popfile

Yours may be

POPFILE_ROOT = /usr/local/bin/popfile
POPFILE_USER = /var/lib/popfile
LOGS = /var/log/popfile

The easiest way to check is to run slocate popfile.db

All the software installs to POPFILE_ROOT and all the data resides in /var/lib/popfile.

Get the essential files to the correct place

Assuming that your install is in /usr/local/bin/popfile, copy the following files to the correct place:

cp /usr/local/bin/popfile/popfile.dfg /var/lib/popfile
cp /usr/local/bin/popfile/popfile.db /var/lib/popfile
cp /usr/local/bin/popfile/stopwords /var/lib/popfile
cp /usr/local/bin/popfile/messages /var/lib/popfile

Re-start PopFile

Make sure that no version is running: in a root console type

/etc/init.d/popfile stop

For this first time, start PopFile from a root console, so that you can see what is happening:

export POPFILE_USER=/var/lib/popfile
export POPFILE_ROOT=/usr/bin/popfile
perl $POPFILE_ROOT/popfile.pl

PopFile should start up and begin the conversion process. This can take some time, but you will see output scrolling while it happens. Once that is complete you should be able to point your browser at localhost:8080 or http://127.0.0.1:8080 and begin working as usual.

If you meet failure at any stage during this, take the exact error report to the PopFile forum on SourceForge, where the developers will be happy to help.


[edit] Advanced POPFile Configuration

[edit] Coexisting with a local POP3 server

If you run a POP3 server already on your machine (such as the "ipop3" portion of the UW-IMAP server), it will take over port 110, and this will interfere with the operation of POPfile, which defaults to using that same port. For them to run simultaneously, POPfile will need to be told to use a different one.

  1. Stop your POP3 server. With UW-IMAP, this is done by issuing the following two commands at a root prompt:
chkconfig ipop3 off
service xinetd restart
  1. Start POPfile.
  2. Open your Web browser and point it to http://localhost:8080.
  3. Click the Configuration tab.
  4. Under "Module Options," change the value of "POP3 listen port" from 110 to 8110, then click "Apply."
  5. Restart your POP3 server. For UW-IMAP, this would be:
chkconfig ipop3 on
service xinetd restart

POPfile will now listen on port 8110, and the local POP3 server will continue to use port 110. Adjust your email clients' configurations accordingly.


[edit] Running POPfile as a service

Note: Get POPfile configured, and ensure that it works properly, before continuing. POPfile should not be running when you follow these steps.

POPfile can be run as a service. This is preferable to the above method of running it from a line in the /etc/rc.d/rc.local script, for two reasons:

  1. The rc.local script will not complete properly while POPfile is running; any commands later on in that script will not run until POPfile has ended.
  2. This allows better control, such as through the Services menu in Mandriva Linux Control Center and/or the chkconfig and service commands.

To run in this fashion, it needs an initscript - so we'll make one for it. Open your favorite text editor, and copy the following script into it:

#!/bin/sh
#
# Version: 1.0
#
# chkconfig: 345 89 37
# description:   Starts and stops the POPfile POP3 proxy daemon
#
# config:   $POPFILEDIR/popfile.cfg

POPFILEDIR="/var/spool/popfile"


# Source function library.
. /etc/rc.d/init.d/functions

# Source networking configuration.
. /etc/sysconfig/network

# Check that networking is up.
[ ${NETWORKING} = "no" ] && exit 0

# Check that config file exists.
[ -s ${POPFILEDIR}/popfile.cfg ] || exit 0


RETVAL=0

# See how we were called.
case "$1" in
  start)
   if [ ! -f /var/lock/subsys/popfile ]; then
      gprintf "Starting POPfile services: "
      cd $POPFILEDIR && nohup ./popfile.pl &>/dev/null &
      RETVAL=$?
      if [ $RETVAL -eq 0 ]; then
         touch /var/lock/subsys/popfile
         success "POPfile startup"
         echo
      else
         failure "POPfile startup"
         echo
      fi
   else
      RETVAL=1
   fi
   ;;
  stop)
   if [ -f /var/lock/subsys/popfile ]; then
      gprintf "Shutting POPfile services: "
      lynx -dump http://127.0.0.1:8080/shutdown &>/dev/null
      RETVAL=$?
      if [ $RETVAL -eq 0 ]; then
         rm -f /var/lock/subsys/popfile >/dev/null 2>&1
         success "POPfile shutdown"
         echo
      else
         failure "POPfile shutdown"
         echo
      fi
   else
      RETVAL=1
   fi   
   ;;
  restart)
   $0 stop
   $0 start
   RETVAL=$?
   ;;
  status)
   status ${POPFILEDIR}/popfile.pl
   RETVAL=$?
   ;;
  *)
   gprintf "Usage: %s {start|stop|restart|status}\n" "$0"
   exit 1
esac

exit $RETVAL

You may need to edit the POPFILEDIR= line (near the top of the script) to reflect where you installed the program; if you put it into /usr/local/bin/popfile, as Miark suggests above, change "/var/spool/popfile" to "/usr/local/bin/popfile" on this line. You'll also need to have the "lynx" package installed for this script to work.

Save the script with the name "popfile". Make it executable with the following command:

chmod +x popfile

Also go to the directory where popfile is installed and make sure that popfile.pl is executable:

chmod +x popfile.pl

Now "su" to become root, and run the following three commands (in order):

cp popfile /etc/rc.d/init.d/
chkconfig --add popfile
service popfile start

POPfile is now running as a service, and will start at every boot (in run levels 3, 4 and 5). If you have placed a line in /etc/rc.d/rc.local to launch it (as described above), delete that line from that file - it is no longer needed.


[edit] Using Fetchmail with POPfile

As one of the primary uses of Fetchmail is as a POP3 client, it is able to use POPfile as well. Here is how to edit your fetchmailrc file to support this. The file is called ~/.fetchmailrc , if it is a personal one (i.e. you run the fetchmail app as a user), or /etc/fetchmailrc , if fetchmail is run as a system-wide service.

Here are two "poll stanzas" from a typical fetchmailrc file:

poll mail.myisp.com with proto POP3
   user "johndoe" with password "GuEsS@iT" is "john" here
   user "janedoe" with password "FsCkOfF&DiE" is "jane" here
poll mail.otherisp.com with proto POP3
   user "jdoe" with password "MaKe$FaSt" is "john" here

To alter them to perform each fetch by way of POPfile, edit them to look like so:

poll 127.0.0.1 with proto POP3
   user "mail.myisp.com:johndoe" with password "GuEsS@iT" is "john" here
   user "mail.myisp.com:janedoe" with password "FsCkOfF&DiE" is "jane" here
   user "mail.otherisp.com:jdoe" with password "MaKe$FaSt" is "john" here

Note that the second "poll" line is now redundant, and has been removed; all polls now point to the local machine, and the destination server's name is now included in the "user" parameter's value.

If you have POPfile listening on port 8110 (as described above), the "poll" line should instead be made to read:

poll 127.0.0.1 port 8110 with proto POP3

You do not need to restart fetchmail , if it is currently running; it checks for changes to the fetchmailrc file before each poll attempt, and will automatically reload if it determines that the fetchmailrc file in use has changed since the last poll.


[edit] Related Pages

MiarK - 21 Aug 2003 (Unless otherwise noted, this page was written by Miark miark@gardnerbusiness(nospam).com)

The "Advanced POPFile Configuration" section was written by Bill Mullen.

Personal tools