Archive for July, 2010

Postfix Rate Limiting

Since I had to figure out how to limit outbound traffic by domain today I thought I would post the procedure for everyone to enjoy. Listed below are the configuration changes that I made to our main postfix gateway server.

Add the following lines to /etc/postfix/master.cf. You could also copy the smtp line and rename it to something else. I use the term slow in this example.

# Outbound rate limiting
slow unix - - n - 1 smtp
-o syslog_name=postfix-slow

Now add the following line to /etc/postfix/transport. You can rate limit as many individual domains as you wish using the transport file. Don’t forget to postmap transport when you are finished. You should also have transport_maps set in /etc/postfix/main.cf.

domain.com slow:

The last step is to add the following block of code to /etc/postfix/main.cf:

# Outbound rate limiting
slow_destination_rate_delay = 120
slow_destination_concurrency_limit = 5
slow_destination_recipient_limit = 100
slow_connection_cache_time_limit = 0
slow_never_send_ehlo = yes
slow_connect_timeout = 5

This code forces a delay of 120 seconds between connection attempts. It also forces five concurrent connections at any one time. The current postfix default is 10. I’m not sure I would go lower than three for an organization of our size. It also limits recipients to 100 per connection attempt. Don’t forget to restart the postfix daemon after making these changes!

Concurrent Connections

I have noticed over the last few days that we have had Roadrunner messages in our main e-mail gateway stuck in our queue with the following error:

56FE71501A1C 1695 Fri Jul 30 08:32:19 noreply@nowhere.com
(host hrndva-smtpin02.mail.rr.com[71.74.56.244] refused to talk to me: 421 4.7.1 - Connection refused - - Too many concurrent connections from source IP)

Due to Roadrunner’s rather draconian inbound rate limit policy, we receive this bounce error almost constantly. A fairly large number of the e-mail that is sent out over our servers is headed for Roadrunner’s system. Given that much of the traffic is generated from local church mailing lists, a delay of a few hours usually isn’t that big of a deal. At least, it hasn’t been up until this week. Once I got in this morning however, I saw that we had almost 5,000 messages in the queue with the same error. Alarm bells went off and the IT Office shifted into DEFCON 1.

Searching through the 5,000 messages quickly showed me that there was one sender who was doing most of the work. Grepping through /var/log/maillog quickly showed me which server the traffic was coming from. Once I had this information handy I shut down the main gateway server to stop the bleeding until I could figure out what was going on. I headed over to the server in question and started searching through those logs as well.

Turns out that there was an old Squirrelmail plugin that the spammer was using to generate plain text spam messages internally. Specifically these were nigerian phishing scheme messages. Since they are plain text, coming to and from legitimate addresses and were constantly changing, they were very difficult for our spam filters to stop. Once I updated the plugins, cleaned out all of the mail queues in question and restarted the affected services we were back in action.

The only problem remaining was to convince my fellow e-mail administrators that we were no longer sending spam. All of the major ISP’s use different filtering systems, real time blacklists (RBL’s) and their own internal policies. This meant visiting each ISP listed in the queue as a bounced message and reporting to them individually that the issue was resolved.

Along the way I noticed that there were two agencies that many of the ISP’s use as a clearing house for online e-mail system reputation. You can sign up your organization and they will independently verify that you are not a spammer. Once you are on their list, the ISP’s that subscribe to their services can verify that you are a legitmate sender. It amounts to a shake down more or less. You have to pay to play. Given the importance of our e-mail systems I decided to go ahead and sign us up. If you need to do the same with your mail servers then visit these companies:

  • E-mailReg.org – This service charges $20 for one year of service. After verifying domain ownership they say it will take several days.
  • ReturnPath.net – This is the big service that a lot of the major ISP’s use. They charge for their service based on how big your organization is. We received the non-profit discount and paid $200 for the application fee with no monthly fee. This system will take several weeks to verify. You really should sign up for this service when times are good.

All is well now. The spam flow has been stopped and all of our queues are cleaned out. All we can do now is wait on Roadrunner’s rate limits to time out and allow us to resume sending messages.

Required Mailman Permissions

I have been spending a good deal of time in our mailing list server archives trying to run down several permissions related problems.  After doing a great deal of searching online I realized that there was no place online that listed the comprehensive required permissions for the /var/lib/mailman/archives and /var/lib/mailman/lists folders.  I spent a few hours today blindly stumbling through the permissions before I got them right so I thought I would print them here for reference.  This is by no means a comprehensive list of the official permissions.  It is however, what is working for me.

/var/lib/mailman/archives/private/listname

drwxrwsr-x  50 root mailman 4.0K Jul 26 13:17 .
drwxrwx--- 312 root mailman  20K Jul 29 14:04 ..
drwxrwxr-x   2 root mailman 4.0K Jul 29 13:36 2010-April
-rw-rw-r--   1 root mailman  13K Jul 29 13:35 2010-April.txt
drwxrwxr-x   2 root mailman 4.0K Jul 29 13:36 2010-February
-rw-rw-r--   1 root mailman 8.7K Jul 29 13:35 2010-February.txt
drwxrwxr-x   2 root mailman 4.0K Jul 29 13:36 2010-January
-rw-rw-r--   1 root mailman  21K Jul 29 13:35 2010-January.txt
drwxrwxr-x   2 root mailman 4.0K Jul 29 13:36 2010-July
-rw-rw-r--   1 root mailman  34K Jul 29 13:35 2010-July.txt
drwxrwxr-x   2 root mailman 4.0K Jul 29 13:36 2010-June
-rw-rw-r--   1 root mailman  25K Jul 29 13:35 2010-June.txt
drwxrwxr-x   2 root mailman 4.0K Jul 29 13:36 2010-March
-rw-rw-r--   1 root mailman  24K Jul 29 13:35 2010-March.txt
drwxrwxr-x   2 root mailman 4.0K Jul 29 13:36 2010-May
-rw-rw-r--   1 root mailman  22K Jul 29 13:35 2010-May.txt
drwxrwxr-x 569 root mailman  20K Jul 29 13:35 attachments
drwxrwx---   2 root mailman  24K Jul 29 13:36 database
-rw-rw-r--   1 root mailman  38K Jul 29 13:36 index.html
-rw-rw----   1 root mailman 2.7K Jul 29 13:36 pipermail.pck

/var/lib/mailman/archives/private/listname/2010-July/

drwxrwxr-x  2 root mailman 4.0K Jul 29 13:36 .
drwxrwxr-x 94 root mailman  12K Jul 29 13:36 ..
-rw-rw-r--  1 root mailman 2.5K Jul 29 13:36 002505.html
-rw-rw-r--  1 root mailman 2.2K Jul 29 13:36 002506.html
-rw-rw-r--  1 root mailman 2.5K Jul 29 13:36 002507.html
-rw-rw-r--  1 root mailman 4.4K Jul 29 13:36 author.html
-rw-rw-r--  1 root mailman 4.4K Jul 29 13:36 date.html
lrwxrwxrwx  1 root mailman   11 Jul 29 13:35 index.html -> thread.html
-rw-rw-r--  1 root mailman 4.4K Jul 29 13:36 subject.html
-rw-rw-r--  1 root mailman 5.1K Jul 29 13:36 thread.html

/var/lib/mailman/archives/private/listname/database/

drwxrwx---  2 root mailman  24K Jul 29 13:36 .
drwxrwxr-x 94 root mailman  12K Jul 29 13:36 ..
-rw-rw----  1 root mailman  31K Jul 29 13:36 2010-July-article
-rw-rw----  1 root mailman 4.4K Jul 29 13:36 2010-July-author
-rw-rw----  1 root mailman 3.9K Jul 29 13:36 2010-July-date
-rw-rw----  1 root mailman 4.6K Jul 29 13:36 2010-July-subject
-rw-rw----  1 root mailman 3.9K Jul 29 13:36 2010-July-thread

/var/lib/mailman/lists/listname

drwxrwsr-x   2 root    mailman 4.0K Jul 29 13:17 .
drwxrwsr-x 194 root    mailman  12K Jul  6 21:51 ..
-rw-rw----   1 root    mailman 1.7K Jul  6 21:51 admindbpreamble.html
-rw-rw----   1 root    mailman 8.9K Jul  6 21:51 config.db
-rw-rw----   1 root    mailman 8.9K Jul  6 21:51 config.db.last
-rw-rw----   1 apache  mailman  14K Jul 29 13:17 config.pck
-rw-rw----   1 mailman mailman  14K Jul 29 00:54 config.pck.last
-rw-rw----   1 root    mailman  12K Jul 27 18:42 digest.mbox
-rw-rw----   1 root    mailman  189 Jul  6 21:51 handle_opts.html
-rw-rw----   1 root    mailman 1.1K Jul  6 21:51 headfoot.html
-rw-rw----   1 root    mailman 3.1K Jul  6 21:51 listinfo.html
-rw-rw----   1 root    mailman 4.1K Jul  6 21:51 options.html
-rw-rw----   1 mailman mailman   46 Jul 29 00:54 pending.pck
-rw-rw----   1 root    mailman    2 Jul  6 21:51 request.db
-rw-rw----   1 mailman mailman  13K Jul  6 21:51 request.pck
-rw-rw----   1 root    mailman 1.2K Jul  6 21:51 roster.html
-rw-rw----   1 root    mailman  198 Jul  6 21:51 subscribe.html

After setting these permissions the mailman server resumed normal operations.  It looks like apache will take over the files that are edited directly from the web interface.  That should be ok.  The main problem is giving mailman read/write access to the files that it needs to update and maintain the mailing list archives.  Trust me, if mailman can’t access any of these files it will move the message quietly over to the /var/spool/mailman/shunt directory.  Nobody wants that.  Once you resolve any permissions issues be sure to restart the mailman daemon.  To remove e-mail from the shunt directory run /usr/lib/mailman/bin/unshunt.

Mailman Archives Issue Resolved

I have been battling a weird archives issue with our GNU Mailman mailing list server.  We have some lists that archive properly when e-mail is sent to them.  We have other lists where the e-mail is delivered but does not show up in the archives.  We also have lists where e-mail sent to them disappears and is never heard from again.  I have been hassling with this permissions issue literally for years now.  I picked the baton up again today and decided to try to bring this one home.  First I started in the mailman error logs:

/var/log/mailman/error

Jul 26 12:25:43 2010 (2755) Uncaught runner exception: [Errno 13] Permission denied: ‘/var/lib/mailman/archives/private/listname/index.html’Jul 26 12:25:43 2010 (2755) Traceback (most recent call last):
File “/usr/lib/mailman/Mailman/Queue/Runner.py”, line 112, in _oneloop
self._onefile(msg, msgdata)
File “/usr/lib/mailman/Mailman/Queue/Runner.py”, line 170, in _onefile
keepqueued = self._dispose(mlist, msg, msgdata)
File “/usr/lib/mailman/Mailman/Queue/ArchRunner.py”, line 73, in _dispose
mlist.ArchiveMail(msg)
File “/usr/lib/mailman/Mailman/Archiver/Archiver.py”, line 217, in ArchiveMail
h.close()
File “/usr/lib/mailman/Mailman/Archiver/pipermail.py”, line 324, in close
self.write_TOC()
File “/usr/lib/mailman/Mailman/Archiver/HyperArch.py”, line 1094, in write_TOC
toc = open(os.path.join(self.basedir, ‘index.html’), ‘w’)
IOError: [Errno 13] Permission denied: ‘/var/lib/mailman/archives/private/listname/index.html’
Jul 26 12:25:43 2010 (2755) SHUNTING: 1280155615.876646+a19c8dce602a83897d29592d36d618fc80195ec7
I didn’t remember seeing this particular error message before (of course I didn’t write down the old ones) so I started over again with fresh eyes.  After googling for a long time I ran across this nugget:
>>I ran several times check_perms -f and it says all is fixed.
>
>
> check_perms is lying (actually, there are many files, as opposed to
> directories, that check_perms doesn’t check). The above file and all
> files in /var/lib/mailman/archives/private/ excluding those in
> /var/lib/mailman/archives/private/*/database/ need to be group
> writable.
>
> Once you fix these permissions, you could run bin/unshunt to add the
> shunted messages to the archive, but there is an issue in that the
> messages have been successfully added to
> /var/lib/mailman/archives/private/mylist.mbox/mylist.mbox, and
> unshunting will add them again.
>
> Rather than trying to fix archive permissions, I suggest you verify
> that /var/lib/mailman/archives/private/mylist.mbox/mylist.mbox
> contains all the lists posts from inception to date, and mayby verify
> there are no stray “From ” lines in message bodies with bin/cleanarch,
> and then stop Mailman and rebuild the archive with
>
>  bin/arch –wipe listname
>
> and then start Mailman. This way, the pipermail archive will be
> completely rebuilt with correct permissions.
>
> This is one reason why I always recommend when moving lists to just
> move the LISTNAME.mbox/LISTNAME.mbox file and build the archive on the
> new machine with bin/arch.
>
> Note if you do this, remove the shunted messages from qfiles/shunt/ so
> they don’t accidently get unshunted in the future.
What?  The check_perms utility doesn’t fix all of the needed permissions in order for mailman to run properly?  Why won’t it complain when the daemon starts up then?  Why won’t it at least say that some of the permissions have been correct but not all of them? I’ve been running this mailman installation for several years now and I (and all my buddies who also run mailman) have always held up check_perms as the gold standard for making sure that mailman works properly.  I wish I had known about this a few weeks ago when I was moving hundreds of gigabytes of files from one server to another.  The bit about moving lists would have saved me (and my end users) a lot of time.

What’s The Frequency?

Cell phone coverage in our new building has proven to be quite a challenge.  It would appear that big steel buildings such as ours do not permit easy wireless transmission.  I have noticed issues with sending 802.11 G and N signals as well the signals that make our phones work.  The overhead wireless network was set up with a few more draft N access points to compensate for the signal troubles.  Unfortunately we can’t install a few more cell phone towers around the property to improve the signal.

Since I have been working at the new building for several months ahead of the rest of the staff I am acutely aware of the cell phone trouble.  I remember fondly having to run outside several times to make quick phone calls over the course of troubleshooting an issue.  The problem isn’t that I don’t want to fix it.  The problem is that there simply aren’t very many available options.  I called AT&T, Sprint/Nextel and Verizon to discuss the problem.  I discovered that both Verizon and Sprint/Nextel offer corporate level IT solutions for repeating their wireless network signal through a facility such as ours (approx. 35,000 sq. ft.).  AT&T, unfortunately, has no similar solution.

After discussing this internally we determined that since the majority of our staff are Verizon subscribers we would go with the Verizon solution first.  The only system that is rated to work with Verizon’s towers is the Juni JR-20 Plus.  This system has two antennas that will each cover approx. 20,000 sq. ft.  Once that system has been installed we can review its effectiveness and determine where to go from there.

Juni JR-20 Plus

On a related note, both Verizon and Sprint/Nextel use CDMA networks.  My hope is that Sprint/Nextel subscribers can roam off of these repeaters while they are inside the building.  If that works out then we won’t have to look at investing in any more major equipment purchases.  This stuff ain’t cheap!

To further complicate things, AT&T and T-Mobile use GSM.  These two different protocols do not mix.  There are smaller wireless range extenders available from AT&T but they are primarily designed for home usage.  As such, they are not well suited for use in a facility our size.  I’ll keep posting updates as my research progresses and will let you know how the installation goes.

Faith & Technology

I submitted my first article for our new Faith & Technology blog today.  We are starting this up as a collaborative effort between several clergy and lay people around our Conference.  Its purpose is to discuss technology and the various ways in which it impacts faith and our local church community.  This is my first foray into scheduled writing for another publication (except for this blog of course).  My first article discusses what happens to clergy with full friends lists when they leave a church.  I am very excited about this opportunity.  Be sure to subscribe.  We have lots of interesting things coming!

Practice Safe Searching

As I was navigating the internet today at work I noticed a new Google feature that seems rather interesting:

Safe Searching!

Google has released the beta version of their https encrypted search website.  Google probably didn’t encrypt their searches from the beginning due to the increased overhead of the https protocol.  After several test searches I cannot tell a difference between the http equivalent.  Now we can all search whatever we want in the coffee shop without much worry of wireless sniffers watching our every move.  Very impressive!

Mailing List Blues

The Conference mailing list server is down again. I’ve been monitoring disk utilization on the list server for awhile now in an attempt to keep the server up until after the building move. Once I realized that we were going to run out of space again I decided to take the server down preemptively.

GNU Mailman

We have a long running history with this software.  Sometime around 2003 I was tasked with setting up a mailing list solution for the Conference.  Several of our local churches had also expressed interest so I had to find something cheap, scalable and fast.  GNU Mailman was the perfect choice.  It’s free and open source software, you can continue throwing lists at it and it supports lists of all sizes.

The list server is my oldest Linux installation.  I was a lowly Windows admin at the time so my good friend Alan Swartz helped me with the original Red Hat installation.  My how times have changed.  Back then I had a brief list of commands to create new lists, reboot the server and perform a few basic administrative tasks.  I wish I had kept a copy of that original handwritten list but alas, it is lost to the sands of time.  This software has proven robust over the years as it has moved across several physical computers and 3-4 different Linux distributions.

Victims of our own success.

It would seem that too much of a good thing always lead to problems.  The mailing list server maintains an archive of all of the e-mail that is sent over each of the mailing lists.  These files grow over time as new messages are sent.  Over time disk space can become a problem.  It took us several years of constant usage to amass a corpus of around 80 gigabytes (GB).  Mailman must have changed how it stores e-mail because over the course of a year or so we shot up to around 280 GB.  Maybe people realized that you can send attachments to the lists?  Once things get back to normal I plan to dig into why these list archives are growing so quickly.

Everyone loves a good history lesson but why is the server down today?  The simple answer is that the hard drive is full (again).  Once it fills up the mailman daemon stops responding.  Since I am out of the office it could take me a good while to discover that the system is down.  That’s why I decided to go ahead and replace it.

On June 17th the system went down due to a full hard drive.  With the building move coming up soon I decided to try temporarily remove the larger archives from the internal mailing lists.  This would free up enough hard disk space to keep the server running (hopefully) until well after the building move when I could properly schedule an outage.  I’ve been monitoring the disk utilization since then, moving archives as I can.  Unfortunately, I’ve moved all of the larger ones and was forced to move forward with plans to switch the drive.

Last night I pulled the 320 GB drive and replaced it with a 1.5 Terabyte (TB) drive.  It takes awhile to copy the archives back to the new drive however.  Overnight 60 GB of the 280 GB data store copied.  I expect progress all day and will bring the system back online as soon as I can.  Hopefully this will buy us a good bit of time before I have to permanently retire some of the archives.

Update: Friday, July 9th 2010 @ 12:17 PM
The list server is back up!  We have plenty of available disk space now.  I’m hoping that this one will last us a good while.  I still need to research what is eating so much disk space but moving forward we should be in good shape!

Follow

Get every new post delivered to your Inbox.

Join 498 other followers