PDA

View Full Version : Courier-MTA failing to deliver to remote MTA's..


dpuryear
04-03-2003, 10:36 AM
We are a new DSM install. Everything works fine except for Courier. Here are some cases and their results:

1. Deliver from local user to local user: works. So if I mail from root to bob@myserver.com then it works.
2. Deliver from local user to remote user: fails. If I mail from root or bob@myserver.com to bob@yahoo.com it fails. This fails for any recipient and any sender.
3. Deliver from remote user to local user: fails. If I mail from bob@yahoo.com to bob@myserver.com it fails.

For cases 2 and 3 I see a consistent patter. In /var/log/maillog I get:

Apr 3 09:08:46 www courieresmtp: id=001BB224.3E8C4E7E.00004158,from=<root@www.example.com>,addr=<dpuryear@usa.net>: DNS lookup failed.
Apr 3 09:08:46 www courieresmtp: id=001BB224.3E8C4E7E.00004158,from=<root@www.example.com>,addr=<dpuryear@usa.net>,status: deferred
Apr 3 09:08:46 www courierd: completed,id=001BB224.3E8C4E7E.00004158

(I am using "www.example.com" to protect the innocent.)

Okay, so perhaps DNS is failing? No:

# nslookup
Note: nslookup is deprecated and may be removed from future releases.
Consider using the `dig' or `host' programs instead. Run nslookup with
the `-sil[ent]' option to prevent this message from appearing.
> set q=mx
> usa.net
Server: 0.0.0.0
Address: 0.0.0.0#53

Non-authoritative answer:
usa.net mail exchanger = 10 mxpool01.netaddress.usa.net.

Authoritative answers can be found from:
usa.net nameserver = cndvg054.usa.net.
usa.net nameserver = vndvg055.usa.net.
mxpool01.netaddress.usa.net internet address = 165.212.8.32
vndvg055.usa.net internet address = 165.212.55.55
> exit

DNS works for ANY domain I try but that fails with Courier. This is consistent!

I asked on courier-users and was told that this may be due to IPv6 being compiled in. Possibly a known issue. Has anyone else seen this? I hate to modify the RPM's installed by DSM so I'd rather a fix directly from Zervex if possible.

Any help is appreciated. Thanks!

dpuryear
04-03-2003, 03:11 PM
It was suggested to me that since we are using a Courier compiled with IPv6 support that this might be an issue. I was told someone else (not someone running DSM by the way) had experienced a problem and that compiling Courier without IPv6 support fixed the issue.

This sounds somewhat reasonable if it's a bug. However, my concern is that this seems to only be affecting us instead of everyone using DSM. I assume most of us are using the same software versions under Linux (Courier and BIND), so I don't see why we would be the only ones affected.

But you never know with computers.

Any thoughts on this?

dpuryear
04-05-2003, 01:00 AM
I resolved this issue. First, you need IPv6 support compiled into your kernel or loaded as a module apparently. Second, there was still a DNS resolution error but I fixed that. Here is the note I send to Zervex technical support. I am placing it here in case this gets anyone else.

...
Interesting problem and resolution.

After further investigation I did find that the Courier you are using has IPv6 support built-in. Of course, as we don't use IPv6 we don't have it built into the kernel. We did have IPv6 support built as a module, but we weren't using it. As a test I loaded module ipv6. Courier then stopped giving me the "DNS lookup failure" messages, however delivery still failed.

Next, I ran /usr/lib/courier/testmxlookup and noticed that the program was hanging. I used strace to see what the hell was going on and noticed that it was in fact trying to use IPv6 first. Also, it was hanging on a sendto() and I noticed that after a while it would try again, but increase the delay used by select() while waiting on the response. Normal DNS usage, but this did let me know that testmxlookup wasn't getting a response back from the DNS server, or was not contacting it properly. Okay, so my next thought was that perhaps the problem is now (after having fixed the IPv6 issue) a DNS resolution issue as testmxlookup is just used to test DNS lookups for the MX record. In /etc/resolv.conf we had:

nameserver 0.0.0.0

This is the usually advised method of defining /etc/resolv.conf when you have a local DNS server. I replaced this with an actual IP address defined on a network interface and now everything works.

So this is something odd going on in Courier.

Anyway, it works now.
...