12 The NAT Router House of Pain
Adam Ierymenko edited this page 2016-02-10 16:12:41 -08:00

The NAT Router House of Pain

This page is used to track workarounds and weird hacks that have been added to the ZeroTier code base to deal with broken, restrictive, or otherwise sketchy NATs, routers, firewalls, and other network hardware.

Confirmed Cases

Crappy apartment NAT ignores zero-byte keepalives

Apparently some NATs ignore zero-byte UDP packet keepalives. The customer in question reported a NAT (unknown brand) at their apartment complex with a 60 second UDP timeout that ignored zero-byte keepalive packets, causing TCP failover and route flapping.

2015-09-22 1.0.6-dev: Changed NAT keepalives to carry a randomized four-byte junk payload as a workaround for this. Also fixed a timing bug that caused them to go out a bit less frequently than intended.

Ubiquiti Edge uPnP port reuse fail

Ubiquiti Networks Edge Router Lite (ERLite-3, firmware 1.6.0) appears to have at least one and possibly two bugs:

  1. Attempting to do normal NAT-t (hole punching) on a port also used for uPnP mapping apparently causes the router to stop accepting traffic from the outside world at all, period, on any port or mapping.

  2. I also suspect that mapping uPnP to multiple hosts behind the router on the same UDP port causes issues, though I have not been able to confirm this. #1 is definitely true.

2015-09-23 1.0.6-dev: OneService now creates a second random UDP socket if UPnP/NAT-PMP port mapping is enabled and uses this socket for that purpose, leaving the regular port for normal non-mapped operation. Thus uPnP/NAT-PMP will not map to the same port as normal traffic.

uPnP port reservation fails unless lease time is zero ("forever")

UPnP is a terrible, awful, evil, disgusting protocol that sort of looks like what you'd get if you printed out and ate a bunch of XML and then induced vomiting several hours later. One of its apparent failure modes is that a non-trivial proportion of routers ignore port assignments that are not "forever," and interpret this in various ways: actual forever, random forgetfulness, or some hard-coded expiration time.

As a result ZeroTier now (as of 1.1.0) uses a deterministic port for uPnP and reserves it "forever," since this is the only way that works reliably.

Multiple devices using same port

This one got escalated from unconfirmed. Apparently some NATs including possibly Airport Extreme choke if more than one device behind them is using the same local port. Since ZT uses 9993 by default, this means more than one ZT device behind the NAT makes it do strange things.

You can launch with -p to pick a different port, but it's nice to have a default. But it's also nice if things work by default. This is ugly and nasty and evil and bears more investigation.


Unconfirmed Reports

Broken routers that blacklist ports forever on failed NAT-t

We have at least one report of this. The only conceivable workaround is uPnP or relaying. In practice this also means that the number of usable UDP ports these routers offer will shrink over time even due to normal packet loss and other edge case conditions, causing DNS and other UDP protocols to break and require a router reset. There's only so much we can do to support total broke-etude.

FIFO and random port expiration behavior

See issue #296 -- https://github.com/zerotier/ZeroTierOne/issues/296