Challenges of Network Address Translation in P2P networks

Ramakrishnan Muthukrishnan

https://mastodon.radio/@vu3rdd

2019/Nov/07

Networks

Modern internet is mostly around centralized servides. Google/Twitter/Facebook...

Origins of the internet

The internet was conceived in the late 1960s fueled by the Cold War and ways to communicate in the case of Nuclear wars. They also had the purpose of sharing computer resources. It was very much a network of networks. Protocols like the Email, Usenet, IRC are all designed to run in a decentralized manner.

It was designed in such a way that new "nodes" can be added. Nodes are today's equivalent of routers and there were a few 10s or 100s of users (with "hosts" connected to the nodes) behind each node.

Modern Internet

Email and FTP made the usage of internet very appealing to users outside the Government and then later, the hyperlinks and world wide web make the internet explode in the popularity among non-scientific community. People started building commerce around internet.

Modern internet is a highly commercialized system where several corporations like Facebook/Google/Twitter run highly centralized services. Most of the computers on the internet are not providing any service. It just uses them. Such computers are called "Clients" and the ones providing any service is called a "Server".

Every computer "on the internet" need to have an IP address. IP address is a 32-bit field in the IP header of the IP packet, usually represented in group of four 8-bit numbers A.B.C.D. The original designers did not anticipate that world would need more that ~3.2 million addresses. But then we have computers of various size like desktop computers, laptops, smart phones, internet connected door bells and fridges etc.. and each of them need an IP address.

As the popularity of the internet increased the 32-bit addresses started running out. Since most computers were only acting as "Clients", it was decided that something needs to be done in short term and long term.

P2P

The ideal state of the web would be peer to peer. This would eliminate use of large data centers, censorships, algorithmic timelines and manipulation of the minds of people and many other evils caused by corporate control of the internet.

In the ideal case, all data would be content addressable, replicated and seeded (a la bittorrent, ipfs, tahoe-lafs). Each peer would make their own decisions about what data to share and who it wants to share it to.

IPv4/IPv6

In the mid-1990s, people realized the problems and proposed two RFCs until a new addressing scheme is designed and implemented:

These two RFCs are the origins of network address translation (NAT). rfc1631 specifies a process to translate a single public address to multiple private addresses. rfc1918 specifies the private IP addresses and their layouts.

The new addressing scheme was IPv6, which uses 128-bit addresses, represented as 8 groups (eg: 2001:0db8:0000:0000:0000:8a2e:0370:7334). Unfortunately, IPv6 and IPv4 do not interoperate, so direct communication between IPv4 hosts and IPv6 hosts are not possible.

So, we are still stuck with IPv4 and NAT.

Network Address Translation

Somewhat similar to office telephone system. One public address with a number of extension numbers. An external caller cannot reach an internal number directly. However, any internal caller can call an external number and conversations can happen after that.

The new internet topology

Hosts within a private network can talk to each other, but hosts in two different private networks cannot talk to each other! This breaks P2P.

Network Address Translation

NAT Variations (1)

Full Cone NAT (one-to-one NAT)

NAT Variations (2)

Address Restricted Cone NAT

NAT Variations (3)

Port Restricted Cone NAT

NAT Variations (4)

Symmetric NAT

Or in other words, the other above approaches (#1-#3) preserves the source port number. But for symmetric NAT, a random port is chosen for every new connection. This makes port prediction very hard and most NAT holepunching techniques fail for this case.

If both the peers are located behind NATs, they are unable to contact each other to know the mapping.

NAT: A few usage scenarios

Both the users within the same NAT.

            Internet          
                |             
                |             
   userA <---> NAT <---> userB

Direct connections work, no need to use the relay

NAT: A few usage scenarios (Contd)

One of the users outside the NAT on public internet

         Internet
             |\
             | \
userA <---> NAT \
                 \
                 userB

Let us say, user A is in a NAT and user B is outside the NAT but on public internet.

Direct connections still work, no need to use the relay. Why? Peer-to-Peer nodes start a server and a client, each node tries to connect to each other. The succeeding socket is used for transfering files.

NAT: A few usage scenarios (Contd)

          Internet
          /       \
         /         \
     userA         userB

Both users have a public IP address. No problem.

NAT: A few usage scenarios (Contd)

            Internet
               |
               |
               |
              / \
             /   \
            /     \
           /       \
userA <-> NAT      NAT <-> userB
        

Each user is behind a different NAT - most common scenarios. Here we will need to attempt hole punching.

NAT: A few usage scenarios (Contd)

            Internet                                   Internet
               |                                          |
               |                                          |
               |                                         / \
              NAT1 (ISP)                                /   \
               |                                      NAT1  NAT2
               |                                       |     |
               |                                       |     |
              NAT2 (home router)                      /       \
              / \                                   NAT1'     NAT2'
             /   \                                  |           |
            /     \                                 |           |
           /       \                                |           |
      userA         userB                       userA            userB

ISPs provide NAT'ed IP to its customers, each customer would have a wireless router which is its own different NAT. Source IP gets re-written everytime the packet passed through the NAT router.

NAT Traversal techniques

Most popular one is the hole punching technique. So, we will discuss that first.

Hole punching

Hole punching relies on a rendezvous server (or an introducer) that knows both the parties and has a session with each of them and hence know their public and private (ip, port) pairs.

Hole punching, peers behind a common NAT

Hole punching, peers behind different NATs

Hole punching, peers behind multiple levels of NATs

NAT Traversal techniques (Continued)

Traversal Using Relays around NAT - TURN

NAT Traversal techniques (Continued)

Session Traversal Utilities for NAT - STUN

Interactive Connectivity Establishment - ICE

Others

challenges in TCP NAT Traversal.

Thanks

Questions?

Afew relevant RFCs and references to read