alice_bob_1

WireGuard

WireGuard is a next generation, cross-platform VPN technology created by Jason A. Donenfeld that has quickly become a popular alternative to the beefy, complex IPSec and SSL VPN solutions used for years. As a testament to its success it has recently been merged into the Linux Kernel as of v5.6. It is also available as a kernel module or as a user space application written in Go or Rust. If you are new to WireGuard I recommend checking out the conceptual overview as this will be helpful for following along through the rest of the post.

Dynamically Addressed Peers

One of the genius parts of WireGuard is its concept of crypto-key routing. Crypto-key routing defines an association between a public key, and a list of IP addresses (Allowed IPs). The list of IP addresses permits (inbound) and routes (outbound) packets inside of a WireGuard tunnel. This association encompasses the total minimum configuration for a peer, there is no static peer endpoint IP address required from a tunnel validation standpoint. This allows for built-in IP roaming on both sides, assuming peer addresses don’t change simultaneously OR there is a reliable means for signaling when they do.

The DNS can be leveraged to support dynamically addressed peers as various WireGuard utilities will resolve DNS names when configuring a peer, and there are supporting scripts that can be used to periodically re-resolve peer addresses. Awesome! This sounds promising… however:

What happens if both peers are behind a NAT that we don’t control? i.e. no static port-forwarding.
How can we discover not only IP addresses but ports? The existing utilities do not support this.

In this post we will set out to establish a WireGuard tunnel between dynamically addressed peers that are both sitting behind a NAT. One of the primary goals for achieving this is to stick with WireGuard in its purest form, the code that now ships with the Linux Kernel. We do not want to compromise it in any fashion to achieve our goals, although we could get very creative with its user space implementation.

hub-and-spoke

Some of you may be thinking why not use a hub-and-spoke model? Surely we can just create tunnels from Alice and Bob to a statically addressed, NAT-free central hub. Alice and Bob can route through the hub.

hub_spoke

This is a totally valid approach and one that is widely used today. However, for our use case we are not interested for the following reasons:

With many peers a hub becomes a vertical scaling bottleneck. (think IoT, connected cars, robotics, etc.)
Sending all of our data through the hub may be costly.
The hub may introduce considerable latency between peers.

Traversing the NAT

Now that we’ve outlined the problem, it’s time to dive in. If we are going to establish a WireGuard tunnel directly between Alice and Bob we need to be able to traverse the NATs in front of them. Since WireGuard works over UDP, UDP hole punching is our best bet for accomplishing this.

UDP hole punching exploits the fact that most NATs are lenient when matching inbound packets against existing “connections”. This allows us to re-use port state for punching our way back in. If Alice sends a UDP packet to a new host, Carol, and Bob has knowledge of the outbound source IP and port Alice’s NAT used during translation, Bob can reach Alice by sending a UDP packet towards this IP:port pair (2.2.2.2:7777 in the illustration below).

hole_punch * Our hole punching example describes a full-cone NAT. There are limitations with other, less common NAT types where this method does not work.

STUN

So we now know how UDP hole punching works. Great, but that still leaves us with open questions.

How does Alice discover her external IP:port?
How does Alice communicate this to Bob?
How do we make this work in the context of WireGuard?

RFC5389 Session Traversal Utilities for NAT (STUN) defines a protocol to answer some of these questions. It’s a lengthy RFC, so I’ll do my best to summarize. It’s important to note that STUN is not a drop-in solution to the problem we are trying to solve:

STUN by itself is not a solution to the NAT traversal problem. Rather, STUN defines a tool that can be used inside a larger solution. The term “STUN usage” is used for any solution that uses STUN as a component.

— RFC5389¹

stun

STUN is a client/server protocol. In the example above Alice is acting as the client and Carol is the server. Alice sends a STUN Binding request to Carol. When the Binding request passes through Alice’s NAT, the source IP:port gets rewritten. Once Carol receives the Binding request, she copies the source IP:port from the layer 3 and layer 4 headers into the payload of the Binding response and sends it to Alice. The Binding response passes back through Alice’s NAT at which point the destination IP:port gets rewritten, but the payload remains untouched. Alice receives the Binding response and becomes aware her external IP:port for this socket is 2.2.2.2:7777.

As previously pointed out, STUN is not a complete solution. STUN provides a mechanism for an application to understand its external IP:port when behind a NAT, but STUN does not provide a method for signaling this to interested parties. If we were writing an application from the ground up that required NAT traversal capabilities, STUN is a component we should consider. We are not writing WireGuard, it already exists, and its not something we can modify (see goal about leaving its source untouched). So where does that leave us? We can certainly take some concepts from STUN and use them to achieve our goal. We clearly need an external, statically addressed host for discovering UDP holes that we can punch through.

An existing PoC

Back in August 2016, the creator of WireGuard, shared a NAT hole punching PoC/Example on the WireGuard mailing list. Jason’s example contains a client and server application. The client is intended to be run alongside WireGuard, and the server runs on the statically addressed host for IP:port discovery. The client uses a raw socket for communicating with the server:

/* We use raw sockets so that the WireGuard interface can actually own the real socket. */
sock = socket(AF_INET, SOCK_RAW, IPPROTO_UDP);
if (sock < 0) {
	perror("socket");
	return errno;
}

As pointed out in the comment, WireGuard owns the “real socket”. By using a raw socket the client is able to spoof the source port used by WireGuard when communicating with the server. This ensures the source IP:port seen at the server will map back to the WireGuard socket on the NAT when punching back in.

The client uses a classic BPF filter on its raw socket for filtering replies from the server destined to the WireGuard port:

static void apply_bpf(int sock, uint16_t port, uint32_t ip)
{
	struct sock_filter filter[] = {
		BPF_STMT(BPF_LD + BPF_W + BPF_ABS, 12 /* src ip */),
		BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ip, 0, 5),
		BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 20 /* src port */),
		BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, PORT, 0, 3),
		BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 22 /* dst port */),
		BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, port, 0, 1),
		BPF_STMT(BPF_RET + BPF_K, -1),
		BPF_STMT(BPF_RET + BPF_K, 0)
	};
	struct sock_fprog filter_prog = {
		.len = sizeof(filter) / sizeof(filter[0]),
		.filter = filter
	};
	if (setsockopt(sock, SOL_SOCKET, SO_ATTACH_FILTER, &filter_prog, sizeof(filter_prog)) < 0) {
		perror("setsockopt(bpf)");
		exit(errno);
	}
}

The data communicated between the client and server is defined in the packet and reply structures:

struct {
    struct udphdr udp;
    uint8_t my_pubkey[32];
    uint8_t their_pubkey[32];
} __attribute__((packed)) packet = {
    .udp = {
        .len = htons(sizeof(packet)),
        .dest = htons(PORT)
    }
};
struct {
    struct iphdr iphdr;
    struct udphdr udp;
    uint32_t ip;
    uint16_t port;
} __attribute__((packed)) reply;

The client works by iterating over the configured WireGuard peers (wg show <interface> peers) and sending a packet to the server for each one. The my_pubkey and their_pubkey fields are populated appropriately. When the server receives a packet from the client, it performs an upsert of an entry against an in-memory table of peers keyed by public key, using the my_pubkey field. It then performs a lookup against the same table using the their_pubkey field. If an entry is found from the second lookup, the peer’s IP:port are sent in a reply to the client. When a reply is received by the client, the ip and port fields are unpacked and the peer’s endpoint address is configured (wg set <interface> peer <key> <options...> endpoint <ip>:<port>).

The entry structure:

struct entry {
	uint8_t pubkey[32];
	uint32_t ip;
	uint16_t port;
};

The IP and port fields in the entry structure are populated from the IP and UDP headers in the packet received from the client. Each time a client requests the IP and port information for a peer, its own IP and port information is refreshed in the peer table.

The example client and server applications are great examples for how WireGuard can be part of a UDP hole punching solution. We know its possible, so let’s build on this knowledge and move closer to something that is more suited for the real world. Specifically, let’s aim to do this in a way that is cross-platform and does not require a custom wire protocol for IP:port discovery. Our solution should be as simple as possible. Opening a raw socket may not be possible on all of our peers, and we may not be able to leverage BPF filters, either. A custom wire protocol is difficult to debug without custom debugging tools, so let’s use something that is mature with existing, widespread tooling.

Doubling down on WireGuard

In the previous section we explored an example client that used Linux-specific networking features to spoof packets from the socket owned by WireGuard. Instead of building a custom application around WireGuard to open and observe IP:port mappings on a NAT, let’s just use WireGuard. WireGuard tunnels are extremely lightweight, we can simply build a tunnel to our statically addressed peer for IP:port discovery.

alice_bob_registry_1

The figure above may remind you of the hub-and-spoke diagram. The main difference here is that we do not intend to route through the Registry peer. The Registry peer has a WireGuard interface with an IPv4 address of 10.0.0.254/32. Alice and Bob’s configuration has been updated to reflect this. They will only accept packets from and route packets to 10.0.0.254/32 via this peer.

The WireGuard tunnel with the Registry peer opens up a hole on Alice’s and Bob’s NAT for them to connect with each other. Now we need a way to query the IP:port of those holes from the Registry peer. For this purpose we are going to use the DNS protocol. DNS is relatively simple, mature (circa 1987), cross-platform, and happens to define a record type, the SRV (service) record, for locating services, i.e. identifying their IP address and port. DNS-Based Service Discovery RFC6763 expands on this record type with a concrete structure and query pattern for discovering services under a given domain. We can leverage these semantics for our use case.

CoreDNS

Now that we have a service discovery protocol picked out, we need a way to connect it with WireGuard. CoreDNS is a plugin-based DNS server written in Go. It’s a graduated CNCF project and happens to be the DNS server for Kubernetes. Let’s write a CoreDNS plugin that takes DNS-SD queries and returns information about associated WireGuard peers. Public keys will be used for the record names (Alice & Bob in our example), and jordanwhited.net will serve as the zone. For those familiar with Bind-style zone files, you can expect zone data something like this:

_wireguard._udp         IN PTR          alice._wireguard._udp.jordanwhited.net.
_wireguard._udp         IN PTR          bob._wireguard._udp.jordanwhited.net.
alice._wireguard._udp   IN SRV 0 1 7777 alice.jordanwhited.net.
alice                   IN A            2.2.2.2
bob._wireguard._udp     IN SRV 0 1 8888 bob.jordanwhited.net.
bob                     IN A            3.3.3.3

Base64 vs Base32 representation of public keys

Up until this point we have been using the pseudonyms Alice and Bob in place of WireGuard public keys. WireGuard public keys are Base64 encoded where a textual representation is required (configuration files, utility output, etc…). So instead of Alice, we would have a string that is 44 bytes long:

$ wg genkey | wg pubkey
UlVJVmPSwuG4U9BwyVILFDNlM+Gk9nQ7444HimPPgQg=

The Base 64 encoding is designed to represent arbitrary sequences of octets in a form that allows the use of both upper- and lowercase letters but that need not be human readable.

— RFC4648²

Unfortunately for us, DNS has case-insensitive behavior where we intend to put public keys:

Each node in the DNS tree has a name consisting of zero or more labels [STD13, RFC1591, RFC2606] that are treated in a case insensitive fashion.

— RFC4343³

Base32 on the other hand, while producing a slightly longer string (56 bytes), will allow us to represent WireGuard public keys inside of the DNS:

The Base 32 encoding is designed to represent arbitrary sequences of octets in a form that needs to be case insensitive but that need not be human readable.

— RFC4648⁴

You can convert back and forth between encoding formats at the command line using the base32 and base64 utilities. For example:

$ wg genkey | wg pubkey > pub.txt
$ cat pub.txt
O9rAAiO5qTejOEtFbsQhCl745ovoM9coTGiprFTaHUE=
$ cat pub.txt | base64 -D | base32
HPNMAARDXGUTPIZYJNCW5RBBBJPPRZUL5AZ5OKCMNCU2YVG2DVAQ====
$ cat pub.txt | base64 -D | base32 | base32 -d | base64
O9rAAiO5qTejOEtFbsQhCl745ovoM9coTGiprFTaHUE=

We now have a case-insensitive public key encoding that is DNS compatible.

Building the plugin

CoreDNS has documentation on writing plugins. Besides setup and configuration parsing, a plugin must implement the plugin.Handler interface:

type Handler interface {
    ServeDNS(context.Context, dns.ResponseWriter, *dns.Msg) (int, error)
    Name() string
}

I’ve gone ahead and implemented a CoreDNS plugin to provide WireGuard peer information via DNS-SD semantics called wgsd. wgsd is an “external” plugin and needs to be enabled at compile-time. There are two methods for loading external plugins documented here. We will use the Build with compile-time configuration file method.

First we need to enable the plugin in plugin.cfg:

$ diff -u plugin.cfg.orig plugin.cfg
--- plugin.cfg.orig     2020-05-13 20:32:56.000000000 -0700
+++ plugin.cfg  2020-05-13 12:24:22.000000000 -0700
@@ -54,6 +54,7 @@
 k8s_external:k8s_external
 kubernetes:kubernetes
 file:file
+wgsd:github.com/jwhited/wgsd
 auto:auto
 secondary:secondary
 etcd:etcd

Then we can build CoreDNS:

$ go generate
$ go build
$ ./coredns -plugins | grep wgsd
  dns.wgsd

Once CoreDNS is compiled, the wgsd plugin can be configured as follows:

.:53 {
  wgsd <zone> <wg device>
}

If this looks foreign to you, check out the CoreDNS configuration manual.

We’re ready to test the plugin! Here’s our configuration:

$ cat Corefile
.:53 {
  debug
  wgsd jordanwhited.net. utun4
}

For our test we have a single WireGuard peer configured with an endpoint of 3.3.3.3:8888:

$ sudo wg show
interface: utun4
  listening port: 52022

peer: mvplwow3agnGM8G78+BiJ3tmlPf9gDtbJ2NdxqV44D8=
  endpoint: 3.3.3.3:8888
  allowed ips: 10.0.0.2/32

Let’s list the available peers:

$ dig @127.0.0.1 _wireguard._udp.jordanwhited.net. PTR +noall +answer +additional

; <<>> DiG 9.10.6 <<>> @127.0.0.1 _wireguard._udp.jordanwhited.net. PTR +noall +answer +additional
; (1 server found)
;; global options: +cmd
_wireguard._udp.jordanwhited.net. 0 IN  PTR     TL5GLQUMG5VATRRTYG57HYDCE55WNFHX7WADWWZHMNO4NJLY4A7Q====._wireguard._udp.jordanwhited.net.

Now that we’ve discovered a peer, we can fetch its endpoint information:

$ dig @127.0.0.1 TL5GLQUMG5VATRRTYG57HYDCE55WNFHX7WADWWZHMNO4NJLY4A7Q====._wireguard._udp.jordanwhited.net. SRV +noall +answer +additional

; <<>> DiG 9.10.6 <<>> @127.0.0.1 TL5GLQUMG5VATRRTYG57HYDCE55WNFHX7WADWWZHMNO4NJLY4A7Q====._wireguard._udp.jordanwhited.net. SRV +noall +answer +additional
; (1 server found)
;; global options: +cmd
tl5glqumg5vatrrtyg57hydce55wnfhx7wadwwzhmno4njly4a7q====._wireguard._udp.jordanwhited.net. 0 IN SRV 0 0 8888 TL5GLQUMG5VATRRTYG57HYDCE55WNFHX7WADWWZHMNO4NJLY4A7Q====.jordanwhited.net.
TL5GLQUMG5VATRRTYG57HYDCE55WNFHX7WADWWZHMNO4NJLY4A7Q====.jordanwhited.net. 0 IN A 3.3.3.3

🎉 🎉 🎉 It works! 🎉 🎉 🎉

Let’s double check to make sure our keys match up:

$ sudo wg show utun4 peers
mvplwow3agnGM8G78+BiJ3tmlPf9gDtbJ2NdxqV44D8=
$ dig @127.0.0.1 _wireguard._udp.jordanwhited.net. PTR +short | cut -d. -f1 | base32 -d | base64
mvplwow3agnGM8G78+BiJ3tmlPf9gDtbJ2NdxqV44D8=

👍 👍 👍

Reviewing the communications flow

We’re almost there, here’s a recap of the communications flow:

Alice starts off by establishing a tunnel with the Registry. Bob simultaneously does the same thing. Next, the wgsd-client (still to be implemented) on Alice queries our CoreDNS plugin (wgsd) running on the Registry. The plugin retrieves Bob’s endpoint information from WireGuard and returns it to the wgsd-client. The wgsd-client then sets Bob’s endpoint value. Finally, a WireGuard tunnel is established directly between Alice and Bob.

Any reference to an “established tunnel” simply means a handshake occurred and that packets may flow between peers. Note that while WireGuard does have a handshake mechanism, it is more of a connection-less protocol than you may think:

Any secure protocol require some state to be kept, so there is an initial very simple handshake that establishes symmetric keys to be used for data transfer. This handshake occurs every few minutes, in order to provide rotating keys for perfect forward secrecy. It is done based on time, and not based on the contents of prior packets, because it is designed to deal gracefully with packet loss.

— wireguard.com/protocol⁵

We are now ready to implement our final piece, the wgsd-client.

Implementing wgsd-client

The wgsd-client is responsible for keeping peer endpoint configuration up to date. It retrieves the list of configured peers, queries CoreDNS for matching public keys, and then sets the endpoint value for each peer if needed. Our initial implementation is intended to be run periodically via cron or similar scheduling mechanism. It checks all peers once in a serialized fashion and then exits, it is not a daemon. We can improve upon this later, but for now let’s start with something simple.

The source for wgsd-client is available in the same repo as wgsd under cmd/wgsd-client.

We’re ready to test out our solution. In our tests we will have Alice and Bob behind NAT, and a Registry peer with no NAT. Alice is connected to an LTE provider, Bob is connected to a residential ISP, and Registry is an EC2 instance. Here are the public keys of all three peers:

Peer	Public Key	Tunnel Address
Alice	xScVkH3fUGUv4RrJFfmcqm8rs3SEHr41km6+yffAHw4=	10.0.0.1
Bob	syKB97XhGnvC+kynh2KqQJPXoOoOpx/HmpMRTc+r4js=	10.0.0.2
Registry	JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=	10.0.0.254

And here is the initial WireGuard configuration & state for each peer:

Alice

jwhited@Alice:~$ sudo cat /etc/wireguard/utun4.conf
[Interface]
Address = 10.0.0.1/32
PrivateKey = 0CtieMOYKa2RduPbJss/Um9BiQPSjgvHW+B7Mor5OnE=
ListenPort = 51820

# Registry
[Peer]
PublicKey = JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=
Endpoint = 4.4.4.4:51820
PersistentKeepalive = 5
AllowedIPs = 10.0.0.254/32

# Bob
[Peer]
PublicKey = syKB97XhGnvC+kynh2KqQJPXoOoOpx/HmpMRTc+r4js=
PersistentKeepalive = 5
AllowedIPs = 10.0.0.2/32
jwhited@Alice:~$ sudo wg show
interface: utun4
  public key: xScVkH3fUGUv4RrJFfmcqm8rs3SEHr41km6+yffAHw4=
  private key: (hidden)
  listening port: 51820

peer: JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=
  endpoint: 4.4.4.4:51820
  allowed ips: 10.0.0.254/32
  latest handshake: 48 seconds ago
  transfer: 1.67 KiB received, 11.99 KiB sent
  persistent keepalive: every 5 seconds

peer: syKB97XhGnvC+kynh2KqQJPXoOoOpx/HmpMRTc+r4js=
  allowed ips: 10.0.0.2/32
  persistent keepalive: every 5 seconds

Bob

jwhited@Bob:~$ sudo cat /etc/wireguard/wg0.conf
[sudo] password for jwhited:
[Interface]
Address = 10.0.0.2/32
PrivateKey = cIN5NqeWcbreXoaIhR/4wgrrQJGym/E7WrTttMtK8Gc=
ListenPort = 51820

# Registry
[Peer]
PublicKey = JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=
Endpoint = 4.4.4.4:51820
PersistentKeepalive = 5
AllowedIPs = 10.0.0.254/32

# Alice
[Peer]
PublicKey = xScVkH3fUGUv4RrJFfmcqm8rs3SEHr41km6+yffAHw4=
PersistentKeepalive = 5
AllowedIPs = 10.0.0.1/32
jwhited@Bob:~$ sudo wg show
interface: wg0
  public key: syKB97XhGnvC+kynh2KqQJPXoOoOpx/HmpMRTc+r4js=
  private key: (hidden)
  listening port: 51820

peer: JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=
  endpoint: 4.4.4.4:51820
  allowed ips: 10.0.0.254/32
  latest handshake: 26 seconds ago
  transfer: 1.54 KiB received, 11.75 KiB sent
  persistent keepalive: every 5 seconds

peer: xScVkH3fUGUv4RrJFfmcqm8rs3SEHr41km6+yffAHw4=
  allowed ips: 10.0.0.1/32
  persistent keepalive: every 5 seconds

Registry

jwhited@Registry:~$ sudo cat /etc/wireguard/wg0.conf
[Interface]
Address = 10.0.0.254/32
PrivateKey = wLw2ja5AapryT+3SsBiyYVNVDYABJiWfPxLzyuiy5nE=
ListenPort = 51820

# Alice
[Peer]
PublicKey = xScVkH3fUGUv4RrJFfmcqm8rs3SEHr41km6+yffAHw4=
AllowedIPs = 10.0.0.1/32

# Bob
[Peer]
PublicKey = syKB97XhGnvC+kynh2KqQJPXoOoOpx/HmpMRTc+r4js=
AllowedIPs = 10.0.0.2/32
jwhited@Registry:~$ sudo wg show
interface: wg0
  public key: JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=
  private key: (hidden)
  listening port: 51820

peer: xScVkH3fUGUv4RrJFfmcqm8rs3SEHr41km6+yffAHw4=
  endpoint: 2.2.2.2:41424
  allowed ips: 10.0.0.1/32
  latest handshake: 6 seconds ago
  transfer: 510.29 KiB received, 52.11 KiB sent

peer: syKB97XhGnvC+kynh2KqQJPXoOoOpx/HmpMRTc+r4js=
  endpoint: 3.3.3.3:51820
  allowed ips: 10.0.0.2/32
  latest handshake: 1 minute, 46 seconds ago
  transfer: 498.04 KiB received, 50.59 KiB sent

With active tunnels between Alice - Registry and Bob - Registry we should be able to query endpoint information:

jwhited@Alice:~$ dig @4.4.4.4 -p 5353 _wireguard._udp.jordanwhited.net. PTR +noall +answer +additional

; <<>> DiG 9.10.6 <<>> @4.4.4.4 -p 5353 _wireguard._udp.jordanwhited.net. PTR +noall +answer +additional
; (1 server found)
;; global options: +cmd
_wireguard._udp.jordanwhited.net. 0 IN  PTR     YUTRLED535IGKL7BDLERL6M4VJXSXM3UQQPL4NMSN27MT56AD4HA====._wireguard._udp.jordanwhited.net.
_wireguard._udp.jordanwhited.net. 0 IN  PTR     WMRID55V4ENHXQX2JSTYOYVKICJ5PIHKB2TR7R42SMIU3T5L4I5Q====._wireguard._udp.jordanwhited.net.
jwhited@Alice:~$ dig @4.4.4.4 -p 5353 YUTRLED535IGKL7BDLERL6M4VJXSXM3UQQPL4NMSN27MT56AD4HA====._wireguard._udp.jordanwhited.net. SRV +noall +answer +additional

; <<>> DiG 9.10.6 <<>> @4.4.4.4 -p 5353 YUTRLED535IGKL7BDLERL6M4VJXSXM3UQQPL4NMSN27MT56AD4HA====._wireguard._udp.jordanwhited.net. SRV +noall +answer +additional
; (1 server found)
;; global options: +cmd
yutrled535igkl7bdlerl6m4vjxsxm3uqqpl4nmsn27mt56ad4ha====._wireguard._udp.jordanwhited.net. 0 IN SRV 0 0 41424 YUTRLED535IGKL7BDLERL6M4VJXSXM3UQQPL4NMSN27MT56AD4HA====.jordanwhited.net.
YUTRLED535IGKL7BDLERL6M4VJXSXM3UQQPL4NMSN27MT56AD4HA====.jordanwhited.net. 0 IN A 2.2.2.2

Perfect, now we can run the wgsd-client on Alice & Bob.

jwhited@Alice:~$ sudo ./wgsd-client -device=utun4 -dns=4.4.4.4:5353 -zone=jordanwhited.net.
2020/05/20 13:24:02 [JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=] no SRV records found
jwhited@Alice:~$ ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2): 56 data bytes
64 bytes from 10.0.0.2: icmp_seq=0 ttl=64 time=173.260 ms
^C
jwhited@Alice:~$ sudo wg show
interface: utun4
  public key: xScVkH3fUGUv4RrJFfmcqm8rs3SEHr41km6+yffAHw4=
  private key: (hidden)
  listening port: 51820

peer: syKB97XhGnvC+kynh2KqQJPXoOoOpx/HmpMRTc+r4js=
  endpoint: 3.3.3.3:51820
  allowed ips: 10.0.0.2/32
  latest handshake: 2 seconds ago
  transfer: 252 B received, 264 B sent
  persistent keepalive: every 5 seconds

peer: JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=
  endpoint: 4.4.4.4:51820
  allowed ips: 10.0.0.254/32
  latest handshake: 1 minute, 19 seconds ago
  transfer: 184 B received, 1.57 KiB sent
  persistent keepalive: every 5 seconds

jwhited@Bob:~$ sudo ./wgsd-client -device=wg0 -dns=4.4.4.4:5353 -zone=jordanwhited.net.
2020/05/20 13:24:04 [JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=] no SRV records found
jwhited@Bob:~$ sudo wg show
interface: wg0
  public key: syKB97XhGnvC+kynh2KqQJPXoOoOpx/HmpMRTc+r4js=
  private key: (hidden)
  listening port: 51820

peer: xScVkH3fUGUv4RrJFfmcqm8rs3SEHr41km6+yffAHw4=
  endpoint: 2.2.2.2:41424
  allowed ips: 10.0.0.1/32
  latest handshake: 22 seconds ago
  transfer: 392 B received, 9.73 KiB sent
  persistent keepalive: every 5 seconds

peer: JeZlz14G8tg1Bqh6apteFCwVhNhpexJ19FDPfuxQtUY=
  endpoint: 4.4.4.4:51820
  allowed ips: 10.0.0.254/32
  latest handshake: 1 minute, 14 seconds ago
  transfer: 2.08 KiB received, 17.59 KiB sent
  persistent keepalive: every 5 seconds

wgsd-client discovered the endpoint addresses and configured them. The tunnel between Alice and Bob is operational!

Summary

alice_bob_registry_2

We’ve successfully established a WireGuard tunnel directly between two peers constrained by NAT. Our solution uses pre-existing protocols and service discovery techniques, along with pluggable server code. Debugging it is as simple as running dig or nslookup. It’s cross-platform and does not interfere with or require modifications to WireGuard proper.

We’ve only gotten started, however. While fully functional, our CoreDNS plugin can certainly be improved, unit tested, and documented further. Same with the client. Extending this model for full-blown WireGuard configuration management is worthy of its own post.

Security considerations need to be made as well. Should the CoreDNS server only be available over the tunnel with the Registry? Should the zone be signed? Do we need to query WireGuard peer information for every DNS query or can this be cached? All questions worth thinking about.

The code for the solution is open source and available on Github. Please try it out, report bugs, and contribute!

WireGuard Endpoint Discovery and NAT Traversal using DNS-SD

2020/05/20