Reverse lookup with systemd-resolved and Consul as secondary DNS Server

Not getting any attention at https://github.com/hashicorp/consul/issues/6462
Maybe someone can take a look here, and point me to a configuration issue on my system?

Overview of the Issue

We are running Consul server on Ubuntu 18.04 in AWS. Systemd-resolved setup was followed from this guid: https://learn.hashicorp.com/consul/security-networking/forwarding#systemd-resolved-setup

We don’t have any issues resolving AWS domain or Consul domain, the issue is only related to reverse lookup. We are occasionally seeing instance resolves it’s FQDN with Consul domain.

Example of the same dig command running within few seconds interval:

ip-172-31-28-9:~$ dig -x 172.31.28.9

; <<>> DiG 9.11.3-1ubuntu1.8-Ubuntu <<>> -x 172.31.28.9
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43381
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;9.28.31.172.in-addr.arpa.	IN	PTR

;; ANSWER SECTION:
9.28.31.172.in-addr.arpa. 0	IN	PTR	ip-172-31-28-9.us-east-1.compute.internal.

;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Fri Sep 06 18:45:48 UTC 2019
;; MSG SIZE  rcvd: 108

ip-172-31-28-9:~$ dig -x 172.31.28.9

; <<>> DiG 9.11.3-1ubuntu1.8-Ubuntu <<>> -x 172.31.28.9
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63509
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;9.28.31.172.in-addr.arpa.	IN	PTR

;; ANSWER SECTION:
9.28.31.172.in-addr.arpa. 0	IN	PTR	ip-172-31-28-9.node.dc1.consul.

;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Fri Sep 06 18:45:49 UTC 2019
;; MSG SIZE  rcvd: 97

Same thing happens when trying to get FQDN in python using socket.getfqdn():

ip-172-31-28-9:~$ cat test.py
import socket
fullname = socket.getfqdn()
print(fullname)
ip-172-31-28-9:~$ python test.py
ip-172-31-28-9.node.dc1.consul
ip-172-31-28-9:~$ python test.py
ip-172-31-28-9.us-east-1.compute.internal

Config

/etc/resolv.conf

nameserver 127.0.0.53
search us-east-1.compute.internal

/etc/systemd/resolved.conf.d/10-consul.conf

[Resolve]
DNS=127.0.0.1
Domains=~consul

/etc/consul.d/agent/config.json

{
  "disable_update_check": true,
  "disable_remote_exec": true,
  "domain": "consul",
  "data_dir": "/var/lib/consul",
  "enable_syslog": true,
  "leave_on_terminate": true,
  "recursors": ["172.31.0.2"]
}

systemd-resolve --status

Global
         DNS Servers: 127.0.0.1
          DNS Domain: ~consul
          DNSSEC NTA: 10.in-addr.arpa
                      16.172.in-addr.arpa
                      168.192.in-addr.arpa
                      17.172.in-addr.arpa
                      18.172.in-addr.arpa
                      19.172.in-addr.arpa
                      20.172.in-addr.arpa
                      21.172.in-addr.arpa
                      22.172.in-addr.arpa
                      23.172.in-addr.arpa
                      24.172.in-addr.arpa
                      25.172.in-addr.arpa
                      26.172.in-addr.arpa
                      27.172.in-addr.arpa
                      28.172.in-addr.arpa
                      29.172.in-addr.arpa
                      30.172.in-addr.arpa
                      31.172.in-addr.arpa
                      corp
                      d.f.ip6.arpa
                      home
                      internal
                      intranet
                      lan
                      local
                      private
                      test

Link 2 (eth0)
      Current Scopes: DNS
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
         DNS Servers: 172.31.0.2
          DNS Domain: us-east-1.compute.internal

/etc/iptables/rules.v4

*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
-A OUTPUT -d 127.0.0.1/32 -p tcp -m tcp --dport 53 -j REDIRECT --to-ports 8600
-A OUTPUT -d 127.0.0.1/32 -p udp -m udp --dport 53 -j REDIRECT --to-ports 8600
COMMIT
*raw
:PREROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A PREROUTING -p tcp -m multiport ! --dports 80,443,3006,8080 -j NOTRACK
-A OUTPUT -s 127.0.0.0/8 -d 127.0.0.0/8 -p udp -m udp --sport 8600 -j ACCEPT
-A OUTPUT -s 127.0.0.0/8 -d 127.0.0.0/8 -p udp -m udp --dport 53 -j ACCEPT
-A OUTPUT -s 127.0.0.0/8 -d 127.0.0.0/8 -p tcp -m tcp --sport 8600 -j ACCEPT
-A OUTPUT -s 127.0.0.0/8 -d 127.0.0.0/8 -p tcp -m tcp --dport 53 -j ACCEPT
-A OUTPUT -s 127.0.0.0/8 -d 127.0.0.0/8 -j NOTRACK
COMMIT
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p tcp -m limit --limit 5/min -m tcp --dport 22 -j LOG --log-prefix "iptables-dropped: "
COMMIT

Reproduction Steps

OS: Ubuntu 18.04
Infrastructure: AWS EC2 instance with default DNS

Systemd-resolved and Iptables setup from the guid: https://learn.hashicorp.com/consul/security-networking/forwarding#systemd-resolved-setup

Run dig -x your_ip_address +short in a loop (when running it 100 times we were getting ~10 names resolved with Consul domain)

OR

Use python socket.getfqdn() to get FQDN

Consul info for both Client and Server

Client info
output from client 'consul info' command here
Server info
agent:
	check_monitors = 0
	check_ttls = 0
	checks = 2
	services = 1
build:
	prerelease =
	revision = 944cc710
	version = 1.6.0
consul:
	acl = disabled
	bootstrap = true
	known_datacenters = 1
	leader = true
	leader_addr = 10.42.10.40:8300
	server = true
raft:
	applied_index = 5732
	commit_index = 5732
	fsm_pending = 0
	last_contact = 0
	last_log_index = 5732
	last_log_term = 8
	last_snapshot_index = 0
	last_snapshot_term = 0
	latest_configuration = [{Suffrage:Voter ID:e5552921-5a4d-3180-8e8d-8275b5151833 Address:10.42.10.40:8300}]
	latest_configuration_index = 1
	num_peers = 0
	protocol_version = 3
	protocol_version_max = 3
	protocol_version_min = 0
	snapshot_version_max = 1
	snapshot_version_min = 0
	state = Leader
	term = 8
runtime:
	arch = amd64
	cpu_count = 4
	goroutines = 80
	max_procs = 4
	os = linux
	version = go1.12.8
serf_lan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 1
	event_time = 2
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 8
	members = 1
	query_queue = 0
	query_time = 1
serf_wan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 1
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 5
	members = 1
	query_queue = 0
	query_time = 1

Operating system and Environment details

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION=“Ubuntu 18.04.1 LTS”

systemd --version
systemd 237
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid

AWS instance with default DNS.

Resolved by adding dnsmasq between systemd-resolved and Consul as per https://github.com/hashicorp/consul/issues/4155#issuecomment-465928992