Securing home (part 3)
Or Setting up a gaslighting DNS
by on 2023-01-03Home is finally equipped with serious networking equipmentTM, but I was missing one core service: DNS.
Gaslighting is a colloquialism, loosely defined as manipulating someone so as to make them question their own reality.
According to Wikipedia
§ everyone gets a lie
OpenBSD and FreeBSD both ship unbound(8)
in their base:
- since 2013 for FreeBSD
- since 2014 for OpenBSD
Unbound since 1.16 can handle tags
and views that will come in handy for serving different client hosts
differently (the lying/gaslighting part). The simple idea being that a
single unbound daemon should run on my home router and serve all the
networks I have at home (grown-ups, kids, guests and IoT). I already had
a script that created a file that can be
included in unbound.conf(5)
so I tried patching it for this new environment and it has been quite an
adventure.
After a bit of experimenting, here’s an unbound.conf(5)
example that does exactly what I need it to:
server:
interface: 10.28.56.1
interface: 10.29.58.1
# performance, see https://nlnetlabs.nl/documentation/unbound/howto-optimise/
prefetch: yes
prefetch-key: yes
serve-expired: yes
rrset-cache-size: 100m
msg-cache-size: 50m
#crontab(5) contains:
# ftp -o /var/unbound/db/root.hints https://www.internic.net/domain/named.cache
root-hints: "/var/unbound/db/root.hints"
hide-identity: yes
hide-version: yes
# Perform DNSSEC validation.
auto-trust-anchor-file: "/var/unbound/db/root.key"
val-log-level: 2
# Synthesize NXDOMAINs from DNSSEC NSEC chains.
# https://tools.ietf.org/html/rfc8198
aggressive-nsec: yes
# define all tags
define-tag: "bad gambling nsfw home_whitelist iot_blacklist iot_whitelist"
# sane defaults
access-control: 0.0.0.0/0 deny
# 10.28.56.0/24 querying "bad" domains get a specific reply
# no specifics for nsfw or gambling domains
# using different A replies helps identify what went well/wrong
access-control-tag: 10.28.56.0/24 "bad"
access-control-tag-data: 10.28.56.0/24 "bad" "A 127.0.56.1"
# 10.29.58.0/24 querying "bad or nsfw" domains get a specific reply, but we
# will answer truthfully for domains with the home_whitelist tag
access-control-tag: 10.29.58.0/24 "bad nsfw home_whitelist"
access-control-tag-action: 10.29.58.0/24 "home_whitelist" always_transparent
access-control-tag-data: 10.29.58.0/24 "bad" "A 127.0.58.1"
access-control-tag-data: 10.29.58.0/24 "nsfw" "A 127.0.58.2"
# 10.30.59.0/24 are only allowed a few domains (whitelist), but not tracking
access-control-tag: 10.30.59.0/24 "bad iot_whitelist iot_blacklist"
local-zone-tag: . "iot_blacklist"
local-zone: . redirect
access-control-tag-action: 10.30.59.0/24 "iot_whitelist" transparent
access-control-tag-data: 10.30.59.0/24 "bad" "A 127.0.59.1"
access-control-tag-data: 10.30.59.0/24 "iot_blacklist" "A 127.0.59.2"
# break (NXDOMAIN) use-application-dns.net (DoH canary domain)
local-zone: use-application-dns.net static
# unbreak laposte.fr/suivi, because they are outsourcing core functionality;
local-zone-tag: cdn.tagcommander.com "home_whitelist"
local-zone: cdn.tagcommander.com redirect
# NB: tagcommander.com ends up with the "bad" tag, but our setup above
# overrides that for 10.29.58.0/24
# The generated file is included after the rest
include: out.lie-to-us
include: iot_whitelist.conf
remote-control:
control-enable: yes
control-interface: /var/run/unbound.sock
lie-to-us
produces a file that looks like:
...
local-zone: tgoogle.com redirect
local-zone-tag: tgoogle.com "bad"
local-zone: translategoogle.com redirect
local-zone-tag: translategoogle.com "bad"
local-zone: translatorgoogle.com redirect
local-zone-tag: translatorgoogle.com "bad"
local-zone: tuyulz-blogspot.googlecode.com redirect
local-zone-tag: tuyulz-blogspot.googlecode.com "bad"
local-zone: vaderkalendern.segoogle.com redirect
local-zone-tag: vaderkalendern.segoogle.com "bad"
...
#TV
local-zone-tag: netflix.com "iot_whitelist"
local-zone: netflix.com redirect
local-zone-tag: nflximg.com "iot_whitelist"
local-zone: nflximg.com redirect
...
§ but not too fast
FreeBSD and OpenBSD don’t ship GNU’s bash in their base, and I like to write scripts that “just work”TM. Using only POSIX-ish shell is usually how I achieve this goal but this time it wasn’t possible:
openbsd% time lie-to-us -o out.lie-to-us
lie-to-us -o out.lie-to-us 360.48s user 1458.17s system 87% cpu 34:49.89 total
The output file was around 2 million lines, and OpenBSD’s
sh(1)
obviously had serious issues looping over that many
lines
(while IFS= read -r _first _second _rest; do ...; done < input
).
Linux’ bash didn’t (it completed the run in less than 3 minutes).
§ what do I actually do?
lie-to-us
did two things:
- fetch and sanitize domain lists for various tags
- merge the
domain -> tag
mapping to the output file format (local-zone-tag:
)
How can I speed that up?
§ reducing input
Less data to comb through means it goes
fastTM, right? unbound(8)
is a recursive
resolver, so if it serves a lie for malware.tld
, we don’t
need to have specific data for foo.malware.tld
. The grand
plan is as follows:
- Sort domains
- Read the file line by line, remove the current line if it contains
the previous line (current line is
$prefix.$previous_line
)
Which yields:
# sort domains so that subdomains are below their parent domain
rev input | sort | rev > input.sorted
_prev_domain="thisshouldntmatch"
while IFS= read -r _domain; do
case "$_domain" in
*."$_prev_domain")
: ;;
*)
printf '%s %s\n' "$_prev_domain" "$_tag" # _tag is set beforehand
_prev_domain="$_domain" ;;
esac
done < input.sorted > output
# don't forget the last domain!
printf '%s %s\n' "$_prev_domain $_tag" >> output
Except that it’s very, very slow. On 954k lines input, it took some 16 minutes.
Let’s look elsewhere!
awk — pattern-directed scanning and processing language
Sounds promising, even if the syntax is a bit weird for a newcomer like me. Let’s go.
BEGIN {
= ""; domregex="thisshouldntmatch"
dom }
$0 !~ domregex {
if(dom != "") {
printf("%s %s\n", dom, tag)
};
=".*\\."$0; dom=$0
domregex}
END {
printf("%s %s\n", dom, tag)
}
- The
BEGIN
block sets some variables. - The
$0 !~
part checks if the current line does not match the previous domain at all; if it doesn’t match:printf()
thedomain
(and not the “subdomain”), update variables. This code section repeats for all lines of the input. - The
END
block deals with the end situation. If we redirect the output to our destination and feedawk(1)
with a tag and the input, it’s all good!
openbsd% time awk -v tag=tag 'BEGIN { dom = ""; domregex="thisshouldntmatch" }
$0 !~ domregex { if(dom != "") { printf("%s %s\n", dom, tag) }; domregex=".*\\."$0; dom=$0 }
END { printf("%s %s\n", dom, tag) }' < input.sorted > output
awk -v tag=tag < input.sorted > output 17.88s user 1.79s system 100% cpu 19.669 total
A 50× speedup, not too shabby.
§ squashing lists
After our tagged domains are all neatly ordered with their tag alongside them, we need to create a list of domains with all their tags.
porn.tld nsfw
nsfwmalware.tld bad
nsfwmalware.tld nsfw
googleadservices.com bad
local-zone: porn.tld redirect
local-zone-tag: porn.tld "nsfw"
local-zone: nsfwmalware.tld redirect
local-zone-tag: nsfwmalware.tld "bad nsfw"
local-zone: googleadservices.com redirect
local-zone-tag: googleadservices.com "bad"
Again, the shell version of the loop was excruciatingly slow. The awk version is incredibly fast.
$1 == domain { tags = tags " " $2 }
$1 != domain {
if (domain != "") {
printf("local-zone: %s redirect\nlocal-zone-tag: %s \"%s\"\n", domain, domain, tags)
}
= $1; tags = $2
domain }
END {
if (domain != "") {
printf("local-zone: %s redirect\nlocal-zone-tag: %s \"%s\"\n", domain, domain, tags)
}
}
If our current line is about the same domain as the previous line,
append the current tag to tags
; else format
domain
and tags
for the output, and update
domain and tags to that of the current line. Don’t forget the last
line.
§ final code
Heavily inspired by my previous work on lie-to-me
, lie-to-us
has a very simple interface, but only targets
unbound.conf(5)
this time. I also dropped lots of difficult
things, such as dealing with NXDOMAINs, which can be handwritten by the
operator (see example config above, or lie-to-us
’ help
text).
lie-to-us [-d] [-o out] [-i "domain [domain [...]]"] [tag!URL[!IP][(^|\n)tag!URL[!IP]...]]
lie-to-us -h
lie-to-us
is also quite speedy now, completing its run
in about 1 minute on my OpenBSD
router (34× speedup!) or 47 seconds on my FreeBSD server (the one
hosting this blog).
§ lessons learnt
unbound(8)
rocks! It has some very powerful features but the documentation doesn’t cover edge-cases (multiple tags that match, precedence, etc.) (the config in this article has been checked for validity and does what I expect it to)awk(1)
is an incredibly speedy way of parsing long files, though it too comes with its share of portability issues (gawk(1)
,nawk(1)
,…) (1, 2). However, those UNIX greybeards did know what they were doing when first creating those tools!