(new Soapbox())->shout(array_map('strtoupper', $opinions)); //Shaun's blog


Me, elsewhere

GitHub
parseword
Miscellaneous public code

snuze
A PHP API client for Reddit

Twitter
@parseword
I don't tweet much

XMPP chat
xmpp@shaunc.com
(Pidgin, Miranda, Swift, etc.)


Perfect is the enemy of good enough.

chrony improves client stats output for easier abuse detection

Posted August 25, 2020 by shaun

A new pre-release version of chrony (4.0-pre3) is out today, with some neat improvements to the client statistics output. Specifically, the chronyc clients command accepts three new options:

  • -p, minimum packet threshold
  • -r, reset client stat counters
  • -k, replace control stats with NTS stats

The first two will be useful to anyone who uses chronyd for a public-facing time server.

One constant challenge of operating any public service is detecting and reacting to abuse. What happens when some gomer decides to synchronize his clock to your NTP server 100 times a second, or a vendor bug floods you with 20Kpps at random intervals? How — and how quickly — would you notice? chronyd can be configured to enforce rate limiting, but it still has to receive the packets and decide to ignore them. That's a drain on resources which is better handled by locating abusive hosts and blocking them at the firewall.

In this post I'll compare a couple of approaches to spotting excessive NTP queries, and demonstrate how the new chronyc clients parameters can be used to simplify monitoring.

chrony 3.x abuse detection (the old way)

Previously, my process for keeping tabs on NTP involved this nightly cron job:

#!/usr/bin/bash
STAMP=$(date -d "12 hours ago" +"%Y-%m-%d")

/usr/bin/chronyc -c clients > /tmp/ntp/ntpclients.${STAMP}.csv

cut -d',' -f1-6 /tmp/ntp/ntpclients.${STAMP}.csv \
    | sort -t, -k2,2 -nr \
    | grep -E "[[:digit:]]{5,}," \
    | mail -s "[$(hostname)] Aggressive NTP client report: ${STAMP}" root

/usr/bin/systemctl restart chronyd

This dumps chrony's client stats to a CSV file, ranks the clients by packets received, and mails a report showing which NTP clients sent 10,000 or more packets yesterday. (10K isn't necessarily excessive, it was just easier to match 5 digits than to perform numeric comparisons.) The notification message looks like this:

From: root@server.example
To: root@server.example
Subject: [server.example] Aggressive NTP client report: 2020-08-24

63.98.240.2,149756,0,-5,127,11896
4.79.238.58,19035,0,-5,127,51468
4.14.252.174,18674,0,-5,127,51468
12.183.201.66,137475,0,-5,127,28980
198.35.18.181,12576,0,-2,127,43594
...

Despite flagging some obvious undesirables, there are a couple of weaknesses in this approach.

The first problem is that the service has to be restarted in order to flush the counters each night. Secondly, with chrony configured for clientloglimit 16777216, stats are only being retained for ~100,000 clients at a time. I never bothered digging into chrony's eviction logic, so I don't know which client stats are being dropped, just that hundreds of thousands are. If it's freceny based, then short-lived egregious abuse early in the day might disappear from the stats by the time the job runs, and I'd never be the wiser.

chrony 4.0 abuse detection (the new way)

The new chronyc clients options available in the 4.0-pre3 release are helpful for spotting abuse in a more timely fashion. The -p flag, combined with a numeric value, will only display clients who have sent at least the specified number of packets to the server. And the -r flag will reset the packet counters immediately, obviating the need to restart the service.

Here's a quick look at how -p works. First, let's see how many total clients are in the stats cache:

[root@tock /home/files/chrony-4.0-pre3]# chronyc -n clients | wc -l
102179

With prior versions of chrony, listing these clients was an all-or-nothing affair; you'd have to dump all 102,179 of those records somewhere and process them, as in the cron job above. Now, it's easy to list only those clients that meet a certain threshold. For example, who's sent at least 1,000 packets to the service:

[root@tock /home/files/chrony-4.0-pre3]# chronyc -n clients -p 1000
Hostname                      NTP   Drop Int IntL Last     Cmd   Drop Int  Last
===============================================================================
12.190.239.218               2031      0  -2   -     0       0      0   -     -
45.228.139.65                1220      0   0   -     1       0      0   -     -
187.210.9.188                1628      0  -2   -   234       0      0   -     -

The size of the output here is much easier to work with, and since it doesn't have to be parsed or sliced up, the pretty formatting can stay intact. Given a reasonable threshold value, this is also suitable for watch-ing in a terminal.

Using the new -p and -r options, I was able to refine and improve my abuse monitoring process. Instead of doing a stats run once per day, and potentially losing visibility into a portion of the day's activity, now I've set up an hourly job instead:

#!/usr/bin/bash
STAMP=$(date +"%Y-%m-%d-%H:%M")

if /usr/bin/chronyc -n clients -p 6000 | grep 0 >/dev/null; then
    /usr/bin/chronyc -n clients -r -p 6000 \
        | mail -s "[$(hostname)] Aggressive NTP client report: ${STAMP}" root
else
    /usr/bin/chronyc -n clients -r >/dev/null
fi

This looks for clients exceeding 6,000 NTP packets in the last hour, and mails a notice if any were found. In either case, the stat counters get reset to 0 with the -r option, so each execution only examines an hour's worth of data. Not only will potential abuse get detected more rapidly, I don't have to restart chronyd anymore.



Recent articles

📰 chrony improves client stats output for easier abuse detection

📰 Resolving PHP error "Fatal error: strict_types declaration must not use block mode"

📰 Resolving "Not using downloaded repomd.xml because it is older than what we have" yum error

📰 Resolving subversion error E125001: Couldn't determine absolute path of '.'

📰 Caveat with Vantec SATA/IDE to USB 2.0 Adapter and Macrium software

📰 Jay Niffley, Man of Mystery

📰 160.1.30.97: Multi-protocol scanning activity from Amazon GovCloud

📰 Compiling Doxygen on FreeBSD without LaTeX and Ghostscript

📰 Introducing Snuze, a PHP client for the Reddit API

📰 jisusaiche: Java's installer telemetry

📰 BIND client log error "query_find: query_getdb failed"

📰 Resolving "The lang/perl5.24 port has been deleted: Has expired" portmaster error

📰 Armagaddon2 interim fix for Firefox 56 and other old versions

📰 Strange DNS queries: qname "miep", qtype ANY

📰 Resolving "x_tables: ip_tables: udp match: only valid for protocol 17" iptables error

▲ Back to top | Permalink to this page