(new Soapbox())->shout(array_map('strtoupper', $opinions)); //Shaun's blog

Me, elsewhere

Miscellaneous public code

I don't tweet much

XMPP chat
(Pidgin, Miranda, Swift, etc.)

Website integrity monitoring through version control

Posted September 20, 2018 by shaun

This month has brought news of two very high-profile website breaches involving credit card theft:

  • On September 6th, British Airways announced that its payment website had been compromised for a 15-day period spanning late August into September.

  • On September 19th it was revealed that PC retailer NewEgg suffered a nearly identical breach, that one affecting transactions for more than a month.

In these attacks, a group dubbed MageCart first gained illicit access to each victim's web servers, then inserted malicious code into JavaScript files included by their payment processing applications. That extra code secretly siphoned customer payment data off to a separate server under the attackers' control. Between the two companies, it's likely that credit card details were stolen from more than a million transactions.

Nobody noticed? For weeks

MageCart has racked up many other victims, often by attacking third-party JavaScript libraries. It's understandable, if not entirely forgivable, that a company may not notice tampering with a vendor library hosted off-site. What struck me about the British Airways and NewEgg attacks is that, in both cases, the compromised code appears to have gone undetected on their own servers for several weeks. My comment on Twitter sums up my reaction:

It's unnerving that these injections go undetected for so long. Do folks not monitor their deployments? Everything I'm responsible for, right down to my dinky blog, sends me hourly emails if anything is out of sync with its corresponding repository. | 12:14 PM - 19 Sep 2018

I wanted to expand upon that thought here, and present the strategy I use a bit more thoroughly than Twitter allows.

There are probably a hundred ways to monitor a deployment for unwanted changes; notify and pynotify and gamin and monocle and on down the list. All of these utilities will do the job, but my approach is to monitor using tools that are always already available on my servers: version control to check for tampering, and PHP to automate those checks.

Every website should be a repository

More specifically, every deployed instance of your website should be a checked-out copy of the trunk (or master) of a repository. Git, subversion, it doesn't really matter which system you use, so long as you're using one and the canonical copy of your website lives there. Setting up a version control server is beyond the scope of this document, there are a lot of tutorials covering every VCS/OS combination.

As with any other project, development on your website takes place in branches while the trunk (or master) holds the current live version. With the exception of tweaking config files, no changes should ever be made manually inside a document root. The only way you touch any files there should be with svn up, git pull, or equivalent. If your infrastructure is cloud-based, the script or playbook you use to deploy new instances should include a step to automatically perform a fresh checkout.

What goes into the website repository? As a rule of thumb, almost everything your team creates should get versioned. HTML, application code (PHP, JSP, Go, etc.), CSS style sheets, JavaScript, configuration files, any images that comprise the site design. Leave out user-generated content, and leave out any large binary data like inventory images or product manuals. If those things are served out of the same document root, set your .svnignore or .gitignore to disregard them; better yet, shuffle them off to a static subdomain.

Deny web access to repository metadata

If you check out a repository into your document root, that path may contain a .git or .svn directory (or a bunch of them, if you use an older version of svn). Make sure your httpd process is explicitly forbidden from serving these directories.

In Apache, use a <Files> directive:

    <Files ~ "\.(git|svn)$">
        Require all denied

In nginx, use a location stanza:

    location ~ /\.(git|svn) {
        deny all;

Set up automated version control status checks over SSH

Your version control system has a built-in way to figure out which files have been modified in a given working copy: svn status, git status, hg status, and so on. While these commands are normally used to determine what changes need to be committed, they're also perfectly suitable for the opposite task, making sure nothing has changed. If an attacker fiddles with any of your website's files, this will be signaled in the output of a status check.

To that end, I suggest creating a cron job that remotely runs git status or svn status over SSH against every single deployed copy of your repository on an hourly basis, if not more frequently. When the output from any command indicates file changes, interpret that scenario as potential tampering. The script should send an alert to your team, your ticket tracking system, your emergency pager alias, wherever it's going to be seen and acted upon quickly.

Note: For the following commands to work seamlessly, you need to get your SSH keys in order to enable passwordless SSH across all of the involved systems. A utility called ssh-copy-id exists on many systems to facilitate this process. If you don't set this up properly, your script will hang or timeout waiting for a password.

The syntax to check a repository's status remotely is:

    ssh -t user@example.com "cd /foo/bar && git status" 2>/dev/null

    ssh -t user@example.com "svn status /foo/bar/" 2>/dev/null

When a subversion repository is clean and unaltered, the command will produce no output at all.

When a git repository is clean and unaltered, the output looks like this:

    # On branch master
    nothing to commit (working directory clean)

Any divergence from these results should raise a red flag and trigger an alert.

Sample scripts

You can build your own monitoring job, or feel free to use mine.


I'm not so good at bash scripting, so I use a PHP script called remote-repository-check to automate this whole process. You don't need to have PHP installed on any of the target servers, just the one where this job runs. Configuration is straightforward; just define a new element of the $repos array for each repository you want to monitor.

I have this set to run every hour through cron.


While not directly a part of repository monitoring, I have another utility called find-repos that might be useful, especially if you deploy multiple sites to each server. This script will find all of the root-level git and subversion repositories on a server. You can use its output to assist in configuring the remote-repository-check script.

Don't sweat the overhead

As a final thought, if you're concerned about the performance impact of running so many repository checks, don't be. Modern version control is fast and efficient at comparing deltas. As an example, as of this writing, the FreeBSD source tree currently holds 199984 files. It only takes ~6 seconds to run svn status against the entire collection, and that's on hardware from 2010. Your repository probably isn't that big, and your equipment probably isn't that old.

Recent articles

📰 Resolving subversion error E000013: Unable to create pristine install stream

📰 Enhancements to SmokePing's AnotherDNS probe

📰 Generating vanity DNSSEC key tags

📰 DDoS involving forged packets from

📰 Website integrity monitoring through version control

📰 SpamAssassin 3.4.2 fixes security problems, adds HashBL and phishing plugins

📰 Bug or turf war? ICQ via Pidgin now fails with "startOSCARSession: Request Timeout"

📰 🎂

📰 SFSQuery, a PHP class to query the StopForumSpam API and DNSBL

📰 Resolving portmaster error "pkg-static: automake-1.16.1 conflicts with automake-wrapper-20131203"

📰 Resolving LibreNMS error "RuntimeException: The only supported ciphers are AES-128-CBC and AES-256-CBC with the correct key lengths"

📰 Fast, but not so accurate (yet)

📰 autodiscover.xml as an Indicator of Attack

📰 Blocking Facebook's Tracking and Surveillance: A Comprehensive Approach

📰 Let's Encrypt Readies for Certificate Transparency with Embedded SCTs

▲ Back to top | Permalink to this page