File Change Detection and Remediation

Today we are going to talk about something a bit less technical than our last post, but also very important for anyone that manages a website: how to detect and address file changes. Why? Well, reviewing your file changes is one of the best ways to detect and prevent your website from infections and hacking. When your website is hacked, the attacker or the malware could either change original files on your website or add new ones. Checking and monitoring any file changes on your server will allow you to take action right away in case anything suspicious happens. We’ll discuss how exactly how we can do that with more details in this post and share with you some tips and tricks for staying on top of changing files.

Fortunately, we don’t have to do this manually. Today there are tools like CodeGuard and many plugins that can help you track file changes automatically for sites built with WordPress, Joomla, Drupal, or anything else.

A little bit about hashing

To explain how file monitoring software works I’d like to first talk about cryptographic hash functions a.k.a. hashing. You might not know what this is, but I can assure that you have used it at least once in your life. 

“A hash function is any function that can be used to map data of arbitrary size to data of fixed size” – Wikipedia

What a hash function does is produce a unique string of letters and numbers that represent the input file. If you’ve heard the terms checksum, digest or signature, the value that the hash function returns is similar. These functions are very common in the security world and they are used to validate integrity and authentication. The most common hashing algorithms are MD5 and SHA, which have many versions like SHA-1, SHA-2 and even SHA-3 which was released in 2015 by NIST. Although MD5 and SHA-1 are not considered safe anymore to be used as cryptographic hash functions, they are still widely used as a checksum to verify data integrity, which is exactly what we need here.

Visual representation of hash function – source: https://i.stack.imgur.com/WEKK4.png

Another characteristic is that they are considered one-way functions, meaning that if you only have the hash value or message digest, you won’t be able to determine what text or file generated it. Also, the algorithm is designed in a way that, if you change a single bit (bit, not byte) of the file or the string you are using, the hash value should be completely different from the last one.

One common place you may have seen these values used is with file downloads. Some software providers, like Ubuntu, will provide hash values for their releases. After downloading these large files, you can calculate the hash locally and make sure it matches the one that the vendor published.

Ubuntu files MD5 hashes

To summarize, the ideal hash function should have these five attributes:

  • The same message must always result in the same hash value.
  • It should be quick to generate the hash value for any size of message or file.
  • It has to be infeasible to generate the original message from its hash value except by brute-forcing all possible messages.
  • Any change to a message should change the hash value completely.
  • It should be infeasible to find two different messages with the same hash value.

File Monitoring Locally

File monitoring software usually uses hash functions to detect if the file was changed or not. What they do is they generate a list of hashes of every file on your server, then they keep checking from time to time if the file still has the same hash from before by generating the same hash again and comparing with the last one. If they don’t match it means that the file was modified in some way. If you don’t have proper logs on your system it will be hard to tell exactly what was changed and by who, but if you don’t remember changing any file recently, you might take a look and compare that file with the last one you have from your backups. With WordPress core files, for example, you can just compare it with a fresh file from the same version you have on your site since those files aren’t supposed to be modified. It’s worth noting that changing only the filename of a file won’t change the hash value since the hash is based on the content of the file, not the name itself.

For WordPress, there are some plugins that will do that for you like WordPress File Monitor. There are similar plugins for other content management systems. You can also use logging software and intrusion detection systems (IDS) like OSSEC. They are a little harder to install and configure but will do much more than just monitoring files.

File Monitoring at CodeGuard

At CodeGuard, we take a similar approach to monitoring files but there are some important differences since we are monitoring remotely via FTP or SFTP. Calculating file hashes requires direct access to the file content. The software and plugins mentioned previously are running on the server where the content is hosted, so they have direct access to the files. Using remote protocols like FTP or SFTP would require that the client downloads the file from the host to calculate the hash. This process can be time-consuming and would require a substantial amount of bandwidth to transfer all file content from every website every day, for a daily backup. To ensure fast and reliable backups, we look at file metadata to generate a list of possible file changes. Then we only download that subset of possibly changed files for closer inspection and hashing. This allows us to perform backups with industry leading speed and performance without impacting the underlying host.

When relevant file changes are detected during a backup, CodeGuard will send a ChangeAlert email with a prioritized list of the files that have changed.

Suspicious File Changes

At CodeGuard we have a list of Top Suspicious Files Changes which are reported by customers through our ChangeAlert email notifications. Below are the files that our customers have found most concerning recently. If you’re not sure which files to focus on in your ChangeAlert emails or the file changes reported by another system, these files are a good place to start.

A recent list of suspicious file reports.

As you can see the index.php and .htaccess file are the most reported ones. Those files can change for legitimate reasons, but they are also very powerful and a frequent target of malicious actors, so you have to watch them closely. In general, if these files change and were not directly related to a change that you made or the result of a controlled update, then you should consider remediation. The same applies for any other files that appear to be related to plugins, themes or content management system modules. The only times those files should change is during a controlled automatic or manual update, or by an authorized individual.

Remediation

Ultimately, if you see changes that were not made by you, someone that manages your website or your host, either manually or via an automatic update of some kind, then you should take corrective action. For CodeGuard customers, that means using a backup to restore to a prior date. For those that don’t use CodeGuard, you can attempt to manually revert the change based on comparing the file with one that is known to be good or by working with a developer to review the changed file.

Regardless of how you remediate the file change, it’s good practice to immediately update all of the components on your website (plugins, themes, libraries, modules, etc) and any other websites in the same hosting account. You should also consider changing the passwords for your hosting account, FTP/SFTP, and any applications like content management systems on the host if you believe the change was malicious.

Leave a comment

Your email address will not be published. Required fields are marked *