Monday, September 16, 2019

Understanding and transitioning to ClamAV's new On-Access scanner

We have a new On-Access scanner for ClamAV that separates functionality from clamd into a new application called clamonacc.

This post is for technically inclined users who have used ClamAV’s On-Access scanner in the past (0.99 - 0.101.3), and wish to transition to a newer version (>= 0.102.0). While this overview may be somewhat useful for new On-Access users, we first recommend setting up your environment using the official documentation, then returning here only if your use case is not met.

This post is also for anyone who may simply be trying to install ClamAV from source, on an older system, with an older version of Curl. If that’s the case, skip ahead to the section titled The Breakdown for your fix.

Things That Haven’t Changed

With a change this big, it’s easier to start with what’s the same. Here’s a list of the important things:
  • Fanotify and inotify still required
  • Clamd still needs to be run and clamd.conf still used by default
  • All working “OnAccessXYZ” clamd.conf configuration options still valid and work as expected
  • Only Linux systems are supported

The New Stuff

Now for the real reason you’re here: what’s different and how that affects you. Well, let’s start with the differences, and then I’ll break down each item to help you gain a fuller understanding of the new system.
  • Curl (version >= 7.45) required for installation
  • VirusEvent and Extra Scanning features re-enabled
  • Client application called clamonacc which interfaces with a clamd server
  • Command-line options
  • Separate and cleaner logging
  • Configuration option for excluding users via username
  • Configurable multi-threaded event handling architecture
  • Configuration options which allow tweaks to network communication and error handling

The Breakdown

Curl (version >= 7.45) required for installation:

This is only relevant if you are installing from source, but it is worth noting. If your curl version is out of date, the installation will fail with an error message stating that you need a version of curl >= 7.45 when you run:

$ > ./configure

If your OS package maintainers do not provide a version of curl newer than 7.45, we recommend installing the latest version of curl (and its headers) from the source.

Alternatively, if you don’t need On-Access capabilities, you can skip installation on your system using the “./configure” flag “--disable-clamonacc”. If you are using a non-Linux system, installation of clamonacc will automatically be disabled.

VirusEvent and Extra Scanning features re-enabled:

Previous versions of the On-Access Scanner had disabled the VirusEvent and Extra Scanning features. The VirusEvent feature allowed users to kick-off a custom shell script whenever clamd found a malicious object. Extra Scanning was a feature tied to inotify which used its expanded and more mature event detection to fill the gaps left (at the time) by fanotify event coverage. With Extra Scanning enabled, users can catch "create" and "move to" events, which up until kernel version 5.1, were not available for capture with the fanotif api. Without Extra Scanning, On-Access scanning will capture all "access" and "open" events only.

Both VirusEvent and Extra Scanning features were disabled due to resource consumption issues when running the On-Access Scanner for long periods of time. However, the new On-Access Scanner has been re-architected with a long-running use-case at the forefront. As a result, it is more reliable, error tolerant, and much, much better at cleaning up after itself. All of this allows us to re-enable the Extra Scanning feature with confidence.

Similarly, due to the new separation between clamd and clamonacc, VirusEvent scripts should now work as expected. This is not so much, “re-enabling the feature” as it is a direct (albeit planned and intended) result of this new separation. Like with clamdscan, VirusEvent will be kicked off by the clamd process, not the new clamonacc application.

Client application called clamonacc which interfaces with a clamd server:

The biggest change to On-Access Scanning is its separation from the clamd server application. With this separation comes more flexibility in deployment options, better stability and up-time for both applications, and a much improved potential attack surface.

Regarding flexibility, the application can be run on the same machine as a clamd instance, or for resource sensitive deployments clamonacc can “phone home” to a central clamd instance. Even better, multiple clamonacc instances on multiple systems can all receive verdicts from a single, centrally located clamd instance. This offloads verdicts to a single location, and scanning/protection tasking to a much lighter-weight application.

However, while such a deployment is possible, it requires streaming over a TCP socket connection, which comes with a number of drawbacks. First, this version of ClamAV requires users to secure their own TCP sockets. We are moving to change this in the future, (the new curl requirement is a step in that direction) but it’s still important to note. Second, the version of clamonacc (and clamd) released with 0.102.0 is not optimized for sending files and receiving verdicts via a network stream. While there are plans to alleviate this, expect full file contents to be sent across the configured socket each time clamonacc requires a clamd verdict. This will obviously have a network impact on a distributed deployment. Third, and finally, caching still needs to be implemented on the clamonacc client side to reduce the number of overall network scan requests.

All that said, smart network engineering, and a targeted clamonacc configuration which only watches necessary files/directory and excludes the right UIDs and/or unames might let you mitigate or overcome these hurdles quite nicely.

Another benefit to this separation is increased stability for both clamd and clamonacc. During our testing, clamonacc was able to recover gracefully from just about every issue that arose--whether anticipated or not--while still providing necessary protections. Similarly during the course of development and testing, clamd was not affected by any clamonacc failure. That said, this does not mean that clamonacc cannot affect clamd at all, or vice-versa. These applications do not exist in a vaccuum and must necessarily interact with one another during normal operation.

With that in mind, one major goal of this rework was improving clamd’s security posture. In versions prior to 0.102, On-Access Scanning was tied directly into clamd, and thus required users to run clamd with elevated privileges (often root). This came with a host of security concerns given the size of clamd's attack surface. By separating clamonacc from clamd, a system admin need only ensure clamd has the read and access permissions necessary to deal with any file descriptors clamonacc may pass along. Of course, clamonacc still requires elevated permissions due to the fanotify interfaces used, but compared to clamd, clamonacc's attack surface is much smaller.

Command-line options:

In order of appearance when you run clamonacc with “--help” these are the command line options and their uses:

   --help

As one would expect, prints the version number, a command line usage example, and a very abbreviated explanation of each available command line option, alongside their shorter forms.

    --version              

Attempts a connection to the clamd server and requests clamd’s version, such that a version mismatch between server and client might be identified. If a clamd server is not found, the local client version is printed to the console instead.

    --verbose

This is akin to clamd’s or clamscan’s --debug option, but isn’t quite so noisy as either of those. By default, clamonacc does not print any output after daemonizing, so you will have to pair this option with --log or --foreground to use it.

    --log=FILE

FILE should be a full path to the logfile you wish clamonacc to use. Without this option, clamonacc will not keep a log. With this option, clamonacc will only output some information to the console if --foreground is enabled. As of the release of 0.102, it is a known bug that clamonacc lacks log rotation.

    --foreground

Forces clamonacc not to daemonize into the background and instead print output and verdicts to the console.

    --watch-list=FILE

This is the command line analogue to the “OnAccessIncludePath” configuration option. The file provided via FILE will be parsed at startup and all valid paths will have watch points placed on them. FILE must be a proper path, it must be a text file, and each path in the text file must be a full path to a valid directory. You must separate multiple paths in the text file with a newline. If you run clamonacc with --verbose, it will let you know if you got any of this wrong, but it will still startup, choosing to ignore invalid input instead of failing out.

    --exclude-list=FILE

This is the command line analogue for “OnAccessExcludePath”. Everything that holds true for --watch-list holds true for --exclude-list, except the end result is that the provided paths within the text file will not have watch points placed on them when clamonacc starts up.

    --remove

Works the same way as clamdscan’s --remove option. In the event that a file is found to be malicious, clamonacc will make a best attempt at removal.

    --move=DIRECTORY
    --copy=DIRECTORY

Works as you'd expect, each also sharing clamdscan’s core functionality. If clamd returns with a malicious, the clamonacc process will either move or copy it into the given path. These three options are mutually exclusive.

    --config-file=FILE

When loading configuration options, clamonacc checks for clamd.conf in ClamAV’s default install location. You can force clamonacc to use a configuration file in a location of your choice by using this option instead. This option is especially useful if you have broken up clamd and clamonacc configuration options into their own separate files.

    --allmatch

Every time a scan request is made, clamonacc will tell the clamd server to run in all-match mode when rendering verdicts.

    --fdpass

This is a niche option with an unclear usecase, but we preserved in case older clamdscan users may know of a specific usecase we do not. Generally, if you are running clamd on the same system as clamonacc, you will be using a local unix socket and file descriptor passing is enabled by default. One theoretical (untested) use, is passing file descriptors along a socket between containers or between a container and the host.

    --stream 

Typically, the only time you would use this option is when you could otherwise pass file descriptors instead. Even if clamonacc and clamd were optimized for streaming, file descriptor passing would be the better, and faster method. It’s only use (besides debugging), is avoiding permission issues that arise when passing file descriptors to clamd.

Separate and cleaner logging:

On-Access Scanning no longer uses the same log file as clamd. To make clamonacc print its output to a logfile, run clamonacc with the command “--log=FILE” where “FILE” is the name you wish to give the log file. Without this command, by default, clamonacc will fork into the background without printing any output. Regardless of whether a log file has been specified, Clamonacc will still protect your system according to any configurations made and all command line options passed. And no matter the logging situation, all VirusEvents will trigger from clamd as expected.

If you do choose to enable logging, know that On-Access logging has been cleaned up considerably in the move from 0.101 to 0.102. After startup, you will see only verdicts for malicious files and errors in their log. That’s it.

If the “--verbose” command is supplied at startup, significantly more output will be available to you. This information is primarily useful for troubleshooting purposes and developers. Therefore, only consider using it if you run into a recurring problem during application runtime.

Configuration option for excluding users via username:

A feature included on user request, this allows simple exclusion of any user and more flexible permission management. The option to use this feature is called “OnAccesExcludeUname” and you can use it as many times as you’d like.

Another exclusion useful option that existed in 0.101 and continues to exist in 0.102, but may seem out of place to some users, is “OnAccessExcludeRootUID”, which is a boolean option that--as it says on the box--will exclude all events triggered by a processes under the root UID “0” from being scanned. This option was added strictly as a workaround to an option parsing limitation, which entirely disabled the “OnAccessExcludeUID” option when set to “0”.

Configurable multi-threaded event handling architecture:

Clamonacc has been re-architectured to follow a multi-supplier, single-consumer queue model for event processing. It accomplishes this by keeping an active thread pool to handle verdict receipts, which is managed by a thread that kicks off work for the pool whenever new entries are added to the event queue it maintains. Currently, that event queue is set up to be fed and grown with distilled information from fanotify, and inotify event monitoring threads, but in theory, the event queue could very easily be fed from other sources down the road--should the need or desire arise.

The clamonacc will startup with five worker threads available to consume events from the queue. However, if your system has the resources for it, you can drastically improve the performance of clamonacc by raising that number with the “OnAccessMaxThreads” options. If you do this, you will likely also want to increase values on “MaxThreads” and “MaxQueue” as well to ensure your clamd instance can keep up.

Configuration options which allow tweaks to network communication and error handling:

With the separation came increased inter-process complexity. And with that complexity arose more potential error cases. Of particular note are the new configuration options surrounding network communications between clamd and clamonacc applications. Two options are provided for tweaking network communication behavior to better suit your environment:
  • OnAccessCurlTimeout
  • OnAccessRetryAttempts
By default, each connection attempt made by clamonacc will timeout after five seconds and will not attempt to reconnect. In case of connection failure or timeout due to known, intermittent network constraints, you may force clamonacc to reattempt the connection by setting the OnAccessRetryAttempts to the number of retries you’d like clamonacc to make before giving up and reporting an error.

Users experienced with the prevention will now be wondering what happens in such a case? Will the file remain locked? Will clamonacc release its access hold automatically in case of failure?

Clamonacc is configured to allow all access attempts if an error occurs while prevention is enabled. However, you can change this behavior by enabling the “OnAccessDenyOnError” configuration option. When this option is enabled alongside “OnAccessPrevention”, clamonacc will deny process access to a file if any error is encountered during the scanning process.

As you can imagine, this is potentially a very dangerous setting and must be used with care to avoid locking your system out of important resources due to something so mundane as a clamd permission issue, or a brief network outage.

Wrap Up

That’s the bulk of it. A lot has changed from a technical standpoint, and while the amount of information shared above might seem overwhelming at first glance, from an operational standpoint there isn’t too much more you need to worry about. Be mindful of your UID/uname excludes, make sure clamd has the right permissions, lock down your TCP ports, be aware of your resource limitations and the knobs you’ve been given to tweak software performance, and you should have your deployment up in no time.

Finally, as I said before, if there’s anything that changed which I didn’t go over above, please leave a comment below so I can address your concern.

Happy clamming.