Tuesday, December 3, 2019

By Micah Snyder.

Today I'm very excited, and a little bit nervous, to unveil Mussels. Mussels is a cross-platform, general-purpose dependency build automation tool. You might compare it with Vcpkg, Conan, or Buildout. It serves a similar purpose, but the approach is a little different.


How to get Mussels

The Mussels project is hosted on GitHub under Cisco-Talos/Mussels.

By the time this blog post is published, you should be able to install Mussels from PyPI using Pip. You may also clone the Mussels Git repository and use Pip to install it locally.

Install Mussels from PyPI:

python3 -m pip install --user mussels

Install Mussels from a Git clone:

python3 -m pip install --user .

The origin of Mussels

Mussels is something I've been crafting as a hobby project in support of ClamAVⓇ. The need for a dependency management tool became obvious as we were actively engaged in upgrading ClamAV's external dependencies, specifically for Windows builds. Historically, ClamAV maintained a collection of third-party code that was copy-pasted into our own repository with custom Visual Studio project files created to build these libraries. This approach worked and made it simple enough to compile the project.

Maintenance, however, was a bit of a nightmare, so we decided to separate these dependencies out. This meant we'd have new things to build as dependencies of ClamAV, however, and things only got worse the moment we decided to add libcurl as a hard requirement in support of HTTP 1.1/2.0, and TLS/SSL.

All of a sudden, we went from ClamAV requiring just OpenSSL to be built separately from ClamAV, to having to build:
  • zlib
  • bzip2
  • pthread-w32
  • libjson-c
  • OpenSSL (depends on zlib)
  • libxml2 (depends on zlib)
  • libpcre2 (depends on bzip2, zlib)
  • libssh2 (depends on OpenSSL, zlib)
  • NGHTTP2 (depends on libxml2, OpenSSL, zlib)
  • libcurl (depends on OpenSSL, NGHTTP2, libssh2, zlib)
It clearly isn't feasible to build all of this without some sort of automation. Most of these libraries are actively maintained projects that see new releases every couple of months. And it certainly isn't something we wanted to maintain source code copies of inside the ClamAV source repository.

Mussels started out as something really simple that was largely focused on building dependencies for ClamAV on Windows. There were a couple of occasions, however, when I needed to build some new combination of libraries. I quickly threw together a few Mussels recipes to make it happen. With Mussels, things just worked. I'd type mussels build clamav_deps and it was off to the races while I'd wander away to grab some tea and marvel at how easy it all was. That was when I realized I really wanted to productize this Mussels-thing and make it available for the general public. I've spent as much time as I could afford to make Mussels ready for public consumption once I received approval to make it open-source.

How Mussels works

Mussels is intended to simplify the process of building complex applications that have lengthy dependency chains without having to write all new CMake, Meson, Bazel, XCode, or Visual Studio project files. Instead, you write (and share) simple recipes that leverage the original build systems intended by software authors of your external library dependencies.

Recipes are YAML files that detail how to build a given library or application. A recipe defines where to get the software source archive, what other recipes the software depends on for the build, what tools are required to build the software, and of course what commands to run to perform the build.

A simple example recipe:

name: zlib
version: "1.2.11"
url: https://www.zlib.net/zlib-1.2.11.tar.gz
mussels_version: "0.1"
type: recipe
platforms:
  Linux:
    host:
      build_script:
        configure: |
          cmake . -DCMAKE_INSTALL_PREFIX="{install}/{target}"
        make: |
          cmake --build . --config Release
        install: |
          make install
      dependencies: []
      install_paths:
        license/zlib:
          - README
      required_tools:
        - cmake
        - make
        - gcc


Like a recipe, developers may define tools in YAML as well. Tools describe how to identify if a given build tool (like GCC, Cmake, Python, Java, etc.) exists on the current machine.

A simple tool definition:

name: gcc
version: ""
mussels_version: "0.1"
type: tool
platforms:
  Posix:
    path_checks:
      - gcc
    command_checks:
      - command: "gcc --version"
        output_has: "gcc"
    file_checks:
      - /usr/local/bin/gcc
      - /usr/bin/gcc


Recipes and tools may be shared in Git repositories that we call "cookbooks."

Cookbooks can either be public Git repositories, private Git repositories, or simply a local directory containing recipe and tool YAML files. It's really easy to create a new recipe or tool by cloning existing recipes and tools to your local directory and then customizing them to suit your project's needs.

Mussels works on macOS, Linux/Unix, and Windows operating systems. Though it was originally written in support of building C-based application libraries, it's flexible and can be extended to build and assemble any software package.

Running a build with Mussels is as simple as this:
  1. Identify the recipe you wish to build using:
    mussels update
    mussels list -a
  2. Verify that the recipe will suit your needs with:
    mussels recipe show <recipe-name> -V
  3. Clone the recipe to your current directory:
    mussels recipe clone <recipe-name>

    Or, choose to trust the cookbook which provides the recipe:
    mussels cookbook trust <cookbook-name>
  4. Do a dry run, to see what all will be built:
    mussels build <recipe-name> --dry-run
    If you are missing tools required for the build, Mussels will tell you want you will need.
  5. Run the build!
    mussels build <recipe-name>
Hop over to the Mussels README page on GitHub to learn more.
 
You can also join us on Discord.

I hope you will give Mussels a try, and I hope you'll find a way to use Mussels to make development on your project a little bit easier.

For those who have been interested to build the latest version of ClamAV on Windows from source without having to build and assemble all of the dependencies by hand, we've also added the ClamAV Mussels cookbook to the Mussels bookshelf, making our own recipes publicly available for anyone to use or copy.

Friday, November 22, 2019

This serves as notice that we are planning on publishing a new main.cvd and a cdiff Monday, November 25, 2019.

In the past we notified our mirror maintainers to let them know it was going to be a hit on their bandwidth, but now that we have a CDN, the effect should be minimal.  However, we still wanted to give our end users a heads up just in case any questions come up around why ClamAV is taking a bit longer to reload that particular update.

After that update restarts and reloads should happen much faster, and the daily downloads will again shrink.

Wednesday, November 20, 2019

Today we are publishing two patch versions, 0.102.1 and 0.101.5.  Both of these can be found on ClamAV's downloads page, with 0.102.1 as the main release and 0.101.5 under "Previous Stable Releases."

0.102.1

ClamAV 0.102.1 is a security patch release to address the following issues.


  • Fix for the following vulnerability affecting 0.102.0 and 0.101.4 and prior:
    • CVE-2019-15961:
      • A Denial-of-Service (DoS) vulnerability may occur when scanning a specially crafted email file as a result of excessively long scan times. The issue is resolved by implementing several maximums in parsing MIME messages and by optimizing use of memory allocation. Reported by Joran Dirk Greef, Ronomon, Cape Town
  • Build system fixes to build clamav-milter, to correctly link with libxml2 when detected, and to correctly detect fanotify for on-access scanning feature support.
  • Signature load time is significantly reduced by changing to a more efficient algorithm for loading signature patterns and allocating the AC trie. Patch courtesy of Alberto Wu.
  • Introduced a new configure option to statically link libjson-c with libclamav. Static linking with libjson is highly recommended to prevent crashes in applications that use libclamav alongside another JSON parsing library.
  • Null-dereference fix in email parser when using the --gen-json metadata option.
  • Fixes for Authenticode parsing and certificate signature (.crb database) bugs.


Special thanks to the following for code contributions and bug reports:

- Alberto Wu
- Joran Dirk Greef
- Reio Remma

0.101.5

ClamAV 0.101.5 is a security patch release that addresses the following issues.


  • Fix for the following vulnerability affecting 0.102.0 and 0.101.4 and prior:
    • CVE-2019-15961:
      • A Denial-of-Service (DoS) vulnerability may occur when scanning a specially crafted email file as a result of excessively long scan times. The issue is resolved by implementing several maximums in parsing MIME messages and by optimizing use of memory allocation.
  • Added the zip scanning improvements found in v0.102.0 where it scans files using zip records from a sorted catalogue which provides deduplication of file records resulting in faster extraction and scan time and reducing the likelihood of alerting on non-malicious duplicate file entries as overlapping files.
  • Signature load time is significantly reduced by changing to a more efficient algorithm for loading signature patterns and allocating the AC trie. Patch courtesy of Alberto Wu.
  • Introduced a new configure option to statically link libjson-c with libclamav. Static linking with libjson is highly recommended to prevent crashes in applications that use libclamav alongside another JSON parsing library.
  • Null-dereference fix in email parser when using the --gen-json metadata option.


Special thanks to the following for code contributions and bug reports:

- Alberto Wu
- Joran Dirk Greef

Please join us on the ClamAV mailing lists for further discussion!  Thanks!

Wednesday, October 2, 2019

Today we are excited to release ClamAV 0.102.0.

Users who tested the 0.102.0 release candidate may note that the 0.102.0 release includes a handful of minor bug fixes and improvements over the release candidate. These include:
  • Improved zlib, and iconv detection when running ./configure.
  • Fixed detection of the libcurl version and c-ares dependency required for the LocalIP freshclam config option. 
  • Fixed bug in file copy routine that caused a failure when attempting to update freshclam using a DatabaseCustomURL with "file://"
  • Added ./configure --enable-libclamav-only option, for those wishing to bypass building of libfreshclam and the ClamAV CLI applications. This option also bypasses the libcurl dependency requirement.
Release materials for ClamAV 0.102.0 can be found on the ClamAV's downloads site.

Release Notes

ClamAV 0.102.0 includes an assortment of improvements and a few significant changes.

Major changes

  • The On-Access Scanning feature has been migrated out of clamd and into a brand new utility named clamonacc. This utility is similar to clamdscan and clamav-milter in that it acts as a client to clamd. This separation from clamd means that clamd no longer needs to run with root privileges while scanning potentially malicious files. Instead, clamd may drop privileges to run under an account that does not have super-user. In addition to improving the security posture of running clamd with On-Access enabled, this update fixed a few outstanding defects:
    • On-Access scanning for created and moved files (Extra-Scanning) is fixed.
    • VirusEvent for On-Access scans is fixed.
    • With clamonacc, it is now possible to copy, move, or remove a file if the scan triggered an alert, just like with clamdscan
    • For details on how to use the new clamonacc On-Access scanner, please refer to the user manual on ClamAV.net, and please read our blog post entitled "Understanding and transitioning to ClamAV's new On-Access scanner."
  • The freshclam database update utility has undergone a significant update. This includes:
    • Added support for HTTPS.
    • Support for database mirrors hosted on ports other than 80.
    • Removal of the mirror management feature (mirrors.dat).
    • An all new libfreshclam library API.

Notable changes

  • Added support for extracting ESTsoft .egg archives. This feature is new code developed from scratch using ESTsoft's Egg-archive specification and without referencing the UnEgg library provided by ESTsoft. This was necessary because the UnEgg library's license includes restrictions limiting the commercial use of the UnEgg library.
  • The documentation has moved
    • Users should navigate to ClamAV.net to view the documentation online.
    • The documentation will continue to be provided in HTML format with each release for offline viewing in the docs/html directory.
    • The new home for the documentation markdown is in our ClamAV FAQ Github repository.
  • To remediate future denial of service conditions caused by excessive scan times, we introduced a scan time limit. The default value is two minutes (120,000 milliseconds).

    To customize the time limit:
    • use the clamscan --max-scantime option
    • use the clamd MaxScanTime config option
  • Libclamav users may customize the time limit using the cl_engine_set_num function. For example:

    cl_engine_set_num(engine, CL_ENGINE_MAX_SCANTIME, time_limit_milliseconds)

Other improvements

  • Improved Windows executable Authenticode handling, enabling both whitelisting and blacklisting of files based on code-signing certificates. Additional improvements to Windows executable (PE file) parsing. Work courtesy of Andrew Williams.
  • Added support for creating bytecode signatures for Mach-O and ELF executable unpacking. Work courtesy of Jonas Zaddach.
  • Re-formatted the entire ClamAV code-base using clang-format in conjunction with our new ClamAV code style specification. See the clamav.net blog post for details.
  • Integrated ClamAV with Google's OSS-Fuzz automated fuzzing service with the help of Alex Gaynor. This work has already proven beneficial, enabling us to identify and fix subtle bugs in both legacy code and newly developed code.
  • The clamsubmit tool is now available on Windows.
  • The clamscan metadata feature (--gen-json) is now available on Windows.
  • Significantly reduced number of warnings generated when compiling ClamAV with "-Wall" and "-Wextra" compiler flags and made many subtle improvements to the consistency of variable types throughout the code.
  • Updated the majority of third-party dependencies for ClamAV on Windows. The source code for each has been removed from the clamav-devel repository. This means that these dependencies have to be compiled independently of ClamAV. The added build process complexity is offset by significantly reducing the difficulty of releasing ClamAV with newer versions of those dependencies.
  • During the 0.102 development period, we've also improved our Continuous Integration (CI) processes. Most recently, we added a CI pipeline definition to the ClamAV Git repository. This chains together our build and quality assurance test suites and enables automatic testing of all proposed changes to ClamAV, with customizable parameters to suit the testing needs of any given code change.
  • Added a new clamav-version.h generated header to provide version number macros in text and numerical format for ClamAV, libclamav, and libfreshclam.
  • Improved cross-platform buildability of libxml2. Work courtesy of Eneas U de Queiroz with supporting ideas pulled from the work of Jim Klimov.

Bug fixes

  • Fix to prevent a possible crash when loading LDB type signature databases and PCRE is not available. Patch courtesy of Tomasz Kojm.
  • Fixes to the PDF parser that will improve PDF malware detection efficacy. Patch courtesy of Clement Lecigne.
  • Fix for regular expression phishing signatures (PDB R-type signatures).
  • Various other bug fixes.

New Requirements

  • Libcurl has become a hard-dependency. Libcurl enables HTTPS support for freshclam and clamsubmit as well as communication between clamonacc and clamd.
  • Libcurl version >= 7.45 is required when building ClamAV from source with the new On-Access Scanning application (clamonacc). Users on Linux operating systems that package older versions of libcurl (e.g. all versions of CentOS and Debian versions <= 8) have a number of options:
    • Wait for your package maintainer to provide a newer version of libcurl.
    • Install a newer version of libcurl from source.
    • Disable installation of clamonacc and On-Access Scanning capabilities with the ./configure flag --disable-clamonacc.
  • Non-Linux users will need to take no actions as they are unaffected by this new requirement.

Acknowledgements

The ClamAV team would like to thank the following individuals for their code submissions:
  • Alex Gaynor
  • Andrew Williams
  • Carlo Landmeter
  • Chips
  • Clement Lecigne
  • Eneas U de Queiroz
  • Jim Klimov
  • Joe Cooper
  • Jonas Zaddach
  • Markus Kolb
  • Orion Poplawski
  • Ørjan Malde
  • Paul Arthur
  • Rick Wang
  • Romain Chollet
  • Rosen Penev
  • Thomas Jarosch
  • Tomasz Kojm
  • Tuomo Soini

Monday, September 16, 2019

We have a new On-Access scanner for ClamAV that separates functionality from clamd into a new application called clamonacc.

This post is for technically inclined users who have used ClamAV’s On-Access scanner in the past (0.99 - 0.101.3), and wish to transition to a newer version (>= 0.102.0). While this overview may be somewhat useful for new On-Access users, we first recommend setting up your environment using the official documentation, then returning here only if your use case is not met.

This post is also for anyone who may simply be trying to install ClamAV from source, on an older system, with an older version of Curl. If that’s the case, skip ahead to the section titled The Breakdown for your fix.

Things That Haven’t Changed

With a change this big, it’s easier to start with what’s the same. Here’s a list of the important things:
  • Fanotify and inotify still required
  • Clamd still needs to be run and clamd.conf still used by default
  • All working “OnAccessXYZ” clamd.conf configuration options still valid and work as expected
  • Only Linux systems are supported

The New Stuff

Now for the real reason you’re here: what’s different and how that affects you. Well, let’s start with the differences, and then I’ll break down each item to help you gain a fuller understanding of the new system.
  • Curl (version >= 7.45) required for installation
  • VirusEvent and Extra Scanning features re-enabled
  • Client application called clamonacc which interfaces with a clamd server
  • Command-line options
  • Separate and cleaner logging
  • Configuration option for excluding users via username
  • Configurable multi-threaded event handling architecture
  • Configuration options which allow tweaks to network communication and error handling

The Breakdown

Curl (version >= 7.45) required for installation:

This is only relevant if you are installing from source, but it is worth noting. If your curl version is out of date, the installation will fail with an error message stating that you need a version of curl >= 7.45 when you run:

$ > ./configure

If your OS package maintainers do not provide a version of curl newer than 7.45, we recommend installing the latest version of curl (and its headers) from the source.

Alternatively, if you don’t need On-Access capabilities, you can skip installation on your system using the “./configure” flag “--disable-clamonacc”. If you are using a non-Linux system, installation of clamonacc will automatically be disabled.

VirusEvent and Extra Scanning features re-enabled:

Previous versions of the On-Access Scanner had disabled the VirusEvent and Extra Scanning features. The VirusEvent feature allowed users to kick-off a custom shell script whenever clamd found a malicious object. Extra Scanning was a feature tied to inotify which used its expanded and more mature event detection to fill the gaps left (at the time) by fanotify event coverage. With Extra Scanning enabled, users can catch "create" and "move to" events, which up until kernel version 5.1, were not available for capture with the fanotif api. Without Extra Scanning, On-Access scanning will capture all "access" and "open" events only.

Both VirusEvent and Extra Scanning features were disabled due to resource consumption issues when running the On-Access Scanner for long periods of time. However, the new On-Access Scanner has been re-architected with a long-running use-case at the forefront. As a result, it is more reliable, error tolerant, and much, much better at cleaning up after itself. All of this allows us to re-enable the Extra Scanning feature with confidence.

Similarly, due to the new separation between clamd and clamonacc, VirusEvent scripts should now work as expected. This is not so much, “re-enabling the feature” as it is a direct (albeit planned and intended) result of this new separation. Like with clamdscan, VirusEvent will be kicked off by the clamd process, not the new clamonacc application.

Client application called clamonacc which interfaces with a clamd server:

The biggest change to On-Access Scanning is its separation from the clamd server application. With this separation comes more flexibility in deployment options, better stability and up-time for both applications, and a much improved potential attack surface.

Regarding flexibility, the application can be run on the same machine as a clamd instance, or for resource sensitive deployments clamonacc can “phone home” to a central clamd instance. Even better, multiple clamonacc instances on multiple systems can all receive verdicts from a single, centrally located clamd instance. This offloads verdicts to a single location, and scanning/protection tasking to a much lighter-weight application.

However, while such a deployment is possible, it requires streaming over a TCP socket connection, which comes with a number of drawbacks. First, this version of ClamAV requires users to secure their own TCP sockets. We are moving to change this in the future, (the new curl requirement is a step in that direction) but it’s still important to note. Second, the version of clamonacc (and clamd) released with 0.102.0 is not optimized for sending files and receiving verdicts via a network stream. While there are plans to alleviate this, expect full file contents to be sent across the configured socket each time clamonacc requires a clamd verdict. This will obviously have a network impact on a distributed deployment. Third, and finally, caching still needs to be implemented on the clamonacc client side to reduce the number of overall network scan requests.

All that said, smart network engineering, and a targeted clamonacc configuration which only watches necessary files/directory and excludes the right UIDs and/or unames might let you mitigate or overcome these hurdles quite nicely.

Another benefit to this separation is increased stability for both clamd and clamonacc. During our testing, clamonacc was able to recover gracefully from just about every issue that arose--whether anticipated or not--while still providing necessary protections. Similarly during the course of development and testing, clamd was not affected by any clamonacc failure. That said, this does not mean that clamonacc cannot affect clamd at all, or vice-versa. These applications do not exist in a vaccuum and must necessarily interact with one another during normal operation.

With that in mind, one major goal of this rework was improving clamd’s security posture. In versions prior to 0.102, On-Access Scanning was tied directly into clamd, and thus required users to run clamd with elevated privileges (often root). This came with a host of security concerns given the size of clamd's attack surface. By separating clamonacc from clamd, a system admin need only ensure clamd has the read and access permissions necessary to deal with any file descriptors clamonacc may pass along. Of course, clamonacc still requires elevated permissions due to the fanotify interfaces used, but compared to clamd, clamonacc's attack surface is much smaller.

Command-line options:

In order of appearance when you run clamonacc with “--help” these are the command line options and their uses:

   --help

As one would expect, prints the version number, a command line usage example, and a very abbreviated explanation of each available command line option, alongside their shorter forms.

    --version              

Attempts a connection to the clamd server and requests clamd’s version, such that a version mismatch between server and client might be identified. If a clamd server is not found, the local client version is printed to the console instead.

    --verbose

This is akin to clamd’s or clamscan’s --debug option, but isn’t quite so noisy as either of those. By default, clamonacc does not print any output after daemonizing, so you will have to pair this option with --log or --foreground to use it.

    --log=FILE

FILE should be a full path to the logfile you wish clamonacc to use. Without this option, clamonacc will not keep a log. With this option, clamonacc will only output some information to the console if --foreground is enabled. As of the release of 0.102, it is a known bug that clamonacc lacks log rotation.

    --foreground

Forces clamonacc not to daemonize into the background and instead print output and verdicts to the console.

    --watch-list=FILE

This is the command line analogue to the “OnAccessIncludePath” configuration option. The file provided via FILE will be parsed at startup and all valid paths will have watch points placed on them. FILE must be a proper path, it must be a text file, and each path in the text file must be a full path to a valid directory. You must separate multiple paths in the text file with a newline. If you run clamonacc with --verbose, it will let you know if you got any of this wrong, but it will still startup, choosing to ignore invalid input instead of failing out.

    --exclude-list=FILE

This is the command line analogue for “OnAccessExcludePath”. Everything that holds true for --watch-list holds true for --exclude-list, except the end result is that the provided paths within the text file will not have watch points placed on them when clamonacc starts up.

    --remove

Works the same way as clamdscan’s --remove option. In the event that a file is found to be malicious, clamonacc will make a best attempt at removal.

    --move=DIRECTORY
    --copy=DIRECTORY

Works as you'd expect, each also sharing clamdscan’s core functionality. If clamd returns with a malicious, the clamonacc process will either move or copy it into the given path. These three options are mutually exclusive.

    --config-file=FILE

When loading configuration options, clamonacc checks for clamd.conf in ClamAV’s default install location. You can force clamonacc to use a configuration file in a location of your choice by using this option instead. This option is especially useful if you have broken up clamd and clamonacc configuration options into their own separate files.

    --allmatch

Every time a scan request is made, clamonacc will tell the clamd server to run in all-match mode when rendering verdicts.

    --fdpass

This is a niche option with an unclear usecase, but we preserved in case older clamdscan users may know of a specific usecase we do not. Generally, if you are running clamd on the same system as clamonacc, you will be using a local unix socket and file descriptor passing is enabled by default. One theoretical (untested) use, is passing file descriptors along a socket between containers or between a container and the host.

    --stream 

Typically, the only time you would use this option is when you could otherwise pass file descriptors instead. Even if clamonacc and clamd were optimized for streaming, file descriptor passing would be the better, and faster method. It’s only use (besides debugging), is avoiding permission issues that arise when passing file descriptors to clamd.

Separate and cleaner logging:

On-Access Scanning no longer uses the same log file as clamd. To make clamonacc print its output to a logfile, run clamonacc with the command “--log=FILE” where “FILE” is the name you wish to give the log file. Without this command, by default, clamonacc will fork into the background without printing any output. Regardless of whether a log file has been specified, Clamonacc will still protect your system according to any configurations made and all command line options passed. And no matter the logging situation, all VirusEvents will trigger from clamd as expected.

If you do choose to enable logging, know that On-Access logging has been cleaned up considerably in the move from 0.101 to 0.102. After startup, you will see only verdicts for malicious files and errors in their log. That’s it.

If the “--verbose” command is supplied at startup, significantly more output will be available to you. This information is primarily useful for troubleshooting purposes and developers. Therefore, only consider using it if you run into a recurring problem during application runtime.

Configuration option for excluding users via username:

A feature included on user request, this allows simple exclusion of any user and more flexible permission management. The option to use this feature is called “OnAccesExcludeUname” and you can use it as many times as you’d like.

Another exclusion useful option that existed in 0.101 and continues to exist in 0.102, but may seem out of place to some users, is “OnAccessExcludeRootUID”, which is a boolean option that--as it says on the box--will exclude all events triggered by a processes under the root UID “0” from being scanned. This option was added strictly as a workaround to an option parsing limitation, which entirely disabled the “OnAccessExcludeUID” option when set to “0”.

Configurable multi-threaded event handling architecture:

Clamonacc has been re-architectured to follow a multi-supplier, single-consumer queue model for event processing. It accomplishes this by keeping an active thread pool to handle verdict receipts, which is managed by a thread that kicks off work for the pool whenever new entries are added to the event queue it maintains. Currently, that event queue is set up to be fed and grown with distilled information from fanotify, and inotify event monitoring threads, but in theory, the event queue could very easily be fed from other sources down the road--should the need or desire arise.

The clamonacc will startup with five worker threads available to consume events from the queue. However, if your system has the resources for it, you can drastically improve the performance of clamonacc by raising that number with the “OnAccessMaxThreads” options. If you do this, you will likely also want to increase values on “MaxThreads” and “MaxQueue” as well to ensure your clamd instance can keep up.

Configuration options which allow tweaks to network communication and error handling:

With the separation came increased inter-process complexity. And with that complexity arose more potential error cases. Of particular note are the new configuration options surrounding network communications between clamd and clamonacc applications. Two options are provided for tweaking network communication behavior to better suit your environment:
  • OnAccessCurlTimeout
  • OnAccessRetryAttempts
By default, each connection attempt made by clamonacc will timeout after five seconds and will not attempt to reconnect. In case of connection failure or timeout due to known, intermittent network constraints, you may force clamonacc to reattempt the connection by setting the OnAccessRetryAttempts to the number of retries you’d like clamonacc to make before giving up and reporting an error.

Users experienced with the prevention will now be wondering what happens in such a case? Will the file remain locked? Will clamonacc release its access hold automatically in case of failure?

Clamonacc is configured to allow all access attempts if an error occurs while prevention is enabled. However, you can change this behavior by enabling the “OnAccessDenyOnError” configuration option. When this option is enabled alongside “OnAccessPrevention”, clamonacc will deny process access to a file if any error is encountered during the scanning process.

As you can imagine, this is potentially a very dangerous setting and must be used with care to avoid locking your system out of important resources due to something so mundane as a clamd permission issue, or a brief network outage.

Wrap Up

That’s the bulk of it. A lot has changed from a technical standpoint, and while the amount of information shared above might seem overwhelming at first glance, from an operational standpoint there isn’t too much more you need to worry about. Be mindful of your UID/uname excludes, make sure clamd has the right permissions, lock down your TCP ports, be aware of your resource limitations and the knobs you’ve been given to tweak software performance, and you should have your deployment up in no time.

Finally, as I said before, if there’s anything that changed which I didn’t go over above, please leave a comment below so I can address your concern.

Happy clamming.

*This article was accidentally withdrawn and is being re-published so it is available for historical reference.

Today we are publishing the release candidate for ClamAV 0.102.0 (clamav-0.102.0-rc).

There have been some bug fixes and minor improvements since the 0.102.0 beta.  We do not expect any additional changes should be necessarily before publishing the 0.102.0 stable release.

Please take this opportunity to validate that the 0.102.0 release candidate works for your application and that there are no major issues blocking your upgrade to 0.102.0.

Release materials for 0.102.0-rc can be found on the ClamAV's downloads site.
 

Release Notes

ClamAV 0.102.0 includes an assortment improvements and a couple of significant changes.

Major changes

  • The On-Access Scanning feature has been migrated out of clamd and into a brand new utility named clamonacc. This utility is similar to clamdscan and clamav-milter in that it acts as a client to clamd. This separation from clamd means that clamd no longer needs to run with root privileges while scanning potentially malicious files. Instead, clamd may drop privileges to run under an account that does not have super-user. In addition to improving the security posture of running clamd with On-Access enabled, this update fixed a few outstanding defects:
    • On-Access scanning for created and moved files (Extra-Scanning) is fixed.
    • VirusEvent for On-Access scans is fixed.
    • With clamonacc, it is now possible to copy, move, or remove a file if the scan triggered an alert, just like with clamdscan. For details on how to use the new clamonacc On-Access scanner, please refer to the user manual on ClamAV.net, and keep an eye out for a new blog post on the topic.
  • The freshclam database update utility has undergone a significant update. This includes:
    • Added support for HTTPS.
    • Support for database mirrors hosted on ports other than 80.
    • Removal of the mirror management feature (mirrors.dat).
    • An all new libfreshclam library API.

Notable changes

  • Added support for extracting ESTsoft .egg archives. This feature is new code developed from scratch using ESTsoft's Egg-archive specification and without referencing the UnEgg library provided by ESTsoft. This was necessary because the UnEgg library's license includes restrictions limiting the commercial use of the UnEgg library.
  • The documentation has moved!
    • Users should navigate to ClamAV.net to view the documentation online.
    • The documentation will continue to be provided in HTML format with each release for offline viewing in the docs/html directory.
    • The new home for the documentation markdown is in our ClamAV FAQ Github repository.
  • To remediate future denial of service conditions caused by excessive scan times, we introduced a scan time limit. The default value is 2 minutes (120000 milliseconds).

    To customize the time limit:
    • use the clamscan --max-scantime option
    • use the clamd MaxScanTime config option
  • Libclamav users may customize the time limit using the cl_engine_set_num function. For example:

    cl_engine_set_num(engine, CL_ENGINE_MAX_SCANTIME, time_limit_milliseconds)

Other improvements

  • Improved Windows executable Authenticode handling, enabling both whitelisting and blacklisting of files based on code-signing certificates. Additional improvements to Windows executable (PE file) parsing. Work courtesy of Andrew Williams.
  • Added support for creating bytecode signatures for Mach-O and ELF executable unpacking. Work courtesy of Jonas Zaddach.
  • Re-formatted the entire ClamAV code-base using clang-format in conjunction with our new ClamAV code style specification. See the clamav.net blog post for details.
  • Integrated ClamAV with Google's OSS-Fuzz automated fuzzing service with the help of Alex Gaynor. This work has already proven beneficial, enabling us to identify and fix subtle bugs in both legacy code and newly developed code.
  • The clamsubmit tool is now available on Windows.
  • The clamscan metadata feature (--gen-json) is now available on Windows.
  • Significantly reduced number of warnings generated when compiling ClamAV with "-Wall" and "-Wextra" compiler flags and made many subtle improvements to the consistency of variable types throughout the code.
  • Updated the majority of third-party dependencies for ClamAV on Windows. The source code for each has been removed from the clamav-devel repository. This means that these dependencies have to be compiled independently of ClamAV. The added build process complexity is offset by significantly reducing the difficulty of releasing ClamAV with newer versions of those dependencies.
  • During the 0.102 development period, we've also improved our Continuous Integration (CI) processes. Most recently, we added a CI pipeline definition to the ClamAV Git repository. This chains together our build and quality assurance test suites and enables automatic testing of all proposed changes to ClamAV, with customizable parameters to suit the testing needs of any given code change.
  • Added a new clamav-version.h generated header to provide version number macros in text and numerical format for ClamAV, libclamav, and libfreshclam.
  • Improved cross-platform buildability of libxml2. Work courtesy of Eneas U de Queiroz with supporting ideas pulled from the work of Jim Klimov.

Bug fixes

  • Fix to prevent a possible crash when loading LDB type signature databases and PCRE is not available. Patch courtesy of Tomasz Kojm.
  • Fixes to the PDF parser that will improve PDF malware detection efficacy. Patch courtesy of Clement Lecigne.
  • Fix for regular expression phishing signatures (PDB R-type signatures).
  • Various other bug fixes.

New Requirements

  • Libcurl has become a hard-dependency. Libcurl enables HTTPS support for freshclam and clamsubmit as well as communication between clamonacc and clamd.
  • Libcurl version >= 7.45 is required when building ClamAV from source with the new On-Access Scanning application (clamonacc). Users on Linux operating systems that package older versions of libcurl (e.g. all versions of CentOS and Debian versions <= 8) have a number of options:
    • Wait for your package maintainer to provide a newer version of libcurl.
    • Install a newer version of libcurl from source.
    • Disable installation of clamonacc and On-Access Scanning capabilities with the ./configure flag --disable-clamonacc.
  • Non-Linux users will need to take no actions as they are unaffected by this new requirement.

Acknowledgements

The ClamAV team thanks the following individuals for their code submissions:
  • Alex Gaynor
  • Andrew Williams
  • Carlo Landmeter
  • Chips
  • Clement Lecigne
  • Eneas U de Queiroz
  • Jim Klimov
  • Joe Cooper
  • Jonas Zaddach
  • Markus Kolb
  • Orion Poplawski
  • Ørjan Malde
  • Paul Arthur
  • Rick Wang
  • Romain Chollet
  • Rosen Penev
  • Thomas Jarosch
  • Tomasz Kojm

Finally, we'd like to thank Joe McGrath for building our quality assurance test suite and for working diligently to ensure knowledge transfer up until his last day on the team. Working with you was a pleasure, Joe, and we wish you the best of luck in your next adventure!

Wednesday, August 21, 2019

Today we have published the ClamAV 0.101.4 security patch release.

0.101.4


ClamAV 0.101.4 is a security patch release that addresses the following issues.
  •  An out of bounds write was possible within ClamAV's NSIS bzip2 library when attempting decompression in cases where the number of selectors exceeded the max limit set by the library (CVE-2019-12900). The issue has been resolved by respecting that limit.

    Thanks to Martin Simmons for reporting the issue here.
  •  The zip bomb vulnerability mitigated in 0.101.3 has been assigned the CVE identifier CVE-2019-12625. Unfortunately, a workaround for the zip-bomb mitigation was immediately identified. To remediate the zip-bomb scan time issue, a scan time limit has been introduced in 0.101.4. This limit now resolves ClamAV's vulnerability to CVE-2019-12625.

    The default scan time limit is 2 minutes (120000 milliseconds).

    To customize the time limit:
    - use the clamscan  --max-scantime option
    - use the clamd  MaxScanTime config option

    Libclamav users may customize the time limit using the cl_engine_set_num function. For example:

    C
        cl_engine_set_num(engine, CL_ENGINE_MAX_SCANTIME, time_limit_milliseconds)


    Thanks to David Fifield for reviewing the zip-bomb mitigation in 0.101.3 and reporting the issue.
As usual, ClamAV may be downloaded from https://www.clamav.net/downloads, and discussion should take place on the ClamAV-Users list.  Thanks!

Monday, August 5, 2019

We are pleased to introduce the ClamAV 0.101.3 security patch release and a beta for the upcoming 0.102 feature release.

Both of these can be found on ClamAV's downloads site, with 0.101.3 in the "latest stable release" section and 0.102.0-beta in the beta section.

0.101.3

ClamAV 0.101.3 is a patch release to address a vulnerability to non-recursive zip bombs.

A Denial-of-Service (DoS) vulnerability may occur when scanning a zip bomb as a result of excessively long scan times. The issue is resolved by detecting the overlapping local file headers which characterize the non-recursive zip bomb described by David Fifield.

Thank you to Hanno Böck for reporting the issue as it relates to ClamAV, here.

Also included in 0.101.3:
  • Update of bundled the libmspack library from 0.8alpha to 0.10alpha, to address a buffer overflow vulnerability in libmspack < 0.9.1α.


0.102-beta

ClamAV 0.102.0 includes an assortment of improvements and a couple of significant changes.

Major changes

  • The On-Access Scanning feature has been migrated out of clamd and into a brand new utility named clamonacc. This utility is similar to clamdscan and clamav-milter in that it acts as a client to clamd. This separation from clamd means that clamd no longer needs to run with root privileges while scanning potentially malicious files. Instead, clamd may drop privileges to run under an account that does not have super-user. In addition to improving the security posture of running clamd with On-Access enabled, this update fixed a few outstanding defects:
    • On-Access scanning for created and moved files (Extra-Scanning) is fixed.
    • VirusEvent for On-Access scans is fixed.
    • With clamonacc, it is now possible to copy, move, or remove a file if the scan triggered an alert, just like with clamdscan. For details on how to use the new clamonacc On-Access scanner, please refer to the user manual on ClamAV.net, and keep an eye out for a new blog post on the topic
  • The freshclam database update utility has undergone a significant update. This includes:
    • Added support for HTTPS.
    • Support for database mirrors hosted on ports other than 80.
    • Removal of the mirror management feature (mirrors.dat).
    • An all new libfreshclam library API.

Notable changes

  • Added support for extracting ESTsoft .egg archives. This feature is new code developed from scratch using ESTsoft's Egg-archive specification and without referencing the UnEgg library provided by ESTsoft. This was necessary because the UnEgg library's license includes restrictions limiting the commercial use of the UnEgg library.
  • The documentation has moved!
    • Users should navigate to ClamAV.net to view the documentation online.
    • The documentation will continue to be provided in HTML format with each release for offline viewing in the docs/html directory.
    • The new home for the documentation markdown is in our ClamAV FAQ Github repository.

Other improvements

  • Improved Windows executable Authenticode handling, enabling both whitelisting and blacklisting of files based on code-signing certificates. Additional improvements to Windows executable (PE file) parsing. Work courtesy of Andrew Williams.
  • Added support for creating bytecode signatures for Mach-O and ELF executable unpacking. Work courtesy of Jonas Zaddach.
  • Re-formatted the entire ClamAV code-base using clang-format in conjunction with our new ClamAV code style specification. See the clamav.net blog post for details.
  • Integrated ClamAV with Google's OSS-Fuzz automated fuzzing service with the help of Alex Gaynor. This work has already proven beneficial, enabling us to identify and fix subtle bugs in both legacy code and newly developed code.
  • The clamsubmit tool is now available on Windows.
  • The clamscan metadata feature (--gen-json) is now available on Windows.
  • Significantly reduced number of warnings generated when compiling ClamAV with "-Wall" and "-Wextra" compiler flags and made many subtle improvements to the consistency of variable types throughout the code.
  • Updated the majority of third-party dependencies for ClamAV on Windows. The source code for each has been removed from the clamav-devel repository. This means that these dependencies have to be compiled independently of ClamAV. The added build process complexity is offset by significantly reducing the difficulty of releasing ClamAV with newer versions of those dependencies.
  • During the 0.102 development period, we've also improved our Continuous Integration (CI) processes. Most recently, we added a CI pipeline definition to the ClamAV Git repository. This chains together our build and quality assurance test suites and enables automatic testing of all proposed changes to ClamAV, with customizable parameters to suit the testing needs of any given code change.

Bug fixes

  • Fix to prevent a possible crash when loading LDB type signature databases and PCRE is not available. Patch courtesy of Tomasz Kojm.
  • Fixes to the PDF parser that will improve PDF malware detection efficacy. Patch courtesy of Clement Lecigne.
  • Fix for regular expression phishing signatures (PDB R-type signatures).
  • Various other bug fixes.

New Requirements

  • Libcurl has become a hard-dependency. Libcurl enables HTTPS support for freshclam and clamsubmit as well as communication between clamonacc and clamd.
  • Libcurl version >= 7.45 is required when building ClamAV from source with the new On-Access Scanning application (clamonacc). Users on Linux operating systems that package older versions of libcurl (e.g. all versions of CentOS and Debian versions <= 8) have a number of options:
    • Wait for your package maintainer to provide a newer version of libcurl.
    • Install a newer version of libcurl from source.
    • Disable installation of clamonacc and On-Access Scanning capabilities with the ./configure flag --disable-clamonacc.

      Non-Linux users will need to take no actions as they are unaffected by this new requirement.

Acknowledgements

The ClamAV team thanks the following individuals for their code submissions:
  • Alex Gaynor
  • Andrew Williams
  • Carlo Landmeter
  • Chips
  • Clement Lecigne
  • Paul Arthur
  • Jonas Zaddach
  • Ørjan Malde
  • Rick Wang
  • Rosen Penev
  • Thomas Jarosch
  • Tomasz Kojm

Finally, we'd like to thank Joe McGrath for building our quality assurance test suite and for working diligently to ensure knowledge transfer up until his last day on the team. Working with you was a pleasure, Joe, and we wish you the best of luck in your next adventure!

Tuesday, March 26, 2019

ClamAV 0.101.2

ClamAV 0.101.2 is a patch release to address a handful of security related bugs.

This patch release is being released alongside the 0.100.3 patch so that users
who are unable to upgrade to 0.101 due to libclamav API changes are protected.

This release includes 3 extra security related bug fixes that do not apply to
prior versions.  In addition, it includes a number of minor bug fixes and
improvements.

- Fixes for the following vulnerabilities affecting 0.101.1 and prior:
  - CVE-2019-1787:
    An out-of-bounds heap read condition may occur when scanning PDF
    documents. The defect is a failure to correctly keep track of the number
    of bytes remaining in a buffer when indexing file data.
  - CVE-2019-1789:
    An out-of-bounds heap read condition may occur when scanning PE files
    (i.e. Windows EXE and DLL files) that have been packed using Aspack as a
    result of inadequate bound-checking.
  - CVE-2019-1788:
    An out-of-bounds heap write condition may occur when scanning OLE2 files
    such as Microsoft Office 97-2003 documents. The invalid write happens when
    an invalid pointer is mistakenly used to initialize a 32bit integer to
    zero. This is likely to crash the application.

- Fixes for the following vulnerabilities affecting 0.101.1 and 0.101.0 only:
  - CVE-2019-1786:
    An out-of-bounds heap read condition may occur when scanning malformed PDF
    documents as a result of improper bounds-checking.
  - CVE-2019-1785:
    A path-traversal write condition may occur as a result of improper input
    validation when scanning RAR archives. Issue reported by aCaB.
  - CVE-2019-1798:
    A use-after-free condition may occur as a result of improper error
    handling when scanning nested RAR archives. Issue reported by David L.

- Fixes for the following assorted bugs:
  - Added checks to prevent shifts from causing undefined behavior in HTML
    normalizer, UPX unpacker, ARJ extractor, CPIO extractor, OLE2 parser,
    LZW decompressor used in the PDF parser, Xz decompressor, and UTF-16 to
    ASCII transcoder.
  - Added checks to prevent integer overflow in UPX unpacker.
  - Fix for minor memory leak in OLE2 parser.
  - Fix to speed up PDF parser when handling truncated (or malformed) PDFs.
  - Fix for memory leak in ARJ decoder failure condition.
  - Fix for potential memory and file descriptor leak in HTML normalization code.

- Removed use of problematic feature that converted file descriptors to
  file paths. The feature was intended to improve performance when scanning
  file types, notably RAR archives, for which the API requires a file path.
  This feature caused issues in environments where the ClamAV engine is run
  in a low-permissions or sandboxed process. RAR archives are still supported
  with this change, but performance may suffer slightly if the file path is not
  provided in calls to `cl_scandesc_callback()`.
  - Added filename and tempfile names to scandesc calls in clamd.
  - Added general scan option `CL_SCAN_GENERAL_UNPRIVILEGED` to treat the scan
    engine as unprivileged, meaning that the scan engine will not have read
    access to the file. Provided file paths are for logging purposes only.
  - Added ability to create a temp file when scanning RAR archives when the
    process does not have read access to the file path provided (i.e.
    unprivileged is set, or an access check fails).

Thank you to the Google OSS-Fuzz project for identifying and reporting many of
the bugs patched in this release.

Additional thanks to the following community members for submitting bug reports:

- aCaB
- David L.

ClamAV 0.100.3

ClamAV 0.100.3 is a patch release to address a few security related bugs.

This patch release is being released alongside the 0.101.2 patch so that users
who are unable to upgrade to 0.101 due to libclamav API changes are protected.

The bug fixes in this release are limited to security-related bugs only.
Users are encouraged to upgrade to 0.101.2 for additional improvements.

- Fixes for the following vulnerabilities:
  - CVE-2019-1787:
    An out-of-bounds heap read condition may occur when scanning PDF
    documents. The defect is a failure to correctly keep track of the number
    of bytes remaining in a buffer when indexing file data.
  - CVE-2019-1789:
    An out-of-bounds heap read condition may occur when scanning PE files
    (i.e. Windows EXE and DLL files) that have been packed using Aspack as a
    result of inadequate bound-checking.
  - CVE-2019-1788:
    An out-of-bounds heap write condition may occur when scanning OLE2 files
    such as Microsoft Office 97-2003 documents. The invalid write happens when
    an invalid pointer is mistakenly used to initialize a 32bit integer to
    zero. This is likely to crash the application.

Thank you to the Google OSS-Fuzz project for identifying and reporting the bugs
patched in this release.

Both of these can be found on ClamAV's downloads site, with 0.101.2 as the main release and 0.100.3 under "Previous Stable Releases"

Wednesday, February 20, 2019


The ClamAV Development team is looking to add a new C programmer to the team, located in Fulton, Maryland USA!

Our team is a small and friendly one. At present, we have two software engineers and one quality assurance engineer. All three of us work in the Cisco office in Fulton, MD where we occupy a comfortable corner complete with our own snack area and Nerf gun arsenal. And we're not simply legacy software maintainers: we actively design and engineer new features while improving existing ones. If you’re seeking a long-term, head-in-the-sand, maintenance job, look elsewhere. Our team is constantly engaged. From discussing new ways to improve our processes, to opening ourselves to new concepts, our willingness to learn and explore new ideas leads us to write better code, test more effectively, and ultimately build a better product.

As a member of the ClamAV team, you wouldn’t be working in a vacuum, and you wouldn’t get stuck fixing bugs in legacy software day-in and day-out. The team has an ever-evolving list of cool, new ideas to work on. Project planning is a collaborative effort. We derive requirements from community requests as well as from our partners in Talos Malware Research, Talos Web Team, and stakeholders across a handful of Cisco product groups. We work as a team to find the best way to tackle the more difficult challenges. The ClamAV Dev team is part of a larger group of skilled developers, researchers, and analysts, with whom we collaborate daily. Working on ClamAV, you'll always have a friendly and highly-skilled talentpool from which to draw upon and share ideas.

Our team performs for a wide audience. ClamAV is oft regarded as an anti-malware product, but our tools do more than protect endpoint devices. In fact, ClamAV is run on most mail servers and many web servers on the internet, protecting the world from malware transmitted via email and file upload. What’s more, our open source scanning and matching technology is baked into many of Cisco's core products. As a consequence, our team is responsible for maintaining a healthy relationship not only with the open source community but also with those Cisco product development teams that integrate ClamAV into their software.

Speaking of Talos, the ClamAV team is a core component of the largest and coolest commercial threat intelligence organization on the planet. Our offices are new and shiny. Our workforce is skilled, but also casual. We enjoy our work, and we know how to relax, too. That means our office has an active roster of video gamers, board gamers, ping pong players, and more.

Ok … so the work sounds good to you, and the office sounds great too… but your schedule can be a little crazy, and sometimes you need to stay home in your pajamas. No problem. Our team generally works from home once a week. Usually we work from home on Fridays, but things can be adjusted as needed to accomodate life's inescapable work-week appointments.

Does this job sounds too good to be true? Don't take my word for it. This year, Cisco placed #6 in Fortune’s rankings of the 100 best companies to work for! Our offices in Fulton, MD and in San Jose, CA are no exception.

To see the required skills, please check out the job listing on our careers page on TalosIntelligence.com.

Friday, February 15, 2019

We are currently experiencing issues with the safebrowsing publishing system.  This means that we are not able to currently publish a new safebrowsing.cvd file.

We are working on resolving the issue and will post an update when this is corrected.


ClamAV has adopted the use of Clang-Format for the purposes of improving the readability and maintainability of our code base.

Contributors to ClamAV, and those wishing to adopt the same format rules for their own projects, are advised to use the latest version of Clang-Format available. Some of the rules we have selected require version 7 as a minimum. Details are included below on how to install Clang-Format, including recommended plugins for popular text editors.

Some of our readers may be concerned that auto-formatting the code base will make it difficult to identify when a bug was introduced. The concern is valid but we have decided that the benefits in code readability outweighed the extra steps that may be needed when viewing the git history. As a reminder, `git blame` and `git diff` have the `-w` option allowing you to ignore changes to white-space which can help tremendously when analyzing changes. The ignore-whitespace feature can also be enabled in most Git plugins for text editors, such as VSCode’s GitLens extension.

Auto-formatting exemptions

In a few instances, we had to modify the code to ensure that Clang-Format would not incorrectly format the code. This is because there are still some cases where manually formatted code is more legible than Clang-Format auto-formatted code.

Situations where formatting requires some manual finesse:

  1. The `try` keyword may not be used as a variable name because Clang-Format will interpret it in the context of exception handling. Avoid the naming variables with reserved words such as `try` to alleviate such issues.
  2. Programmers often choose to align consecutive macros to make related constants easier to identify at a glance. The AlignConsecutiveMacros Clang-Format feature is still a work in progress so we are not yet able to automatically align consecutive macros.
  3. A similar issue to the alignment of consecutive macros is the alignment of large array and large struct declarations. Clang-Format generally does an okay job formatting single and multi-dimensional array and/or structure declarations. However, manual formatting in this case is generally better.

To address items 2 and 3 above, we have resorted to using the `// clang-format on/off` markers to protect specific blocks of code from being modified by Clang-Format.

ClamAV code examples where manual formatting was better than Clang-Format:
Preparatory steps taken to ensure correct formatting when working with Clang-Format can be found here.

This commit in our Git history implemented our Clang-Format rules for the first time. You may find it interesting to peruse to see how the rules affected the code base.

Clang-Format Usage

To support our chosen code format style, we have included the following in the clamav-devel dev/0.102 branch:


Formatting options:
  • Run the following in a directory containing a “.clang-format” ruleset. It will apply the rules in-place to the file or directory you specified:

    ~/workspace/clamav-devel$ clang-format -i <filepath>
  • If you’re working regularly with ClamAV, or another project configured to use Clang-Format, you may wish to set the workspace settings for your editor to format-on-save automatically for you.
    IMPORTANT: Be sure that you do NOT enable format-on-save globally for all projects, or you may inadvertently make large changes to code in projects that do not provide a Clang-Format ruleset.
  • If you have made many changes across the ClamAV code base, you may wish to run our clam-format convenience script:

    ~/workspace/clamav-devel$ ./clam-format
    This script does three things:
    • Re-generate the .clang-format config, specifying the specific format style rules our team has chosen to enforce (as described in the next section).
    • Update the format for all ClamAV-owned code in the repository in-place.
    • Undo changes from step (2) to specific files that should not be reformatted.

The Clam C/C++ Format Style

ClamAV’s format style was chosen not only because we find it attractive and legible, but also because it aligned best with the existing ClamAV code prior to enacting the new rules. We have done our best to minimize frivolous changes to the code base using the options that Clang-Format provides.

The Clam C/C++ Format Style uses the following Clang-Format options:

Language: Cpp, 

These rules only apply to C & C++ code.

UseTab: Never,
IndentWidth: 4,

Code must not use tabs for indentation, and must use 4 spaces for indentation.

AlignTrailingComments: true,

Comments after code on consecutive lines will be aligned.

Example:
int a;     // My comment a
int b = 2; // comment b


AlignConsecutiveAssignments: true,

Variable assignments on consecutive lines will be aligned.

Example:
int aaaa = 12;
int b    = 23;

int ccc  = 23;

AlignAfterOpenBracket: true,

Function arguments split onto multiple lines will be aligned.

Example:

someLongFunction(argument1,
                 argument2);


AlignEscapedNewlines: Left, 

Backslashes for escaped newlines will be aligned as far left as possible.

Example:
#define A     \
    int aaaa; \
    int b;    \
    int dddddddddd;


AlignOperands: true,

Operands split onto multiple lines will be aligned.

Example:
int aaa = bbbbbbbbbbbbbbb +
          ccccccccccccccc;


AllowShortFunctionsOnASingleLine: Empty,

Only empty functions may be defined on a single line.

Example:
void f() {}
void f2() {
    bar2();
}


AllowShortIfStatementsOnASingleLine: true,
Short if-statements may be written on a single line.

Example:
true, if (a) return;

AllowShortLoopsOnASingleLine: true,
Short loops may be written on a single line.

Example:
true, while (true) continue;

BreakBeforeBraces: Linux,

The Linux style of breaking before braces will be used. In this style, there is a new line before the opening brace on function, namespace and class definitions, but control statements such as if, else, for, do, while, and try and braces variable declarations such as enum and struct do not get a new line before the opening brace.

Example:
try {    foo();
} catch () {
}

void foo() { bar(); }

class foo
{
};

if (foo()) {
} else {
}

enum X : int { A, B };


BreakBeforeTernaryOperators: true,

Multiline ternary statements will place the ternary operators before the values, otherwise clang-format will reformat the statement into a single line.

Example:

veryVeryVeryVeryVeryVeryVeryVeryVeryVeryVeryLongDescription
    ? firstValue
    : SecondValueVeryVeryVeryVeryLong;


ColumnLimit: 0,

A column limit of 0 means that there is no column limit. In this case, Clang-Format will respect the input’s line breaking decisions within statements unless they contradict other rules.

FixNamespaceComments: true,

The closing brace of a namespace will include a comment providing the namespace name.

Example:
namespace a {
foo();
} // namespace a;


SortIncludes: false,

Header #include statements will not be automatically rearranged by alphabetical order, because the specific order of header includes may inadvertently or intentionally affect compilation.

MaxEmptyLinesToKeep: 1,

Only 1 empty line may exist between lines of code.

SpaceBeforeParens: ControlStatements,

Control statements will have space before the opening paren. Functions and macros will not.

Example:
void f() {    if (true) {        f();
    }
}


IndentCaseLabels: true,

Case labels in switch statements will be indented.

Example:
switch (fool) {
    case 1:
        bar();
        break;
    default:        plop();
}

DerivePointerAlignment: true

Pointer alignment preference is up to the author of the file but must be consistent across the file. Inconsistencies in a specific file will be reformatted to match the rest of the file.

Example pointer alignment possibilities:
  • int * a;
  • int* b;
  • int *c;
Install Clang-Format


Clang-Format is easy to install on most operating systems, although we will note that due to choices we’ve made in our format specification. Clang-Format version 7 or higher is required for ClamAV. If the Clang-Format tool is not provided as a separate package, it can be obtained by installing Clang or LLVM.


CentOS:

sudo yum install epel-release && sudo yum install clang

Fedora:

sudo dnf install clang

Ubuntu & Debian:

sudo apt install clang-format

macOS:

First, install Homebrew.

Then:

brew install clang-format

Windows:

First, install Chocolatey.

Then:

C:\> choco install llvm

Plugins for popular text editors and IDEs


In addition to the above manual command-line usage, clang-format plugins are readily available for most popular text editors, enabling you to auto-format as you edit using commands in your editor, or to auto-format each time you save a file.

VSCode:

https://marketplace.visualstudio.com/items?itemName=xaver.clang-format

Sublime Text:

https://packagecontrol.io/packages/Clang%20Format

VIM & Emacs:

http://clang.llvm.org/docs/ClangFormat.html#vim-integration

https://github.com/rhysd/vim-clang-format

Atom Editor:

https://atom.io/packages/clang-format

Visual Studio:

https://marketplace.visualstudio.com/items?itemName=LLVMExtensions.ClangFormat

Emacs:

https://www.reddit.com/r/emacs/comments/7uq9w1/replace_emacs_c_autoformatting_with_clangformat/

Conventions and recommendations beyond Clang-Format

In addition to the rules enforced by Clang-Format, we have some additional guidelines to use when developing code for ClamAV.

Disclaimer: As project with a history dating back almost two decades, not all code in ClamAV adheres to these guidelines. New code should, however. During the course of working with older code, it should be updated to follow these as best as is feasible and as time permits.

Integer Types


Integer variables should use variable types from stdint that are specific about variable width (int8_t, uint32_t, int64_t, etc.) wherever possible. These types should be included using #include “clamav-types.h”, for portability. Not all operating systems or toolchains provide stdint.h or inttypes.h.

Notable exceptions to the above:

ptrdiff_t may be used to hold the signed numerical value of the difference of two pointers. Similarly, intptr_t and uintptr_t may be used to store and perform arithmetic on pointers specifically because they match the width of a pointer on a given machine.

size_t may be used to store unsigned size values as needed.

off_t, ssize_t, int, and long and other poorly defined types of yore may be required (at least temporarily) by library API's. These should be avoided in general, favoring ptrdiff_t (signed), size_t (unsigned), or an even more explicit type, such as uint32_t or int64_t. The older integer types are not well defined and its size may vary by compiler, OS, and system architecture. These types should be properly converted to better defined types as fast as is practical.

Integer Types in Format Strings

The format characters for stdint variable types are defined in clamav-types.h as of ClamAV 0.101.1.

Example format string usage:
int64_t obj_id = 0;
printf("Object ID: " PRIi64 "\n", obj_id);


In addition to these types, many people forget how to print size_t, ptrdiff_t, ssize_t, and off_t in a way that is portable across all systems.

For reference, these are:
  • size_t val
    • “%zu”, val
  • ptrdiff_t val
    • “%zd”, val
  • off_t val
    • “%lld”, (long long)val
  • ssize_t val
    • “%lld”, (long long)val
Inline Documentation / Function Comments
Functions should have Doxygen-style comment blocks. These comment blocks should be located where other developers can easily read them when working with the functions.

That is to say that:

  • For library API's, the Doxygen comment block should appear above the function prototype in the .h header file. For an example, see clamav.h.
  • For almost all other function types, it is best to place the doxygen comment block above the implementation in the .c / .cc / .cpp file.
Tip: Many text editors offer Doxygen extensions that significantly assist in writing function comments. For VSCode, the Doxygen Documentation Generator extension is quite helpful.

Example comment block including a brief, description, in, out, and in/out parameters:

/**
 * @brief Scan a file, given a file descriptor.
 *
 * This callback variant allows the caller to provide a context structure that caller provided callback functions can interpret.
 *
 * @param desc File descriptor of an open file. The caller must provide this or the map.
 * @param filename (optional) Filepath of the open file descriptor or file map.
 * @param[out] virname Will be set to a statically allocated (i.e. needs not be freed) signature name if the scan matches against a signature.
 * @param[out] scanned The number of bytes scanned.
 * @param engine The scanning engine.
 * @param scanoptions Scanning options.
 * @param[in/out] context An opaque context structure allowing the caller to record details about the sample being scanned.
 * @return cl_error_t CL_CLEAN, CL_VIRUS, or an error code if an error occurred during the scan.
 */
extern int cl_scandesc_callback(int desc, const char *filename, const char **virname, unsigned long int *scanned, const struct cl_engine *engine, struct cl_scan_options *scanoptions, void *context);

Packed Struct Definitions

Some structs require packing to ensure that they correctly represent data the same when written to or read from disk or the network.

Take the example struct:
struct example {
    short y;
    int x;
    char z;
}

Without packing, the compiler might optimize execution time by representing the struct on disk as the following bytes:
y
y


x
x
x
x
z




In reality, the programmer probably envisioned the struct without all that extra padding:
y
y
x
x
x
x
z


To force the compiler to properly pack a struct so that it is portable, surround the struct definition with the following:

#ifndef HAVE_ATTRIB_PACKED
#define __attribute__(x)
#endif

#ifdef HAVE_PRAGMA_PACK
#pragma pack(1)
#endif

#ifdef HAVE_PRAGMA_PACK_HPPA
#pragma pack 1
#endif

/* struct definition */
struct example_struct
{
    ...
} __attribute__((packed));

#ifdef HAVE_PRAGMA_PACK
#pragma pack()
#endif

#ifdef HAVE_PRAGMA_PACK_HPPA
#pragma pack
#endif


Note that all THREE preprocessor directive forms are necessary for cross-platform compatible code, as per https://bugzilla.clamav.net/show_bug.cgi?id=1752.