ClamAV® blog: Brief Re-introduction to ClamAV Bytecode Signatures

Wednesday, November 19, 2014

Brief Re-introduction to ClamAV Bytecode Signatures

Bytecode signatures are a specialized type of ClamAV signature which is able to perform additional processing of the scanned file and allow for more robust detection. Unlike the standard ClamAV signature types, bytecode signatures have a number of unique distinctions which need to be respected for their effective usage.

Bytecode Signature Generation

The major distinction between bytecode signatures and the other ClamAV based standard signature languages is that bytecode signatures are actually compiled from a user-written source file, similar to Java bytecode. The tool used to generate bytecode signatures from source is the clambc-compiler which is a separate project from ClamAV.

You can get it by using one of these commands:
git clone git://github.com/vrtadmin/clamav-bytecode-compiler (recommended for git <1.7)
git clone https://github.com/vrtadmin/clamav-bytecode-compiler (recommended for git 1.7+)

The repository can also be browsed online here:
https://github.com/vrtadmin/clamav-bytecode-compiler

Once the source is acquired, read the README in the project to compile and install the clambc-compiler. To compile the bytecode signature, simply run the command:

clambc-compiler [options] [source]

For information on how to write bytecode source, please refer to the clambc-compiler documentation or other blog posts.

Running Bytecode Signatures in ClamAV

Due to the nature of how bytecode signatures are ran in ClamAV, there are a number of pre-cautions taken to ensure safety of the bytecode signature execution.

Trust

Bytecode signatures, by default, are considered untrusted. In fact, only bytecode signatures published by Cisco, in the bytecode.cvd are considered “trusted”. This means that the ClamAV engine will, by default, never load, trigger or execute untrusted bytecodes. One can bypass this safety mechanism by specifying the bytecode unsigned option to the engine but it should be noted that it is up to the user’s discretion on using untrusted bytecode signatures.

For clamscan, the command line option is --bytecode-unsigned.

For clamd, one would need to specify BytecodeUnsigned yes to clamd.conf.

Timeout

Bytecode signatures are designed to only run for a limited amount of time designated by an internal timeout value. If execution time exceeds the value, the bytecode signature’s execution is terminated and the user is notified. The bytecode signature timeout value can be set by the user.

For clamscan, the command line is --bytecode-timeout=[time in ms].

For clamd, one would specify BytecodeTimeout [time in ms] to clamd.conf.

Bytecode Databases

Bytecode signatures are stored in a separate database from the standard ClamAV signatures. In fact, it is impossible to generate database files (with sigtool) that contain both bytecode signatures and standard signatures. Bytecode databases are generated by building a database that is named “bytecode.*” which triggers specialized handling by sigtool. Sigtool will add all bytecode signatures in the specified directory regardless if the name of the signature matches the name of the database. Bytecode signatures thus can be named anything provided the extension is “.cbc”.

Bytecode databases generated this way are still considered untrusted (unless published by Cisco) which means one needs to still specify the appropriate flags to use the database.

Issue Reporting

If anyone encounters issue with bytecode signatures, whether within the clambc-compiler or within ClamAV, they can report them to https://bugzilla.clamav.net/. Be sure to include the bytecode signature, bytecode source(if possible), and any other pieces of useful information.