After a lot of hard work by our teams, and with RSA just a few days away, we are proud to announce that along with Cisco and Sourcefire's corporate teams being present at RSA, and for the first time we will also be holding an Open Source Community Meeting!
Matt Watchinski (Director of the Vulnerability Research Team) and myself, Joel Esler, (Open Source Manager) will be presenting on the state of our Open Source projects at Sourcefire, the state of Open Source now that we are Cisco, some future developments and of course, open Q&A!
So here's some attendance details:
Open Source Community Meeting
Executive Conference Center
55 4th Street -- Level 2
San Francisco, CA 94103
Wednesday, February 26th, 2014
12:00pm - 2:00pm
Lunch will be provided on site.
We also have some exclusive Swag give-aways that not only no one else has, but aren't available anywhere else! Available for the first 40 people that come through the door (if we have your size).
We'll have availability for about 50 people on site, so first come, first served, let's make this a repeating event!
We look forward to seeing you there!
Wednesday, February 19, 2014
Tuesday, February 18, 2014
I recently wrote a blog post Generating ClamAV Signatures with IDAPython and MySQL. In the comments, I was asked for more details on how the script generate_sigs.py groups binaries by functions.
The three tables used were shared in the previous post, but for convenience, here they are again.
To get the groups of binaries you find binaries that share a number of common functions. I'll build out the MySQL query so it is understandable. The inner query is here:
This gives a list of functions and their associated binaries if more than 2 binaries are associated with that function.
This list is quite long so I've truncated it. An example result, function 1425 has 13 binaries associated with it, those binaries' ids are listed. That's great, but we really want a list of binaries and a list of functions that associate those binaries. So, we now embed the original query in a similar query that creates a list of functions grouped by the bn_list field.
I also added count(*) < 23 to the last line of this query to get readable output. The resulting table is split and truncated below. Each row in bn_list corresponds to the same row in fn_list.
This leaves us with a list of binaries grouped by 4 or more functions. This isn't perfect for creating signatures because some lines are contained completely in other lines. For example:
Since one entry's binary list is a subset of the other entry's binary list, we can delete the shorter list and avoid having largely duplicate functionality. This is done programmatically once the query is returned to the Python script. I hope this fills in any blanks on grouping the functions. After this, another script is called to extract basic blocks from common functions and generate a signature with those bytes. The original post goes into a bit more detail on that, so I will end here.
The three tables used were shared in the previous post, but for convenience, here they are again.
binaries - stores information about each sample seen +-------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | md5 | varchar(32) | NO | | NULL | | | size | int(11) | NO | | NULL | | +-------+-------------+------+-----+---------+----------------+ functions - stores information about each function seen +-------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | md5 | varchar(32) | NO | | NULL | | | size | int(11) | NO | | NULL | | +-------+-------------+------+-----+---------+----------------+ link_table - associates each binary with a set of functions +---------+---------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +---------+---------+------+-----+---------+-------+ | prog_id | int(11) | NO | PRI | NULL | | | fn_id | int(11) | NO | PRI | NULL | | +---------+---------+------+-----+---------+-------+
To get the groups of binaries you find binaries that share a number of common functions. I'll build out the MySQL query so it is understandable. The inner query is here:
SELECT fn_id, group_concat(prog_id ORDER BY prog_id) AS bn_list, # get a list of binaries count(*) AS pcnt # count the binaries in that list FROM link_table GROUP BY fn_id HAVING pcnt > 2 # filter the results ORDER BY bn_list;
This gives a list of functions and their associated binaries if more than 2 binaries are associated with that function.
+-------+----------------------------------------------------------+------+ | fn_id | bn_list | pcnt | +-------+----------------------------------------------------------+------+ | 993 | 10,16,63,74,76,87,92,93,124,126,129,135,145 | 13 | | 994 | 10,16,63,74,76,87,92,93,124,126,129,135,145 | 13 | | 995 | 10,16,63,74,76,87,92,93,124,126,129,135,145 | 13 | | 1021 | 11,15,28,77,86,91,136 | 7 | | 1116 | 11,15,28,86,136 | 5 | | 1258 | 12,20,22,127 | 4 | | 1118 | 12,22,127 | 3 | | 1364 | 14,24,140 | 3 | | 1434 | 18,59,68,71,73,83,84,110,119,120,137,138,148,150,154,157 | 16 | | 1425 | 18,68,71,83,84,110,119,120,138,148,150,154,157 | 13 | | 1426 | 18,68,71,83,84,110,119,120,138,148,150,154,157 | 13 | | 1427 | 18,68,71,83,84,110,119,120,138,148,150,154,157 | 13 | | 1428 | 18,68,71,83,84,110,119,120,138,148,150,154,157 | 13 | | 1429 | 18,68,71,83,84,110,119,120,138,148,150,154,157 | 13 | | 1430 | 18,68,71,83,84,110,119,120,138,148,150,154,157 | 13 | | 1436 | 18,68,71,83,84,110,119,120,138,148,150,154,157 | 13 | ...
This list is quite long so I've truncated it. An example result, function 1425 has 13 binaries associated with it, those binaries' ids are listed. That's great, but we really want a list of binaries and a list of functions that associate those binaries. So, we now embed the original query in a similar query that creates a list of functions grouped by the bn_list field.
SELECT bn_list, group_concat(fn_id ORDER BY fn_id) AS fn_list # get a list of functions for each bn_list FROM (SELECT fn_id, group_concat(prog_id ORDER BY prog_id) AS bn_list, count(*) AS pcnt FROM link_table GROUP BY fn_id HAVING pcnt > 1 ORDER BY bn_list) AS t GROUP BY bn_list HAVING count(*) > 4; # get groups connected by > 4 functions
I also added count(*) < 23 to the last line of this query to get readable output. The resulting table is split and truncated below. Each row in bn_list corresponds to the same row in fn_list.
+--------------------------------------------------------+ | bn_list | +--------------------------------------------------------+ | 121,131 | | 18,68,71,83,84,110,119,120,138,148,150,154,157,167,182 | | 18,84,119,138,150,157,182 | | 19,81,115,173 | | 26,95,142,146,165,183 | | 27,128 | | 27,37,53,70,79,172 | | 30,64,100 | | 48,50,69,147,168 | | 59,73 | | 96,105,181 | | 96,181 | +--------------------------------------------------------+
+--------------------------------------------------------+ | fn_list | +--------------------------------------------------------+ | 37061,37062,37063,37064,37065,37066,37067,37068 | | 1425,1426,1427,1428,1429,1430,1436 | | 1419,1420,1421,1422,1423,1424,1431,1432,1433,1435 | | 1437,1438,1439,1440,1441,1442,1443,1444,1445,1446,... | | 4359,4360,4361,4362,4363 | | 4572,4576,4577,4580,4634,4635,4644 | | 4482,4483,4559,4560,4608,4622,4623,4624 | | 4779,4780,4781,4782,4783,4784,4785,4786,4787,4788,... | | 12054,12086,12087,12102,12103,12105,12108,12109,... | | 21291,21292,21293,21294,21295,21296,21297,21301,... | | 31659,31661,31665,31671,31673 | | 31651,31653,31655,31657,31663,31667,31669,31685,31687 | +--------------------------------------------------------+
This leaves us with a list of binaries grouped by 4 or more functions. This isn't perfect for creating signatures because some lines are contained completely in other lines. For example:
+--------------------------------------------------------+ | bn_list | +--------------------------------------------------------+ | 18,68,71,83,84,110,119,120,138,148,150,154,157,167,182 | | 18,84,119,138,150,157,182 |
+--------------------------------------------------------+ | fn_list | +--------------------------------------------------------+ | 1425,1426,1427,1428,1429,1430,1436 | | 1419,1420,1421,1422,1423,1424,1431,1432,1433,1435 |
Since one entry's binary list is a subset of the other entry's binary list, we can delete the shorter list and avoid having largely duplicate functionality. This is done programmatically once the query is returned to the Python script. I hope this fills in any blanks on grouping the functions. After this, another script is called to extract basic blocks from common functions and generate a signature with those bytes. The original post goes into a bit more detail on that, so I will end here.
Monday, February 17, 2014
I am pleased to announce the creation of a new ClamAV signatures contribution program. My name is Alain Zidouemba and I will be managing this program.
If you would like to submit a ClamAV signature, you may do so by emailing community-sigs [at] lists [dot] clamav [dot] net. We will require that each signature:
- not be a hash-based signature
- be accompanied by a MD5/SHA1/SHA256 for a sample the signature is meant to detect.
- come with a brief description of the threat the signature is trying to detect and what the signature is looking for
Please DO NOT attach malware to your email. Instead, submit your sample here.
Signatures submitted will be tweaked if necessary in order to conform to our standards. After the signature passes quality assurance testing, it will be released with proper attribution unless you prefer to remain anonymous.
You can subscribe to the mailing list here. More information about this program will be added in the FAQ in a few days.
We look forward to a fruitful collaboration on community-sigs [at] lists [dot] clamav [dot] net.
Wednesday, February 12, 2014
Covering malware is a constant fight and the more automation you can integrate, the easier life becomes. This post will go over a relatively easy setup for generating ClamAV signatures based on a set of samples.
I chose to work with OSX malware, specifically targeting Mach-O files. This would give me a relatively small sample set to work with. I downloaded the files from VirusTotal using the search type:macho positives:5+. At the time of download, this yielded 239 samples.
The first problem was grouping samples. Grouping the samples would allow to generate a single signature for multiple samples. One signature for each sample is costly and leads to a bloated signature set. For this, I set up three MySQL tables.
binaries - stores information about each sample seen
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| md5 | varchar(32) | NO | | NULL | |
| size | int(11) | NO | | NULL | |
+-------+-------------+------+-----+---------+----------------+
functions - stores information about each function seen
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| md5 | varchar(32) | NO | | NULL | |
| size | int(11) | NO | | NULL | |
+-------+-------------+------+-----+---------+----------------+
link_table - associates each binary with a set of functions
+---------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| prog_id | int(11) | NO | PRI | NULL | |
| fn_id | int(11) | NO | PRI | NULL | |
+---------+---------+------+-----+---------+-------+
The table binaries stores a hash of each program, a unique id, and the program's size. The table functions stores the md5sum of the bytes comprising the function, a unique id, and the size of the function. The table link_table links each binary to the functions it contains. The grouping is done based on common functions between binaries.
In order to populate these tables I wrote an IDAPython script. It iterates through the functions of the program, calculates their md5sum, and then inserts that information into the functions table if its length is greater than 19. The value 19 was selected after some light analysis in order to filter out functions that only consisted of a few instructions. Here is the snippet that populates functions and link_table.
# for all function offsets
for fn_ea in Functions():
if fn_ea == None:
continue
# get function from offset
f = idaapi.get_func(fn_ea)
# get function bytes
start = f.startEA
size = f.endEA - start
bytes = GetManyBytes(start, size)
# if the function is sufficiently long
if bytes != None and len(bytes) > 19:
fn_hash = md5(bytes).hexdigest().upper()
fn_size = str(len(bytes))
fn_data = (fn_hash, fn_size)
# get function id, or insert and get function id
fn_id = get_fn_id(cursor, cnx, fn_data)
# link binary to function
link_query = 'REPLACE INTO link_table (prog_id, fn_id) VALUES (%s, %s)'
link_data = (prog_id, fn_id)
cursor.execute(link_query, link_data)
cnx.commit().
IDA and this script are called by a batch script for every target binary. Once these tables are populated another script is ran, generate_sigs.py. This script uses the MySQL functionality group_concat to group binaries, based on their common functions, into a list. The problem with this approach is that if binaries A, B, and C share functions x, y, and z, and binaries A and C share functions w, x, y, and z, then we will have duplicates in the list returned. To remedy this problem the script simply loops through the rows returned and if any list of binaries is completely contained in another list, it is removed. Any binary not in these groupings is marked to get its own signature.
Next, the md5sums of the functions common to each group are added to the table communicate. This was the best way for me to pass this information between scripts. Once this table is populated, another IDAPython script is called on the first binary in a group. This script iterates through the functions in the binary and if the function's md5sum matches one in the list of shared functions, its basic blocks are loaded into a table basic_blocks. This table stores the parent function's md5sum, the bytes that comprise the basic block, the basic block's md5sum, the size of the basic block, and its entropy. The byte_ prefix is used to differentiate between attributes of the raw data and the hex encoded version used in the ClamAV signatures.
communicate - used to pass the md5s of common functions
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| fn_md5 | varchar(32) | NO | PRI | NULL | |
+--------+-------------+------+-----+---------+-------+
basic_blocks - stores basic block information from functions
+--------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| fn_md5 | varchar(32) | NO | PRI | NULL | |
| hex_bytes | mediumtext | NO | | NULL | |
| bb_md5 | varchar(32) | NO | PRI | NULL | |
| byte_size | int(11) | NO | | NULL | |
| byte_entropy | double | NO | | NULL | |
+--------------+-------------+------+-----+---------+-------+
Once the basic blocks are stored, the IDAPython script completes and returns the the signature generation script. The basic blocks are queried for, sorted by their parent function and a metric entropy * size. The script then iterates through the functions and selects the best basic block based on the previously mentioned metric. It continues to do this until it has a sufficient amount of bytes. It then constructs an LDB signature.
With my newly created signatures, I ran a test on all the samples I had downloaded.
----------- SCAN SUMMARY -----------
Known viruses: 107
Engine version: 0.98.1
Scanned directories: 1
Scanned files: 239
Infected files: 190
Data scanned: 78.35 MB
Data read: 81.36 MB (ratio 0.96:1)
Time: 2.332 sec (0 m 2 s)
The interesting lines are highlighted. Since this script should give near total coverage, a detection rate of 190/239, while impressive, did not meet my expectations. Something was amiss. My colleague Shaun Hurley noticed that 64 bit Mach-O files were being neglected. Thinking about it, this made sense. IDA has different versions for 32 bit and 64 bit files. I modified the scripts to use idaw64.exe and reran them on the 64 bit binaries. The combined signature set was more impressive.
----------- SCAN SUMMARY -----------
Known viruses: 155
Engine version: 0.98.1
Scanned directories: 1
Scanned files: 239
Infected files: 232
Data scanned: 78.82 MB
Data read: 81.36 MB (ratio 0.97:1)
Time: 2.535 sec (0 m 2 s)
Great success!
This method does have some drawbacks. Since I was running it in a VM, concerns about hard disk space influenced the choice to group based on functions rather than grouping based on basic blocks. This will be fixed by offloading MySQL to a more dedicated machine whose hard drive I can fill up. As well, only common functions between the binaries are considered when selecting basic blocks. This was an oversight on my part since other functions may not be exact matches but could share a lot of common code. With the extra database space, I do not think grouping based on basic blocks is an unreasonable task for these relatively small sets of samples. Building in automatic identification of 32 bit and 64 bit files would remove some manual effort from the process.
A good example of a signature generated for multiple samples is this one for Flashback:
Osx.Trojan.Flashback-16;Engine:51-255,Target:9;0&1&2&3&4&5&6&7&8;5531C089E58B550885D2740B;5589E583EC28895DF48B5D088975F88B750C897DFC8B7D1085DB0F94C285F60F94C031C908C20F858A000000;5589E58B450885C0740B;5589E583EC18C744240801000000C744240414000000C7042400000000E83F020000C74008000000;5589E557565383EC2C8B45100FB7550C8B5D188945E08B451489542404895C24088945E48B451C89;8D4208894424088B450C89142489442404E818FFFFFF0FBEC0;5589E557565383EC1C8B7D0885FF7432;8B450C894210B801000000;C744240801000000C74424040C000000C7042400000000E877010000893089C28B073B03751E
While that signature is just extracted x86, it alerts on the following 15 samples:
B5942F202930DAFF45C79BDC7871C088
548981EF3FCB813FCD3ED2EBAB8102D7
C067B84DC59C93C1363FD9FC56CD2918
B0199B369A3FCC71653ED8A9F7990AFC
4E855DD770680F80A30B9805262BBEE6
EF2DB2EEB040BDF1D0A9A18F2775149B
9272778BB6FBC00131FFCECE51388ACB
BE1B0DB89A4798E6C11E4EBFB6B479AE
CED7C97304BFFD932822565E99460213
B94BF524A537C02DDA4CD047F61E00C4
14DE914B0101C0E7A2C7CF521557E747
657E5A48CEC24F0C6F516CA55581550F
647AF7013D0DA77B6E74D3C692B1B6C3
84352BF4A2FA95FC51AD0781000AA864
93734AEBC1670C22A79F08D1A0FCBD8F
Overall, I'm very happy with these results. Since IDAPro is used to extract everything, this work will translate well to the other binary types that IDA is capable of parsing - most importantly, portable executables.
I chose to work with OSX malware, specifically targeting Mach-O files. This would give me a relatively small sample set to work with. I downloaded the files from VirusTotal using the search type:macho positives:5+. At the time of download, this yielded 239 samples.
The first problem was grouping samples. Grouping the samples would allow to generate a single signature for multiple samples. One signature for each sample is costly and leads to a bloated signature set. For this, I set up three MySQL tables.
binaries - stores information about each sample seen
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| md5 | varchar(32) | NO | | NULL | |
| size | int(11) | NO | | NULL | |
+-------+-------------+------+-----+---------+----------------+
functions - stores information about each function seen
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| md5 | varchar(32) | NO | | NULL | |
| size | int(11) | NO | | NULL | |
+-------+-------------+------+-----+---------+----------------+
link_table - associates each binary with a set of functions
+---------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| prog_id | int(11) | NO | PRI | NULL | |
| fn_id | int(11) | NO | PRI | NULL | |
+---------+---------+------+-----+---------+-------+
The table binaries stores a hash of each program, a unique id, and the program's size. The table functions stores the md5sum of the bytes comprising the function, a unique id, and the size of the function. The table link_table links each binary to the functions it contains. The grouping is done based on common functions between binaries.
In order to populate these tables I wrote an IDAPython script. It iterates through the functions of the program, calculates their md5sum, and then inserts that information into the functions table if its length is greater than 19. The value 19 was selected after some light analysis in order to filter out functions that only consisted of a few instructions. Here is the snippet that populates functions and link_table.
# for all function offsets
for fn_ea in Functions():
if fn_ea == None:
continue
# get function from offset
f = idaapi.get_func(fn_ea)
# get function bytes
start = f.startEA
size = f.endEA - start
bytes = GetManyBytes(start, size)
# if the function is sufficiently long
if bytes != None and len(bytes) > 19:
fn_hash = md5(bytes).hexdigest().upper()
fn_size = str(len(bytes))
fn_data = (fn_hash, fn_size)
# get function id, or insert and get function id
fn_id = get_fn_id(cursor, cnx, fn_data)
# link binary to function
link_query = 'REPLACE INTO link_table (prog_id, fn_id) VALUES (%s, %s)'
link_data = (prog_id, fn_id)
cursor.execute(link_query, link_data)
cnx.commit().
IDA and this script are called by a batch script for every target binary. Once these tables are populated another script is ran, generate_sigs.py. This script uses the MySQL functionality group_concat to group binaries, based on their common functions, into a list. The problem with this approach is that if binaries A, B, and C share functions x, y, and z, and binaries A and C share functions w, x, y, and z, then we will have duplicates in the list returned. To remedy this problem the script simply loops through the rows returned and if any list of binaries is completely contained in another list, it is removed. Any binary not in these groupings is marked to get its own signature.
Next, the md5sums of the functions common to each group are added to the table communicate. This was the best way for me to pass this information between scripts. Once this table is populated, another IDAPython script is called on the first binary in a group. This script iterates through the functions in the binary and if the function's md5sum matches one in the list of shared functions, its basic blocks are loaded into a table basic_blocks. This table stores the parent function's md5sum, the bytes that comprise the basic block, the basic block's md5sum, the size of the basic block, and its entropy. The byte_ prefix is used to differentiate between attributes of the raw data and the hex encoded version used in the ClamAV signatures.
communicate - used to pass the md5s of common functions
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| fn_md5 | varchar(32) | NO | PRI | NULL | |
+--------+-------------+------+-----+---------+-------+
basic_blocks - stores basic block information from functions
+--------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| fn_md5 | varchar(32) | NO | PRI | NULL | |
| hex_bytes | mediumtext | NO | | NULL | |
| bb_md5 | varchar(32) | NO | PRI | NULL | |
| byte_size | int(11) | NO | | NULL | |
| byte_entropy | double | NO | | NULL | |
+--------------+-------------+------+-----+---------+-------+
Once the basic blocks are stored, the IDAPython script completes and returns the the signature generation script. The basic blocks are queried for, sorted by their parent function and a metric entropy * size. The script then iterates through the functions and selects the best basic block based on the previously mentioned metric. It continues to do this until it has a sufficient amount of bytes. It then constructs an LDB signature.
With my newly created signatures, I ran a test on all the samples I had downloaded.
----------- SCAN SUMMARY -----------
Known viruses: 107
Engine version: 0.98.1
Scanned directories: 1
Scanned files: 239
Infected files: 190
Data scanned: 78.35 MB
Data read: 81.36 MB (ratio 0.96:1)
Time: 2.332 sec (0 m 2 s)
The interesting lines are highlighted. Since this script should give near total coverage, a detection rate of 190/239, while impressive, did not meet my expectations. Something was amiss. My colleague Shaun Hurley noticed that 64 bit Mach-O files were being neglected. Thinking about it, this made sense. IDA has different versions for 32 bit and 64 bit files. I modified the scripts to use idaw64.exe and reran them on the 64 bit binaries. The combined signature set was more impressive.
----------- SCAN SUMMARY -----------
Known viruses: 155
Engine version: 0.98.1
Scanned directories: 1
Scanned files: 239
Infected files: 232
Data scanned: 78.82 MB
Data read: 81.36 MB (ratio 0.97:1)
Time: 2.535 sec (0 m 2 s)
Great success!
This method does have some drawbacks. Since I was running it in a VM, concerns about hard disk space influenced the choice to group based on functions rather than grouping based on basic blocks. This will be fixed by offloading MySQL to a more dedicated machine whose hard drive I can fill up. As well, only common functions between the binaries are considered when selecting basic blocks. This was an oversight on my part since other functions may not be exact matches but could share a lot of common code. With the extra database space, I do not think grouping based on basic blocks is an unreasonable task for these relatively small sets of samples. Building in automatic identification of 32 bit and 64 bit files would remove some manual effort from the process.
A good example of a signature generated for multiple samples is this one for Flashback:
Osx.Trojan.Flashback-16;Engine:51-255,Target:9;0&1&2&3&4&5&6&7&8;5531C089E58B550885D2740B;5589E583EC28895DF48B5D088975F88B750C897DFC8B7D1085DB0F94C285F60F94C031C908C20F858A000000;5589E58B450885C0740B;5589E583EC18C744240801000000C744240414000000C7042400000000E83F020000C74008000000;5589E557565383EC2C8B45100FB7550C8B5D188945E08B451489542404895C24088945E48B451C89;8D4208894424088B450C89142489442404E818FFFFFF0FBEC0;5589E557565383EC1C8B7D0885FF7432;8B450C894210B801000000;C744240801000000C74424040C000000C7042400000000E877010000893089C28B073B03751E
While that signature is just extracted x86, it alerts on the following 15 samples:
B5942F202930DAFF45C79BDC7871C088
548981EF3FCB813FCD3ED2EBAB8102D7
C067B84DC59C93C1363FD9FC56CD2918
B0199B369A3FCC71653ED8A9F7990AFC
4E855DD770680F80A30B9805262BBEE6
EF2DB2EEB040BDF1D0A9A18F2775149B
9272778BB6FBC00131FFCECE51388ACB
BE1B0DB89A4798E6C11E4EBFB6B479AE
CED7C97304BFFD932822565E99460213
B94BF524A537C02DDA4CD047F61E00C4
14DE914B0101C0E7A2C7CF521557E747
657E5A48CEC24F0C6F516CA55581550F
647AF7013D0DA77B6E74D3C692B1B6C3
84352BF4A2FA95FC51AD0781000AA864
93734AEBC1670C22A79F08D1A0FCBD8F
Overall, I'm very happy with these results. Since IDAPro is used to extract everything, this work will translate well to the other binary types that IDA is capable of parsing - most importantly, portable executables.
Labels:
clamav
Tuesday, February 11, 2014
Kaspersky labs released a report that covers in detail a piece of malware known as "Careto" or "The Mask". The report included several MD5 hashes of samples and related files, IP addresses and domain information. Typically with ClamAV, a hash signature targeting an entire file is formatted as following:
MD5:FileSize:Name
The samples for Careto and therefore their sizes were unavailable to us at the time of this blog post, making it impossible to release hash-based coverage. However, as of ClamAV 0.98, a hash signature can be written with a wildcard for the file size. The format for such a signature is:
MD5:*:Name:73
The 73 on the end will prevent the signature from being loaded by an older ClamAV engine that doesn't support this signature format.
The Mask is a combination of tools that cover 32-bit and 64-bit Windows, Mac OS X and Linux. Kaspersky also identified potential Android and Apple iOS variants. Their analysis indicates it can intercept many different forms of communication from the victim machine, exfiltrate data and provide remote access to the attacker.
This signatures file can be used to detect the sample discussed in the article. Just download it and put it in the same folder where you have your ClamAV signatures. If any alerts are generated from these please let us know by emailing research < at > sourcefire (dot) com.
MD5:FileSize:Name
The samples for Careto and therefore their sizes were unavailable to us at the time of this blog post, making it impossible to release hash-based coverage. However, as of ClamAV 0.98, a hash signature can be written with a wildcard for the file size. The format for such a signature is:
MD5:*:Name:73
The 73 on the end will prevent the signature from being loaded by an older ClamAV engine that doesn't support this signature format.
The Mask is a combination of tools that cover 32-bit and 64-bit Windows, Mac OS X and Linux. Kaspersky also identified potential Android and Apple iOS variants. Their analysis indicates it can intercept many different forms of communication from the victim machine, exfiltrate data and provide remote access to the attacker.
This signatures file can be used to detect the sample discussed in the article. Just download it and put it in the same folder where you have your ClamAV signatures. If any alerts are generated from these please let us know by emailing research < at > sourcefire (dot) com.
Thursday, February 6, 2014
This notice is for the members of the ClamAV mailing lists found here:
http://lists.clamav.net/mailman/listinfo/clamav-users
On Monday, February 10th, 2014 starting at 10am EST, the ClamAV Mailing lists will be moving to new server hardware. We anticipate this outage to last approximately four (4) hours. We will be notifying everyone when the new server is up and operational.
Thank you for your patience.
Joel Esler
Threat Intelligence Team Lead
Open Source Manager
Vulnerability Research Team
http://lists.clamav.net/mailman/listinfo/clamav-users
On Monday, February 10th, 2014 starting at 10am EST, the ClamAV Mailing lists will be moving to new server hardware. We anticipate this outage to last approximately four (4) hours. We will be notifying everyone when the new server is up and operational.
Thank you for your patience.
Joel Esler
Threat Intelligence Team Lead
Open Source Manager
Vulnerability Research Team
Labels:
mailing lists
Tuesday, January 21, 2014
Sourceforge has fired up their monthly "Project of the Month" process again, and they were kind enough to choose ClamAV for this months vote!
You can read more about the process on their blog post here: https://sourceforge.net/blog/revival-of-weekly-featured-projects-and-project-of-the-month-voting/
And you can cast your vote here: https://sourceforge.net/p/potm/discussion/vote/thread/7d522915/
Thanks to everyone who supports the ClamAV project, get out and vote!
(Note: You must be a member of Sourceforge, and must be logged in, to vote.)
You can read more about the process on their blog post here: https://sourceforge.net/blog/revival-of-weekly-featured-projects-and-project-of-the-month-voting/
And you can cast your vote here: https://sourceforge.net/p/potm/discussion/vote/thread/7d522915/
Thanks to everyone who supports the ClamAV project, get out and vote!
(Note: You must be a member of Sourceforge, and must be logged in, to vote.)
Labels:
clamav,
sourceforge
Tuesday, January 14, 2014
ClamAV 0.98.1 provides improved support of Mac OS X platform, support for new file types, and
quality improvements. These include:
There are also fixes for other minor issues and code quality changes. Please
see the ChangeLog file for details.
--
The ClamAV team (http://www.clamav.net/team)
quality improvements. These include:
- Extraction, decompression, and scanning of files within Apple Disk Image (DMG) format.
- Extraction, decompression, and scanning of files within Extensible Archive (XAR) format.
XAR format is commonly used for software packaging, such as PKG and RPM, as well as
general archival.
- Decompression and scanning of files in "Xz" compression format.
- Improvements and fixes to extraction and scanning of ole formats.
- Option to force all scanned data to disk. This impacts only a few file types where
some embedded content is normally scanned in memory. Enabling this option
ensures that a file descriptor exists when callback functions are used, at a small
performance cost. This should only be needed when callback functions are used
that need file access.
- Various improvements to ClamAV configuration, support of third party libraries,
and unit tests.
There are also fixes for other minor issues and code quality changes. Please
see the ChangeLog file for details.
--
The ClamAV team (http://www.clamav.net/team)
Tuesday, October 8, 2013
In July we told you about Sourcefire’s agreement to be acquired by Cisco, and today that acquisition has closed – we are now one company. This also means that we are also now one community, and Cisco has reiterated its commitment to maintaining our innovation and support of Snort, ClamAV and other open source projects, as well as its own projects. As Marty Roesch wrote on our corporate blog:
I’m also happy to report that there will be no changes to how our communities are run or our communications, including mailing lists, snort.org, clamav.net or social media sites. Please visit the corporate blog for more details and, as always, reach out to me with questions. I will still be your community manager and I look forward to many more years of being a part of this community.
"I can tell you with certainty that this is a great match for Sourcefire, for Cisco and, ultimately, for our customers, partners and open source communities… Beyond the technology, one of the things that is important to me is that Cisco and Sourcefire both share key values that transcend our company names, HQ locations and number of employees."
I’m also happy to report that there will be no changes to how our communities are run or our communications, including mailing lists, snort.org, clamav.net or social media sites. Please visit the corporate blog for more details and, as always, reach out to me with questions. I will still be your community manager and I look forward to many more years of being a part of this community.
Labels:
cisco,
Sourcefire
Friday, October 4, 2013
Everyone that reads this blog may not read all of the other Sourcefire/Vulnerability Research Team (VRT) Blogs, so I thought I'd add a quick comment here about one of the Malware Research Teams articles over on our VRT Blog.
The article is entitled "Android Basic Block Signatures" and he goes over some good syntax for the ClamAV Signature Language.
Please check it out here: http://blog.talosintel.com/2013/10/android-basic-block-signatures.html
The article is entitled "Android Basic Block Signatures" and he goes over some good syntax for the ClamAV Signature Language.
Please check it out here: http://blog.talosintel.com/2013/10/android-basic-block-signatures.html
Labels:
clamav
Subscribe to:
Posts
(
Atom
)