Thursday, October 16 2014

Mitigating Poodle SSLv3 vulnerability on a Go server

If you run a Go server that supports SSL/TLS, you should update your configuration to disable SSLv3 today. The sample code below sets the minimal accepted version to TLSv1, and reorganizes the default ciphersuite to match Mozilla's Server Side TLS guidelines.

Thank you to @jrconlin for the code cleanup!

package main

import (
    "crypto/rand"
    "crypto/tls"
    "fmt"
)

func main() {
    certificate, err := tls.LoadX509KeyPair("server.pem", "server.key")
    if err != nil {
        panic(err)
    }
    config := tls.Config{
        Certificates:             []tls.Certificate{certificate},
        MinVersion:               tls.VersionTLS10,
        PreferServerCipherSuites: true,
        CipherSuites: []uint16{
            tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
            tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
            tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,
            tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,
            tls.TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,
            tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,
            tls.TLS_RSA_WITH_AES_128_CBC_SHA,
            tls.TLS_RSA_WITH_AES_256_CBC_SHA,
            tls.TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,
            tls.TLS_RSA_WITH_3DES_EDE_CBC_SHA},
    }
    config.Rand = rand.Reader

    netlistener, err := tls.Listen("tcp", "127.0.0.1:50443", &config)
    if err != nil {
        panic(err)
    }
    newnetlistener := tls.NewListener(netlistener, &config)
    fmt.Println("I am listening...")
    for {
        newconn, err := newnetlistener.Accept()
        if err != nil {
            fmt.Println(err)
        }
        fmt.Printf("Got a new connection from %s. Say Hi!\n", newconn.RemoteAddr())
        newconn.Write([]byte("ohai"))
        newconn.Close()
    }
}

Run the server above with $ go run tls_server.go and test the output with cipherscan:

$ ./cipherscan 127.0.0.1:50443
........
Target: 127.0.0.1:50443

prio  ciphersuite                  protocols              pfs_keysize
1     ECDHE-RSA-AES128-GCM-SHA256  TLSv1.2                ECDH,P-256,256bits
2     ECDHE-RSA-AES128-SHA         TLSv1,TLSv1.1,TLSv1.2  ECDH,P-256,256bits
3     ECDHE-RSA-AES256-SHA         TLSv1,TLSv1.1,TLSv1.2  ECDH,P-256,256bits
4     AES128-SHA                   TLSv1,TLSv1.1,TLSv1.2
5     AES256-SHA                   TLSv1,TLSv1.1,TLSv1.2
6     ECDHE-RSA-DES-CBC3-SHA       TLSv1,TLSv1.1,TLSv1.2  ECDH,P-256,256bits
7     DES-CBC3-SHA                 TLSv1,TLSv1.1,TLSv1.2

Certificate: UNTRUSTED, 2048 bit, sha1WithRSAEncryption signature
TLS ticket lifetime hint: None
OCSP stapling: not supported
Server side cipher ordering

Thursday, October 9 2014

Automated configuration analysis for Mozilla's TLS guidelines

Last week, we updated Mozilla's Server Side TLS guidelines to add a third recommended configurations. Each configuration maps to a target compatibility level:

  1. Old supports Windows XP pre-SP2 with IE6 and IE7. Those clients do not support AES ciphers, and for them we need to maintain a configuration that accepts 3DES, SSLv3 and SHA-1 certificates.
  2. Intermediate is the new default, and supports clients from Firefox 1 until now. Unlike the old configuration, SSLv3, 3DES and SHA-1 are disabled. We also recommend using a Diffie-Hellman parameter of 2048 bits when PFS DHE ciphers are in use (note that java 6 fails with a DH param > 1024 bits, use the old configuration if you need java 6 compatibility).
  3. Modern is what we would really love to enable everywhere, but is not yet supported by enough clients. This configuration only accepts PFS ciphers and TLSv1.1+. Unfortunately, clients older than Firefox 27 will fail to negotiate this configuration, so we reserve it for services that do not need backward compatibility before FF27 (webrtc, sync1.5, ...).

Three recommended configurations means more choice, but also more work to evaluate a given endpoint.To help with the analysis of real-world TLS setups, we rely on cipherscan, a wrapper to openssl s_client that quickly pulls TLS configuration from a target. I wrote the initial version of cipherscan last year, and I'm very happy to see it grow with major contributions from Hubert Kario (Red Hat) and a handful of other people.

Today I'm releasing an extension to cipherscan that evaluates a scan result against our guidelines. By running it against a target, it will tell you what the current configuration level is, and what should be changed to reach the next level.

$ ./analyze.py -t jve.linuxwall.info
jve.linuxwall.info:443 has intermediate tls

Changes needed to match the old level:
* consider enabling SSLv3
* add cipher DES-CBC3-SHA
* use a certificate with sha1WithRSAEncryption signature
* consider enabling OCSP Stapling

Changes needed to match the intermediate level:
* consider enabling OCSP Stapling

Changes needed to match the modern level:
* remove cipher AES128-GCM-SHA256
* remove cipher AES256-GCM-SHA384
* remove cipher AES128-SHA256
* remove cipher AES128-SHA
* remove cipher AES256-SHA256
* remove cipher AES256-SHA
* disable TLSv1
* consider enabling OCSP Stapling

The analysis above evaluates my blog. I'm aiming for intermediate level here, and it appears that I reach it. I could improve further by enabling OCSP Stapling, but that's not a hard requirement.

If I wanted to reach modern compatibility, I would need to remove a few ciphers that are not PFS, disable TLSv1 and, again, enable OCSP Stapling. I would probably want to update my ciphersuite to the one proposed on Server Side TLS #Modern compatibility.

Looking at another site, twitter.com, the script return "bad ssl". This is because twitter still accepts RC4 ciphers, and in the opinion of analyze.py, this is a bad thing to do. We really don't trust RC4 anymore.

$ ./analyze.py -t twitter.com
twitter.com:443 has bad ssl

Changes needed to match the old level:
* remove cipher ECDHE-RSA-RC4-SHA
* remove cipher RC4-SHA
* remove cipher RC4-MD5
* use a certificate with sha1WithRSAEncryption signature
* consider enabling OCSP Stapling

Changes needed to match the intermediate level:
* remove cipher ECDHE-RSA-RC4-SHA
* remove cipher RC4-SHA
* remove cipher RC4-MD5
* remove cipher DES-CBC3-SHA
* disable SSLv3
* consider enabling OCSP Stapling

Changes needed to match the modern level:
* remove cipher ECDHE-RSA-RC4-SHA
* remove cipher AES128-GCM-SHA256
* remove cipher AES128-SHA
* remove cipher RC4-SHA
* remove cipher RC4-MD5
* remove cipher AES256-SHA
* remove cipher DES-CBC3-SHA
* disable TLSv1
* disable SSLv3
* consider enabling OCSP Stapling

The goal of analyze.py is to help operators define a security level for their site, and use this script to verify their configuration. If you want to check compatibility with a target level, you can use the -l flag to specify the level you want:

$ ./analyze.py -t stooge.mozillalabs.com -l modern
stooge.mozillalabs.com:443 has modern tls

Changes needed to match the modern level:
* consider enabling OCSP Stapling

Our guidelines are opinionated, and you could very well disagree with some of the recommendations. The discussion is open on the Talk section of the wiki page, I'm always happy to discuss them, and make them helpful to as many people as possible.

You can get cipherscan and analyze.py from the github repository at https://github.com/jvehent/cipherscan.

Thursday, September 25 2014

Shellshock IOC search using MIG

Shellshock is being exploited. People are analyzing malwares dropped on systems using the bash vulnerability.

I wrote a MIG Action to check for these IOCs. I will keep updating it with more indicators as we discover them. To run this on your Linux 32 or 64 bits system, download the following archive: mig-shellshock.tar.xz 

Download the archive and run mig-agent as follow:

$ wget https://jve.linuxwall.info/ressources/taf/mig-shellshock.tar.xz
$ sha256sum mig-shellshock.tar.xz 
0b527b86ae4803736c6892deb4b3477c7d6b66c27837b5532fb968705d968822  mig-shellshock.tar.xz
$ tar -xJvf mig-shellshock.tar.xz 
shellshock_iocs.json
mig-agent-linux64
mig-agent-linux32
$ ./mig-agent-linux64 -i shellshock_iocs.json
This will output results in JSON format. If you grep for "foundanything" and both values return "false", it means no IOC was found on your system. If you get "true", look at the detailed results in the JSON output to find out what exactly was found.
$ ./mig-agent-linux64 -i shellshock_iocs.json|grep foundanything
    "foundanything": false,
    "foundanything": false,
The full action for MIG is below. I will keep updating it over time, I recommend you use the one below instead of the one in the archive.
{
"name": "Shellshock IOCs (nginx and more)",
"target": "os='linux' and heartbeattime \u003e NOW() - interval '5 minutes'",
"threat": {
"family": "malware",
"level": "high"
},
"operations": [
{
"module": "filechecker",
"parameters": {
"searches": {
"iocs": {
"paths": [
"/usr/bin",
"/usr/sbin",
"/bin",
"/sbin",
"/tmp"
],
"sha256": [
"73b0d95541c84965fa42c3e257bb349957b3be626dec9d55efcc6ebcba6fa489",
"ae3b4f296957ee0a208003569647f04e585775be1f3992921af996b320cf520b",
"2d3e0be24ef668b85ed48e81ebb50dce50612fb8dce96879f80306701bc41614",
"2ff32fcfee5088b14ce6c96ccb47315d7172135b999767296682c368e3d5ccac",
"1f5f14853819800e740d43c4919cc0cbb889d182cc213b0954251ee714a70e4b"
],
"regexes": [
"/bin/busybox;echo -e '\\\\147\\\\141\\\\171\\\\146\\\\147\\\\164'"
]
}
}
}
},
{
"module": "netstat",
"parameters": {
"connectedip": [
"108.162.197.26",
"162.253.66.76",
"89.238.150.154",
"198.46.135.194",
"166.78.61.142",
"23.235.43.31",
"54.228.25.245",
"23.235.43.21",
"23.235.43.27",
"198.58.106.99",
"23.235.43.25",
"23.235.43.23",
"23.235.43.29",
"108.174.50.137",
"201.67.234.45",
"128.199.216.68",
"75.127.84.182",
"82.118.242.223",
"24.251.197.244",
"166.78.61.142"
]
}
}
],
"description": {
"author": "Julien Vehent",
"email": "ulfr@mozilla.com",
"revision": 201409252305
},
"syntaxversion": 2
}

Monday, September 22 2014

Batch inserts in Go using channels, select and japanese gardening

I was recently looking into the DB layer of MIG, searching for ways to reduce the processing time of an action running on several thousands agents. One area of the code that was blatantly inefficient concerned database insertions.

When MIG's scheduler receives a new action, it pulls a list of eligible agents, and creates one command per agent. One action running on 2,000 agents will create 2,000 commands in the ''commands'' table of the database. The scheduler needs to generate each command, insert them into postgres and send them to their respective agents.

MIG uses separate goroutines for the processing of incoming actions, and the storage and sending of commands. The challenge was to avoid individually inserting each command in the database, but instead group all inserts together into one big operation.

Go provides a very elegant way to solve this very problem.

At a high level, MIG Scheduler works like this:

  1. a new action file in json format is loaded from the local spool and passed into the NewAction channel
  2. a goroutine picks up the action from the NewAction channel, validates it, finds a list of target agents and create a command for each agent, which is passed to a CommandReady channel
  3. a goroutine listens on the CommandReady channel, picks up incoming commands, inserts them into the database and sends them to the agents (plus a few extra things)
The CommandReady goroutine is where optimization happens. Instead of processing each command as they come, the goroutine uses a select() statement to either pick up a command, or timeout after one second of inactivity.
// Goroutine that loads and sends commands dropped in ready state
// it uses a select and a timeout to load a batch of commands instead of
// sending them one by one
go func() {
    ctx.OpID = mig.GenID()
    readyCmd := make(map[float64]mig.Command)
    ctr := 0
    for {
        select {
        case cmd := <-ctx.Channels.CommandReady:
            ctr++
            readyCmd[cmd.ID] = cmd
        case <-time.After(1 * time.Second):
            if ctr > 0 {
                var cmds []mig.Command
                for id, cmd := range readyCmd {
                    cmds = append(cmds, cmd)
                    delete(readyCmd, id)
                }
                err := sendCommands(cmds, ctx)
                if err != nil {
                    ctx.Channels.Log <- mig.Log{OpID: ctx.OpID, Desc: fmt.Sprintf("%v", err)}.Err()
                }
            }
            // reinit
            ctx.OpID = mig.GenID()
            ctr = 0
        }
    }
}()

As long as messages are incoming, the select() statement will elect the case when a message is received, and store the command into the readyCmd map.

When messages stop coming for one second, the select statement will fall into its second case: time.After(1 * time.Second).

In the second case, the readyCmd map is emptied and all commands are sent as one operation. Later in the code, a big INSERT statement that include all commands is executed against the postgres database.

In essence, this algorithm is very similar to a Japanese Shishi-odoshi.

shishi-odoshi.gif

The current logic is not yet optimal. It does not currently set a maximum batch size, mostly because it does not currently need to. In my production environment, the scheduler manages  about 1,500 agents, and that's not enough to worry about limiting the batch size.

Wednesday, August 27 2014

Postgres multicolumn indexes to save the day

I love relational databases. Well designed, they are the most elegant and efficient way to store data. Which is why MIG uses Postgresql, hosted by Amazon RDS.

It's the first time I use RDS for anything more than a small website. I discover its capabilities along the way.  Over the past few days, I've been investigating performance issues. The database was close to 100% CPU, and the number of DB connections maintained by the Go database package was varying a lot. Something was off.

I have worked as a junior Oracle & Postgres DBA in the past. In my limited experience, database performances are almost always due to bad queries, or bad schemas. When you wrote the queries, however, this is what you blame last, after spending hours looking for a bug in any other components outside of your control.

Eventually, I re-read my queries, and found one that looked bad enough:

// AgentByQueueAndPID returns a single agent that is located at a given queueloc and has a given PID
func (db *DB) AgentByQueueAndPID(queueloc string, pid int) (agent mig.Agent, err error) {
	err = db.c.QueryRow(`SELECT id, name, queueloc, os, version, pid, starttime, heartbeattime,
		status FROM agents WHERE queueloc=$1 AND pid=$2`, queueloc, pid).Scan(
		&agent.ID, &agent.Name, &agent.QueueLoc, &agent.OS, &agent.Version, &agent.PID,
		&agent.StartTime, &agent.HeartBeatTS, &agent.Status)
	if err != nil {
		err = fmt.Errorf("Error while retrieving agent: '%v'", err)
		return
	}
	if err == sql.ErrNoRows {
		return
	}
	return
}
The query locates an agent using its queueloc and pid values. Which is necessary to properly identify an agent, except that neither queueloc nor pid have indexes, resulting in a sequential scan of the table:
mig=> explain SELECT * FROM agents WHERE queueloc='xyz' AND pid=1234;
QUERY PLAN                                                   
--------------------------------------------------------------
 Seq Scan on agents  (cost=0.00..3796.20 rows=1 width=161)
   Filter: (((queueloc)::text = 'xyz'::text) AND (pid = 1234))
(2 rows)

This query is called ~50 times per second, and even with only 45,000 rows in the agents table, that is enough to burn all the CPU cycles on my RDS instance.

Postgres supports multicolumn indexes. The fix is simple enough: create an index on the columns queueloc and pid together.

mig=> create index agents_queueloc_pid_idx on agents(queueloc, pid);
CREATE INDEX

Which results in an immediate, drastic, reduction of the cost of the query, and CPU usage of the instance.


mig=> explain SELECT * FROM agents WHERE queueloc='xyz' AND pid=1234;
QUERY PLAN                                                     
---------------------------------------------------------------------------------------
 Index Scan using agents_queueloc_pid_idx on agents  (cost=0.41..8.43 rows=1 width=161)
   Index Cond: (((queueloc)::text = 'xyz'::text) AND (pid = 1234))
(2 rows)

migdbcpuusage.png

Immediate performance gain for a limited effort. Gotta love Postgres !

- page 2 of 32 -