testing logstash configs with rspec

At work I’m supporting a rails app, developed by an external company. That app logs a lot of useful performance information, so I’m using logstash to grab that data and send it to statsd+graphite.

I’ve been nagging the developers for more debugging information in the log file, and they’ve now added “enhanced logging”.

Since the log format is changing, I’m taking the opportunity to clean up our logstash configuration. The result of this has been to create an automated testing framework for the logstash configs.

Managing config files

Logstash allows you to point to a directory containing many config files, so I’ve used this feature to split up the config into smaller parts. There is a config file per input and, a filter config for that input, and a related statsd output config. For outputs, I also have a config for elasticsearch.

Because I use grep to tag log messages, then run a grok based on those tags, it was necessary to put all filters for an input in a single file. Otherwise your filter ordering can get messed up as you can’t guarantee what order the files are read by logstash.

If you want to break out your filters in multiple files but need the filters to be loaded in a certain order, then prefix their names with numbers to explicitly specify the order.

100-filter-one.conf
101-filter-two.conf
...

Filter config

The application logfile has several different kinds of messages that I want to extract data from. There are rails controller logs, CRUD requests generated by javascript, SQL requests, passenger logs and memcached logs.

So, when I define the input, those log messages are defined with the type ‘rails’.

The first filter that gets applied is a grok filter which processes the common fields such as timestamp, server, log priority etc. If this grok is matched, the log message is tagged with ‘rails’.

Messages tagged ‘rails’ are subject to several grep filters that differentiate between types of log message. For example, a message could be tagged as ‘rails_controller’, ‘sql’, or ‘memcached’.

Then, each message type tag has a grok filter that extracts all the relevant data out of the log entry.

One of the key things I’m pulling out of the log is the response time, so there are some additional greps and tags for responses that take longer than we consider acceptable.

When constructing the grep filters, I debug the regexes with http://www.rubular.com/, and for the grok filters http://grokdebug.herokuapp.com/ is a massively useful tool.

However, each of these web tools only look at a single log message or regex - I want to test my whole filter configuration, how entries are directed through the filter logic, and know when I break some dependency for another part of the configuration.

rspec tests

Since logstash 1.1.5 it’s been possible to run rspec tests using the monolithic jar:

java -jar logstash-monolithic.jar rspec <filelist>

So, given I have a log message that looks like this:

2013-01-20T13:14:01+0000 INFO server rails[12345]: RailController.index: 123.1234ms db=railsdb request_id=fe3c217 method=GET status=200 uri=/page/123 user=johan

Then I would write a spec test that looks like:

File: spec/logstash.rb
files = Dir['../configs/filter*.conf']
@@configuration = String.new
files.sort.each.do |file|
  @@configuration << File.read(file)
end

describe "my first logstash rspec test"
  extend LogStash::RSpec
  
  config(@@configuration)
  
  message = %(2013-01-20T13:14:01+0000 INFO server rails[12345]: RailsController.index: 123.1234ms db=railsdb request_id=fe3c217 method=GET status=200 uri=/page/123 user=johan)
  
  sample("@message" => message, "@type" => "rails")
    insist { subject.type } == "rails"
    insist { subject.tags }.include?("user")
    reject { subject.tags }.include?("_grokparsefailure")
    insist { subject["TIMESTAMP_ISO8601"] } == "2013-01-20T13:14:01+0000"
    insist { subject["logpriority"] } == "INFO"
    insist { subject["logsource"] } == "rails"
    insist { subject["railscontroller"] } = "RailsController"
    insist { subject["railscontrollerction"] } = "index"
    insist { subject["time"] } == "123.1234"
    insist { subject["database"] } == "railsdb"
    insist { subject["request_id"] } == "fe3c217"
    insist { subject["method"] } == "GET"
    insist { subject["status"] } == "200"
    insist { subject["uri"] } == "/page/123"
    insist { subject["user"] } == "johan"
  end
end

So, this is dynamically including in all my filter configurations from my logstash configuration directory. Then I define my known log message, and what I expect the outputs to be - the tags that should and shouldn’t be there, and the content of fields pulled out of the log message.

Develop - Verify workflow

Before writing any filter config, I take sample log messages and write up rspec tests of what I expect to pull out of those log entries. When I run those tests the first time, they fail.

Then I’ll use the grokdebug website to construct my grok statements. Once they’re working, I’ll update the logstash filter config files with the new grok statements, and run the rspec test suite.

If the tests are failing, often I’ll output subject.inspect within the sample block, to show how logstash has processed the log event. But these debug messages are removed once our tests are passing, so we have clean output for automated testing.

When all the tests are passing I’ll deploy them to the logstash server and restart logstash.

java -jar /usr/share/logstash/logstash-monolithic.jar rspec examples.rb
..........................

Finished in 0.23 seconds
26 examples, 0 failures

Automating with Jenkins

Now we have a base config in place, I want to automate testing and deploying new configurations. To do this I use my good friend Jenkins.

All my spec tests and configs are stored in a single git repository. Whenever I push my repo to the git server, a post-receive hook is executed that starts a Jenkins job.

This job will fetch the repository and run the logstash rspec tests on a build server. If these pass, then the configs are copied to the logstash server and the logstash service is restarted.

If the tests fail, then a human needs to look at the errors and fix the problem.

Integrating with Configuration Management

You’ll notice my logstash configs are stored in a git repo as files, rather than being generated by configuration management. That’s a choice I made in this situation as it was easier to manage within our environment.

If you manage your logstash configs via CM, then a possible approach would be to apply your CM definition to a test VM and then run your rspec tests within that VM.

Alternatively, the whole logstash conf.d directory could be synced by your CM tool. Then you could grab just that directory for testing, rather than having to do a full CM run.

Catching problems you haven’t written tests for

I send to statsd the number of _grokparsefailure tagged messages - this highlights any log message formats that I haven’t considered, or can show up if the log format changes on me and that I need to update my grok filters.