Power of graylog2 web interface

I was overwhelmed by the power of the graylog2 web interface, after the backend switched to elasticsearch. With custom fields in the gelf message holding valuable data and elasticsearch providing the right kind of search capabilities, its up to the imagination of users on what could be achieved.

Let me take you through a overview of how we could use graylog2 web interface. The rest of the blog assumes, that you have read the previous blog on logstash and graylog.

Armed with custom fields, lets take a common scenario where we want to quickly find out broken urls. We can login to the graylog2 web interface. Once logged in, there will be the messages tab that will be highlighted by default and it shows quick filter.

Now click on the ‘quickfilter‘ option. Select ‘Add additional field‘ from the expanded options.

Enter ‘_response‘ in the first text box and the value ‘404‘ in the next text box. Now hit ‘Run filter’. There you go, the list of all broken urls. You can select the message from the result and on the right side shows the details of the message.

Each of the fields value is clickable, that means you can quickly filter out all the messages with that value. For example, you have found that a old url is hit and it has returned 404. Just select the value in the field request and you will be taken to the results page which lists all messages with the same request url.

You want to know the referrers for all the requests which has this url. We can use the analytics shell to figure this out.

all.distribution({_httpreferrer}, _request = "/old_url")

We would typically go to the analytics tool like Google Analytics to figure out, what browsers were used to reach the site and the number of requests for each of these browsers. This can be mined from the log itself by using the analytics shell with a simple query:

all.distribution({_httpuseragent})

Similarly to find the incoming referral urls and the request count per referral url we can use something like:

all.distribution({_httpreferrer})

With the graylog custom fields we have defined in the previous post, we can find the rendering time for a rails view template, time taken for a rails action, etc.. We could easily enhance it to add fields for queries and their execution time. If that is done, the log can provide vital clues into application performance and behavior of the user.

Just to give a idea of the power that is available, let me quote a real incident. We had a interesting user, who entered a secret information in the designated field, but reported that the private info is compromised and it is shown in another public field. We were clueless how on earth, this could occur. Finally, log mining came to our rescue. we figured out all the requests from that day by that user and then used the rails unique request id of each of these requests and got all the logs. we matched the httpreferrer of a particular request with the request field of other requests, by which we got the exact user flow in the application. Once that is done, we replicated the exact scenario and found that, when a validation error occurs, the focus was not set properly. This debugging was a very simple exercise, since we were able to map the exact user journey in the application. Other wise it would have been a daunting task to figure out what was happening.

From here on, its just about imagination about how much value you want to derive out of your logs and how you want that achieved. The tools are available right at your doorsteps.

After using the tool extensively, there is only one request to graylog team, to look at providing a plugin architecture for the graylog web interface for extension. Also planning to work on Graylog web ui to give some more power and customization capabilities, when time permits.

In case of questions, pls do reach out through twitter or through comments.

Advertisements

logstash + graylog2 – cant ask more for logging

Everyone knows how important logging is for a web application. It is a shelter we build for the rainy day. But the logs in all the application has enormous amount of valuable information, that can give interesting perspectives about the application. Also as applications grow in complexity and scale, logs become spread across multiple machines and files. Soon developers/support people will be overwhelmed by the sheer number of log files to search and track issues. In order to overcome this situation, and to have a clear, simplified oversight over the system, a central logging system becomes necessary.

Was part of a team which was responsible for building a highly distributed and complex system. Just to give a idea of the kind of complexity involved, let me give a brief summary of tech components. The project was developed using three languages (1 dynamic and 2 static), uses sql and a nosql db, a queuing engine, etc.. In order to get the log consolidated and have a central view, we evaluated splunk. Unfortunately it was extremely costly. So we were forced to look at open source alternatives and finally we selected logstash and graylog.

Looking back at that decision, was one of the great decisions. I will share some insights into how logstash and graylog2 could be configured.

First we need to download the logstash jar to each of the machines where log files are generated.
Then we need to configure the input file for logstash to process. its pretty simple. we need to provide the path to files and also group them under different types.

input {
  file {
    type => nginx_web
    path => ["/opt/nginx/logs/access.log", "/opt/nginx/logs/error.log"]
  }
}

Graylog uses a log format called Graylog Extended Log Format(GELF). By using this format, we can send custom fields with the log. This format also doesnot have a log size limitation. We can preprocess the logs and define custom fields based on the line logged, before they are sent to the graylog server. Below is a example of nginx log being parsed using grok patterns(regular expressions) and fields are defined with the values matched. Grok patterns are extremely powerful, so they need a separate detailed blog… but thats for later.

filter {
 grok {
   type => nginx_web
   pattern => "%{IP:clientip} (?:%{HOST:clienthost}|-) (?:%{USER:clientuser}|-) \[%{HTTPDATE:time}\] \"(?:%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}|%{DATA:unparsedrq})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QUOTEDSTRING:httpreferrer} %{QUOTEDSTRING:httpuseragent}"
 }
}

The corresponding nginx log format is:

log_format  main  '$remote_addr $host $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $ssl_cipher $request_time';

Above grok filter will parse each log entry and fill the values for the following fields clientip, clienthost, clientuser, time, verb, request, httpversion, unparsedrq, response, bytes, httpreferrer, httpuseragent.

The advantage of splitting the input log entry into multiple fields, is that custom rules to search, group can be done on each of these fields in the graylog2 web.
Sample grok pattern for the rails 3 project is:

 grok {
   type => app
   pattern => "\[%{WORD:requesttoken}\] \[%{IP:clientip}\] (?:%{DATA:logstmt}|-)"
 }
 grok {
   type => app
   pattern => "Rendered %{DATA:renderedtemplate} \(%{DATA:rendertime}\)"
 }

The above grok filter parses the rails 3 app log and defines the following fields: requesttoken, clientip, logstmt, renderedtemplate, rendertime.

Next we need to configure the output. This is where we specify the output format as gelf and the graylog server to contact.
We can also specify custom fields. Here we are sending a custom field named environment with value as ‘uat’.
Another predefined field in gelf format is ‘facility’ and here we are setting it with the value of field ‘type’.

output {
 gelf {
 host => "graylog-server-ip"
 facility => "%{@type}"
 custom_fields => ["environment", "uat"]
 }
}

A complete sample configuration for logstash agent is:

input {
  file {
    type => pg91
    path => ["/var/lib/pgsql/9.1/pgstartup.log", "/var/lib/pgsql/9.1/data/pg_log/*.log"]
  }
  file {
    type => app
    path => ["/app/employeemgmt/shared/log/uat.log"]
  }
  file {
    type => nginx_web
    path => ["/opt/nginx/logs/access.log", "/opt/nginx/logs/error.log"]
  }
}

filter {
 grok {
   type => nginx_web
   pattern => "%{IP:clientip} (?:%{HOST:clienthost}|-) (?:%{USER:clientuser}|-) \[%{HTTPDATE:time}\] \"(?:%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}|%{DATA:unparsedrq})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QUOTEDSTRING:httpreferrer} %{QUOTEDSTRING:httpuseragent}"
 }
 grok {
   type => app
   pattern => "\[%{WORD:requesttoken}\] \[%{IP:clientip}\] (?:%{DATA:logstmt}|-)"
 }
 grok {
   type => app
   pattern => "Rendered %{DATA:renderedtemplate} \(%{DATA:rendertime}\)"
 }
}

output {
 gelf {
 host => "graylog-server-ip"
 facility => "%{@type}"
 custom_fields => ["environment", "uat"]
 }
}

All these configuration need to be done in a conf file.
Once the conf file is ready, we need to install the graylog2 server. Installation is pretty simple, again download the server jar file and run it. This graylog2 wiki will help you with that.
Finally we need to start the logstash agent using:

java -jar logstash-1.1.0-monolithic.jar agent -f logstash-agent.conf

Enjoy all your logs are in central Graylog2 server. Watch out for the next post on graylog2 web interface and how we can use it.

Ant parallel execution – Error with emma code coverage

In our project we tried to run ant tasks in parallel. We were parallelizing emma code coverage task with some other independent build tasks.
At that time we were getting this error:

../project/build.xml:157: java.lang.ExceptionInInitializerError
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:116)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:357)
at org.apache.tools.ant.Target.performTasks(Target.java:385)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1337)
...
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:257)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:104)
Caused by: java.lang.ExceptionInInitializerError
at com.vladium.emma.Processor.run(Processor.java:40)
at com.vladium.emma.instr.instrTask.execute(instrTask.java:77)
at com.vladium.emma.emmaTask.execute(emmaTask.java:57)
...
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
... 31 more
Caused by: java.lang.IllegalArgumentException: null or corrupt input: out
at com.vladium.logging.Logger.create(Logger.java:74)
at com.vladium.logging.Logger.(Logger.java:604)
... 39 more

On analysing this we found a open defect in emma which is linked here:
http://sourceforge.net/tracker/index.php?func=detail&aid=1641403&group_id=108932&atid=651897

Then on further debugging it, I found that the code in some of the logging portion in ant is obtaining locks and releasing it. Then we found out that the defect is in ant. We were running ant 1.7.1 at that time. I upgraded the ant to 1.8.1 and it solved the problem.

In ant 1.8.1, the locks which were obtained and released in logging was removed and an advanced lockless way of logging was introduced.

I hope this is useful for others debugging this issue.

Grails controller integration testing

I wanted to test a groovy controller using a unit test. We had a render whose arguments needs to be verified. Initially we added a test case which extends ControllerUnitTestCase and tried a lot using the renderArgs that is given by the Unit test case. We were able to find that the redirectArgs were working properly but the renderArgs was returning null.

After a bit of struggle, we found that the render arguments are only available in a spring modelandview class which gets constructed when u call render.

Finally, when we changed to the following code

controller.modelAndView.viewName
controller.modelAndView.model

everything started to work.

Even though it is documented in the grails doc in a completely different section in grails doc. The section on testing seems to be misleading.

Mockito mock child classes too…

Recently, I had a requirement to mock out the child classes of a parent mocked class. Typically we reach this kind of situation when we want to mock out a view and its components.

I was able to get a small snipper for solving this problem:

   protected T createMock(Class clazz) {  
     return (T) mock(clazz, createAnswer());  
   }  
   Answer createAnswer() {  
     return new Answer() {  
       public HashMap methodObjectHashMap = new HashMap();  
       @Override  
       public Object answer(InvocationOnMock invocationOnMock) throws Throwable {  
         Method method = invocationOnMock.getMethod();  
         if (method.getReturnType().getName().equalsIgnoreCase(“void”)){  
           return null;  
         }  
         if (methodObjectHashMap.get(method) == null) {  
           methodObjectHashMap.put(method, mock(method.getReturnType()));  
         }  
         return methodObjectHashMap.get(method);  
       }  
     };  
   }  

Instead of using the mockito’s mock method to create a mock object of a class, We can use the creatMock method which will get the required mock object with each properties also as mocked objects.

The world is at fault

I recently came across a situation where every bit of problem is attributed to the technology. It started with the criticism that the problems we have are all due to groovy and then matured to grails.

  • The latest entrant into this club is the JUnit 4. There was a issue when we run a grails test suite, one of the integration tests failed. It turned out that the mocking in one of the unit tests were wrong and that left few entries in the database.

What?

A mocked unit test leaving entries in the DB.

Thats the same surprise, i too had. This led us to quoting that Junit 4 is having issues.

I never believed on that and on further investigating, it turned out that when we run the grails unit and integration test together:

  • The test classes are loaded and mocked.
  • The unit test added rows of the mock class.
  • Since the classes are loaded in common for both (some part of it is custom), the mocked class was the same that the integration test also used.

That was the reason that there were so called records left behind in the DB. Actually it never went to the DB itself.

So what made that to be cleared off when we moved to Junit 3 back. Grails adds a metaclass to the GroovySystem.metaClassRegistry when mockDomain, mock… methods are called for the corresponding class. These metaClasses are by default cleared by the tearDown method in the GrailsUnitTestCase.

So when we take up the data setup in Junit 4, we also are given the responsibility of clearing the data back, which can be achieved by a simple call to the GroovySystem.metaClassRegistry clear or the original tearDown call itself.

Ok…. How many times we have been in the same kind of situation and we have just blamed technology for the problems which we introduced…

The problem with this kind of thoughts is:

  • Real problem wont be identified
  • Different forms of the same problem would occur
  • Nothing to learn situations …

In order to effectively address this, We have to move towards a thought process where the problem and its root causes take the centre stage than a opinion that is not backed by data…