My FeedDiscussionsHeadless CMS
New
Sign in
Log inSign up
Learn more about Hashnode Headless CMSHashnode Headless CMS
Collaborate seamlessly with Hashnode Headless CMS for Enterprise.
Upgrade ✨Learn more

Parse logs using GROK pattern

escoder's photo
escoder
·Jun 7, 2021·

2 min read

GROK filter plugin is used for parsing unstructured data and converting it into a structured format. Once converted into a structured format (eg JSON format), it becomes very easy to query and perform further operations on the data.

In other words, using GROK filter, we can match a line against a given pattern so that it can map each part of the line to specific fields.

The general syntax of GROK pattern is: %{PATTERN:FIELDNAME}

Here PATTERN represents the pattern we are specifying to match the parts of the line, and the FIELDNAME is the identifier of the part of the line that has matched a certain pattern.

Note: There are several GROK patterns that are already created. You can find them here. You can modify these patterns to create your own custom pattern.

You can build your grok filters and test them using this grok debug tool:

GROK pattern examples

  1. The grok pattern shown below, will break the given line into the host, timestamp, level, and country. In the pattern given below, all the predefined patterns are used to parse the input data.
    55.3.244.1 2021-02-25T18:43:18,222 INFO India
    
    Pattern:
    %{IP:host.ip} %{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{WORD:country}
    
    image.png
  2. To extract only a particular part of the line in the output, you need to parse only that part of the input data. Suppose you want to find only the LOGLEVEL and skip the remaining part from the input data
    %{LOGLEVEL:level}
    
    image.png
  3. You can use %{SPACE} OR \s* for parsing space in-between input data
  4. Parse the square bracket or any other special character or whitespace that comes in the data line
    [55.3.244.1] [2021-02-25T18:43:18,222]
    
    \[%{IP:host.ip}\]\s*\[%{TIMESTAMP_ISO8601:timestamp}\]
    
    image.png
  5. To make some fields optional, you need to use the conditional operator. You need to enclose the complete field, which you want to make optional inside ( ) and in the end add ?. In the below, example timestamp field is made optional

    [55.3.244.1]
    

    Pattern:

    \[%{IP:host.ip}\](\[%{TIMESTAMP_ISO8601:timestamp}\])?
    

    image.png

  6. By default grok filter returns the output in string format. But you can cast them to either int or float as well, which means data type conversion is possible.

    523
    
    %{NUMBER:num:int}
    

    image.png If type conversion, is not done then the pattern would be

    %{NUMBER:num}
    

    And the output will be (notice that num is of string type and not of integer type) image.png