Parsing configuration example

You can configure and customize most stages of the logs parsing pipeline, using the custom-values.yaml file.

Sample log parser configuration
logs-parser:
  kf_parsing_config:
    configPath: "/conf"
    config: |-
      parser_patterns:
        - dissect:
            timestamp_pat: "<REGEX>"
        - grok:
            NGINX_HOST: (?:%{IP:destination_ip}|%{NGINX_NOTSEPARATOR:destination_domain})(:%{NUMBER:destination_port})?
      - remap:
          args:
            kf_source:
              - "$.logSource"
          conditions:
            - matcher: "__kf_agent"
              value: "fluent-bit"
              op: "=="
      - parser:
          dissect:
            args:
              -  tokenizer: '%{timestamp} %{level} [LLRealtimeSegmentDataManager_%{segment_name}]'
            conditions:
              - matcher: "%kf_msg"
                value: "LLRealtimeSegmentDataManager_"
                op: "contains"

Configuration details

There are several important configuration details.

Kloudfuse parsing configuration

You must specify all parsing configurations under the kf_parsing_config key.

Configuration path

By default, we write the configuration as a yaml file in the conf/ directory. However, you can override the directory name by specifying a different configPath:

kf_parsing_config:
configPath: "<CUSTOM_CONFIG_DIR>"

Parser patterns

We use two patterns to define grammar, dissect and grok. You can use these definitions in other parts of the configuration by referring to them through the keys timestamp_pat or NGINX_HOST.

Consider the pattern definitions from the Sample log parser configuration:

Parser patterns example
      parser_patterns:
        - dissect:
            timestamp_pat: "<REGEX>"
        - grok:
            NGINX_HOST: (?:%{IP:destination_ip}|%{NGINX_NOTSEPARATOR:destination_domain})(:%{NUMBER:destination_port})?

Functions

Specify each function as a separate item in the list. Be sure to include the required arguments for the function.

Each function can have a list of conditions that determine whether the system should run the function for a specific log line. If a function has no attached conditions, the system always runs it. A condition definition is a triplet made up of a matcher, a value, and an operator (op).

Functions example
logs-parser:
  kf_parsing_config:
    configPath: "/conf"
    config: |-
      parser_patterns:
        . . .
      - <FUNC_NAME>:
        args:
          - <FUNC_ARGS>
        conditions:
          # condition 1
          - matcher: <LABEL|FACET|FIELD|JSON_PATH|__kf_agent>
            value: "<EXPECTED_VALUE>"
            op: "<OP_VALUE>"
          # condition 2
          - matcher: <LABEL|FACET|JSON_PATH|__kf_agent>
            value: "<EXPECTED_VALUE>"
            op: "<OP_VALUE>"
matcher

Can be any one of these:

Label

Name must be prefixed with #.

Log facet

Name must be prefixed with @.

Field

Name must be prefixed with %.

Only supported field is kf_msg.

JSONPath

Field from incoming JSON payload, specified as JSONPath.

Only for remap function.

__kf_agent

A static string literal for specifying the log agent type.

Only for remap function.

value

Depending on the operator, the value must be a literal string or a regex.

op

The operator for the condition.

Kloudfuse supports these conditional operators:

==

String equality

!=

String inequality

=~

Regex match

!~

Regex not match

contains

Checks whether a string contains a sequence of characters.

startsWith

Checks whether a string starts with a sequence of characters.

endsWith

Checks whether a string ends with a sequence of characters.

in

The value is in the String; multiple values must be in a comma-separated string format.