Parsing configuration example
You can configure and customize most stages of the logs parsing pipeline, using the custom-values.yaml
file.
logs-parser:
kf_parsing_config:
configPath: "/conf"
config: |-
parser_patterns:
- dissect:
timestamp_pat: "<REGEX>"
- grok:
NGINX_HOST: (?:%{IP:destination_ip}|%{NGINX_NOTSEPARATOR:destination_domain})(:%{NUMBER:destination_port})?
- remap:
args:
kf_source:
- "$.logSource"
conditions:
- matcher: "__kf_agent"
value: "fluent-bit"
op: "=="
- parser:
dissect:
args:
- tokenizer: '%{timestamp} %{level} [LLRealtimeSegmentDataManager_%{segment_name}]'
conditions:
- matcher: "%kf_msg"
value: "LLRealtimeSegmentDataManager_"
op: "contains"
Configuration details
There are several important configuration details.
Kloudfuse parsing configuration
You must specify all parsing configurations under the kf_parsing_config
key.
Configuration path
By default, we write the configuration as a yaml file in the conf/
directory. However, you can override the directory name by specifying a different configPath
:
kf_parsing_config:
configPath: "<CUSTOM_CONFIG_DIR>"
Parser patterns
We use two patterns to define grammar, dissect and grok. You can use these definitions in other parts of the configuration by referring to them through the keys timestamp_pat
or NGINX_HOST
.
Consider the pattern definitions from the Sample log parser configuration:
parser_patterns:
- dissect:
timestamp_pat: "<REGEX>"
- grok:
NGINX_HOST: (?:%{IP:destination_ip}|%{NGINX_NOTSEPARATOR:destination_domain})(:%{NUMBER:destination_port})?
Functions
Specify each function as a separate item in the list. Be sure to include the required arguments for the function.
Each function can have a list of conditions that determine whether the system should run the function for a specific log line. If a function has no attached conditions, the system always runs it. A condition definition is a triplet made up of a matcher, a value, and an operator (op).
logs-parser:
kf_parsing_config:
configPath: "/conf"
config: |-
parser_patterns:
. . .
- <FUNC_NAME>:
args:
- <FUNC_ARGS>
conditions:
# condition 1
- matcher: <LABEL|FACET|FIELD|JSON_PATH|__kf_agent>
value: "<EXPECTED_VALUE>"
op: "<OP_VALUE>"
# condition 2
- matcher: <LABEL|FACET|JSON_PATH|__kf_agent>
value: "<EXPECTED_VALUE>"
op: "<OP_VALUE>"
- matcher
-
Can be any one of these:
- Label
-
Name must be prefixed with
#
.
- Log facet
-
Name must be prefixed with
@
.
- Field
-
Name must be prefixed with
%
.Only supported field is
kf_msg
.
- JSONPath
-
Field from incoming JSON payload, specified as JSONPath.
Only for remap function.
- __kf_agent
-
A static string literal for specifying the log agent type.
Only for remap function.
- value
-
Depending on the operator, the value must be a literal string or a regex.
- op
-
The operator for the condition.
Kloudfuse supports these conditional operators:
- ==
-
String equality
- !=
-
String inequality
- =~
-
Regex match
- !~
-
Regex not match
- contains
-
Checks whether a string contains a sequence of characters.
- startsWith
-
Checks whether a string starts with a sequence of characters.
- endsWith
-
Checks whether a string ends with a sequence of characters.
- in
-
The value is in the String; multiple values must be in a comma-separated string format.