Derive log facets
You can add a tokenizer to derive log facets during ingestion into the Kloudfuse observability platform.
In addition to auto-extracted log facets, the system can derive log facets during ingestion through a customized tokenizer. A typical situation for this is when you must capture a string in the log event as a named log facet that is not auto-derived.
When you parse this log line:
10.12.0.35 - - [26/May/2021:18:59:10 +0000] "GET /unavailable HTTP/1.1" 503 21 "-" "hey/0.0.1
Using this tokenizer:
'%{sourceIp} - - [%{timestamp}] "%{requestMethod} %{uri} %{_}" %{responseCode} %{contentLength}'
Kloudfuse generates these log facets: sourceIp: 10.12.0.35
, requestMethod: GET
, responseCode: 503
, and contentLength: 21
.
Apply tokenizer
You can apply the tokenizer to incoming log lines based on source and line filters by configuring the logs-parser values.yaml
file, and then performing a helm upgrade.
This values.yaml
demonstrates how to apply a conditional pattern to an incoming log event. Add the pipeline: section
from this example, and any values that may already exist in the logs-parser
section.
logs-parser: (1)
pipeline: (2)
configPath: "/conf" (3)
config: |- (4)
- nginx: (5)
- pipeline: (6)
- func: dissect (7)
params:
- tokenizer: '%{sourceIp} - - [%{timestamp}] "%{requestMethod} %{uri} %{_} %{responseCode} %{contentLength}' (8)
- pinot: (9)
- pipeline:
- if: 'msg contains "LLRealtimeSegmentDataManager_"' # (10)
then:
- func: dissect
params:
- tokenizer: '%{timestamp} %{level} [LLRealtimeSegmentDataManager_%{segment_name}]'
1 | logs-parser specifies the values for logs-parser helm chart , a sub-chart of the Kloudfuse stack. |
2 | pipeline represents the values for the logs-parser pipeline definition. This pipeline definition holds across all sources. A pipeline is a sequence of functions that are applied to an incoming log event to extract and process the log event. |
3 | configPath represents the path where the pipeline file is loaded into the logs-parser pod. |
4 | config represents the yaml that is dumped into the pipeline file at configPath . |
5 | nginx represents the pipeline applied to the events with label source=nginx . |
6 | pipeline represents the pipeline definition for the given source; in this case, nginx . |
7 | func: dissect instructs the logs-parser to apply the dissect function to the incoming log line. It applies the user-provided pattern to the incoming log line. |
8 | tokenizer: 'pattern' is a required argument to func: dissect . It defines the pattern. |
9 | pinot is another source-specific pipeline for source=pinot |
10 | if: 'msg contains line-filter' is an optional line filter that can apply a tokenizer conditionally on a line filter. |