Logs fingerprinting

Application and infrastructure logs are very large and repetitive. In a traditional logging system, practitioners implement aggressive filtering and limits on the logs that get actively indexed and queried. However, when troubleshooting, it is best to have all applicable logs indexed ready to examine.

The Kloudfuse logging system extracts a unique fingerprint of the raw logs and leverages the repetitive nature of the log events to reduce the storage footprint, while also maintaining a high level of searchability. Because a typical log line contains both static strings that developers write and strings and IDs generated during runtime, the Kloudfuse platform developed a fingerprint for log lines by splitting them into their static and dynamic parts. By storing the static fingerprints and dynamic values separately, we thoroughly index the log events while storing them in a cost-effective manner.

Generating a log fingerprint

Consider the following log line:

ts=2023-02-01T22:51:33Z caller=logging.go:29 method=Authorise result=false took=9.775µs

cosole

The fingerprint is:

ts=<v_0> caller=<v_1> method=<v_2> result=<v_3> took=<v_4>

The auto-extracted log facets are:

caller: logging.go:29 method: Authorise result: false took: 9.775µs ts: 2023-02-01T22:51:33Z

In addition to automatic facet generation, you can also specify additional faceting at ingestion; see Derive log facets.