User Guide to Data Scrubbing
Overview
The Scrubbing feature allows you to:
-
Permanently delete telemetry data based on configurable filters
-
Clean up Logs, APM, Metrics, and Events data selectively
-
Apply label-based filters to target specific data
-
Select time ranges for data deletion
-
Monitor scrubbing job progress and history
-
View detailed logs and statistics for each scrubbing operation
Accessing the Scrubbing Tool
The Scrubbing tool is accessible from the Admin section:
-
Navigate to the Admin section in the left sidebar
-
Click on Scrubbing from the admin menu options
-
The main dashboard displays all scrubbing jobs with their current status
Understanding the Scrubbing Jobs Dashboard
The Scrubbing Jobs dashboard provides a comprehensive view of all scrubbing operations:
Dashboard Columns
| Column | Description |
|---|---|
Job Name |
User-defined identifier for the scrubbing job |
Filters |
Label filters applied to the data (e.g., |
Stream |
Type of data being scrubbed (Logs, APM, Metrics, or Events) |
Status |
Current job status (Incomplete, Completed, or Cancelled) |
Progress |
Visual progress bar and count of processed items |
Time Range |
Date and time range of data being scrubbed |
Created |
When the scrubbing job was initiated |
Creating a Scrubbing Job
To create a new scrubbing job:
-
Click the Start scrubbing job button in the top-right corner
-
The confirmation dialog opens with configuration options
Step 1: Select Stream Type
Choose the data stream to scrub:
-
Logs: Application and system log data
-
APM: Application Performance Monitoring traces and spans
-
Metrics: Time-series metric data
-
Events: System and application events
Step 2: Select Time Range
Configure the time period for data deletion:
-
Use the Last hour dropdown for quick time selections
-
The interface shows "Select the time range for data that will be permanently deleted"
-
Default selection is "Last hour"
Step 3: Configure Label Filters
Label filters determine which data will be deleted. The interface shows:
-
Label Filters (applies to all [stream-type]): Define filters using label-value pairs
-
Two dropdown selectors: "Select…" = "Select…" format
-
The label changes dynamically based on stream type (e.g., "applies to all logs")
-
Examples from actual data:
kube_service = "opbeans-go",action = "add_client",app_shipping_zip_code = "95054"
Step 4: Preview Data
Before confirming the scrubbing job:
Logs Preview Section
-
Shows sample logs matching your filter criteria with the text "Sample logs matching your filter criteria ([time-range])"
-
Displays Log line count with exact numbers (e.g., "2,039,047")
-
Includes detailed statistics breakdown:
-
Total: Overall count (e.g., "2.04M")
-
debug: Specific count (e.g., "918")
-
error: Specific count (e.g., "9.31K")
-
fatal: Specific count (e.g., "2")
-
info: Specific count (e.g., "1.99M")
-
notice: Specific count (e.g., "115")
-
trace: Specific count (e.g., "10.92K")
-
warn: Specific count (e.g., "30.98K")
-
-
Time-based histogram showing data distribution with "Compare" option and "Same time yesterday"
-
Paginated table view with detailed log entries showing:
-
Date: Timestamp (e.g., "2025-09-18 17:38:04")
-
Container Name: Service name (e.g., "imageprovider", "cartservice")
-
Host: Kubernetes node (e.g., "gke-moscatel-moscatel-np-us-west1-a-df65173f-9zjj")
-
Kube Namespace: Namespace (e.g., "otel-trace")
-
Kube Service: Service identifier (often shows "-")
-
Kube Cluster Name: Cluster name (e.g., "moscatel")
-
Message: Full log message content
-
Pod Name: Complete pod name (e.g., "my-otel-demo-imageprovider-597ff6cd84-vqkvb")
-
Source: Log source (e.g., "imageprovider", "cartservice")
-
Step 4: Review Table Data
The preview section includes a paginated table with navigation controls:
-
Rows per page: Configurable (default appears to be 10)
-
Page navigation: Numbered page controls (1, 2, 3, 4, 5) and "Go to next page"
-
Additional action: "Open in Logs page" button for detailed log exploration
Monitoring Scrubbing Progress
The Scrubbing Jobs dashboard shows all jobs with detailed progress information:
Job Status Types
-
Incomplete: Job is currently running (orange status, shows current progress like "0% 0 / 566")
-
Completed: Job finished successfully (green status, shows "100%" with final counts)
-
Cancelled: Job was stopped before completion (shows "0%" with partial progress)
Progress Display
Each job in the dashboard shows:
-
Progress Bar: Visual green progress bar showing completion percentage
-
Percentage: Numeric percentage (0% to 100%)
-
Item Count: Fractional display showing "processed / total" items
-
Real Examples from System:
-
Active job: "0% 0 / 566" (incomplete job in progress)
-
Small completed job: "100% 49 / 49" (quoteservice logs)
-
Medium completed job: "100% 2,877 / 2,877" (metrics with add_client action)
-
Large completed job: "100% 601,874 / 601,874" (otel-demo logs)
-
Very large cancelled job: "0% 0 / 14,401,108" (cancelled due to size)
-
Best Practices
Before Creating a Scrubbing Job
-
Verify Filters: Double-check label filters to ensure correct data selection
-
Review Time Range: Confirm the date range matches your intention
-
Check Preview: Always review the preview data before confirming
-
Document Purpose: Use descriptive job names for audit trails
Common Use Cases
Compliance and Data Retention
-
Remove data older than retention policies require
-
Delete sensitive information from specific services
-
Clean up test or development data from production