Configure AWS Kinesis Firehose for Kloudfuse

AWS Kinesis Firehose is the transport layer used by both the CloudWatch Metrics and CloudWatch Logs integrations. This page covers the one-time setup of the Firehose delivery streams and the IAM role that both integrations share.

Overview

You need two Kinesis Firehose delivery streams — one for metrics and one for logs — each pointing to the corresponding Kloudfuse ingestion endpoint. Both streams use the same IAM role for permissions.

The Kloudfuse ingestion endpoints follow this pattern:

Signal Endpoint

Metrics

https://<kloudfuse-ingester-host>/ingester/kinesis/metrics

Logs

https://<kloudfuse-ingester-host>/ingester/kinesis/logs

Replace <kloudfuse-ingester-host> with the hostname or IP of your Kloudfuse ingester service. This is typically the external load balancer address created during installation.

Prerequisites

  • A running Kloudfuse installation with an externally reachable HTTPS ingestion endpoint

  • AWS CLI configured with sufficient permissions to create Firehose streams and IAM roles

  • The Kloudfuse ingester hostname or IP address

Step 1: Create the IAM Role

Both delivery streams share a single IAM role that Firehose and CloudWatch Logs assume to deliver records.

Create the trust policy document:

cat > firehose-trust-policy.json <<'EOF'
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "firehose.amazonaws.com",
          "logs.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
bash

Create the IAM role:

aws iam create-role \
  --role-name KloudfuseFirehoseRole \
  --assume-role-policy-document file://firehose-trust-policy.json
bash

Create and attach the permissions policy:

cat > firehose-permissions-policy.json <<'EOF'
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "firehose:PutRecord",
        "firehose:PutRecordBatch"
      ],
      "Resource": "*"
    }
  ]
}
EOF

aws iam put-role-policy \
  --role-name KloudfuseFirehoseRole \
  --policy-name KloudfuseFirehosePermissions \
  --policy-document file://firehose-permissions-policy.json
bash

Note the role ARN returned — you will need it when creating the delivery streams.

Step 2: Create the Metrics Delivery Stream

aws firehose create-delivery-stream \
  --delivery-stream-name kloudfuse-metrics \
  --delivery-stream-type DirectPut \
  --http-endpoint-destination-configuration '{
    "EndpointConfiguration": {
      "Url": "https://<kloudfuse-ingester-host>/ingester/kinesis/metrics",
      "Name": "kloudfuse-metrics"
    },
    "RoleARN": "arn:aws:iam::<account-id>:role/KloudfuseFirehoseRole",
    "BufferingHints": {
      "SizeInMBs": 4,
      "IntervalInSeconds": 60
    },
    "RequestConfiguration": {
      "ContentEncoding": "GZIP"
    },
    "S3BackupMode": "FailedDataOnly",
    "S3DestinationConfiguration": {
      "RoleARN": "arn:aws:iam::<account-id>:role/KloudfuseFirehoseRole",
      "BucketARN": "arn:aws:s3:::<backup-bucket-name>",
      "Prefix": "kloudfuse-metrics-failures/"
    }
  }'
bash

Replace <kloudfuse-ingester-host>, <account-id>, and <backup-bucket-name> with your values. The S3 backup bucket is only used for failed deliveries.

Step 3: Create the Logs Delivery Stream

aws firehose create-delivery-stream \
  --delivery-stream-name kloudfuse-logs \
  --delivery-stream-type DirectPut \
  --http-endpoint-destination-configuration '{
    "EndpointConfiguration": {
      "Url": "https://<kloudfuse-ingester-host>/ingester/kinesis/logs",
      "Name": "kloudfuse-logs"
    },
    "RoleARN": "arn:aws:iam::<account-id>:role/KloudfuseFirehoseRole",
    "BufferingHints": {
      "SizeInMBs": 4,
      "IntervalInSeconds": 60
    },
    "RequestConfiguration": {
      "ContentEncoding": "GZIP"
    },
    "S3BackupMode": "FailedDataOnly",
    "S3DestinationConfiguration": {
      "RoleARN": "arn:aws:iam::<account-id>:role/KloudfuseFirehoseRole",
      "BucketARN": "arn:aws:s3:::<backup-bucket-name>",
      "Prefix": "kloudfuse-logs-failures/"
    }
  }'
bash

Verify the Delivery Streams

Confirm Streams are Active

Confirm both streams are in ACTIVE state:

aws firehose describe-delivery-stream \
  --delivery-stream-name kloudfuse-metrics \
  --query 'DeliveryStreamDescription.DeliveryStreamStatus'

aws firehose describe-delivery-stream \
  --delivery-stream-name kloudfuse-logs \
  --query 'DeliveryStreamDescription.DeliveryStreamStatus'
bash

Both commands should return "ACTIVE".

Confirm Data is Being Delivered

  1. In the AWS Console, open Amazon Kinesis and select the kloudfuse-metrics delivery stream.

  2. Click the Monitoring tab and check the following graphs over the last 15 minutes:

    Metric What to look for

    IncomingRecords

    Non-zero — confirms CloudWatch Metrics Stream is publishing to Firehose

    DeliveryToHTTPEndpoint.Success

    Non-zero — confirms records are reaching the Kloudfuse ingester

    DeliveryToHTTPEndpoint.DataFreshness

    Low value (seconds) — high values indicate a delivery backlog or failures

    FailedConversionRecords

    Zero — any non-zero value indicates a record format problem

  3. Repeat for the kloudfuse-logs delivery stream.

  4. To confirm from the CLI, retrieve recent delivery statistics:

    aws cloudwatch get-metric-statistics \
      --namespace AWS/Firehose \
      --metric-name DeliveryToHttpEndpoint.Success \
      --dimensions Name=DeliveryStreamName,Value=kloudfuse-metrics \
      --start-time "$(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u -d '-1 hour' +%Y-%m-%dT%H:%M:%SZ)" \
      --end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
      --period 300 \
      --statistics Sum
    bash

Next Steps

With the delivery streams in place:

Troubleshooting

Data Not Arriving in Kloudfuse

If a stream is ACTIVE but data is not appearing in Kloudfuse:

  1. Check the Monitoring tab in the AWS Console. Open the delivery stream in the Kinesis console and look for non-zero DeliveryToHTTPEndpoint.Failed or a rising DeliveryToHTTPEndpoint.DataFreshness value. These are the primary indicators of a delivery problem.

  2. Inspect the S3 failure backup. Failed records are written to S3 under the prefix configured when the stream was created (kloudfuse-metrics-failures/ or kloudfuse-logs-failures/). Download a failed record and inspect the error message:

    aws s3 ls s3://<backup-bucket-name>/kloudfuse-metrics-failures/ --recursive | tail -5
    aws s3 cp s3://<backup-bucket-name>/kloudfuse-metrics-failures/<path-to-record> /tmp/failed-record.gz
    gunzip /tmp/failed-record.gz && cat /tmp/failed-record
    bash

    Common error messages and their causes:

    Error message Cause

    Connection refused / Connection timed out

    Kloudfuse ingester endpoint is unreachable — check security groups, ingress, and the endpoint URL

    HTTP 401 Unauthorized

    API key is missing or incorrect in the Firehose HTTP endpoint access key configuration

    HTTP 403 Forbidden

    The IAM role does not have permission to write to the S3 backup bucket

    HTTP 413 Request Entity Too Large

    Reduce the SizeInMBs buffering hint (try 2 MB)

    HTTP 503 Service Unavailable

    Kloudfuse ingester is overloaded or restarting — check ingester pod status

  3. Verify the ingester pod is running:

    kubectl get pods -n kfuse | grep ingester
    kubectl logs -n kfuse deployment/ingester --tail=50
    bash
  4. Test the endpoint directly from a host that has network access to the Kloudfuse ingester:

    curl -v -X POST \
      https://<kloudfuse-ingester-host>/ingester/kinesis/metrics \
      -H "Content-Type: application/json" \
      -d '{"test": true}'
    bash

    A 400 Bad Request (not a network error) confirms the endpoint is reachable — the Kloudfuse ingester will reject the test payload but the connection succeeded.

Permission Errors

If the stream reports Access Denied errors:

  1. Confirm the KloudfuseFirehoseRole trust policy includes both firehose.amazonaws.com and logs.amazonaws.com as trusted principals:

    aws iam get-role \
      --role-name KloudfuseFirehoseRole \
      --query 'Role.AssumeRolePolicyDocument'
    bash
  2. Confirm the role has firehose:PutRecord and firehose:PutRecordBatch permissions:

    aws iam get-role-policy \
      --role-name KloudfuseFirehoseRole \
      --policy-name KloudfuseFirehosePermissions
    bash
  3. If using CloudWatch Logs subscription filters, the role also needs logs:PutSubscriptionFilter for the log group.

Stream Stuck in CREATING State

A stream that stays in CREATING for more than a few minutes usually indicates an IAM role propagation delay. Wait 30 seconds and re-run the describe-delivery-stream command. If it remains stuck, delete and recreate the stream — IAM role ARNs referenced during stream creation must exist at the time of the API call.

aws firehose delete-delivery-stream --delivery-stream-name kloudfuse-metrics
bash

High Delivery Latency

Firehose buffers records before delivery based on the SizeInMBs and IntervalInSeconds hints. The default configuration (4 MB or 60 seconds, whichever comes first) means data can be up to 60 seconds old when it arrives in Kloudfuse.

To reduce latency, lower the interval:

aws firehose update-destination \
  --delivery-stream-name kloudfuse-metrics \
  --current-delivery-stream-version-id 1 \
  --destination-id destinationId-000000000001 \
  --http-endpoint-destination-update '{
    "BufferingHints": {
      "SizeInMBs": 1,
      "IntervalInSeconds": 15
    }
  }'
bash
The minimum IntervalInSeconds is 15 seconds. Lowering this increases the number of HTTP requests to the Kloudfuse ingester.