Configure the Object Store for Pinot
Configure the Object Store for Pinot on one of these platforms by adding custom configuration to the custom_values.yaml
you use with the helm installation of Kloudfuse.
Ensure that the deepstore is in the same region that host the compute instances of the Kloudfuse stack. |
Add the following configurations in the custom_values.yaml
to use with the helm installation of Kloudfuse.
GCP
Kloudfuse supports two options for Object Store configuration in GCP:
Option 1: Use a Service Account Key
Prerequisites
-
Download the key from GCP console for the GCP service account.
-
Create a Kubernetes secret with GCP secret credentials, which allows access to the GCS bucket. Be sure to name the file
secretKey
.kubectl create secret generic pinot-sd-secret --from-file=./secretKey -n kfuse
Helm Values
Add the following values in the custom_values.yaml
file. Replace the GCS details as required.
pinot:
deepStore: (1)
enabled: true
type: "gcs"
useSecret: true
createSecret: false
secretName: "pinot-sd-secret"
dataDir: "gs://[REPLACE BUCKET HERE]/kfuse/controller/data" (2)
gcs:
projectId: "REPLACE PROJECT ID HERE"
1 | deepStore : Enable/disable storage of Pinot segments in deep store. |
2 | dataDir : bucket for deep storage |
Option 2: Use Google Cloud Workload Identity
Prerequisites
-
Follow the steps in Authenticate to Google Cloud APIs from GKE workloads to create and associate a service account with the GKE cluster.
-
Use the following values:
- ROLE_NAME
-
roles/storage.admin
Alternatively, create a custom role with the following permissions:
-
storage.buckets.get
-
storage.objects.create
-
storage.objects.delete
-
storage.objects.get
-
storage.objects.getIamPolicy
-
storage.objects.list
-
storage.objects.update
-
- NAMESPACE
-
kfuse
; create this namespace if you haven’t done this yet.Alternatively, use the namespace of the Kloudfuse deployment.
- KSA_NAME
-
default
Helm values
Add the following helm values in the custom_values.yaml
file. Replace the GCS details as required.
pinot:
deepStore: (1)
enabled: true
type: "gcs"
useSecret: false
createSecret: false
dataDir: "gs://[REPLACE BUCKET HERE]/kfuse/controller/data" (2)
gcs:
projectId: "REPLACE PROJECT ID HERE"
1 | deepStore : Enable/disable storage of Pinot segments in deep store. |
2 | dataDir : bucket for deep storage |
AWS
Pinot must have an IAM policy with read and write permissions to the S3 bucket for deep storage. Currently, Kloudfuse supports these options for consuming this policy:
Option 1: Use an IAM User Secret Access Key
-
Refer to the AWS document Create an IAM user in your AWS account for instruction on how to create an IAM user.
-
Ensure that the user has the IAM policy for reading and writing in the S3 bucket for deep storage.
-
After creating the IAM user, generate access key credentials.
Note the values of the access key and secret key.
-
Add the following values to the
custom_values.yaml
file. Replace the S3 details with your values. -
Set
createSecret
anduseSecret
totrue
. TheaccessKey
andsecretKey
are the credentials of the IAM user.AWS Configuration with IAM User Secret Access Keypinot: deepStore: (1) enabled: true type: "s3" useSecret: true (2) createSecret: true (3) dataDir: "s3://[REPLACE BUCKET HERE]/kfuse/controller/data" (4) serverSideEncryption: "aws:kms" (5) ssekmsKeyId: "" (6) ssekmsEncryptionContext: "" (7) s3: (8) region: "YOUR REGION" accessKey: "YOUR AWS ACCESS KEY" secretKey: "YOUR AWS SECRET KEY"
yaml1 deepStore
: Enable/disable storing of Pinot segments in deep store.2 useSecret
: Set totrue
; typically used when have access to deep store using node-level access credentials. Set tofalse
when don’t need to pass the secret.3 createSecret
: Set totrue
; creates a secret with provided credentials.4 dataDir
: Bucket for deep storage.5 serverSideEncryption
: (Optional) The server-side encryption algorithm used when storing this object in Amazon S3,aws:kms
.6 ssekmsKeyId
: (Optional) Required whenserverSideEncryption=aws:kms
. Specifies the AWS KMS key ID to use for object encryption.7 ssekmsEncryptionContext
: (Optional) Specifies the AWS KMS Encryption Context to use for object encryption. The value of this header is a base64-encoded UTF-8 string holding JSON with the encryption context key-value pairs.8 s3
: Fill in aws s3 credentials.
Option 2: Attach the IAM policy to the NodeInstanceRole of the EKS Cluster Node Group
-
Attach the IAM policy to the
NodeInstanceRole
of the node that runs the Kloudfuse stack. On an EKS console under the corresponding EKS cluster’s node group detail page, under the NodeIAM role ARNaccess
, access theNodeInstanceRole
. -
Add the following values in the
custom_values.yaml
file. Replace the S3 details with your values. -
Set both
createSecret
anduseSecret
tofalse
.AWS Configuration with Attached IAM policy on the NodeInstanceRole of the EKS Cluster Node Grouppinot: deepStore: (1) enabled: true type: "s3" useSecret: false (2) createSecret: false (3) dataDir: "s3://[REPLACE BUCKET HERE]/kfuse/controller/data" (4) s3: (5) region: "YOUR REGION"
yaml1 deepStore
: Enable/disable storing of Pinot segments in deep store.2 useSecret
: Set tofalse
when don’t need to pass the secret. Typically used when have access to deep store using node-level access credentials; then set totrue
.3 createSecret
: Set to false. Iftrue
, creates a secret with provided credentials.4 dataDir
: Bucket for deep storage.5 s3
: Fill in aws s3 credentials.
Option 3: Use a Kubernetes ServiceAccount Resource that Assumes an IAM Role
-
Ensure that the Kubernetes cluster has a
ServiceAccount
that is associated with an IAM role with permissions to read and write from S3. For information on how to create a ServiceAccount, see AWS documentation on Assign IAM roles to Kubernetes service accounts. -
Ensure that pinot is configured to use the
ServiceAccount
, and thatdeepStore
is configured properly in thecustom_values.yaml
file. Make sure that bothuseSecret
andcreateSecret
arefalse
.AWS Configuration with a Kubernetes ServiceAccount Resource that Assumes an IAM Rolepinot: serviceAccountName: <REPLACE SERVICE ACCOUNT NAME HERE> deepStore: (1) enabled: true type: "s3" useSecret: false (2) createSecret: false (3) dataDir: "s3://[REPLACE BUCKET HERE]/kfuse/controller/data" (4) s3: (5) region: "YOUR REGION"
yaml1 deepStore
: Enable/disable storing of Pinot segments in deep store.2 useSecret
: Set tofalse
when don’t need to pass the secret. Typically used when have access to deep store using node-level access credentials; then set totrue
.3 createSecret
: Set to false. Iftrue
, creates a secret with provided credentials.4 dataDir
: Bucket for deep storage.5 s3
: Fill in aws s3 credentials.
Azure
Prerequisites
-
The storage account must be enabled with Azure Data Lake Storage Gen 2.
When creating a storage account, select Enable hierarchical namespace option in the Advanced section:
-
Find the access key by navigating to Access Keys from the left pane of the storage account, in the Security + networking section.
-
The
fileSystemName
refers to the container name. You can create a container by navigating to Containers from the left pane of the storage account, under the Data storage section.
Helm values
Add the following values in the custom_values.yaml
file. Replace the Azure Data Lake details with your values.
+
pinot:
deepStore: (1)
enabled: true
type: "adl2"
dataDir: "adl2://[REPLACE CONTAINER NAME HERE]/kfuse/controller/data" (2)
adl2: (3)
accountName: "YOUR AZURE STORAGE ACCOUNT NAME"
accessKey: "STORAGE ACCOUNT ACCESS KEY"
fileSystemName: "STORAGE ACCOUNT CONTAINER NAME"
1 | deepStore : Enable/disable storing of Pinot segments in deep store. |
2 | dataDir : Bucket for deep storage. |
3 | adl2 : Fill in Azure Data Lake Storage credentials. |