Strimzi: Apache Kafka on Kubernetes Setup
Complete setup guide for Strimzi - an open-source Kubernetes operator for running Apache Kafka clusters in a cloud-native way. Originally created by Red Hat, now a CNCF Sandbox project. Includes installation, cluster configuration, topic management, and production best practices for running Kafka workloads on Kubernetes.
- Step 1
Understanding Strimzi and Kafka on Kubernetes
Strimzi provides a way to run Apache Kafka on Kubernetes using custom resources and operators. It manages Kafka brokers, ZooKeeper (or KRaft mode), Kafka Connect, MirrorMaker, and Kafka Bridge as native Kubernetes resources. The operator handles deployment, scaling, configuration changes, and upgrades declaratively. This eliminates the operational complexity of running distributed Kafka clusters manually.
Strimzi Components: - Cluster Operator: Manages Kafka clusters, topics, and users - Entity Operator: Manages topics and users within clusters - Topic Operator: Synchronizes KafkaTopics with Kafka - User Operator: Manages KafkaUsers and ACLs - Bridge: HTTP-based Kafka client for browser/IoT devices Kafka Resources Managed: - Kafka: Broker cluster configuration - KafkaTopic: Topic definitions - KafkaUser: User credentials and ACLs - KafkaConnect: Connect cluster for integrations - KafkaMirrorMaker2: Cross-cluster replication - KafkaBridge: HTTP REST API gateway - Step 2
Prerequisites
You need a running Kubernetes cluster (1.23+) with kubectl configured. Minimum cluster resources: 4 CPU cores, 8GB RAM for a test cluster; production needs vary by workload. Persistent storage is required (StorageClass with dynamic provisioning recommended). You'll need cluster-admin or equivalent permissions to install CRDs and create namespaces.
# Verify Kubernetes cluster access kubectl version --client kubectl cluster-info # Check available resources kubectl top nodes # Verify StorageClass exists kubectl get storageclass # Create dedicated namespace kubectl create namespace kafka kubectl config set-context --current --namespace=kafka # Verify namespace kubectl get namespace kafka⚠ Heads up: Strimzi requires Kubernetes 1.23 or later. For production use, ensure your cluster has monitoring (Prometheus), logging, and backup strategies in place before deploying Kafka. - Step 3
Install Strimzi Operator via YAML Manifests
The fastest way to install Strimzi is applying the release manifests directly. This creates all necessary CRDs (Custom Resource Definitions), the Cluster Operator deployment, and RBAC resources. The operator watches for Kafka custom resources and manages their lifecycle automatically.
# Install Strimzi operator (latest stable release) VERSION=0.44.0 kubectl create -f https://github.com/strimzi/strimzi-kafka-operator/releases/download/$VERSION/strimzi-cluster-operator-$VERSION.yaml # Verify CRDs are installed kubectl get crd | grep strimzi # Should show: kafkas, kafkatopics, kafkausers, kafkaconnects, etc. # Wait for operator to be ready kubectl wait --for=condition=available --timeout=300s deployment/strimzi-cluster-operator -n kafka # Check operator logs kubectl logs -l name=strimzi-cluster-operator -n kafka -f # Verify operator is running kubectl get deployment strimzi-cluster-operator -n kafka⚠ Heads up: The operator installation creates cluster-wide CRDs. If you want namespace-scoped installation, edit the YAML to remove ClusterRole resources and adjust RBAC accordingly. - Step 4
Alternative: Install via Helm
Helm provides a more flexible installation method with configurable values. This is recommended for production deployments where you need to customize operator settings, resource limits, or watchNamespaces. Helm also simplifies upgrades and rollbacks.
# Add Strimzi Helm repository helm repo add strimzi https://strimzi.io/charts/ helm repo update # Search available versions helm search repo strimzi/strimzi-kafka-operator --versions # Install with default values helm install strimzi-kafka-operator strimzi/strimzi-kafka-operator \ --namespace kafka \ --create-namespace # Install with custom values cat <<EOF > values.yaml watchNamespaces: - kafka - production resources: limits: memory: 512Mi cpu: 500m requests: memory: 256Mi cpu: 100m logLevel: INFO EOF helm install strimzi-kafka-operator strimzi/strimzi-kafka-operator \ --namespace kafka \ --values values.yaml # Verify installation helm list -n kafka kubectl get pods -n kafka - Step 5
Deploy Your First Kafka Cluster (Ephemeral)
An ephemeral cluster stores data in emptyDir volumes - data is lost when pods restart. This is perfect for development and testing. Strimzi deploys Kafka with ZooKeeper by default (KRaft mode is also supported). The cluster operator watches this custom resource and creates all necessary StatefulSets, Services, and ConfigMaps.
apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: my-cluster namespace: kafka spec: kafka: version: 3.9.0 replicas: 3 listeners: - name: plain port: 9092 type: internal tls: false - name: tls port: 9093 type: internal tls: true config: offsets.topic.replication.factor: 3 transaction.state.log.replication.factor: 3 transaction.state.log.min.isr: 2 default.replication.factor: 3 min.insync.replicas: 2 inter.broker.protocol.version: "3.9" storage: type: ephemeral zookeeper: replicas: 3 storage: type: ephemeral entityOperator: topicOperator: {} userOperator: {} - Step 6
Apply Kafka Cluster Configuration
Save the Kafka resource definition to a file and apply it. The Cluster Operator detects the new resource and provisions the entire cluster. This creates StatefulSets for Kafka brokers and ZooKeeper nodes, plus Services for access. Initial deployment takes 2-5 minutes.
# Save the YAML from previous step as kafka-ephemeral.yaml # Apply the configuration kubectl apply -f kafka-ephemeral.yaml -n kafka # Watch cluster creation progress kubectl get kafka my-cluster -n kafka -w # Wait until STATUS shows Ready # Check all created resources kubectl get all -l app.kubernetes.io/instance=my-cluster -n kafka # Verify Kafka pods are running kubectl get pods -l strimzi.io/cluster=my-cluster -n kafka # Should show: 3 kafka pods, 3 zookeeper pods, entity-operator # Check Kafka service endpoints kubectl get svc -l strimzi.io/cluster=my-cluster -n kafka # View cluster status kubectl describe kafka my-cluster -n kafka - Step 7
Production-Ready Persistent Cluster
For production use, configure persistent storage with appropriate IOPS and throughput. Use separate storage classes for Kafka (high IOPS) and ZooKeeper (lower latency). Add resource requests/limits based on your workload. Enable metrics collection and configure JVM tuning for optimal performance.
apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: production-cluster namespace: kafka spec: kafka: version: 3.9.0 replicas: 3 listeners: - name: plain port: 9092 type: internal tls: false - name: tls port: 9093 type: internal tls: true authentication: type: tls - name: external port: 9094 type: loadbalancer tls: true configuration: bootstrap: loadBalancerIP: <your-load-balancer-ip> config: offsets.topic.replication.factor: 3 transaction.state.log.replication.factor: 3 transaction.state.log.min.isr: 2 default.replication.factor: 3 min.insync.replicas: 2 log.retention.hours: 168 log.segment.bytes: 1073741824 compression.type: producer storage: type: persistent-claim size: 100Gi class: fast-ssd deleteClaim: false resources: requests: memory: 4Gi cpu: "2" limits: memory: 8Gi cpu: "4" jvmOptions: -Xms: 2048m -Xmx: 4096m metricsConfig: type: jmxPrometheusExporter valueFrom: configMapKeyRef: name: kafka-metrics key: kafka-metrics-config.yml zookeeper: replicas: 3 storage: type: persistent-claim size: 10Gi class: fast-ssd deleteClaim: false resources: requests: memory: 1Gi cpu: "500m" limits: memory: 2Gi cpu: "1" metricsConfig: type: jmxPrometheusExporter valueFrom: configMapKeyRef: name: kafka-metrics key: zookeeper-metrics-config.yml entityOperator: topicOperator: resources: requests: memory: 256Mi cpu: "200m" limits: memory: 512Mi cpu: "500m" userOperator: resources: requests: memory: 256Mi cpu: "200m" limits: memory: 512Mi cpu: "500m" kafkaExporter: topicRegex: ".*" groupRegex: ".*"⚠ Heads up: Persistent volumes are not deleted when the cluster is removed unless deleteClaim is true. Set to false for production to prevent accidental data loss. Always test backup/restore procedures before going live. - Step 8
Create Kafka Topics Declaratively
Strimzi manages topics as Kubernetes resources. Create a KafkaTopic custom resource and the Topic Operator synchronizes it with Kafka. This provides GitOps-friendly topic management with version control and declarative configuration.
apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaTopic metadata: name: orders namespace: kafka labels: strimzi.io/cluster: my-cluster spec: partitions: 10 replicas: 3 config: retention.ms: 604800000 # 7 days segment.ms: 3600000 # 1 hour compression.type: lz4 max.message.bytes: 1048576 min.insync.replicas: 2 --- apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaTopic metadata: name: events namespace: kafka labels: strimzi.io/cluster: my-cluster spec: partitions: 20 replicas: 3 config: retention.ms: 86400000 # 1 day cleanup.policy: delete compression.type: snappy - Step 9
Apply Topic Configuration
Apply topic definitions using kubectl. The Topic Operator creates topics in Kafka and keeps them synchronized with the custom resource. Any changes to the KafkaTopic spec are reflected in Kafka automatically.
# Apply topic definitions kubectl apply -f topics.yaml -n kafka # List KafkaTopic resources kubectl get kafkatopics -n kafka # Describe a specific topic kubectl describe kafkatopic orders -n kafka # Verify topics exist in Kafka (exec into broker pod) kubectl exec -it my-cluster-kafka-0 -n kafka -- bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --list # Get detailed topic info kubectl exec -it my-cluster-kafka-0 -n kafka -- bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --describe \ --topic orders # Update topic (edit the KafkaTopic resource) kubectl edit kafkatopic orders -n kafka # Change partitions or config, save - Topic Operator applies changes # Delete topic kubectl delete kafkatopic orders -n kafka⚠ Heads up: Topic deletion requires delete.topic.enable=true in Kafka config (enabled by default in Strimzi). Deleting a KafkaTopic resource deletes the actual topic and all its data. - Step 10
Create Kafka Users with Authentication
KafkaUser resources define users with TLS or SCRAM-SHA-512 authentication. Strimzi automatically generates certificates or passwords and stores them in Kubernetes Secrets. Users can have ACLs (Access Control Lists) for fine-grained authorization on topics and consumer groups.
apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaUser metadata: name: producer-app namespace: kafka labels: strimzi.io/cluster: my-cluster spec: authentication: type: tls authorization: type: simple acls: - resource: type: topic name: orders patternType: literal operations: - Write - Describe - resource: type: group name: producer-group patternType: literal operations: - Read --- apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaUser metadata: name: consumer-app namespace: kafka labels: strimzi.io/cluster: my-cluster spec: authentication: type: scram-sha-512 authorization: type: simple acls: - resource: type: topic name: orders patternType: literal operations: - Read - Describe - resource: type: group name: consumer-group patternType: prefix operations: - Read - Step 11
Apply User Configuration and Access Credentials
Apply KafkaUser resources and retrieve generated credentials from Secrets. TLS users get a certificate/key pair; SCRAM-SHA users get a password. Mount these secrets in your application pods to authenticate with Kafka.
# Apply user definitions kubectl apply -f kafka-users.yaml -n kafka # List KafkaUser resources kubectl get kafkausers -n kafka # Check generated secrets kubectl get secrets -n kafka | grep producer-app kubectl get secrets -n kafka | grep consumer-app # View TLS certificate for producer-app kubectl get secret producer-app -n kafka -o jsonpath='{.data.user\.crt}' | base64 -d # View TLS key kubectl get secret producer-app -n kafka -o jsonpath='{.data.user\.key}' | base64 -d # View CA certificate (for trust) kubectl get secret my-cluster-cluster-ca-cert -n kafka -o jsonpath='{.data.ca\.crt}' | base64 -d # View SCRAM-SHA password for consumer-app kubectl get secret consumer-app -n kafka -o jsonpath='{.data.password}' | base64 -d # Extract credentials to files for application use kubectl get secret producer-app -n kafka -o jsonpath='{.data.user\.crt}' | base64 -d > user.crt kubectl get secret producer-app -n kafka -o jsonpath='{.data.user\.key}' | base64 -d > user.key kubectl get secret my-cluster-cluster-ca-cert -n kafka -o jsonpath='{.data.ca\.crt}' | base64 -d > ca.crt - Step 12
Test Producer and Consumer
Verify your Kafka cluster is working by producing and consuming messages. Use the kafka-console tools included in the Kafka container image to test connectivity and authentication.
# Create a test topic kubectl apply -f - <<EOF apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaTopic metadata: name: test-topic namespace: kafka labels: strimzi.io/cluster: my-cluster spec: partitions: 3 replicas: 3 EOF # Start a producer (plain listener) kubectl run kafka-producer -ti --image=quay.io/strimzi/kafka:0.44.0-kafka-3.9.0 \ --rm=true --restart=Never -n kafka -- bin/kafka-console-producer.sh \ --bootstrap-server my-cluster-kafka-bootstrap:9092 \ --topic test-topic # Type some messages, Ctrl+C when done # Start a consumer in another terminal kubectl run kafka-consumer -ti --image=quay.io/strimzi/kafka:0.44.0-kafka-3.9.0 \ --rm=true --restart=Never -n kafka -- bin/kafka-console-consumer.sh \ --bootstrap-server my-cluster-kafka-bootstrap:9092 \ --topic test-topic \ --from-beginning # Test with TLS listener (requires certificate) kubectl run kafka-producer-tls -ti --image=quay.io/strimzi/kafka:0.44.0-kafka-3.9.0 \ --rm=true --restart=Never -n kafka -- bin/kafka-console-producer.sh \ --bootstrap-server my-cluster-kafka-bootstrap:9093 \ --topic test-topic \ --producer-property security.protocol=SSL \ --producer-property ssl.truststore.location=/tmp/truststore.p12 \ --producer-property ssl.truststore.password=<password> - Step 13
Deploy Kafka Connect for Integrations
Kafka Connect provides scalable, reliable streaming integration between Kafka and external systems. Strimzi manages Connect clusters as custom resources. You can use pre-built connectors (databases, S3, Elasticsearch) or build custom ones.
apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaConnect metadata: name: my-connect-cluster namespace: kafka annotations: strimzi.io/use-connector-resources: "true" spec: version: 3.9.0 replicas: 3 bootstrapServers: my-cluster-kafka-bootstrap:9092 config: group.id: connect-cluster offset.storage.topic: connect-cluster-offsets config.storage.topic: connect-cluster-configs status.storage.topic: connect-cluster-status config.storage.replication.factor: 3 offset.storage.replication.factor: 3 status.storage.replication.factor: 3 build: output: type: docker image: <your-registry>/kafka-connect:latest plugins: - name: debezium-postgres-connector artifacts: - type: tgz url: https://repo1.maven.org/maven2/io/debezium/debezium-connector-postgres/2.8.1.Final/debezium-connector-postgres-2.8.1.Final-plugin.tar.gz - name: camel-http-connector artifacts: - type: tgz url: https://repo1.maven.org/maven2/org/apache/camel/kafkaconnector/camel-http-kafka-connector/4.4.3/camel-http-kafka-connector-4.4.3-package.tar.gz - Step 14
Apply Connect Cluster and Deploy Connectors
Strimzi can build custom Connect images with your connectors using the build specification. For production, pre-build images and reference them. Then deploy connector instances using KafkaConnector resources.
# Apply Connect cluster kubectl apply -f kafka-connect.yaml -n kafka # Wait for Connect cluster to be ready kubectl wait --for=condition=ready kafkaconnect/my-connect-cluster --timeout=600s -n kafka # Check Connect pods kubectl get pods -l strimzi.io/cluster=my-connect-cluster -n kafka # Create a connector instance kubectl apply -f - <<EOF apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaConnector metadata: name: postgres-source namespace: kafka labels: strimzi.io/cluster: my-connect-cluster spec: class: io.debezium.connector.postgresql.PostgresConnector tasksMax: 2 config: database.hostname: postgres.database.svc.cluster.local database.port: 5432 database.user: kafka_user database.password: <your-password> database.dbname: production database.server.name: prod-db table.include.list: public.orders,public.customers plugin.name: pgoutput EOF # List connectors kubectl get kafkaconnectors -n kafka # Check connector status kubectl describe kafkaconnector postgres-source -n kafka # View Connect logs kubectl logs -l strimzi.io/cluster=my-connect-cluster -n kafka -f - Step 15
Enable Monitoring with Prometheus and Grafana
Strimzi exposes JMX metrics from Kafka, ZooKeeper, and Connect via Prometheus exporters. Configure the metricsConfig in your Kafka resource to enable metrics collection. Deploy Prometheus and Grafana to visualize cluster health, throughput, latency, and consumer lag.
# Create Kafka metrics ConfigMap kubectl apply -f https://raw.githubusercontent.com/strimzi/strimzi-kafka-operator/0.44.0/examples/metrics/kafka-metrics.yaml -n kafka # Install Prometheus Operator (if not already installed) kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml # Create ServiceMonitor for Kafka kubectl apply -f - <<EOF apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: kafka-metrics namespace: kafka spec: selector: matchLabels: strimzi.io/kind: Kafka endpoints: - port: tcp-prometheus interval: 30s EOF # Deploy Grafana kubectl create deployment grafana --image=grafana/grafana:latest -n kafka kubectl expose deployment grafana --type=LoadBalancer --port=3000 -n kafka # Get Grafana URL kubectl get svc grafana -n kafka # Import Strimzi dashboards from # https://github.com/strimzi/strimzi-kafka-operator/tree/main/examples/metrics/grafana-dashboards # View Prometheus metrics directly kubectl port-forward svc/my-cluster-kafka-brokers 9404:9404 -n kafka # Visit http://localhost:9404/metrics - Step 16
Configure External Access
Expose Kafka outside Kubernetes using LoadBalancer, NodePort, or Ingress listeners. Each broker gets a unique external address for client connections. Choose the listener type based on your cloud provider and networking setup.
# LoadBalancer listener (AWS, GCP, Azure) listeners: - name: external port: 9094 type: loadbalancer tls: true authentication: type: tls configuration: brokerCertChainAndKey: secretName: kafka-tls-cert certificate: tls.crt key: tls.key --- # NodePort listener (on-prem, local) listeners: - name: external port: 9094 type: nodeport tls: true configuration: preferredNodePortAddressType: ExternalIP brokers: - broker: 0 advertisedHost: <node-1-external-ip> nodePort: 32100 - broker: 1 advertisedHost: <node-2-external-ip> nodePort: 32101 - broker: 2 advertisedHost: <node-3-external-ip> nodePort: 32102 --- # Ingress listener (with NGINX or similar) listeners: - name: external port: 9094 type: ingress tls: true configuration: bootstrap: host: kafka-bootstrap.example.com brokers: - broker: 0 host: kafka-0.example.com - broker: 1 host: kafka-1.example.com - broker: 2 host: kafka-2.example.com class: nginx - Step 17
Connect External Clients
Retrieve bootstrap addresses and certificates for external connections. Configure your Kafka clients with the appropriate security protocol and credentials. Test connectivity before deploying applications.
# Get external bootstrap address kubectl get kafka my-cluster -n kafka -o jsonpath='{.status.listeners[?(@.name=="external")].bootstrapServers}' # Get CA certificate for TLS kubectl get secret my-cluster-cluster-ca-cert -n kafka -o jsonpath='{.data.ca\.crt}' | base64 -d > ca.crt # Get client certificate (if using mutual TLS) kubectl get secret producer-app -n kafka -o jsonpath='{.data.user\.crt}' | base64 -d > client.crt kubectl get secret producer-app -n kafka -o jsonpath='{.data.user\.key}' | base64 -d > client.key # Test connection with kafka-console-producer (local machine) kafka-console-producer.sh \ --bootstrap-server <external-bootstrap-address>:9094 \ --topic test-topic \ --producer-property security.protocol=SSL \ --producer-property ssl.truststore.location=truststore.jks \ --producer-property ssl.truststore.password=<password> \ --producer-property ssl.keystore.location=keystore.jks \ --producer-property ssl.keystore.password=<password> # Java client configuration example # properties.put("bootstrap.servers", "<external-address>:9094"); # properties.put("security.protocol", "SSL"); # properties.put("ssl.truststore.location", "/path/to/truststore.jks"); # properties.put("ssl.truststore.password", "password"); # properties.put("ssl.keystore.location", "/path/to/keystore.jks"); # properties.put("ssl.keystore.password", "password"); - Step 18
Upgrade Kafka Version
Strimzi supports rolling upgrades with zero downtime. Update the Kafka version in your Kafka resource and apply. The operator upgrades brokers one at a time, ensuring the cluster remains available. Always check the Strimzi documentation for version compatibility and upgrade path.
# Check current Kafka version kubectl get kafka my-cluster -n kafka -o jsonpath='{.spec.kafka.version}' # Edit Kafka resource to update version kubectl edit kafka my-cluster -n kafka # Change spec.kafka.version from 3.8.0 to 3.9.0 # Change spec.kafka.config.inter.broker.protocol.version if needed # Or patch directly kubectl patch kafka my-cluster -n kafka --type=merge -p '{ "spec": { "kafka": { "version": "3.9.0", "config": { "log.message.format.version": "3.9", "inter.broker.protocol.version": "3.9" } } } }' # Monitor upgrade progress kubectl get pods -l strimzi.io/cluster=my-cluster -n kafka -w # Check Kafka logs during upgrade kubectl logs my-cluster-kafka-0 -n kafka -f # Verify upgrade completed kubectl get kafka my-cluster -n kafka -o yaml | grep version: # After upgrade, update protocol versions if needed # This may require a second rolling restart⚠ Heads up: Always upgrade Strimzi operator first, then Kafka version. Test upgrades in a non-production environment. Some version jumps require incremental upgrades (e.g., 3.6 → 3.7 → 3.8). - Step 19
Backup and Disaster Recovery
Implement regular backups of Kafka topic data and cluster metadata. Use MirrorMaker2 for active-passive or active-active replication to a disaster recovery cluster. Back up PersistentVolumes and ZooKeeper state. Test restore procedures regularly.
# Deploy MirrorMaker2 for cross-cluster replication kubectl apply -f - <<EOF apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaMirrorMaker2 metadata: name: disaster-recovery-mirror namespace: kafka spec: version: 3.9.0 replicas: 1 connectCluster: "target" clusters: - alias: "source" bootstrapServers: my-cluster-kafka-bootstrap:9092 - alias: "target" bootstrapServers: dr-cluster-kafka-bootstrap:9092 mirrors: - sourceCluster: "source" targetCluster: "target" sourceConnector: config: replication.factor: 3 offset-syncs.topic.replication.factor: 3 sync.topic.acls.enabled: "false" heartbeatConnector: config: heartbeats.topic.replication.factor: 3 checkpointConnector: config: checkpoints.topic.replication.factor: 3 topicsPattern: ".*" groupsPattern: ".*" EOF # Backup PersistentVolumes using Velero kubectl create ns velero velero install --provider aws --bucket kafka-backups --secret-file ./credentials-velero # Create backup schedule velero schedule create kafka-daily --schedule="0 2 * * *" --include-namespaces kafka # Manual backup velero backup create kafka-backup-$(date +%Y%m%d) --include-namespaces kafka # List backups velero backup get # Restore from backup velero restore create --from-backup kafka-backup-20260529 - Step 20
Security Hardening
Enable TLS for all listeners, use mutual TLS or SCRAM-SHA-512 authentication, implement network policies, enable authorization with ACLs, and use Pod Security Standards. Regular security audits and updates are essential for production clusters.
# Network Policy to restrict Kafka access apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: kafka-network-policy namespace: kafka spec: podSelector: matchLabels: strimzi.io/cluster: my-cluster policyTypes: - Ingress - Egress ingress: - from: - namespaceSelector: matchLabels: name: applications ports: - protocol: TCP port: 9092 - protocol: TCP port: 9093 egress: - to: - podSelector: matchLabels: strimzi.io/cluster: my-cluster ports: - protocol: TCP port: 9091 - to: - podSelector: matchLabels: strimzi.io/name: my-cluster-zookeeper ports: - protocol: TCP port: 2181 --- # Pod Security Context for Kafka template: pod: securityContext: runAsNonRoot: true runAsUser: 1000 fsGroup: 1000 seccompProfile: type: RuntimeDefault affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: strimzi.io/cluster operator: In values: - my-cluster topologyKey: kubernetes.io/hostname - Step 21
Troubleshooting Common Issues
Common problems include pod restarts due to resource constraints, topic creation failures, authentication errors, and network connectivity issues. Always check operator logs first, then individual component logs. Describe resources to see events and status conditions.
# Check operator logs kubectl logs -l name=strimzi-cluster-operator -n kafka --tail=100 # Check Kafka broker logs kubectl logs my-cluster-kafka-0 -n kafka --tail=100 # Check ZooKeeper logs kubectl logs my-cluster-zookeeper-0 -n kafka --tail=100 # Check entity operator logs (Topic/User operator) kubectl logs -l strimzi.io/name=my-cluster-entity-operator -n kafka -c topic-operator --tail=100 kubectl logs -l strimzi.io/name=my-cluster-entity-operator -n kafka -c user-operator --tail=100 # Describe Kafka resource for status kubectl describe kafka my-cluster -n kafka # Check resource events kubectl get events -n kafka --sort-by='.lastTimestamp' # Verify storage is provisioned kubectl get pvc -n kafka # Check pod resource usage kubectl top pods -n kafka # Test network connectivity between pods kubectl exec -it my-cluster-kafka-0 -n kafka -- nc -zv my-cluster-zookeeper-client 2181 # Verify DNS resolution kubectl exec -it my-cluster-kafka-0 -n kafka -- nslookup my-cluster-kafka-bootstrap # Check Kafka topic status kubectl exec -it my-cluster-kafka-0 -n kafka -- bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --describe --topic <topic-name> # View under-replicated partitions kubectl exec -it my-cluster-kafka-0 -n kafka -- bin/kafka-topics.sh \ --bootstrap-server localhost:9092 \ --describe --under-replicated-partitions - Step 22
Performance Tuning
Optimize Kafka performance by tuning JVM settings, adjusting broker configurations, sizing persistent storage appropriately, and configuring producer/consumer clients correctly. Monitor key metrics like throughput, latency, and disk I/O to identify bottlenecks.
# Performance-tuned Kafka configuration kafka: jvmOptions: -Xms: 8192m -Xmx: 8192m -XX: UseG1GC: true MaxGCPauseMillis: 20 InitiatingHeapOccupancyPercent: 35 G1HeapRegionSize: 16m config: # Network tuning num.network.threads: 8 num.io.threads: 16 socket.send.buffer.bytes: 1048576 socket.receive.buffer.bytes: 1048576 socket.request.max.bytes: 104857600 # Log tuning num.partitions: 16 log.segment.bytes: 1073741824 log.retention.check.interval.ms: 300000 log.flush.interval.messages: 10000 # Replication tuning replica.fetch.max.bytes: 1048576 replica.lag.time.max.ms: 30000 # Compression compression.type: lz4 resources: requests: memory: 16Gi cpu: "4" limits: memory: 16Gi cpu: "8" # Use high-performance storage class storage: type: persistent-claim size: 500Gi class: high-iops-ssd - Step 23
Migrating from KRaft Mode (ZooKeeper-less)
Kafka 3.3+ supports KRaft mode (Kafka Raft metadata mode), which removes the ZooKeeper dependency. Strimzi 0.32+ supports KRaft deployments. This simplifies architecture and improves metadata scalability. For new deployments, consider starting with KRaft mode.
apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: kraft-cluster namespace: kafka spec: kafka: version: 3.9.0 replicas: 3 listeners: - name: plain port: 9092 type: internal tls: false - name: tls port: 9093 type: internal tls: true config: offsets.topic.replication.factor: 3 transaction.state.log.replication.factor: 3 transaction.state.log.min.isr: 2 default.replication.factor: 3 min.insync.replicas: 2 storage: type: persistent-claim size: 100Gi # KRaft-specific: no zookeeper section needed metadataVersion: 3.9-IV0 entityOperator: topicOperator: {} userOperator: {}⚠ Heads up: Migration from ZooKeeper to KRaft is one-way and requires careful planning. Test thoroughly in non-production environments. Not all Strimzi features are available in KRaft mode yet - check documentation for current limitations.
Feature requests
Sign in to suggest features or vote on existing ones.
No feature requests yet.
Discussion
Sign in to join the discussion.
No comments yet.