Troubleshooting Guide

This guide helps you diagnose and resolve common issues you might encounter when self-hosting Traceloop.

Need immediate assistance? Schedule a support call with our team.

Connection Issues

ClickHouse Connection Failures

The Traceloop Helm chart expects specific configuration for external ClickHouse connections:

Required Configuration:

  • Fill out values-external-clickhouse.yaml with your ClickHouse host, port, and connection details
  • Ensure traceloop-clickhouse-secret exists with required credentials:
    • CLICKHOUSE_USERNAME
    • CLICKHOUSE_PASSWORD
    • Note: This secret is automatically created if using Traceloop Terraform, otherwise create it manually
# Verify the ClickHouse secret exists
kubectl get secret traceloop-clickhouse-secret -n traceloop

# Check secret keys (without exposing values)
kubectl describe secret traceloop-clickhouse-secret -n traceloop

Common issues and their solutions when connecting to ClickHouse:

  • Verify the host and port are correct
  • Ensure the database exists and the user has appropriate permissions
  • Check if SSL is required (secure: true)
  • Verify network connectivity and firewall rules
  • Ensure traceloop-clickhouse-secret contains correct credentials
  • Verify values-external-clickhouse.yaml has correct host and configuration

Kafka Connection Issues

The Traceloop Helm chart expects specific configuration for external Kafka connections:

Required Configuration:

  • Fill out values-external-kafka.yaml with your Kafka broker addresses, connection details, and credentials
  • Ensure the following Kafka topics exist: traces, spans, and metrics
    • Note: These topics are automatically created if using Traceloop Terraform, otherwise create them manually
# Verify your Kafka configuration is properly applied
kubectl get configmap -n traceloop | grep kafka

If you’re having trouble connecting to Kafka:

  • Confirm broker addresses are accessible from the Kubernetes cluster
  • Ensure required topics exist: traces, spans, and metrics
  • For Confluent Cloud:
    • Ensure SASL authentication is properly configured
    • Verify SSL is enabled
  • Check topic creation permissions
  • Verify network policies allow Kafka connectivity
  • Verify values-external-kafka.yaml has correct broker addresses, credentials, and configuration

PostgreSQL Connection Problems

The Traceloop Helm chart expects specific configuration for external PostgreSQL connections:

Required Configuration:

  • Fill out values-external-postgres.yaml with your PostgreSQL address and connection details
  • Ensure traceloop-postgres-secret exists with required credentials:
    • POSTGRES_DATABASE_USERNAME
    • POSTGRES_DATABASE_PASSWORD
    • Note: This secret is automatically created if using Traceloop Terraform, otherwise create it manually
# Verify the PostgreSQL secret exists
kubectl get secret traceloop-postgres-secret -n traceloop

# Check secret keys (without exposing values)
kubectl describe secret traceloop-postgres-secret -n traceloop

Common PostgreSQL connectivity issues:

  • Verify database exists and is accessible
  • Check user permissions (needs ability to create schemas and tables)
  • For SSL connections, ensure proper SSL mode is set
  • Verify PostgreSQL version compatibility (9.6 or higher)
  • Ensure traceloop-postgres-secret contains correct credentials
  • Verify values-external-postgres.yaml has correct database address and configuration

Kubernetes Deployment Issues

Pod Startup Failures

If pods aren’t starting properly:

# Check pod status
kubectl get pods -n traceloop

# Check pod events
kubectl describe pod -n traceloop <pod-name>

# Check pod logs
kubectl logs -n traceloop <pod-name>

Common solutions:

  • Verify resource requests and limits are appropriate
  • Ensure all secrets are properly created
  • Check container image pull permissions

Ingress Issues

Our Ingress Architecture: Traceloop uses a custom ingress setup rather than traditional Kubernetes ingress controllers. A load balancer handles SSL termination and forwards HTTP traffic directly to port 30800 on the Kubernetes cluster. This port maps to a Kong gateway pod, which serves as the unified ingress point for all HTTP endpoints.

If you can’t access Traceloop through your load balancer:

Load Balancer Configuration:

  • Verify the load balancer is properly provisioned
  • Check DNS records point to the correct load balancer endpoint
  • Verify SSL certificate is properly attached to load balancer for HTTPS termination
  • Ensure security groups allow traffic from load balancer to Kubernetes cluster on port 30800
  • Check load balancer target group health - targets should be healthy on port 30800

Kong Gateway Pod Health:

  • Verify Kong gateway pod is running and healthy
  • Check that Kong is listening on the correct port (default mapped to 30800)
  • Ensure Kong domain configuration is correct in values-customer.yaml - verify kong-gateway.kong.domain is properly filled
# Check Kong gateway pod status
kubectl get pods -n traceloop | grep kong

# Check Kong pod logs
kubectl logs -n traceloop -l app=kong

# Verify port mapping and service
kubectl get svc -n traceloop

Image Pull Failures

Traceloop images are stored on Docker Hub and require authentication. A regcred secret is provided by Traceloop and must be manually created in the traceloop namespace.

Setup Requirements:

  • Create the regcred secret in the traceloop namespace using credentials provided by Traceloop
  • Ensure the secret is properly referenced in your Helm values
# Verify the regcred secret exists
kubectl get secret regcred -n traceloop

# Check secret details (without exposing credentials)
kubectl describe secret regcred -n traceloop

Common Issues:

  • regcred secret not created in the traceloop namespace
  • Incorrect Docker Hub credentials in the secret
  • Secret not properly referenced in pod specifications
  • Network connectivity issues to Docker Hub

Common Error Messages

”Error: connection refused”

Check component connectivity:

# Test ClickHouse connectivity
kubectl exec -it -n traceloop deploy/traceloop-api -- nc -vz your-clickhouse-host 9000

# Test Kafka connectivity
kubectl exec -it -n traceloop deploy/traceloop-api -- nc -vz your-kafka-broker 9092

# Test PostgreSQL connectivity
kubectl exec -it -n traceloop deploy/traceloop-api -- nc -vz your-postgres-host 5432

“Error: authentication failed”

Traceloop uses PropelAuth as the authentication provider. Ensure proper configuration:

Required Configuration:

  • Fill out values-customer.yaml with PropelAuth settings:
    customerConfig:
      propelauth:
        authURL: "traceloop-provided"
    
    customerSecret:
      propelauth:
        verifierKey: "traceloop-provided"
        apiKey: "traceloop-provided"
    

Authentication issues:

  • Ensure service accounts have necessary permissions
  • Verify SSL/TLS configuration if required
  • Ensure PropelAuth configuration is correct in values-customer.yaml
  • Verify PropelAuth secrets contain the correct keys provided by Traceloop

”Error: insufficient permissions”

Permission-related problems:

  • Check RBAC configuration
  • Verify service account permissions
  • Ensure necessary Kubernetes roles are created
  • Check database user permissions

Quick Validation Checklist

  1. Component Connectivity

    • All infrastructure components are reachable
    • Proper ports are open
    • Network policies allow required traffic
  2. Authentication & Permissions

    • Database credentials are correct
    • Kubernetes secrets exist and are mounted
    • Service accounts have required permissions
  3. Infrastructure Health

    • Kubernetes nodes are healthy
    • Sufficient resources are available
    • Storage is properly configured
  4. Logging & Monitoring

    • Check pod logs: kubectl logs -n traceloop -l app=traceloop
    • Monitor resource usage
    • Review infrastructure metrics

Still having issues? We’re here to help: - Schedule a support call for personalized assistance - Join our community Slack for discussions and updates