Quick Start Guide

Purpose

The Replex Kubernetes Agent collects metadata from the local cluster via the kubernetes API and metrics from a metric provider (a local Prometheus instance, Datadog, Stackdriver or Instana) and sends this information to the replex pushgateway which stores it for further use.

Requirements

  • A Kubernetes cluster to install the agent.

  • A metric provider. Either:

    • Prometheus instance that is running in the cluster and accessible via URL.

    • A thanos instance configured with a querier component

    • A Datadog account with API key and Application key provided.

    • A Stackdriver account.

    • An Instana account with an API token.

  • Helm 3 for the installation.

  • A replex token.

Required parameters for the installation

If only the Replex agent is to be used for data collection in the clusters, it must be installed in each individual cluster. The kubernetesInfoProvider parameter can be left at the default value of kubernetes.

The helm installation requires a few mandatory parameters:

Parameter

Description

cluster.id

A string that identifies the cluster uniquely. This is used in order to distinguish two clusters that might have the same name (e.g. If a cluster called "development" is destroyed and later replaced by a new cluster that is also called "development" this ID should be different to identify both clusters uniquely).

cluster.name

A human readable name for the cluster.

replex.token

A token that is provided by Replex and is used by the pushgateway to authenticate requests from the agent. If the agent is installed in multiple clusters the same token can be used for all deployments.

metrics.provider

The metric provider to be used. Can be either prometheus, thanos, stackdriver, instana or datadog.

(Optional) pushgateway.url

(Optional) onlyUseReadyNodes

Consider nodes without 'Ready' status as not running at all.

(Optional) useControlPlaneCost

Track costs of the Kubernetes Control Plane.

Metric Provider Parameters

  • prometheus or thanos:

    Parameter

    Description

    Default

    prometheus.url

    prometheus.nodeLabel

    Sets the value which represents the 'node label'. This label is usually node or instance.

    node

    prometheus.containerLabel

    Sets the value which represents the 'container label'. This label is usually container or container_name.

    container

    prometheus.podLabel

    Sets the value which represents the 'pod label'. This label is usually pod or pod_name.

    pod

    Parameters prometheus.*Label allow overriding default labels in Prometheus installation. This will be the label that contains actual entity name in a Prometheus time-series like container_cpu_usage_seconds_total.

  • datadog:

    Parameter

    Description

    Default

    datadog.apiKey

    Your Datadog API Key.

    datadog.applicationKey

    Your Datadog Application Key.

    (Optional) datadog.site

    Either com if you are on Datadog US site or eu if you are on Datadog EU site.

    com

    Note: Retrieve your Datadog keys from the Datadog settings page.

  • stackdriver:

    Parameter

    Description

    Default

    stackdriver.projectId

    Your GCP project ID. The agent will authenticate with Google using the service account provided by the environment.

  • instana:

    Parameter

    Description

    Default

    instana.baseUrl

    instana.apiToken

    Valid Instana API token.

Instana Single-Agent Setup

This section is relevant if you already use Instana to monitor your Kubernetes clusters. With a single deployment of the replex Agent, replex can pull all metrics and kubernetes information from the Instana API for all your clusters. This way, you don't need to install the replex Agent on all the clusters that you want to monitor costs for.

To activate this feature it is required to set the Helm parameter kubernetesInfoProvider to instana. Additionally, the Instana base URL (instana.baseUrl) and API Token (instana.apiToken) are required.

For the single agent setup, the cluster.id and cluster.name parameters should not be set since the agent automatically uses the cluster IDs from Instana.

Installation

  1. Add the Replex Helm repository

    helm repo add replex https://registry.replex.io/chartrepo/public
  2. Create a namespace to install the agent in

    kubectl create namespace replex-k8s-agent
  3. Create a file called values.yaml to specify Helm parameters. The file should look like this:

    • prometheus or thanos:

      cluster:
        id: <cluster-id>
        name: <cluster-name>
      replex:
        token: <replex-token>
      metrics:
        provider: prometheus
      prometheus:
        url: <prometheus-url>
        containerLabel: <prometheus-container-label>
        podLabel: <prometheus-pod-label>
        nodeLabel: <prometheus-node-label>
    • datadog:

      cluster:
        id: <cluster-id>
        name: <cluster-name>
      replex:
        token: <replex-token>
      metrics:
        provider: datadog
      datadog:
        apiKey: <datadog-api-key>
        applicationKey: <datadog-application-key>
        site: <datadog-site>
    • stackdriver:

      cluster:
        id: <cluster-id>
        name: <cluster-name>
      replex:
        token: <replex-token>
      metrics:
        provider: stackdriver
      stackdriver:
        projectId: <gcp-project-id>
    • instana:

      cluster:
        id: <cluster-id>
        name: <cluster-name>
      replex:
        token: <replex-token>
      metrics:
        provider: instana
      instana:
        baseUrl: <base-url>
        apiToken: <api-token>
  4. Installation with Helm

    helm install <release-name> replex/replex-k8s-agent --namespace replex-k8s-agent -f values.yaml

    Where release-name is any arbitrary string to identify the Helm installation e.g. replex-agent.

  5. Installation without Helm

    You will still need Helm to render the template locally. However, the installation itself will be done with kubectl directly, not with Helm:

    helm template replex replex/replex-k8s-agent --namespace replex-k8s-agent -f values.yaml | kubectl apply -f -

    After these steps the agent will start sending data to the pushgateway.

Updating

To update the agent, follow these steps:

  1. Update the Helm repo.

    helm repo update
  2. Upgrade the replex release.

    helm upgrade <release-name> replex/replex-k8s-agent --namespace <namespace> -f values.yaml

Logging

The replex-kubernetes-agent runs as a pod in the namespace it was installed in and its logs can be consulted using kubectl.

For example, kubectl -n replex-k8s-agent logs <pod-name>

Retry policy

Replex agent has a retry policy when one of metric push fails. By default, retry policy is turned on, so if one of metric pushes to Pushgateway fails, it will be stored on disk in persistentVolume.mountPath dir. If retry.diskCache is false, metrics will be stored in agents memory. retry.intervalSeconds stands for the interval when agent will try to resend metrics from the cache, by default it is 5 minutes (300s).

To set up retry policy next three variables can be used:

  • (Optional) retry.intervalSeconds: Interval between a retry to push metrics that failed previously.

  • (Optional) retry.diskCache: Cache failed metrics on disk.

  • (Optional) persistentVolume.mountPath: If retry.diskCache is true, path to store metrics.

Chart Values

Key

Type

Default

Description

cloudProviderOverride

string

""

force agent to use specified cloud provider

cluster.id

string

""

Unique cluster identifier

cluster.name

string

""

Cluster name displayed in dashboard

datadog.apiKey

string

""

Your Datadog API Key

datadog.applicationKey

string

""

Your Datadog Application Key

datadog.site

string

"com"

Either com if you are on Datadog US site or eu if you are on Datadog EU site.

extraInitContainers

object

{}

Add init containers to replex agent container

extraVolumeMounts

object

{}

Add extra volumeMounts to replex agent container

extraVolumes

list

[]

Add extra volumes to replex agent container

image.pullPolicy

string

"Always"

Image pull policy

image.repository

string

"registry.replex.io/public/replex-k8s-agent"

Image repository

image.tag

string

""

Override AppVersion with specific tag from image repository

instana.apiToken

string

""

Valid Instana API token

instana.baseUrl

string

""

kubernetesInfoProvider

string

"kubernetes"

Specify Kubernetes info provider. Options are `kubernetes' or 'instana'

metrics.filesystem

string

"cadvisor"

metrics.provider

string

""

Metrics provider to use.

nodeSelector

object

{}

Pod node selector Key-Value pair

onlyUseReadyNodes

bool

false

Consider nodes without 'Ready' status as not running at all.

persistentVolume.mountPath

string

"/data/metrics"

Persistent volume mount path

persistentVolume.size

string

"10Gi"

Size of agent persistent volume

persistentVolume.storageClassName

string

""

name of the storage class to configure the PVC on

prometheus.bearerToken

string

""

Bearer token for prometheus server requests authentication.

prometheus.containerLabel

string

"container"

The label representing "container name" on prometheus time series.

prometheus.nodeLabel

string

"node"

The label representing "node name" on prometheus time series.

prometheus.podLabel

string

"pod"

The label representing "pod name" on prometheus time series.

prometheus.url

string

""

URL of the Prometheus instance

pushgateway.url

string

""

replex.token

string

""

Agent authentication token.

resources.requests.cpu

string

"50m"

Specify the cpu units requests of the agent container

resources.requests.memory

string

"100Mi"

Specify the memory bytes requests of the agent container

retry.diskCache

bool

true

retry.intervalSeconds

int

300

securityContext

object

{}

Deployment security context

sslCertificate

string

""

SSL certificate string. Can be used to add a custom ssl certificate to the agent.

stackdriver.projectId

string

""

Your GCP project ID

tokenSecret.create

bool

true

Set to 'false' to skip creation of Secret object

tokenSecret.key

string

""

Override key in token Secret object (for custom secret keys)

tokenSecret.name

string

""

Override name of token Secret object

tolerations

list

[]

array of tolerations for the pod scheduling

useControlPlaneCost

string

""

(bool) Track costs of the Kubernetes Control Plane.

Last updated