Quick Start Guide

Purpose

The Replex Kubernetes Agent collects metadata from the local cluster via the kubernetes API and metrics from a metric provider (a local Prometheus instance, Datadog, Stackdriver or Instana) and sends this information to the replex pushgateway which stores it for further use.

Requirements

  • A Kubernetes cluster to install the agent.
  • A metric provider. Either:
    • Prometheus instance that is running in the cluster and accessible via URL.
    • A thanos instance configured with a querier component
    • A Datadog account with API key and Application key provided.
    • A Stackdriver account.
    • An Instana account with an API token.
  • Helm 3 for the installation.
  • A replex token.

Required parameters for the installation

If only the Replex agent is to be used for data collection in the clusters, it must be installed in each individual cluster. The kubernetesInfoProvider parameter can be left at the default value of kubernetes.
The helm installation requires a few mandatory parameters:
Parameter
Description
cluster.id
A string that identifies the cluster uniquely. This is used in order to distinguish two clusters that might have the same name (e.g. If a cluster called "development" is destroyed and later replaced by a new cluster that is also called "development" this ID should be different to identify both clusters uniquely).
cluster.name
A human readable name for the cluster.
replex.token
A token that is provided by Replex and is used by the pushgateway to authenticate requests from the agent. If the agent is installed in multiple clusters the same token can be used for all deployments.
metrics.provider
The metric provider to be used. Can be either prometheus, thanos, stackdriver, instana or datadog.
(Optional) pushgateway.url
The full URL to the replex pushgateway. Format: https://pushgateway.client.com/push (Note the /push path).
(Optional) onlyUseReadyNodes
Consider nodes without 'Ready' status as not running at all.
(Optional) useControlPlaneCost
Track costs of the Kubernetes Control Plane.

Metric Provider Parameters

  • prometheus or thanos:
    Parameter
    Description
    Default
    prometheus.url
    The API URL of the local Prometheus instance, for example http://prometheus-server.monitoring.svc.cluster.local:9090
    prometheus.nodeLabel
    Sets the value which represents the 'node label'. This label is usually node or instance.
    node
    prometheus.containerLabel
    Sets the value which represents the 'container label'. This label is usually container or container_name.
    container
    prometheus.podLabel
    Sets the value which represents the 'pod label'. This label is usually pod or pod_name.
    pod
    Parameters prometheus.*Label allow overriding default labels in Prometheus installation. This will be the label that contains actual entity name in a Prometheus time-series like container_cpu_usage_seconds_total.
  • datadog:
    Parameter
    Description
    Default
    datadog.apiKey
    Your Datadog API Key.
    datadog.applicationKey
    Your Datadog Application Key.
    (Optional) datadog.site
    Either com if you are on Datadog US site or eu if you are on Datadog EU site.
    com
    Note: Retrieve your Datadog keys from the Datadog settings page.
  • stackdriver:
    Parameter
    Description
    Default
    stackdriver.projectId
    Your GCP project ID. The agent will authenticate with Google using the service account provided by the environment.
  • instana:
    Parameter
    Description
    Default
    instana.baseUrl
    This is the base URL of a tenant unit, e.g. https://test-example.instana.io.
    instana.apiToken
    Valid Instana API token.

Instana Single-Agent Setup

This section is relevant if you already use Instana to monitor your Kubernetes clusters. With a single deployment of the replex Agent, replex can pull all metrics and kubernetes information from the Instana API for all your clusters. This way, you don't need to install the replex Agent on all the clusters that you want to monitor costs for.
To activate this feature it is required to set the Helm parameter kubernetesInfoProvider to instana. Additionally, the Instana base URL (instana.baseUrl) and API Token (instana.apiToken) are required.
For the single agent setup, the cluster.id and cluster.name parameters should not be set since the agent automatically uses the cluster IDs from Instana.

Installation

  1. 1.
    Add the Replex Helm repository
    helm repo add replex https://registry.replex.io/chartrepo/public
  2. 2.
    Create a namespace to install the agent in
    kubectl create namespace replex-k8s-agent
  3. 3.
    Create a file called values.yaml to specify Helm parameters. The file should look like this:
    • prometheus or thanos:
      cluster:
      id: <cluster-id>
      name: <cluster-name>
      replex:
      token: <replex-token>
      metrics:
      provider: prometheus
      prometheus:
      url: <prometheus-url>
      containerLabel: <prometheus-container-label>
      podLabel: <prometheus-pod-label>
      nodeLabel: <prometheus-node-label>
    • datadog:
      cluster:
      id: <cluster-id>
      name: <cluster-name>
      replex:
      token: <replex-token>
      metrics:
      provider: datadog
      datadog:
      apiKey: <datadog-api-key>
      applicationKey: <datadog-application-key>
      site: <datadog-site>
    • stackdriver:
      cluster:
      id: <cluster-id>
      name: <cluster-name>
      replex:
      token: <replex-token>
      metrics:
      provider: stackdriver
      stackdriver:
      projectId: <gcp-project-id>
    • instana:
      cluster:
      id: <cluster-id>
      name: <cluster-name>
      replex:
      token: <replex-token>
      metrics:
      provider: instana
      instana:
      baseUrl: <base-url>
      apiToken: <api-token>
  4. 4.
    Installation with Helm
    helm install <release-name> replex/replex-k8s-agent --namespace replex-k8s-agent -f values.yaml
    Where release-name is any arbitrary string to identify the Helm installation e.g. replex-agent.
  5. 5.
    Installation without Helm
    You will still need Helm to render the template locally. However, the installation itself will be done with kubectl directly, not with Helm:
    helm template replex replex/replex-k8s-agent --namespace replex-k8s-agent -f values.yaml | kubectl apply -f -
    After these steps the agent will start sending data to the pushgateway.

Updating

To update the agent, follow these steps:
  1. 1.
    Update the Helm repo.
    helm repo update
  2. 2.
    Upgrade the replex release.
    helm upgrade <release-name> replex/replex-k8s-agent --namespace <namespace> -f values.yaml

Logging

The replex-kubernetes-agent runs as a pod in the namespace it was installed in and its logs can be consulted using kubectl.
For example, kubectl -n replex-k8s-agent logs <pod-name>

Retry policy

Replex agent has a retry policy when one of metric push fails. By default, retry policy is turned on, so if one of metric pushes to Pushgateway fails, it will be stored on disk in persistentVolume.mountPath dir. If retry.diskCache is false, metrics will be stored in agents memory. retry.intervalSeconds stands for the interval when agent will try to resend metrics from the cache, by default it is 5 minutes (300s).
To set up retry policy next three variables can be used:
  • (Optional) retry.intervalSeconds: Interval between a retry to push metrics that failed previously.
  • (Optional) retry.diskCache: Cache failed metrics on disk.
  • (Optional) persistentVolume.mountPath: If retry.diskCache is true, path to store metrics.

Chart Values

Key
Type
Default
Description
cloudProviderOverride
string
""
force agent to use specified cloud provider
cluster.id
string
""
Unique cluster identifier
cluster.name
string
""
Cluster name displayed in dashboard
datadog.apiKey
string
""
Your Datadog API Key
datadog.applicationKey
string
""
Your Datadog Application Key
datadog.site
string
"com"
Either com if you are on Datadog US site or eu if you are on Datadog EU site.
extraInitContainers
object
{}
Add init containers to replex agent container
extraVolumeMounts
object
{}
Add extra volumeMounts to replex agent container
extraVolumes
list
[]
Add extra volumes to replex agent container
image.pullPolicy
string
"Always"
Image pull policy
image.repository
string
"registry.replex.io/public/replex-k8s-agent"
Image repository
image.tag
string
""
Override AppVersion with specific tag from image repository
instana.apiToken
string
""
Valid Instana API token
instana.baseUrl
string
""
Base URL of a tenant unit, of form: https://test-example.instana.io
kubernetesInfoProvider
string
"kubernetes"
Specify Kubernetes info provider. Options are `kubernetes' or 'instana'
metrics.filesystem
string
"cadvisor"
Either cadvisor for using cAdvisor filesystem metrics or csi if using CSI drivers. csi requires https://github.com/kubernetes/kube-state-metrics.
metrics.provider
string
""
Metrics provider to use.
nodeSelector
object
{}
Pod node selector Key-Value pair
onlyUseReadyNodes
bool
false
Consider nodes without 'Ready' status as not running at all.
persistentVolume.mountPath
string
"/data/metrics"
Persistent volume mount path
persistentVolume.size
string
"10Gi"
Size of agent persistent volume
persistentVolume.storageClassName
string
""
name of the storage class to configure the PVC on
prometheus.bearerToken
string
""
Bearer token for prometheus server requests authentication.
prometheus.containerLabel
string
"container"
The label representing "container name" on prometheus time series.
prometheus.nodeLabel
string
"node"
The label representing "node name" on prometheus time series.
prometheus.podLabel
string
"pod"
The label representing "pod name" on prometheus time series.
prometheus.url
string
""
URL of the Prometheus instance
pushgateway.url
string
""
Full URL to the replex pushgateway. Format: https://pushgateway.client.com/push (Note the /push path).
replex.token
string
""
Agent authentication token.
resources.requests.cpu
string
"50m"
Specify the cpu units requests of the agent container
resources.requests.memory
string
"100Mi"
Specify the memory bytes requests of the agent container
retry.diskCache
bool
true
retry.intervalSeconds
int
300
securityContext
object
{}
Deployment security context
sslCertificate
string
""
SSL certificate string. Can be used to add a custom ssl certificate to the agent.
stackdriver.projectId
string
""
Your GCP project ID
tokenSecret.create
bool
true
Set to 'false' to skip creation of Secret object
tokenSecret.key
string
""
Override key in token Secret object (for custom secret keys)
tokenSecret.name
string
""
Override name of token Secret object
tolerations
list
[]
array of tolerations for the pod scheduling
useControlPlaneCost
string
""
(bool) Track costs of the Kubernetes Control Plane.
Last modified 2yr ago