Quick Start Guide
The Replex Kubernetes Agent collects metadata from the local cluster via the kubernetes API and metrics from a metric provider (a local Prometheus instance, Datadog, Stackdriver or Instana) and sends this information to the replex pushgateway which stores it for further use.
- A Kubernetes cluster to install the agent.
- A metric provider. Either:
- Prometheus instance that is running in the cluster and accessible via URL.
- A thanos instance configured with a querier component
- A Datadog account with API key and Application key provided.
- A Stackdriver account.
- An Instana account with an API token.
- Helm 3 for the installation.
- A replex token.
If only the Replex agent is to be used for data collection in the clusters, it must be installed in each individual cluster. The
kubernetesInfoProvider
parameter can be left at the default value of kubernetes
.The helm installation requires a few mandatory parameters:
Parameter | Description |
cluster.id | A string that identifies the cluster uniquely. This is used in order to distinguish two clusters that might have the same name (e.g. If a cluster called "development" is destroyed and later replaced by a new cluster that is also called "development" this ID should be different to identify both clusters uniquely). |
cluster.name | A human readable name for the cluster. |
replex.token | A token that is provided by Replex and is used by the pushgateway to authenticate requests from the agent. If the agent is installed in multiple clusters the same token can be used for all deployments. |
metrics.provider | The metric provider to be used. Can be either prometheus, thanos, stackdriver, instana or datadog. |
(Optional) pushgateway.url | The full URL to the replex pushgateway. Format: https://pushgateway.client.com/push (Note the /push path). |
(Optional) onlyUseReadyNodes | Consider nodes without 'Ready' status as not running at all. |
(Optional) useControlPlaneCost | Track costs of the Kubernetes Control Plane. |
- prometheus or thanos:ParameterDescriptionDefaultprometheus.urlThe API URL of the local Prometheus instance, for example http://prometheus-server.monitoring.svc.cluster.local:9090prometheus.nodeLabelSets the value which represents the 'node label'. This label is usually
node
orinstance
.nodeprometheus.containerLabelSets the value which represents the 'container label'. This label is usuallycontainer
orcontainer_name
.containerprometheus.podLabelSets the value which represents the 'pod label'. This label is usuallypod
orpod_name
.podParametersprometheus.*Label
allow overriding default labels in Prometheus installation. This will be the label that contains actual entity name in a Prometheus time-series likecontainer_cpu_usage_seconds_total
. - datadog:ParameterDescriptionDefaultdatadog.apiKeyYour Datadog API Key.datadog.applicationKeyYour Datadog Application Key.(Optional) datadog.siteEither
com
if you are on Datadog US site oreu
if you are on Datadog EU site.com - stackdriver:ParameterDescriptionDefaultstackdriver.projectIdYour GCP project ID. The agent will authenticate with Google using the service account provided by the environment.
- instana:ParameterDescriptionDefaultinstana.baseUrlinstana.apiTokenValid Instana API token.
This section is relevant if you already use Instana to monitor your Kubernetes clusters. With a single deployment of the replex Agent, replex can pull all metrics and kubernetes information from the Instana API for all your clusters. This way, you don't need to install the replex Agent on all the clusters that you want to monitor costs for.
To activate this feature it is required to set the Helm parameter kubernetesInfoProvider to
instana
. Additionally, the Instana base URL (instana.baseUrl
) and API Token (instana.apiToken
) are required.For the single agent setup, the
cluster.id
and cluster.name
parameters should not be set since the agent automatically uses the cluster IDs from Instana.- 1.Add the Replex Helm repositoryhelm repo add replex https://registry.replex.io/chartrepo/public
- 2.Create a namespace to install the agent inkubectl create namespace replex-k8s-agent
- 3.Create a file called
values.yaml
to specify Helm parameters. The file should look like this:- prometheus or thanos:cluster:id: <cluster-id>name: <cluster-name>replex:token: <replex-token>metrics:provider: prometheusprometheus:url: <prometheus-url>containerLabel: <prometheus-container-label>podLabel: <prometheus-pod-label>nodeLabel: <prometheus-node-label>
- datadog:cluster:id: <cluster-id>name: <cluster-name>replex:token: <replex-token>metrics:provider: datadogdatadog:apiKey: <datadog-api-key>applicationKey: <datadog-application-key>site: <datadog-site>
- stackdriver:cluster:id: <cluster-id>name: <cluster-name>replex:token: <replex-token>metrics:provider: stackdriverstackdriver:projectId: <gcp-project-id>
- instana:cluster:id: <cluster-id>name: <cluster-name>replex:token: <replex-token>metrics:provider: instanainstana:baseUrl: <base-url>apiToken: <api-token>
- 4.Installation with Helmhelm install <release-name> replex/replex-k8s-agent --namespace replex-k8s-agent -f values.yamlWhere
release-name
is any arbitrary string to identify the Helm installation e.g.replex-agent
. - 5.Installation without HelmYou will still need Helm to render the template locally. However, the installation itself will be done with
kubectl
directly, not with Helm:helm template replex replex/replex-k8s-agent --namespace replex-k8s-agent -f values.yaml | kubectl apply -f -After these steps the agent will start sending data to the pushgateway.
To update the agent, follow these steps:
- 1.Update the Helm repo.helm repo update
- 2.Upgrade the replex release.helm upgrade <release-name> replex/replex-k8s-agent --namespace <namespace> -f values.yaml
The replex-kubernetes-agent runs as a pod in the namespace it was installed in and its logs can be consulted using kubectl.
For example,
kubectl -n replex-k8s-agent logs <pod-name>
Replex agent has a retry policy when one of metric push fails. By default, retry policy is turned on, so if one of metric pushes to Pushgateway fails, it will be stored on disk in
persistentVolume.mountPath
dir. If retry.diskCache
is false, metrics will be stored in agents memory. retry.intervalSeconds
stands for the interval when agent will try to resend metrics from the cache, by default it is 5 minutes (300s).To set up retry policy next three variables can be used:
- (Optional) retry.intervalSeconds: Interval between a retry to push metrics that failed previously.
- (Optional) retry.diskCache: Cache failed metrics on disk.
- (Optional) persistentVolume.mountPath: If
retry.diskCache
is true, path to store metrics.
Key | Type | Default | Description |
cloudProviderOverride | string | "" | force agent to use specified cloud provider |
cluster.id | string | "" | Unique cluster identifier |
cluster.name | string | "" | Cluster name displayed in dashboard |
datadog.apiKey | string | "" | Your Datadog API Key |
datadog.applicationKey | string | "" | Your Datadog Application Key |
datadog.site | string | "com" | Either com if you are on Datadog US site or eu if you are on Datadog EU site. |
extraInitContainers | object | {} | Add init containers to replex agent container |
extraVolumeMounts | object | {} | Add extra volumeMounts to replex agent container |
extraVolumes | list | [] | Add extra volumes to replex agent container |
image.pullPolicy | string | "Always" | Image pull policy |
image.repository | string | "registry.replex.io/public/replex-k8s-agent" | Image repository |
image.tag | string | "" | Override AppVersion with specific tag from image repository |
instana.apiToken | string | "" | Valid Instana API token |
instana.baseUrl | string | "" | |
kubernetesInfoProvider | string | "kubernetes" | Specify Kubernetes info provider. Options are `kubernetes' or 'instana' |
metrics.filesystem | string | "cadvisor" | Either cadvisor for using cAdvisor filesystem metrics or csi if using CSI drivers. csi requires https://github.com/kubernetes/kube-state-metrics. |
metrics.provider | string | "" | Metrics provider to use. |
nodeSelector | object | {} | Pod node selector Key-Value pair |
onlyUseReadyNodes | bool | false | Consider nodes without 'Ready' status as not running at all. |
persistentVolume.mountPath | string | "/data/metrics" | Persistent volume mount path |
persistentVolume.size | string | "10Gi" | Size of agent persistent volume |
persistentVolume.storageClassName | string | "" | name of the storage class to configure the PVC on |
prometheus.bearerToken | string | "" | Bearer token for prometheus server requests authentication. |
prometheus.containerLabel | string | "container" | The label representing "container name" on prometheus time series. |
prometheus.nodeLabel | string | "node" | The label representing "node name" on prometheus time series. |
prometheus.podLabel | string | "pod" | The label representing "pod name" on prometheus time series. |
prometheus.url | string | "" | URL of the Prometheus instance |
pushgateway.url | string | "" | Full URL to the replex pushgateway. Format: https://pushgateway.client.com/push (Note the /push path). |
replex.token | string | "" | Agent authentication token. |
resources.requests.cpu | string | "50m" | Specify the cpu units requests of the agent container |
resources.requests.memory | string | "100Mi" | Specify the memory bytes requests of the agent container |
retry.diskCache | bool | true | |
retry.intervalSeconds | int | 300 | |
securityContext | object | {} | Deployment security context |
sslCertificate | string | "" | SSL certificate string. Can be used to add a custom ssl certificate to the agent. |
stackdriver.projectId | string | "" | Your GCP project ID |
tokenSecret.create | bool | true | Set to 'false' to skip creation of Secret object |
tokenSecret.key | string | "" | Override key in token Secret object (for custom secret keys) |
tokenSecret.name | string | "" | Override name of token Secret object |
tolerations | list | [] | array of tolerations for the pod scheduling |
useControlPlaneCost | string | "" | (bool) Track costs of the Kubernetes Control Plane. |
Last modified 2yr ago