Found a great article for implementing Custom Metrics for HPA here: https://towardsdatascience.com/kubernetes-hpa-with-custom-metrics-from-prometheus-9ffc201991e
In order to use custom metrics for autoscaling, you need 3 things:
A helm chart is listed on the Kubeapps Hub as stable/prometheus-adapter and can be used to install the adapter:
helm install --name my-release-name stable/prometheus-adapter
We will use https://metrics.fabric.net/new/graph?g0.expr=xxxxx
prometheus-adapter:
prometheus:
url: https://metrics.fabric.net/graph?expr=
port: 8443
rules:
custom:
- seriesQuery: '{__name__=~"myapplication_api_response_time_.*",namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: ^(.*)
as: "mycustomappmetrics"
metricsQuery: 1000 * (sum(rate(mycustommetrics[5m]) > 0) by (<<.GroupBy>>) / sum(rate(mycustommetrics_count[5m]) > 0) by (<<.GroupBy>>))
Check the value of the metric using the following command, which sends a raw GET request to the Kubernetes API server:
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/myapplication/pods/*/myapplication_api_response_time_avg" | jq .
response
{
"kind":"MetricValueList",
"apiVersion":"custom.metrics.k8s.io/v1beta1",
"metadata":{
"selfLink":"/apis/custom.metrics.k8s.io/v1beta1/namespaces/myapplication/pods/*/myapplication_api_response_time_avg"
},
"items":[
{
"describedObject":{
"kind":"Pod",
"namespace":"myapplication",
"name":"myapplication-85cfb49cf6-54hhf",
"apiVersion":"/v1"
},
"metricName":"myapplication_api_response_time_avg",
"timestamp":"2020-06-24T07:24:13Z",
"value":"10750m",
"selector":null
},
{
"describedObject":{
"kind":"Pod",
"namespace":"myapplication",
"name":"myapplication-85cfb49cf6-kvl2v",
"apiVersion":"/v1"
},
"metricName":"myapplication_api_response_time_avg",
"timestamp":"2020-06-24T07:24:13Z",
"value":"12",
"selector":null
}
]
}
Create a HPA that will scale up the myapplication-deployment if the latency exposed by myapplication_api_response_time_avg goes over 500 ms. After a couple of seconds, the HPA fetches the myapplication_api_response_time_avg value from the metrics API.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: myapplication-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapplication-deployment
minReplicas: 3
maxReplicas: 15
metrics:
- type: Pods
pods:
metricName: myapplication_api_response_time_avg
targetAverageValue: "500"
We may notice that the autoscaler doesn’t react immediately to latency spikes. By default, the metrics sync happens once every 30 seconds and scaling up and down can only happen if there was no rescaling within the last 3–5 minutes. In this way, the HPA prevents rapid execution of conflicting decisions and gives time for the Cluster Autoscaler to kick in.
kubectl describe hpa myapplication-hpa -n myapplication
Article Source: https://towardsdatascience.com/kubernetes-hpa-with-custom-metrics-from-prometheus-9ffc201991e