一直觉得 Mixer 的功能会比较不稳定，这次在《深入浅出 Istio》一书的的验证过程中发现，Prometheus 的部分无法工作了，因此今天排查一下，也因此有了些收获，这里做一个简单的记录。

首先我发现，istio-system 中系统默认安装的 Prometheus 资源不见了：

$ kubectl get prometheus --all-namespaces
No resources found.

但是好在相关的 Rule 还在，写法有了一些变化，例如 istio-system 中的 promtcp 的定义：

apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
...
  name: promtcp
  namespace: istio-system
spec:
  actions:
  - handler: prometheus
    instances:
    - tcpbytesent.metric
    - tcpbytereceived.metric
  match: context.protocol == "tcp"

过去我们习惯的 Handler 填写一般会是 handler.prometheus，也就是名为 handler 的 prometheus 资源。例如官方文档中的写法：

# Rule to send metric instances to a Prometheus handler
apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
  name: doubleprom
  namespace: istio-system
spec:
  actions:
  - handler: doublehandler.prometheus
    instances:
    - doublerequestcount.metric

很明显的，1.1 的用法发生了变更，这个新用法中并没有提及对象名称，只知道名字是 prometheus。在 Istio 1.1 的 Helm 源码中搜索一下 name: prometheus 就会看到，在 helm/istio/charts/mixer/templates/config.yaml 中定义了一个对象，一个 handler 类型的对象：

apiVersion: "config.istio.io/v1alpha2"
kind: handler
metadata:
  name: prometheus

这样就可以查查他的定义了，运行 kubectl get -n istio-system handler prometheus -o yaml：

apiVersion: config.istio.io/v1alpha2
kind: handler
metadata:
...
  name: prometheus
  namespace: istio-system
...
spec:
  compiledAdapter: prometheus
  params:
    metrics:
    - instance_name: requestcount.metric.istio-system
...

是的，这个名为 prometheus 的 handler 对象和以前几乎一毛一样。现在有两种定义 Prometheus 的 Handler 了，对此开发给出的解释是，并非所有 Adapter 都会创建自己的 CRD，因此推荐共用的 Handler 类型来进行定义。

在 Reference 中对这一对象做了个大概的讲解。需要注意其中的 compiledAdapter: prometheus，用于指定 Adapter 类型。其中使用 compiledAdapter 和 adapter 两个字段分别用于描述进程内外的两种适配器类型。

因此在 1.1 中，Handler 真正的成为了 Handler，下面给出一个简单的定义，来讲解一下自定义指标中，新 Handler 的定义方法，其中给指标定义名称为 cxl_counter：

apiVersion: config.istio.io/v1alpha2
kind: handler
metadata:
  labels:
    app: mixer
    chart: mixer
    heritage: Tiller
    release: istio
  name: prometheus
spec:
  compiledAdapter: prometheus
  params:
    metrics:
      - instance_name: cxl.metric.default
        kind: COUNTER
        label_names:
          - source_app
          - source_workload
          - source_workload_namespace
          - source_version
          - destination_app
          - destination_workload
          - destination_workload_namespace
          - destination_version
          - destination_service
          - destination_service_name
          - destination_service_namespace
          - reporter
          - response_code
        name: cxl_counter
    metricsExpirationPolicy:
      metricsExpiryDuration: 10m

原有 Handler 的定义方式，同样的指标，定义为 double_counter：

apiVersion: config.istio.io/v1alpha2
kind: prometheus
metadata:
  name: handler
spec:
  metrics:
    - instance_name: cxl.metric.default
      kind: COUNTER
      label_names:
        - source_app
        - source_workload
        - source_workload_namespace
        - source_version
        - destination_app
        - destination_workload
        - destination_workload_namespace
        - destination_version
        - destination_service
        - destination_service_name
        - destination_service_namespace
        - reporter
        - response_code
      name: double_counter
  metricsExpirationPolicy:
    metricsExpiryDuration: 10m

用一个 Rule，将同样的指标分别输出到两个 Handler 之中：

apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
  name: prom-http
spec:
  actions:
    - handler: prometheus
      instances:
        - cxl.metric
    - handler: handler.prometheus
      instances:
        - cxl.metric
  match: context.protocol == "http" || context.protocol == "grpc"

而指标的定义不变：

apiVersion: config.istio.io/v1alpha2
kind: metric
metadata:
  name: cxl
spec:
  dimensions:
    destination_app: destination.labels["app"] | "unknown"
    destination_service: destination.service.host | "unknown"
    destination_service_name: destination.service.name | "unknown"
    destination_service_namespace: destination.service.namespace | "unknown"
    destination_version: destination.labels["version"] | "unknown"
    destination_workload: destination.workload.name | "unknown"
    destination_workload_namespace: destination.workload.namespace | "unknown"
    source_app: source.labels["app"] | "unknown"
    source_version: source.labels["version"] | "unknown"
    source_workload: source.workload.name | "unknown"
    source_workload_namespace: source.workload.namespace | "unknown"
    reporter:
      conditional((context.reporter.kind | "inbound") == "outbound", "source",
      "destination")
    response_code: response.code | 200
  monitored_resource_type: '"UNSPECIFIED"'
  value: "2"

制造请求之后，会发现新旧 Handler 同时工作，并用各自的名字写入了指标。在 Prometheus 中即可查看。

这里真的要吐槽一句，Metric 定义中的所有 Label 需要照抄到 Handler 定义中，映射关系出错的时候，出的不是 Warning，而是 Panic。

涉及到的代码已经更新到版本库的 1.1 分支的第八章内容里。

Istio 1.1 中的 Handler

Subscribe to my newsletter

崔秀龙

崔秀龙