1.简介

Fabric 1.4引入operation service即运维服务接口, orderer,peer节点可提供http服务, 方便外部获取节点的运行指标,管理日志级别,健康检查。

2.如何使用运维服务

以fabirc-sample/first-network为例, ./byfn.sh up

2.1 Orderer节点运维服务

启动后连接到orderer容器

docker exec -it -e LINES=$(tput lines) -e COLUMNS=$(tput cols) orderer.example.com  bash

查看下主要配置文件/etc/hyperledger/fabric/orderer.yaml, 注意不是core.yaml

  • (1)可以看到Operations->ListenAddress默认环回地址监听, 对外服务的话需要设置为0.0.0.0:8443. 

  • (2)端口8443如果映射出去默认是任何人都可以采访, 如果要控制采访, 必须开启TLS enabled为true, 且开启客户端鉴权即clientAuthRequired为true, 配置TLS自身私钥,证书和CA证书, 限制只有指定CA签发证书的客户才能采访。 这些配置我们只需在docker-compose.yaml使用环境变量重写即可, 见步骤(4)

################################################################################
#
#   Operations Configuration
#
#   - This configures the operations server endpoint for the orderer
#
################################################################################
Operations:
    # host and port for the operations server
    ListenAddress: 127.0.0.1:8443

    # TLS configuration for the operations endpoint
    TLS:
        # TLS enabled
        Enabled: false

        # Certificate is the location of the PEM encoded TLS certificate
        Certificate:

        # PrivateKey points to the location of the PEM-encoded key
        PrivateKey:

        # Require client certificate authentication to access all resources
        ClientAuthRequired: false

        # Paths to PEM encoded ca certificates to trust for client authentication
        RootCAs: []


(3)设置运行指标metrics
支持第三方监控软件, statsd和prometheus, Provider默认禁用, 

################################################################################
#
#   Metrics  Configuration
#
#   - This configures metrics collection for the orderer
#
################################################################################
Metrics:
    # The metrics provider is one of statsd, prometheus, or disabled
    Provider: disabled

    # The statsd configuration
    Statsd:
      # network type: tcp or udp
      Network: udp

      # the statsd server address
      Address: 127.0.0.1:8125

      # The interval at which locally cached counters and gauges are pushed
      # to statsd; timings are pushed immediately
      WriteInterval: 30s

      # The prefix is prepended to all emitted statsd metrics
      Prefix:

配置为statsd的话, 需要orderer主动推送运行指标到statsd服务器, 设置一些写的间隔, statsd如何鉴权没提, 估计是ip白名单, 具体细节要查下statsd文档。 具体推送的数据如下表https://hyperledger-fabric.readthedocs.io/en/release-1.4/metrics_reference.html

Bucket Type Description
blockcutter.block_fill_duration.%{channel} histogram The time from first transaction enqueing to the block being cut in seconds.
broadcast.enqueue_duration.%{channel}.%{type}.%{status} histogram The time to enqueue a transaction in seconds.
broadcast.processed_count.%{channel}.%{type}.%{status} counter The number of transactions processed.
broadcast.validate_duration.%{channel}.%{type}.%{status} histogram The time to validate a transaction in seconds.
chaincode.execute_timeouts.%{chaincode} counter The number of chaincode executions (Init or Invoke) that have timed out.
chaincode.launch_duration.%{chaincode}.%{success} histogram The time to launch a chaincode.
chaincode.launch_failures.%{chaincode} counter The number of chaincode launches that have failed.
chaincode.launch_timeouts.%{chaincode} counter The number of chaincode launches that have timed out.
chaincode.shim_request_duration.%{type}.%{channel}.%{chaincode}.%{success} histogram The time to complete chaincode shim requests.
chaincode.shim_requests_completed.%{type}.%{channel}.%{chaincode}.%{success} counter The number of chaincode shim requests completed.
chaincode.shim_requests_received.%{type}.%{channel}.%{chaincode} counter The number of chaincode shim requests received.
consensus.kafka.batch_size.%{topic} gauge The mean batch size in bytes sent to topics.
consensus.kafka.compression_ratio.%{topic} gauge The mean compression ratio (as percentage) for topics.
consensus.kafka.incoming_byte_rate.%{broker_id} gauge Bytes/second read off brokers.
consensus.kafka.outgoing_byte_rate.%{broker_id} gauge Bytes/second written to brokers.
consensus.kafka.record_send_rate.%{topic} gauge The number of records per second sent to topics.
consensus.kafka.records_per_request.%{topic} gauge The mean number of records sent per request to topics.
consensus.kafka.request_latency.%{broker_id} gauge The mean request latency in ms to brokers.
consensus.kafka.request_rate.%{broker_id} gauge Requests/second sent to brokers.
consensus.kafka.request_size.%{broker_id} gauge The mean request size in bytes to brokers.
consensus.kafka.response_rate.%{broker_id} gauge Requests/second sent to brokers.
consensus.kafka.response_size.%{broker_id} gauge The mean response size in bytes from brokers.
couchdb.processing_time.%{database}.%{function_name}.%{result} histogram Time taken in seconds for the function to complete request to CouchDB
deliver.blocks_sent.%{channel}.%{filtered} counter The number of blocks sent by the deliver service.
deliver.requests_completed.%{channel}.%{filtered}.%{success} counter The number of deliver requests that have been completed.
deliver.requests_received.%{channel}.%{filtered} counter The number of deliver requests that have been received.
deliver.streams_closed counter The number of GRPC streams that have been closed for the deliver service.
deliver.streams_opened counter The number of GRPC streams that have been opened for the deliver service.
dockercontroller.chaincode_container_build_duration.%{chaincode}.%{success} histogram The time to build a chaincode image in seconds.
endorser.chaincode_instantiation_failures.%{channel}.%{chaincode} counter The number of chaincode instantiations or upgrade that have failed.
endorser.duplicate_transaction_failures.%{channel}.%{chaincode} counter The number of failed proposals due to duplicate transaction ID.
endorser.endorsement_failures.%{channel}.%{chaincode}.%{chaincodeerror} counter The number of failed endorsements.
endorser.proposal_acl_failures.%{channel}.%{chaincode} counter The number of proposals that failed ACL checks.
endorser.proposal_validation_failures counter The number of proposals that have failed initial validation.
endorser.proposals_received counter The number of proposals received.
endorser.propsal_duration.%{channel}.%{chaincode}.%{success} histogram The time to complete a proposal.
endorser.successful_proposals counter The number of successful proposals.
fabric_version.%{version} gauge The active version of Fabric.
grpc.comm.conn_closed counter gRPC connections closed. Open minus closed is the active number of connections.
grpc.comm.conn_opened counter gRPC connections opened. Open minus closed is the active number of connections.
grpc.server.stream_messages_received.%{service}.%{method} counter The number of stream messages received.
grpc.server.stream_messages_sent.%{service}.%{method} counter The number of stream messages sent.
grpc.server.stream_request_duration.%{service}.%{method}.%{code} histogram The time to complete a stream request.
grpc.server.stream_requests_completed.%{service}.%{method}.%{code} counter The number of stream requests completed.
grpc.server.stream_requests_received.%{service}.%{method} counter The number of stream requests received.
grpc.server.unary_request_duration.%{service}.%{method}.%{code} histogram The time to complete a unary request.
grpc.server.unary_requests_completed.%{service}.%{method}.%{code} counter The number of unary requests completed.
grpc.server.unary_requests_received.%{service}.%{method} counter The number of unary requests received.
ledger.block_processing_time.%{channel} histogram Time taken in seconds for ledger block processing.
ledger.blockchain_height.%{channel} gauge Height of the chain in blocks.
ledger.blockstorage_commit_time.%{channel} histogram Time taken in seconds for committing the block and private data to storage.
ledger.statedb_commit_time.%{channel} histogram Time taken in seconds for committing block changes to state db.
ledger.transaction_count.%{channel}.%{transaction_type}.%{chaincode}.%{validation_code} counter Number of transactions processed.
logging.entries_checked.%{level} counter Number of log entries checked against the active logging level
logging.entries_written.%{level} counter Number of log entries that are written

如果配置为prometheus的话就需要外部来拉取数据了,改下Provider为prometheus即可, 数据格式和statsd类似。 我们这里演示使用prometheus。

  • (4)docker-compose-cli.yaml例子
  orderer.example.com:
    extends:
      file:   base/docker-compose-base.yaml
      service: orderer.example.com
    container_name: orderer.example.com
    environment:
      - ORDERER_OPERATIONS_LISTENADDRESS=0.0.0.0:8443
      - ORDERER_METRICS_PROVIDER=prometheus
    ports:
      - 8443:8443
    networks:
      - byfn
  • (5)测试接口 curl http://192.168.31.86:8443/logspec 默认get是返回当前日志级别
{"spec":"info"}

可使用PUT设置日志级别

curl -X PUT http://192.168.31.86:8443/logspec -d '{"spec":"debug"}'

获取节点健康状况

curl http://192.168.31.86:8443/healthz

返回

{"status":"OK","time":"2019-03-01T07:06:33.805124616Z"}

获取运行指标, 返回一堆东西..

curl http://192.168.31.86:8443/metrics

2.2 Peer节点运维服务

和orderer类似, 不过容器对应的配置文件是/etc/hyperledger/fabric/core.yaml, 用环境变量修改, docker-compose-cli.yaml对应内容.

  peer0.org1.example.com:
    container_name: peer0.org1.example.com
    extends:
      file:  base/docker-compose-base.yaml
      service: peer0.org1.example.com
    environment:
      - CORE_OPERATIONS_LISTENADDRESS=0.0.0.0:9443
      - CORE_METRICS_PROVIDER=prometheus
    ports:
      - 9443:9443
    networks:
      - byfn

3.小结

感觉运维接口功能有限,日志管理暂时还是得用docker的log driver, 不过方向是好的, 只是这个1.4 LTS维护期是一年长不长短不短, 2.0和ETCD based raft共识已在路上。 希望对大家有帮助。