Fluent Operator: a Swiss Army knife for cloud native log management

Authors: Cheng Dehao, Fluent Member, KubeSphere Member

Introduction to Fluent Operator

With the rapid development of cloud native technology and the continuous iteration of technology, higher requirements are put forward for log collection, processing and forwarding. The log scheme under the cloud native architecture is very different from the log architecture design based on physical machine or virtual machine scenario. As a graduation project of CNCF, Fluent Bit is undoubtedly one of the preferred solutions to solve the logging problem in cloud environment. The deployment cost of bitnet and the deployment cost of kunette users have increased to a certain extent.

On January 21, 2019, KubeSphere community developed the Fluent Bit operator to meet the needs of managing Fluent Bit in a cloud native way, and released v0.0 on February 17, 2020 Version 1.0. Since then, the product has been iterative, and the Fluent Bit operator was officially donated to the Fluent community on August 4, 2021.

Fluent Bit operator reduces the threshold of using Fluent Bit and can process log information efficiently and quickly. However, the ability of Fluent Bit to process logs is slightly weak. We have not integrated log processing tools, such as fluent D, which has more plug-ins to use. Based on the above requirements, Fluent Bit operator integrates fluent D, which aims to integrate fluent d into an optional log aggregation and forwarding layer and rename it Fluent Operator (GitHub address: https://github.com/fluent/fluent-operator) . On March 25, 2022, Fluent Operator released v1 Version 0.0, and will continue to iterate the Fluent Operator. V1.0 is expected to be released in the second quarter of 2022 Version 1.0 adds more functions and highlights.

Using Fluent Operator, you can deploy, configure and uninstall Fluent Bit and fluent D flexibly and easily. At the same time, the community also provides a large number of plug-ins that support fluent D and Fluent Bit. Users can customize the configuration according to the actual situation. The official documents provide detailed examples, which are very easy to use and greatly reduce the threshold of using Fluent Bit and fluent D.

Each stage of the Log Pipeline

The Fluent Operator can deploy Fluent Bit or fluent D separately, and will not force the use of Fluent Bit or fluent D. at the same time, it also supports the use of fluent d to receive the log stream forwarded by Fluent Bit for multi tenant log isolation, which greatly increases the flexibility and diversity of deployment. In order to have a more comprehensive understanding of Fluent Operator, the following figure takes the complete log pipeline as an example and divides the pipeline into three parts: collection and forwarding, filtering and output.

Collection and forwarding

Both Fluent Bit and fluent D can collect logs.

When deployed separately, you can meet the needs of log collection through the int put plug-in of Fluent Bit or the forward and http plug-ins of fluent D. When the two are combined, fluent D can use forward to accept the log stream forwarding of Fluent Bit.

In terms of performance, compared with fluent D, Fluent Bit is lighter and consumes less memory (about 650KB), so Fluent Bit is mainly responsible for collecting and forwarding logs. Collect and forward logs on each node through Fluent Bit installed in the form of daemon set.

filter

The data collected by the log is often too messy and redundant, which requires the log processing middleware to provide the ability to filter and process the log information. Both Fluent Bit and fluent D support filter plug-in. Users can integrate and customize log data according to their own needs.

output

The Fluent Bit output or fluent D output plug-in outputs the processed log information to multiple destinations, which can be third-party components such as Kafka and Elasticsearch.

Introduction to CRD

Fluent Operator defines two groups for Fluent Bit and fluent D: Fluent Bit fluent. IO and fluent D fluent. io.

fluentbit.fluent.io

fluentbit.fluent.io# the following six CRDs are included under the group:

  • Fluent Bit CRD defines the attributes of Fluent Bit, such as image version, stain, affinity and other parameters.
  • Clusterfluent bitconfig CRD defines the configuration file of Fluent Bit.
  • ClusterInput CRD defines the input plug-in of Fluent Bit, that is, the input plug-in. Through this plug-in, users can customize which logs to collect.
  • ClusterFilter CRD defines the filter plug-in of Fluent Bit, which is mainly responsible for filtering and processing the information collected by Fluent Bit.
  • ClusterParser CRD defines the parser plug-in of Fluent Bit, which is mainly responsible for parsing log information and can parse log information into other formats.
  • ClusterOutput CRD defines the output plug-in of Fluent Bit, which is mainly responsible for forwarding the processed log information to the destination.
fluentd.fluent.io

fluentd.fluent.io# the following seven CRDs are included under the group:

  • Fluentd CRD defines the attributes of fluentd, such as image version, stain, affinity and other parameters.
  • Clusterfluent dconfig CRD defines the configuration file of fluent cluster level.
  • Fluent dconfig CRD defines the configuration file of fluent's namespace scope.
  • ClusterFilter CRD defines the filter plug-in in the scope of fluent cluster, which is mainly responsible for filtering and processing the information collected by fluent. If Fluent Bit is installed, log information can be further processed.
  • Filter CRD this CRD defines the filter plug-in of fluent namespace, which is mainly responsible for filtering and processing the information collected by fluent. If Fluent Bit is installed, log information can be further processed.
  • ClusterOutput CRD this CRD defines the cluster wide output plug-in of fluent, which is mainly responsible for forwarding the processed log information to the destination.
  • Output CRD this CRD defines the output plug-in in the namespace scope of fluent, which is mainly responsible for forwarding the processed log information to the destination.

Introduction to choreography principle (instance + mounted secret + abstraction ability of CRD)

Although Fluent Bit and fluent d have the ability to collect, process (parse and filter) and output logs, they have different advantages. Fluent Bit is lighter and more efficient than fluent D, and fluent D has more plug-ins.

In order to balance these advantages, Fluent Operator allows users to flexibly use Fluent Bit and fluent D in a variety of ways:

  • Fluent Bit only mode: if you only need to collect logs and send them to the final destination after simple processing, you only need Fluent Bit.
  • Fluent only mode: if you need to receive logs through the network in the form of HTTP or Syslog, and then process and send the logs to the final destination, you only need fluent.
  • Fluent Bit + fluent D mode: if you need to do some advanced processing on the collected logs or send them to more sink s, you can use Fluent Bit and fluent D in combination.

Fluent Operator allows you to configure the log processing pipeline in the above three modes as required. Fluent D and Fluent Bit have rich plug-ins to meet various customization needs of users. Since the configuration and mounting methods of fluent D and Fluent Bit are similar, the mounting method of Fluent Bit configuration file is briefly introduced.

In the Fluent Bit CRD, each clusterinput, clusterparser, clusterfilter and clusteroutput represents a Fluent Bit configuration part, which is selected by the clusterfluent bitconfig tag selector. The Fluent Operator monitors these objects, builds the final configuration, and finally creates a Secret to store the configuration installed in the Fluent Bit daemon set. The whole workflow is as follows:

Because the Fluent Bit itself does not reload the interface (please refer to this for details) Known problems ), in order to enable Fluent Bit to obtain and use the latest configuration when the configuration of Fluent Bit is changed, a wrapper named Fluent Bit watcher is added to restart the Fluent Bit process immediately when the configuration change of Fluent Bit is detected. In this way, the new configuration can be reloaded without restarting the Fluent Bit pod.

In order to make user configuration more convenient, we extract the parameters of application and configuration based on the powerful abstraction ability of CRD. Users can configure Fluent Bit and fluent D through defined CRD. The Fluent Operator monitors the changes of these objects to change the status and configuration of Fluent Bit and fluent D. Especially for the definition of plug-ins, in order to make the user transition more smoothly, we basically keep the naming consistent with the original fields of Fluent Bit to reduce the threshold of use.

How to realize multi tenant log isolation

Fluent Bit can efficiently collect logs, but if it needs to carry out complex processing of log information, Fluent Bit is a little weak, while fluent D can complete advanced processing of log information with the help of its rich plug-ins. Fluent operator abstracts various plug-ins of fluent d so that the log information can be processed to meet the user-defined requirements.

As can be seen from the above definition of CRD, we divide the config uration of fluent D and the CRD of plug-in into cluster level and namespace level CRDs. By defining CRD as two ranges and using the label router plug-in of fluent, the effect of multi tenant isolation can be achieved.

We added the watchNamespace field in clusterfluent config. Users can choose which namespaces to listen to according to their needs. If it is empty, it means to monitor all namespaces. The nameapce level fluent config can only listen to the CR and global level configuration in the namespace in which it is located. Therefore, the log at the namespace level can be output to both the output in the namespace and the output at the clsuter level, so as to achieve the purpose of multi tenant isolation.

Fluent Operator vs logging-operator

difference

  • Both can automatically deploy Fluent Bit and fluent D. The logging operator needs to deploy Fluent Bit and Fluent Bit at the same time, while the Fluent Operator supports pluggable deployment of Fluent Bit and fluent D, which are not strongly coupled. Users can choose to deploy Fluent Bit or Fluent Bit according to their own needs, which is more flexible.
  • In the logging operator, the logs collected by Fluent Bit must pass through fluent d before they can be output to the final destination. Moreover, if the amount of data is too large, fluent D has a single point of failure. The Fluent Bit in the Fluent Operator can directly send the log information to the destination, so as to avoid the hidden danger of single point of failure.
  • Logging operator defines four CRDs: logging, outputs, flows, clusteroutputs and clusterflows, while Fluent Operator defines 13 CRDs. Compared with the logging operator, the Fluent Operator has more diverse CRD definitions. Users can configure fluent D and Fluent Bit more flexibly according to their needs. At the same time, when defining CRD, select the name similar to the configuration of fluent D and Fluent Bit, and strive to make the name clearer to meet the original component definition.
  • Both use the label router plug-in of fluent for reference to realize multi tenant log isolation.

Outlook:

  • Support HPA automatic expansion and contraction;
  • Improve Helm Chart, such as collecting metrics information;
  • ...

would rather

With the help of Fluent Operator, we can carry out complex processing of logs. Here, we can fluent-operator-walkthrough The examples of outputting logs to elasticsearch and Kafka introduce the actual functions of Fluent Operator. To gain some hands-on experience with Fluent Operator, you need a Kind cluster. At the same time, you also need to set up a Kafka cluster and an elasticsearch cluster in this type of cluster.

# Create a kind cluster and name it fluent
./create-kind-cluster.sh

# Create a Kafka cluster under Kafka namespace
./deploy-kafka.sh

# Create an elastic search cluster under the elastic namespace
./deploy-es.sh

The Fluent Operator controls the lifecycle of Fluent Bit and fluent D. You can start the Fluent Operator in the fluent namespace using the following script:

./deploy-fluent-operator.sh

Both Fluent Bit and fluent d have been defined as CRD s in Fluent Operator. You can create Fluent Bit DaemonSet or fluent statefulset by declaring the CR of Fluent Bit or fluent D.

Fluent Bit Only mode

Fluent Bit Only will only enable lightweight Fluent Bit to collect, process and forward logs.

Use Fluent Bit to collect kubelet's logs and output them to Elasticsearch

cat <<EOF | kubectl apply -f -
apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBit
metadata:
  name: fluent-bit
  namespace: fluent
  labels:
    app.kubernetes.io/name: fluent-bit
spec:
  image: kubesphere/fluent-bit:v1.8.11
  positionDB:
    hostPath:
      path: /var/lib/fluent-bit/
  resources:
    requests:
      cpu: 10m
      memory: 25Mi
    limits:
      cpu: 500m
      memory: 200Mi
  fluentBitConfigName: fluent-bit-only-config
  tolerations:
    - operator: Exists
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFluentBitConfig
metadata:
  name: fluent-bit-only-config
  labels:
    app.kubernetes.io/name: fluent-bit
spec:
  service:
    parsersFile: parsers.conf
  inputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      fluentbit.fluent.io/mode: "fluentbit-only"
  filterSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      fluentbit.fluent.io/mode: "fluentbit-only"
  outputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      fluentbit.fluent.io/mode: "fluentbit-only"
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
  name: kubelet
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "fluentbit-only"
spec:
  systemd:
    tag: service.kubelet
    path: /var/log/journal
    db: /fluent-bit/tail/kubelet.db
    dbSync: Normal
    systemdFilter:
      - _SYSTEMD_UNIT=kubelet.service
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFilter
metadata:
  name: systemd
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "fluentbit-only"
spec:
  match: service.*
  filters:
  - lua:
      script:
        key: systemd.lua
        name: fluent-bit-lua
      call: add_time
      timeAsTable: true
---
apiVersion: v1
data:
  systemd.lua: |
    function add_time(tag, timestamp, record)
      new_record = {}
      timeStr = os.date("!*t", timestamp["sec"])
      t = string.format("%4d-%02d-%02dT%02d:%02d:%02d.%sZ",
    		timeStr["year"], timeStr["month"], timeStr["day"],
    		timeStr["hour"], timeStr["min"], timeStr["sec"],
    		timestamp["nsec"])
      kubernetes = {}
      kubernetes["pod_name"] = record["_HOSTNAME"]
      kubernetes["container_name"] = record["SYSLOG_IDENTIFIER"]
      kubernetes["namespace_name"] = "kube-system"
      new_record["time"] = t
      new_record["log"] = record["MESSAGE"]
      new_record["kubernetes"] = kubernetes
      return 1, timestamp, new_record
    end
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/component: operator
    app.kubernetes.io/name: fluent-bit-lua
  name: fluent-bit-lua
  namespace: fluent
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
  name: es
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "fluentbit-only"
spec:
  matchRegex: (?:kube|service)\.(.*)
  es:
    host: elasticsearch-master.elastic.svc
    port: 9200
    generateID: true
    logstashPrefix: fluent-log-fb-only
    logstashFormat: true
    timeKey: "@timestamp"
EOF

Use Fluent Bit to collect the application log of kubernetes and output it to Kafka

cat <<EOF | kubectl apply -f -
apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBit
metadata:
  name: fluent-bit
  namespace: fluent
  labels:
    app.kubernetes.io/name: fluent-bit
spec:
  image: kubesphere/fluent-bit:v1.8.11
  positionDB:
    hostPath:
      path: /var/lib/fluent-bit/
  resources:
    requests:
      cpu: 10m
      memory: 25Mi
    limits:
      cpu: 500m
      memory: 200Mi
  fluentBitConfigName: fluent-bit-config
  tolerations:
    - operator: Exists
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFluentBitConfig
metadata:
  name: fluent-bit-config
  labels:
    app.kubernetes.io/name: fluent-bit
spec:
  service:
    parsersFile: parsers.conf
  inputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      fluentbit.fluent.io/mode: "k8s"
  filterSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      fluentbit.fluent.io/mode: "k8s"
  outputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      fluentbit.fluent.io/mode: "k8s"
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
  name: tail
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "k8s"
spec:
  tail:
    tag: kube.*
    path: /var/log/containers/*.log
    parser: docker
    refreshIntervalSeconds: 10
    memBufLimit: 5MB
    skipLongLines: true
    db: /fluent-bit/tail/pos.db
    dbSync: Normal
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFilter
metadata:
  name: kubernetes
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "k8s"
spec:
  match: kube.*
  filters:
  - kubernetes:
      kubeURL: https://kubernetes.default.svc:443
      kubeCAFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kubeTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      labels: false
      annotations: false
  - nest:
      operation: lift
      nestedUnder: kubernetes
      addPrefix: kubernetes_
  - modify:
      rules:
      - remove: stream
      - remove: kubernetes_pod_id
      - remove: kubernetes_host
      - remove: kubernetes_container_hash
  - nest:
      operation: nest
      wildcard:
      - kubernetes_*
      nestUnder: kubernetes
      removePrefix: kubernetes_
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
  name: kafka
  labels:
    fluentbit.fluent.io/enabled: "false"
    fluentbit.fluent.io/mode: "k8s"
spec:
  matchRegex: (?:kube|service)\.(.*)
  kafka:
    brokers: my-cluster-kafka-bootstrap.kafka.svc:9091,my-cluster-kafka-bootstrap.kafka.svc:9092,my-cluster-kafka-bootstrap.kafka.svc:9093
    topics: fluent-log
EOF

Fluent bit + fluent D mode

With its rich plug-ins, Fluentd can act as a log aggregation layer to perform more advanced log processing. You can easily forward logs from Fluent Bit to fluent d using Fluent Operator.

Forward logs from Fluent Bit to fluent D

To forward logs from Fluent Bit to fluent D, you need to enable the forward plug-in of Fluent Bit, as shown below:

cat <<EOF | kubectl apply -f -
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
  name: fluentd
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/component: logging
spec:
  matchRegex: (?:kube|service)\.(.*)
  forward:
    host: fluentd.fluent.svc
    port: 24224
EOF

Deploy fluent

The fluent forward input plug-in will be enabled by default when deploying fluent, so you only need to deploy the following yaml to deploy fluent:

apiVersion: fluentd.fluent.io/v1alpha1
kind: Fluentd
metadata:
  name: fluentd
  namespace: fluent
  labels:
    app.kubernetes.io/name: fluentd
spec:
  globalInputs:
  - forward:
      bind: 0.0.0.0
      port: 24224
  replicas: 1
  image: kubesphere/fluentd:v1.14.4
  fluentdCfgSelector:
    matchLabels:
      config.fluentd.fluent.io/enabled: "true"

ClusterFluentdConfig: Fluentd cluster wide configuration

If you define clusterfluent dconfig, you can collect logs in any or all namespaces. We can select the namespace to collect logs through the watchedNamespaces field. The following configuration is to collect logs under Kube system and default namespace:

cat <<EOF | kubectl apply -f -
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFluentdConfig
metadata:
  name: cluster-fluentd-config
  labels:
    config.fluentd.fluent.io/enabled: "true"
spec:
  watchedNamespaces:
  - kube-system
  - default
  clusterOutputSelector:
    matchLabels:
      output.fluentd.fluent.io/scope: "cluster"
      output.fluentd.fluent.io/enabled: "true"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: cluster-fluentd-output-es
  labels:
    output.fluentd.fluent.io/scope: "cluster"
    output.fluentd.fluent.io/enabled: "true"
spec:
  outputs:
  - elasticsearch:
      host: elasticsearch-master.elastic.svc
      port: 9200
      logstashFormat: true
      logstashPrefix: fluent-log-cluster-fd
EOF

Fluent dconfig: configuration of fluent namespace scope

If you define fluent dconfig, you can only send logs in the same namespace as the fluent dconfig to Output. In this way, you can isolate logs in different namespaces.

cat <<EOF | kubectl apply -f -
apiVersion: fluentd.fluent.io/v1alpha1
kind: FluentdConfig
metadata:
  name: namespace-fluentd-config
  namespace: fluent
  labels:
    config.fluentd.fluent.io/enabled: "true"
spec:
  outputSelector:
    matchLabels:
      output.fluentd.fluent.io/scope: "namespace"
      output.fluentd.fluent.io/enabled: "true"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: Output
metadata:
  name: namespace-fluentd-output-es
  namespace: fluent
  labels:
    output.fluentd.fluent.io/scope: "namespace"
    output.fluentd.fluent.io/enabled: "true"
spec:
  outputs:
  - elasticsearch:
      host: elasticsearch-master.elastic.svc
      port: 9200
      logstashFormat: true
      logstashPrefix: fluent-log-namespace-fd
EOF

Route logs to different Kafka topic s according to the namespace

Similarly, you can use the filter plug-in of fluent to distribute logs to different topic s according to different namespace s. Here, we include the plug-in recordTransformer in the fluent kernel, which can add, delete, modify and check events.

cat <<EOF | kubectl apply -f -
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFluentdConfig
metadata:
  name: cluster-fluentd-config-kafka
  labels:
    config.fluentd.fluent.io/enabled: "true"
spec:
  watchedNamespaces:
  - kube-system
  - default
  clusterFilterSelector:
    matchLabels:
      filter.fluentd.fluent.io/type: "k8s"
      filter.fluentd.fluent.io/enabled: "true"
  clusterOutputSelector:
    matchLabels:
      output.fluentd.fluent.io/type: "kafka"
      output.fluentd.fluent.io/enabled: "true"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFilter
metadata:
  name: cluster-fluentd-filter-k8s
  labels:
    filter.fluentd.fluent.io/type: "k8s"
    filter.fluentd.fluent.io/enabled: "true"
spec:
  filters:
  - recordTransformer:
      enableRuby: true
      records:
      - key: kubernetes_ns
        value: ${record["kubernetes"]["namespace_name"]}
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: cluster-fluentd-output-kafka
  labels:
    output.fluentd.fluent.io/type: "kafka"
    output.fluentd.fluent.io/enabled: "true"
spec:
  outputs:
  - kafka:
      brokers: my-cluster-kafka-bootstrap.default.svc:9091,my-cluster-kafka-bootstrap.default.svc:9092,my-cluster-kafka-bootstrap.default.svc:9093
      useEventTime: true
      topicKey: kubernetes_ns
EOF

Use both cluster wide and namespace wide fluent dconfig

Of course, you can use both clusterfluent dconfig and fluent dconfig as follows. FluentdConfig will send the log under the fluent namespace to ClusterOutput, and ClusterFluentdConfig will also send the namespace under the watchedNamespaces field (i.e. Kube system and default namespaces) to ClusterOutput.

cat <<EOF | kubectl apply -f -
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFluentdConfig
metadata:
  name: cluster-fluentd-config-hybrid
  labels:
    config.fluentd.fluent.io/enabled: "true"
spec:
  watchedNamespaces:
  - kube-system
  - default
  clusterOutputSelector:
    matchLabels:
      output.fluentd.fluent.io/scope: "hybrid"
      output.fluentd.fluent.io/enabled: "true"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: FluentdConfig
metadata:
  name: namespace-fluentd-config-hybrid
  namespace: fluent
  labels:
    config.fluentd.fluent.io/enabled: "true"
spec:
  clusterOutputSelector:
    matchLabels:
      output.fluentd.fluent.io/scope: "hybrid"
      output.fluentd.fluent.io/enabled: "true"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: cluster-fluentd-output-es-hybrid
  labels:
    output.fluentd.fluent.io/scope: "hybrid"
    output.fluentd.fluent.io/enabled: "true"
spec:
  outputs:
  - elasticsearch:
      host: elasticsearch-master.elastic.svc
      port: 9200
      logstashFormat: true
      logstashPrefix: fluent-log-hybrid-fd
EOF

In the multi tenant scenario, the cluster wide and namespace wide FluentdConfig are used at the same time

In the multi tenant scenario, we can use both cluster wide and namespace wide fluent dconfig to achieve the effect of log isolation.

cat <<EOF | kubectl apply -f -
apiVersion: fluentd.fluent.io/v1alpha1
kind: FluentdConfig
metadata:
  name: namespace-fluentd-config-user1
  namespace: fluent
  labels:
    config.fluentd.fluent.io/enabled: "true"
spec:
  outputSelector:
    matchLabels:
      output.fluentd.fluent.io/enabled: "true"
      output.fluentd.fluent.io/user: "user1"
  clusterOutputSelector:
    matchLabels:
      output.fluentd.fluent.io/enabled: "true"
      output.fluentd.fluent.io/user: "user1"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFluentdConfig
metadata:
  name: cluster-fluentd-config-cluster-only
  labels:
    config.fluentd.fluent.io/enabled: "true"
spec:
  watchedNamespaces:
  - kube-system
  - kubesphere-system
  clusterOutputSelector:
    matchLabels:
      output.fluentd.fluent.io/enabled: "true"
      output.fluentd.fluent.io/scope: "cluster-only"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: Output
metadata:
  name: namespace-fluentd-output-user1
  namespace: fluent
  labels:
    output.fluentd.fluent.io/enabled: "true"
    output.fluentd.fluent.io/user: "user1"
spec:
  outputs:
  - elasticsearch:
      host: elasticsearch-master.elastic.svc
      port: 9200
      logstashFormat: true
      logstashPrefix: fluent-log-user1-fd
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: cluster-fluentd-output-user1
  labels:
    output.fluentd.fluent.io/enabled: "true"
    output.fluentd.fluent.io/user: "user1"
spec:
  outputs:
  - elasticsearch:
      host: elasticsearch-master.elastic.svc
      port: 9200
      logstashFormat: true
      logstashPrefix: fluent-log-cluster-user1-fd
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: cluster-fluentd-output-cluster-only
  labels:
    output.fluentd.fluent.io/enabled: "true"
    output.fluentd.fluent.io/scope: "cluster-only"
spec:
  outputs:
  - elasticsearch:
      host: elasticsearch-master.elastic.svc
      port: 9200
      logstashFormat: true
      logstashPrefix: fluent-log-cluster-only-fd
EOF

Use buffer for fluent output

You can add a buffer to cache the logs of the output plug-in.

cat <<EOF | kubectl apply -f -
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFluentdConfig
metadata:
  name: cluster-fluentd-config-buffer
  labels:
    config.fluentd.fluent.io/enabled: "true"
spec:
  watchedNamespaces:
  - kube-system
  - default
  clusterFilterSelector:
    matchLabels:
      filter.fluentd.fluent.io/type: "buffer"
      filter.fluentd.fluent.io/enabled: "true"
  clusterOutputSelector:
    matchLabels:
      output.fluentd.fluent.io/type: "buffer"
      output.fluentd.fluent.io/enabled: "true"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFilter
metadata:
  name: cluster-fluentd-filter-buffer
  labels:
    filter.fluentd.fluent.io/type: "buffer"
    filter.fluentd.fluent.io/enabled: "true"
spec:
  filters:
  - recordTransformer:
      enableRuby: true
      records:
      - key: kubernetes_ns
        value: ${record["kubernetes"]["namespace_name"]}
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: cluster-fluentd-output-buffer
  labels:
    output.fluentd.fluent.io/type: "buffer"
    output.fluentd.fluent.io/enabled: "true"
spec:
  outputs:
  - stdout: {}
    buffer:
      type: file
      path: /buffers/stdout.log
  - elasticsearch:
      host: elasticsearch-master.elastic.svc
      port: 9200
      logstashFormat: true
      logstashPrefix: fluent-log-buffer-fd
    buffer:
      type: file
      path: /buffers/es.log
EOF

Fluent only mode

You can also enable the fluent only mode, which will only deploy fluent statefulset.

Use fluent to receive logs from HTTP and output them to standard output

If you want to open the fluent plug-in separately, you can receive logs through HTTP.

cat <<EOF | kubectl apply -f -
apiVersion: fluentd.fluent.io/v1alpha1
kind: Fluentd
metadata:
  name: fluentd-http
  namespace: fluent
  labels:
    app.kubernetes.io/name: fluentd
spec:
  globalInputs:
    - http:
        bind: 0.0.0.0
        port: 9880
  replicas: 1
  image: kubesphere/fluentd:v1.14.4
  fluentdCfgSelector:
    matchLabels:
      config.fluentd.fluent.io/enabled: "true"

---
apiVersion: fluentd.fluent.io/v1alpha1
kind: FluentdConfig
metadata:
  name: fluentd-only-config
  namespace: fluent
  labels:
    config.fluentd.fluent.io/enabled: "true"
spec:
  filterSelector:
    matchLabels:
      filter.fluentd.fluent.io/mode: "fluentd-only"
      filter.fluentd.fluent.io/enabled: "true"
  outputSelector:
    matchLabels:
      output.fluentd.fluent.io/mode: "true"
      output.fluentd.fluent.io/enabled: "true"

---
apiVersion: fluentd.fluent.io/v1alpha1
kind: Filter
metadata:
  name: fluentd-only-filter
  namespace: fluent
  labels:
    filter.fluentd.fluent.io/mode: "fluentd-only"
    filter.fluentd.fluent.io/enabled: "true"
spec:
  filters:
    - stdout: {}

---
apiVersion: fluentd.fluent.io/v1alpha1
kind: Output
metadata:
  name: fluentd-only-stdout
  namespace: fluent
  labels:
    output.fluentd.fluent.io/enabled: "true"
    output.fluentd.fluent.io/enabled: "true"
spec:
  outputs:
    - stdout: {}
EOF

This article is composed of blog one article multi posting platform OpenWrite release!

Posted by IAK on Fri, 15 Apr 2022 14:16:29 +0930