Introducing a Gadget: Inspektor Gadget

Hello everyone, today is June 26th, have you eaten yet?

When checking the krew index routinely, I found a new plug-in gadgit. After looking through the history, it turned out to be the work of Kinvolk. The company is not well-known. In my impression, he was the one who started the service grid Benchmark first. The function introduction of the plug-in is very simple: Collection of gadgets for Kubernetes developers, but the usage is amazing, and it feels like the smaller the number of words, the bigger the problem:

Available Commands:
  advise      Recommend system configurations based on collected information
  audit       Audit a subsystem
  completion  generate the autocompletion script for the specified shell
  deploy      Deploy Inspektor Gadget on the cluster
  help        Help about any command
  profile     Profile different subsystems
  snapshot    Take a snapshot of a subsystem and print it
  top         Gather, sort and periodically report events according to a given criteria
  trace       Trace and print system events
  traceloop   Get strace-like logs of a pod from the past
  undeploy    Undeploy Inspektor Gadget from cluster
  version     Show version

Filtering out the auxiliary functions, you can see several main contents:

  • advise: Based on the collected information, recommend system configuration content
  • audit: Auditing Subsystems
  • profile: Profile different subsystems
  • snapshot: Take a snapshot of the subsystem and print
  • top: Events are collected, sorted and periodically reported according to established rules
  • trace: Track and print system events
  • traceloop: Get historical logs in strace-like format

In fact, saying it is the same as not saying it, right? It's better to read one by one.


First install the plugin using krew:

$ kubectl krew install gadget
Updated the local copy of plugin index.
Installing plugin: gadget
 | Use this plugin:
 |      kubectl gadget
 |  | $ kubectl gadget deploy | kubectl apply -f -
WARNING: You installed plugin "gadget" from the krew-index plugin repository.

As you can see above, before using the plugin, you need to install it in the group and run kubectl gadget deploy | kubectl apply -f -. You can see that in addition to RBAC content, there are two things, Daemonset and CRD. In order to track Pod behavior, Inspektor Gadget attaches BPF programs to kernel functions, and when the functions are executed, the kernel will also run these injected programs. Therefore, the BPF program needs to detect whether the system call that triggers the function comes from the trace target of the Inspektor Gadget. To do this, the program looks up the current cgroup id in the BPF Map containing the list of Pod s to track, and if not found, the program exits early. Finally, the BPF program collects the information to be traced, for example, system call parameters, and puts them in a Ring Buffer or a BPF Map. Inspektor Gadget's userspace tools listen or read on the Ring Buffer or BPF map and get new events. After the trace is over, the BPF program will be deleted.

Network Policy Advise

This function consists of two parts, Monitor and Report, which are to start network monitoring of workloads in a specific namespace, generate trace records, and generate two parts of network policies based on trace records, for example:

$ kubectl gadget advise network-policy monitor  --output /tmp/result.txt
Node "gke-gcp-vlab-k8s-default-pool-d3fe3442-pw6v" ready.
Node "gke-gcp-vlab-k8s-default-pool-d3fe3442-9hsc" ready.
Node "gke-gcp-vlab-k8s-default-pool-d3fe3442-nj0k" ready.

$ more /tmp/result.txt
{"type":"connect","remote_kind":"pod","port":2021,"local_pod_namespace":"gadget","local_pod_name":"gadget-dzb7g","local_pod_labels":{"controller-revision-hash":"8f55cc94f","k8s-app":"gadget","pod-template-generation":"1"},"remote_pod_namespace":"kube-system","remote_pod_name":"pdcsi-node-lpqln","remote_pod_labels":{"controller-revision-hash":"69cdc7c487","k8s-app":"gcp-compute-persistent-disk-csi-driver","pod-template-generation":"1"},"debug":"4649087588182 cpu#1 connect 3293 otelsvc 4026531992\n"}

Use Ctrl+C to terminate the command after executing for a while, and you can see that the specified output file contains a bunch of JSON-like records. You can use this file to generate network policies:

$ kubectl gadget advise network-policy report --input=/tmp/result.txt
          k8s-app: konnectivity-agent
    - port: 10250
      protocol: TCP
      k8s-app: gadget
  - Ingress
  - Egress

You can see that the network policy has been generated.

Seccomp Profile Advise

This functionality is accomplished with the advise seccomp-profile module, which has three subcommands, start, list, and stop, for example to track a Calico Pod:

$ kubectl gadget advise seccomp-profile start --podname=calico-node-t6hwg
$ kubectl gadget advise seccomp-profile list
NAMESPACE      NODE(S)                                                                                                                         POD                  CONTAINER    TRACEID
kube-system    gke-gcp-vlab-k8s-default-pool-d3fe3442-9hsc,gke-gcp-vlab-k8s-default-pool-d3fe3442-nj0k,gke-gcp-vlab-k8s-default-pool-d3fe3442-pw6v    calico-node-t6hwg                 HAmaTrPcxTLDNfSo

The HAmaTrPcxTLDNfSo that appears after the above start command is executed is the tracking ID. After a period of time, you can call the stop command to end the tracking. After the tracking ends, the Seccomp of this Pod will be displayed:

kubectl gadget advise seccomp-profile stop HAmaTrPcxTLDNfSo
  "defaultAction": "SCMP_ACT_ERRNO",
  "architectures": [
  "syscalls": [
      "names": [
      "action": "SCMP_ACT_ALLOW"


This module includes block-io and cpu instructions, for example, to monitor the block-io of a certain node:

 kubectl gadget profile block-io --node=gke-gcp-vlab-k8s-default-pool-d3fe3442-9hsc
Tracing block device I/O... Hit Ctrl-C to end.^C

     usecs               : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 1        |                                        |
       128 -> 255        : 1        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 2        |                                        |
      1024 -> 2047       : 54       |****************                        |
      2048 -> 4095       : 44       |*************                           |
      4096 -> 8191       : 49       |***************                         |
      8192 -> 16383      : 128      |****************************************|
     16384 -> 32767      : 118      |************************************    |
     32768 -> 65535      : 11       |***                                     |
     65536 -> 131071     : 5        |*                                       |

You can see statistics recorded and distributed in microseconds. The usage of the cpu subcommand is as follows, where the -K switch means to only focus on the contents of the kernel space:

kubectl gadget profile cpu -p calico-node-t6hwg -K
Capturing stack traces... Hit Ctrl-C to end.^C

calico-node;entry_SYSCALL_64_after_hwframe;do_syscall_64;ksys_write;vfs_write;pipe_write;__wake_up_sync_key;_raw_spin_unlock_irqrestore;_raw_spin_unlock_irqrestore 1
calico-node;entry_SYSCALL_64_after_hwframe;do_syscall_64;ksys_read;vfs_read;pipe_read;anon_pipe_buf_release;anon_pipe_buf_release 1
ip 1
calico-node;entry_SYSCALL_64_after_hwframe;do_syscall_64;__se_sys_nanosleep;get_timespec64;_copy_from_user;copy_user_generic_unrolled;copy_user_generic_unrolled 1
calico-node 9


The Snapshot module is divided into two subcommands, process and socket, which are used to record processes and networks respectively. (The v0.5.1 version of the process subcommand does not seem to work).

$ kubectl gadget snapshot socket \  
    --node=gke-gcp-vlab-k8s-default-pool-d3fe3442-pw6v \
    -o custom-columns=namespace,pod,protocol,status
kube-system     calico-node-zjpl5 TCP      ESTABLISHED
kube-system     calico-node-zjpl5 TCP      ESTABLISHED
kube-system     calico-node-zjpl5 TCP      ESTABLISHED
kube-system     calico-node-zjpl5 TCP      ESTABLISHED
kube-system     calico-node-zjpl5 TCP      ESTABLISHED
kube-system     calico-node-zjpl5 TCP      ESTABLISHED
kube-system     calico-node-zjpl5 TCP      ESTABLISHED


This module has three subcommands, block-io, tcp and file, which are similar to the top command of the Linux system, such as the top file listed in the following command:

$ kubectl gadget top file \
    -o custom-columns=container,pid,comm,reads
CONTAINER        PID     COMM             READS
fluentbit        3737    flb-pipeline     1
fluentbit        3737    flb-pipeline     1
fluentbit        3737    flb-pipeline     2
gke-metrics-agent 56606   otelsvc          2
fluentbit        3737    flb-pipeline     1
fluentbit        3737    flb-pipeline     1
fluentbit        3737    flb-pipeline     2
gke-metrics-agent 56606   otelsvc          2
fluentbit        3737    flb-pipeline     1
fluentbit        3737    flb-pipeline     2


This module tracks system events, currently supports include:

  • bind: Scoket bindings
  • capabilities: Capability check
  • dns: DNS request
  • exec: new process
  • fsslower: open, read, write, and fsync operations take longer than the threshold
  • mount: mount and umount operations
  • oomkill: OOM Killer is triggered
  • open: open system call
  • signal: Signals received by the trace process
  • sni: SNI in the TLS request
  • tcp: TCP connect, accept and close
  • tcpconnect: connect call

Example trace for open:

$ kubectl gadget trace open -o custom-columns=container,path

fluentbit        /var/log/containers
fluentbit        /var/log/pods
fluentbit        /var/log/containers
fluentbit        /var/log/pods
fluentbit        /var/run/google-fluentbit/pos-files
csi-driver-registrar /usr/bin/runc
csi-driver-registrar /sys/kernel/mm/hugepages


In the future, it will be difficult to do Ops without eBPF support?

Posted by magicdanw on Wed, 23 Nov 2022 18:01:14 +1030