Provision microservice’s pipeline on Jenkins using CustomResourceDefinition and Operator on Kubernetes

1_l_MvKlG3R7VCC6aGt53w_A

Overview

In this article I want to show you how to create custom resource in Kubernetes so we can create just another resource which provision CI/CD pipeline on Jenkins for microservice. To achieve such goal we will use operator-sdk CLI and write code in Go to implement integration with Jenkins using API.

Everything will happen in local environment on Minikube – Jenkins will be also deployed there using official Helm chart. Repositories will be created on Github and Dockerhub. Github respositories related to that article:

https://github.com/jakubbujny/article-jenkins-pipeline-crd

https://github.com/jakubbujny/article-microservice1

https://github.com/jakubbujny/article-microservice2

https://github.com/jakubbujny/article-jenkins-pipeline-crd

Jenkins deployment on Kubernetes

Jenkins will be deployed on Kubernetes using official Helm chart which contains already Kubernetes plugin to spawn build agents as separated PODs.

At first we should start Minikube with memory increased as Jenkins and agents are Java processes so they are consuming quite big amount of memory:

minikube start –memory=4096

after start we can deploy Jenkins using following script:

#!/usr/bin/env bash

helm init --wait

helm install \
 --name jenkins stable/jenkins \
 --set master.csrf.defaultCrumbIssuer.enabled=false \
 --set master.tag=2.194 \
 --set master.serviceType=ClusterIP \
 --set master.installPlugins[0]="kubernetes:1.18.1" --set master.installPlugins[1]="workflow-aggregator:2.6" --set master.installPlugins[2]="credentials-binding:1.19" --set master.installPlugins[3]="git:3.11.0" --set master.installPlugins[4]="workflow-job:2.33" \
 --set master.installPlugins[5]="job-dsl:1.76"

Lines:

  • 7 – disabling CSRF make that example simpler as we don’t must deal with issuing/sending crumbs in API requests
  • 9 – we must change Jenkins service type as by default it starts as LoadBalancer what won’t work on Minikube
  • 10 – those are default plugins in Helm chart, provided in such strange form because of some issue in Helm – more info https://stackoverflow.com/questions/48316330/how-to-set-multiple-values-with-helm
  • 11 – We need job dsl plugin what will be described in next section

after those operations Jenkins should start – to access it we can use following command:

kubectl port-forward svc/jenkins 8080:8080

kubectl will create proxy for us so we can see Jenkins UI under localhost:8080. Login is admin and password can be obtained by using following command:

printf $(kubectl get secret –namespace default jenkins -o jsonpath=”{.data.jenkins-admin-password}” | base64 –decode);echo

Seed job

Main concept will be based on seed job – Jenkins job DSL script which will provide pipelines for microservices. Such seed job will contain following code:

projects.split(',').each { project ->
  pipelineJob(project) {
    definition {
      cpsScm {
        scm {
          git {
            remote {
              url("https://github.com/jakubbujny/article-${project}.git")
            }
            branch("*/master")
          }
        }
        triggers {
           scm("* * * * *")
       }
        lightweight()
        scriptPath('Jenkinsfile.groovy')
      }
    }
  }
}

Projects variable will be provided as job parameter – that parameter will be modified by CRD Operator on Kubernetes when new resource is created.

Job DSL script for each project (microservice) will create pipeline job with Github project as source. That pipeline will use Jenkinsfile located in microservice’s repository and will have trigger based on repository pull so every minute repository will be pulled but pipeline will be triggered only when changes are detected.

Jenkinsfile.groovy source

microserviceName = "microservice1"

pipeline {
    agent {
        kubernetes {
            //cloud 'kubernetes'
            label 'mypod'
            yaml """
apiVersion: v1
kind: Pod
spec:
  serviceAccountName: cicd
  containers:
  - name: docker
    image: docker:1.11
    command: ['cat']
    tty: true
    volumeMounts:
    - name: dockersock
      mountPath: /var/run/docker.sock
  - name: kubectl
    image: ubuntu:18.04
    command: ['cat']
    tty: true
  volumes:
  - name: dockersock
    hostPath:
      path: /var/run/docker.sock
"""
        }
    }
    stages {
        stage('Build Docker image') {
            steps {
                checkout scm
                container('docker') {
                    script {
                        def image = docker.build("digitalrasta/article-${microserviceName}:${BUILD_NUMBER}")
                        docker.withRegistry( '', "dockerhub") {
                            image.push()
                        }
                    }
                }
            }
        }
        stage('Deploy') {
            steps {
                container('kubectl') {
                    script {
                        sh "apt-get update && apt-get install -y curl"
                        sh "curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.15.0/bin/linux/amd64/kubectl"
                        sh "chmod +x ./kubectl"
                        sh "mv ./kubectl /usr/local/bin/kubectl"
                        def checkDeployment = sh(script: "kubectl get deployments | grep ${microserviceName}", returnStatus: true)
                        if(checkDeployment != 0) {
                            sh "kubectl apply -f deploy/deploy.yaml"
                        }
                        sh "kubectl set image deployment/${microserviceName} ${microserviceName}=digitalrasta/article-${microserviceName}:${BUILD_NUMBER}"
                    }
                }
            }
        }
    }
}

That pipeline spawn POD on Kubernetes which will contain 3 containers – but we see definition only of 2 of them as 3rd container will be JNLP Jenkins agent. First container is used to build docker image with our microservice, tag it with BUILD_NUMBER, push to Dockerhub and second is Ubuntu where kubectl is installed to make deployment.

CD is simply made by updating docker image in particular deployment so it will be automatically pulled by Kubernetes from Dockerhub.

Please be also aware about ServiceAccount which must be installed on cluster named “cicd”:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cicd

---

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cicd
rules:
  - apiGroups: ["extensions", "apps"]
    resources: ["deployments"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: ["jakubbujny.com"]
    resources: ["jenkinspipelines"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cicd
subjects:
  - kind: ServiceAccount
    name: cicd
roleRef:
  kind: Role
  name: cicd
  apiGroup: rbac.authorization.k8s.io

Cicd ServiceAccount has permissions to manipulate deployments and also to make operations over some special API jakubbujny.com and jenkinspipelines – that’s our CRD which will be described in next section. That permission is not really needed as “jenkinspipeline” resource should be installed by cluster admin but I left that to make example more clear.

CustomResourceDefinition and Operator

The final part is to make our own Operator with API definition. To do that we need Github repository and operator-sdk so we can start with following command

operator-sdk new jenkins-pipeline-operator –repo github.com/jakubbujny/jenkins-pipeline-operator

that command will create basic folders structure with boilerplate code.

As next we want to add our own API and Controller code which will react on changes in that API:

operator-sdk add api –api-version=jakubbujny.com/v1alpha1 –kind=JenkinsPipeline

operator-sdk add controller –api-version=jakubbujny.com/v1alpha1 –kind=JenkinsPipeline

We need field in our API definition to define microservice name for which pipeline should be created – let’s modify following file: pkg/apis/jakubbujny/v1alpha1/jenkinspipeline_types.go

type JenkinsPipelineSpec struct {
Microservice string `json:”microservice”`
}

now we must regenerate APIs definition so yaml configuration in deploy directory will be the same as our code. To do that we should issue command:

operator-sdk generate openapi

and the final and most important part is to write code in Go which will integrate with Jenkins – to make such integration we need to generate API token in Jenkins and pass it to the operator. I simply made that by environment variables in deploy/operator.yaml

Let’s go to the pkg/controller/jenkinspipeline/jenkinspipeline_controller.go – I will describe only the most important part.

func (r *ReconcileJenkinsPipeline) Reconcile(request reconcile.Request) (reconcile.Result, error) {
	reqLogger := log.WithValues("Request.Namespace", request.Namespace, "Request.Name", request.Name)
	reqLogger.Info("Reconciling JenkinsPipeline")

	// Fetch the JenkinsPipeline instance
	instance := &jakubbujnyv1alpha1.JenkinsPipeline{}
	err := r.client.Get(context.TODO(), request.NamespacedName, instance)
	if err != nil {
		if errors.IsNotFound(err) {
			// Request object not found, could have been deleted after reconcile request.
			// Owned objects are automatically garbage collected. For additional cleanup logic use finalizers.
			// Return and don't requeue
			return reconcile.Result{}, nil
		}
		// Error reading the object - requeue the request.
		return reconcile.Result{}, err
	}

	resp, err := getSeedJob()

	if err != nil {
		reqLogger.Error(err, "Failed to get seed config to check whether job exists")
		return reconcile.Result{}, err
	}

	if resp.StatusCode == 404 {
		reqLogger.Info("Seed job not found so must be created for microservice "+instance.Spec.Microservice)

		resp, err := createSeedJob()
		err = handleResponse(resp, err, reqLogger, "create seed job")
		if err != nil {
			return reconcile.Result{}, err
		}

		resp, err = updateSeedJob(instance.Spec.Microservice)
		err = handleResponse(resp, err, reqLogger, "update seed job")
		if err != nil {
			return reconcile.Result{}, err
		}
	} else if resp.StatusCode == 200 {
		reqLogger.Info("Seed job found so must be updated for microservice "+instance.Spec.Microservice)
		resp, err = updateSeedJob(instance.Spec.Microservice)
		err = handleResponse(resp, err, reqLogger, "update seed job")
		if err != nil {
			return reconcile.Result{}, err
		}
	} else {
		err = coreErrors.New(fmt.Sprintf("Received invalid response from Jenkins %s",resp.Status))
		reqLogger.Error(err, "Failed to get seed config to check whether job exists")
		return reconcile.Result{}, err
	}

	resp, err = triggerSeedJob()
	err = handleResponse(resp, err, reqLogger, "trigger seed job")
	if err != nil {
		return reconcile.Result{}, err
	}

	return reconcile.Result{}, nil
}
func handleResponse( resp *http.Response, err error, reqLogger logr.Logger, action string) error {
	if err != nil {
		reqLogger.Error(err, "Failed to "+action)
		return err
	}

	if resp == nil {
		return nil
	}

	if resp.StatusCode != 200 {
		err = coreErrors.New(fmt.Sprintf("Received invalid response from Jenkins %s",resp.Status))
		reqLogger.Error(err, "Failed to"+action)
		return err
	}
	return nil
}

func decorateRequestToJenkinsWithAuth(req *http.Request) {
	jenkinsApiToken := os.Getenv("JENKINS_API_TOKEN")
	req.Header.Add("Authorization", "Basic "+ b64.StdEncoding.EncodeToString([]byte("admin:"+jenkinsApiToken)))
}

func getSeedJob() (*http.Response, error) {
	req, err := http.NewRequest("GET", os.Getenv("JENKINS_URL")+"/job/seed/config.xml", nil)
	if err != nil {
		return nil, err
	}
	decorateRequestToJenkinsWithAuth(req)
	return (&http.Client{}).Do(req)
}

func createSeedJob() (*http.Response, error) {
	seedFileData, err := ioutil.ReadFile("/opt/seed.xml")

	req, err := http.NewRequest("POST", os.Getenv("JENKINS_URL")+"/createItem?name=seed", bytes.NewBuffer(seedFileData))
	if err != nil {
		return nil, err
	}
	req.Header.Set("Content-type", "text/xml")
	decorateRequestToJenkinsWithAuth(req)
	return (&http.Client{}).Do(req)
}

func updateSeedJob(microservice string) (*http.Response, error) {
	resp, err := getSeedJob()
	if err != nil {
		return nil, err
	}
	buf := new(bytes.Buffer)
	_, err = buf.ReadFrom(resp.Body)
	if err != nil {
		return nil, err
	}
	seedXml := buf.String()

	r := regexp.MustCompile(`<defaultValue>(.+)<\/defaultValue>`)
	foundMicroservices := r.FindStringSubmatch(seedXml)

	toReplace := ""
	if strings.Contains(foundMicroservices[1], microservice) {
		return nil,nil
	} else {
		if foundMicroservices[1] == "default" {
			toReplace = microservice
		} else {
			toReplace = foundMicroservices[1] + "," + microservice
		}
	}

	toUpdate := r.ReplaceAllString(seedXml, fmt.Sprintf("<defaultValue>%s</defaultValue>", toReplace))

	req, err := http.NewRequest("POST", os.Getenv("JENKINS_URL")+"/job/seed/config.xml", bytes.NewBuffer([]byte(toUpdate)))
	if err != nil {
		return nil, err
	}
	req.Header.Set("Content-type", "text/xml")
	decorateRequestToJenkinsWithAuth(req)
	return (&http.Client{}).Do(req)
}

func triggerSeedJob() (*http.Response, error) {
	req, err := http.NewRequest("POST", os.Getenv("JENKINS_URL")+"/job/seed/buildWithParameters", nil)
	if err != nil {
		return nil, err
	}
	decorateRequestToJenkinsWithAuth(req)
	return (&http.Client{}).Do(req)
}

Reconcile function will be triggered when state in Kubernetes must be synced so usually when new object is created. We start with getSeedJob() in line 19 and that function make request to Jenkins to check if seed job already exists – if not (404 code) it’s created with default config located in build/seed.xml and added to Operator’s docker image in build/Dockerfile.

If seed job already exists it must be updated to add microservice name to list of parameters what is done by using regular expressions over seed.xml job config to change default value for parameter.

After all program triggers seed job so it will be executed and new pipeline will be created. Now we should build and push operator docker image and then deploy it

operator-sdk build digitalrasta/jenkins-pipeline-operator

docker push digitalrasta/jenkins-pipeline-operator:latest

kubectl apply -f deploy

And now we can create pipeline for microservice by applying following resource

apiVersion: "jakubbujny.com/v1alpha1"
kind: "JenkinsPipeline"
metadata:
  name: "microservice1"
spec:
  microservice: "microservice1"

The one weak thing is we didn’t create Finalizer for JenkinsPipeline resource – it means after deletion of resource seed’s job parameter won’t be modified so pipeline for that microservice will still exists but Finalizer is topic for another article.

NetworkPolicy on Kubernetes – how to setup properly ingress and egress if I want to limit POD’s network access

WordPress-firewall

Overview

I decided to write this post as I’ve seen on Stackoverflow many posts where people are confused about setting properly NetworkPolicy in Kubernetes – especially how to setup egress to not block traffic which will be sent back to client. I understand that might be confusing especially in TCP protocol where client’s port on which data will be sent back is chosen randomly.

So in that article I will show you how you can setup Minikube to support network policies as out-of-the-box network driver in Minikube doesn’t support that. After some playing with policies we will dig deeper to see how such firewall is implemented in Calico.

As always source code for that article can be found on https://github.com/jakubbujny/article-kubernetes-network-policies

NetworkPolicy in theory

NetworkPolicy allows us to define network rules inside Kubernetes cluster and it’s based on podSelector so it means that we can attach NetworkPolicy to pods matching them using e.g. labels. Those policies can limit outgoing and incoming network access using source/dest ports, CIDRs, namespaces and other pod’s labels.

NetworkPolicy in Kubernetes is only API definition which must be then implemented by network CNI plugin. It means if we define NetworkPolicy and we don’t have in our cluster proper CNI plugin which implements it, that NetworkPolicy won’t have any effect so we will have false-security feeling.

NetworkPolicy in practice – setup Calico on Minikube

Minikube out-of-the-box doesn’t support NetworkPolicies. To use that we must install some external CNI.

One of the most popular and stable plugin is now Calico – it use things which exist in IT for many year like netfilter/iptables and Border Gateway Protocol

To allow Minikube to CNI we must start it with following command:

 minikube start --network-plugin=cni 

And then install Calico by running

 curl https://docs.projectcalico.org/master/manifests/calico.yaml | kubectl apply -f - 

We should see new pods created in kube-system namespace – we should wait until they are in Running state before proceeding:

➜ ~ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7f68846cd-8f97d 0/1 Pending 0 6s
calico-node-mbdzl 0/1 Init:0/3 0 6s

So wait until it will be:

➜ ~ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7f68846cd-8f97d 1/1 Running 0 74s
calico-node-mbdzl 1/1 Running 0 74s

Prepare pods

As our example application we will use:

  • pod with ubuntu which will have limited ingress to port 80 and egress to Internet access but only on 443 port to restrict that pod to use TLS, named: ubuntu1
  • pod with ubuntu without any limitations to test connectivity, named: ubuntu2

Let’s start with first pod

apiVersion: v1
kind: Pod
metadata:
  name: ubuntu1
  labels:
    app: ubuntu1
spec:
  containers:
    - name: ubuntu
      image: ubuntu:18.04
      command: ["bash", "-c"]
      args: ["apt-get update && apt-get install -y curl python3 && while true; do sleep 1; done"]
      ports:
        - containerPort: 80

That’s simple pod which in start command install curl for testing and python3 for starting simple web server in next steps. After all it goes into infinite loop so we can exec into that container and write some commands.

apiVersion: v1
kind: Pod
metadata:
  name: ubuntu2
  labels:
    app: ubuntu2
spec:
  containers:
    - name: ubuntu
      image: ubuntu:18.04
      command: ["bash", "-c"]
      args: ["apt-get update && apt-get install -y curl && while true; do sleep 1; done"]

Second pod looks the same as first but we don’t install python and don’t expose port.

Network communication – sysdig

In current communication network is not limited – let’s verify it and see how looks network traffic.

To achieve that we will use sysdig – tool which is preinstalled on Minikube machine and it’s like strace combined with tcpdump but much more powerful. Minikube underhood spawn virtual machine where Kubernetes is installed. We can SSH into that machine and make ourselves root

minikube ssh

sudo su –

after that we want to observe traffic on port 80. We can do it using following command:

sysdig fd.port=80

Now terminal should freeze, waiting for traffic to dump.

Let’s open another terminal and find ubuntu1 IP address

kubectl describe pod ubuntu1

….

Status: Running
IP: 192.168.120.77
Containers:

exec into ubuntu1 pod and start web server

kubectl exec -it ubuntu1 bash

python3 -m http.server 80

Serving HTTP on 0.0.0.0 port 80 (http://0.0.0.0:80/) …

open another terminal and exec into ubuntu2 pod, send HTTP request to ubuntu1 webserver

kubectl exec -it ubuntu2 bash
root@ubuntu2:/# curl 192.168.120.77

Now in sysdig terminal we should see many new logs lines, look on the first

80565 10:16:48.740561399 0 curl (11292) 192.168.120.77:80

as we can see ubuntu2 opened connection to ubuntu1:80 – randomly chosen port to send back data is 37454

Network policies – ingress and egress

Now it’s time to limit the network for ubuntu1 pod. There will be 3 policies:

  • UDP egress on port 53 – that allow DNS traffic so we can translate e.g. google.com to ip address inside pod
  • TCP ingress on port 80 – allow clients to connect to our webserver
  • TCP egress on port 443 – allow pod to connect to Internet services on TLS port

Configuration:

---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: dns
spec:
  podSelector:
    matchLabels:
      app: ubuntu1
  policyTypes:
    - Egress
  egress:
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
      ports:
        - protocol: UDP
          port: 53

---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-ingress-to-webserver
spec:
  podSelector:
    matchLabels:
      app: ubuntu1
  policyTypes:
    - Ingress
  ingress:
    - from:
        - ipBlock:
            cidr: 0.0.0.0/0
      ports:
        - port: 80
          protocol: TCP

---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-egress-to-tls
spec:
  podSelector:
    matchLabels:
      app: ubuntu1
  policyTypes:
    - Egress
  egress:
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
      ports:
        - port: 443
          protocol: TCP

Now we can go back to ubuntu1’s console and try to open connection to google.com on port 80 – we should see timeout and on HTTPS connection we should see response.

root@ubuntu1:/# curl -m 5 -vv google.com
* Rebuilt URL to: google.com/
* Trying 216.58.215.78…
* TCP_NODELAY set
* Connection timed out after 5004 milliseconds
* stopped the pause stream!
* Closing connection 0
curl: (28) Connection timed out after 5004 milliseconds

root@ubuntu1:/# curl https://google.com

Now let’s start again webserver and go back to ubuntu2 to test connection again. We still see answer from webserver.

So how is that possible? We defined limiting rule for egress traffic so we see that traffic is only possible on port 443 but data is somehow still sent back to client on some other port.

Deep dive – Calico iptables

Firewall rules which we defined must be then implemented by CNI. Calico use netfilter/iptables for that – every time we define NetworkPolicy, Calico automatically setup proper netfilter rules on every node in cluster.

Let’s go back onto minikube VM and by making some combination of iptables -S and grep try to find related rules.

-A cali-pi-_Pyx9r8CS7bPqC0nMrCi -p tcp -m comment –comment “cali:x8PiQWJp-yhKM8vP” -m multiport –dports 80 -j MARK –set-xmark 0x10000/0x10000

-A cali-tw-cali22716d29a85 -m comment –comment “cali:MfYo–qV2fDDHXqO” -m mark –mark 0x0/0x20000 -j cali-pi-_Pyx9r8CS7bPqC0nMrCi

-A cali-from-wl-dispatch -i cali22716d29a85 -m comment –comment “cali:8gmcTnib5j5lzG4A” -g cali-fw-cali22716d29a85

-A cali-fw-cali22716d29a85 -m comment –comment “cali:CcU6YKJiUYOoRnia” -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT

As we can see Calico defines custom tables and targets – we don’t want to do whole reverse engineering of that as it’s really complex.

In simple words, in first line, we see our main iptables rule on port 80 which make MARK on the packets – MARK means advanced routing rules which are defined by Calico.

Second line show us how Calico define relations inside those iptables rules so id Pyx9r8CS7bPqC0nMrCi is related to cali22716d29a85.

In third line we see that cali22716d29a85 is actually network interface defined on node and packets processing chain should go to cali-fw-cali22716d29a85.

Finally the most important fourth line has –ctstate RELATED,ESTABLISHED -j ACCEPT parameters. Netfliter is stateful firewall – it means it understands how TCP connection works and can track in memory which connections are new and which are related or  established. So if netfilter see that some client on port 37454 already established connection to port 80 it will track that connection and won’t drop packets by egress rule limiting network traffic to only 443 port. The same rule applies when pod open connection to some Internet services so ingress rule won’t drop packets.