Kubernetes
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.
Orchestrates, monitors containers, ensures scaling and uptime, networking and storage.
Describe your desired state, controllers watch the system and ensure it stays in the right state. Kubernetes resources are the primitives of the definition of the system.
Pods, controllers (ReplicaSets, Deployments), services (persistent access points into pod-based applications), storage
Containers
- Service discovery
- Load balancing
- Storage orchestration
- Automate rollouts and rollbacks
- Self-healing
- Secrets and config management
- Horizontal scaling
Resources
- Pods: Run containers, atomic… all of it runs or none of it
- Deployments
- ReplicaSets
- Services: establishes networking between pods, IP, DNS name, scaling, load balancing
- Storage
- PersistentVolumes
- …
- ConfigMaps
- Secrets
- Node (virtual machine)
- Kubelet: monitors API server for changes, and schedules changes on runtime and proxy, pod probes
- Container runtime: runtime environment for containers, pulls containers from registry and sets up environment, Container Runtime Interface (switch between container runtimes, uses containerd by default, <1.20 default was Docker)
- Kube proxy: provides networking from nodes into pods, iptables, implements Services, routes traffic, load balancing
- Master node (now called the Control Plane node)
- etcd store: Persists the state of the cluster as key-value pairs
- Controller manager
- Scheduler
- API server: RESTful API exposing controls to nodes and administrators
- Cluster
- Must have networking capabilities for Pods to communicate across all nodes
- Must have networking capabilities for node control systems to communicate to all Pods on the current node
- Add-on pods
- DNS pods
- Ingress controllers
- Kubernetes dashboard
- Networking scenarios
- Within a pod
- Pod to pod on the same node
- Pods across nodes
- External services
Kubernetes as a Developer
kubectl
- Web UI dashboard
kubectl version
kubectl cluster-info
kubectl get all
kubectl run [container] --image=[image]
kubectl port-forward [pod] [ports]
kubectl expose ...
kubectl create [resource]
kubectl apply
# Web UI
kubectl apply [dashboard-yaml-url]
kubectl describe secret -n kube-system
# Locate account-token
kubectl proxy
# Go to dashboard URL
Pods
- Creating a pod
kubectl run
- This may cause a Deployment to be created
kubectl create/apply
with yaml file
- Expose a pod port
kubectl port-forward [pod] [externalport]:[internalport]
kubectl
and pods- YAML fundamentals
- Defining a pod
- Pod health
- Typically one service per container, one container per pod
- Organize application components into pods
- Pods are given an IP, memory, volumes and they can be shared across containers
- Scaled horizontally with replicas
- Pod containers share IP and port
- Container processes need different ports
apiVersion: v1
kind: Pod
metadata:
name: my-nginx
spec:
containers:
- name: my-nginx
image: nginx:alpine
livenessProbe:
exec:
command...
httpGet:
path: /index.html
port: 80
initialDelaySeconds: 15
timeoutSeconds: 2
periodSeconds: 5
failureThreshold: 1
# dry run
kubectl create -f file.pod.yml --dry-run --validate=true
# create if not exists
kubectl create -f file.pod.yml
# create or update if exists (just use this)
kubectl apply -f file.pod.yml
kubectl delete -f file.pod.yml
kubectl describe pod [pod]
# Enter shell
kubectl exec [pod] -it sh
# Edit changes in place
kubectl edit pod [pod]
Labels are important to link up different resources
Get IP Address of Pod
kubectl get pod {name} -o yaml | grep podIP
ReplicaSet
- Ensures pod scaling and liveness
- Used to create pods indirectly
- Used by Deployments
Deployments
- Manage replicasets
- Provides rollbacks
Probes
Probes are diagnostics run by kubelets.
- Liveness probe: determines if a pod is healthy
- Readiness probe: determines if a pod is ready to receive traffic from the load balancer
- ExecAction
- TCPSocketAction
- HTTPGetAction
- success
- failure
- unknown
Note: It’s important to put resource constraints on pod specs to ensure node health.
… run nodes in compute, private subnet, vpn to connect … expose services to public load balancer
rolling updates (by default when using deployments) blue-green deployments canary deployments rollbacks
Services
single point of entry to one or more pods since pods are ephemeral, pod-specific IP addresses cannot be relied on services establish a fixed IP to abstract pod IPs from consumers pods and services are linked by labels load balances between pods the worker node’s kube-proxy creates a virtual IP for services load balance at the layer 4 level (TCP/UDP over IP) services are not ephemeral
Types of Services
- ClusterIP - exposes service on internal IP; only pods within the cluster can use the service
- NodePort - expose service on node’s IP static port; worker nodes proxies communication to the port to the internal service; useful for development
- LoadBalancer - external IP to act as a load balancer; useful when combined with a cloud provider’s load balancer; NodePort and ClusterIP services are created implicitly
- ExternalName - DNS name; acts as an alias for an external service
Port Forwarding
You can port forward into multiple resources. However because pods are ephemeral, it’s better to forward into deployments or services.
kubectl port-forward pod/{name} {extPort}:{port}
kubectl port-forward deployment/{name} {extPort}:{port}
kubectl port-forward service/{name} {extPort}
YAML
apiVersion: v1
kind: Service
metadata:
# name, labels, etc...
name: nginx # Gives a DNS entry within the cluster
labels:
app: nginx
spec:
type:
# ClusterIP, NodePort, LoadBalancer, ExternalName
selector:
# Pod template label(s)
app: nginx
ports:
- name: http
port: 80
targetPort: 80
apiVersion: v1
kind: Service
metadata:
# name, labels, etc...
spec:
type: NodePort
selector:
app: nginx
ports:
- port: 80
targetPort: 80
nodePort: 31000 # Optional for NodePort
apiVersion: v1
kind: Service
metadata:
# name, labels, etc...
spec:
type: LoadBalancer
selector:
app: nginx
ports:
- port: 80
targetPort: 80
apiVersion: v1
kind: Service
metadata:
name: external-service
spec:
type: ExternalName
externalName: api.extern.com
ports:
- port: 18000
Get Service IP
This is not necessary within a cluster, as the service name is a local DNS name.
kubectl get services
Test Connection Between Pods and Services
kubectl exec {pod} -- curl -s http://{service|podIP}
kubectl exec {pod} -it sh
> apk add curl
> curl -s http://{service|podIP}
Storage
- Volumes
- PersistentVolumes
- PersistentVolumeClaims
- StorageClasses
Can store state/data and share it between pods and containers with Volumes. Pod file system is ephemeral. Pods can have multiple volumes, and containers use a mount path to access a volume.
Volumes
Can be tied to a pod’s lifetime. Mount path.
Types:
- emptyDir; shared among multiple containers on a pod, tied to pod’s lifetime
- hostPath; mounts to node filesystem; can mount to docker socket on node host; different types are possible
- DirectoryOrCreate
- Directory
- FileOrCreate
- File
- Socket
- CharDevice
- BlockDevice
- nfs
- configMap/secret: special volumes
- persistentVolumeClaim
- cloud (Azure Disk/File, AWS EBS, GCE persistent disk)
- …tons of other built-in and custom volume types
containers:
...
volumeMounts:
- name: {name}
mountPath: /usr/share/...
readOnly: true
# Look for Volumes
kubectl describe pod {pod}
# See volume mounts
kubectl get pod {pod} -o yaml
Poking Around Host Docker
- mount
/var/run/docker.sock
as a volume docker ps -a
- you can see all the docker containers that k8s runs
PersistentVolumes
Cluster-wide storage resource that relies on network-attached storage, works with cloud, NFS, etc. Does not have a lifetime limited by a pod.
Administrator sets up the PersistentVolume resource, sets up a PersisentVolumeClaim resource, then uses that claim on the pod template and defines the mount path.
accessModes capacity resource requests
node affinity to choose which nodes on which it might live
StorageClasses
Dynamically provision storage
PVC can reference storage class, which will provision the PV whenever
StatefulSet
Provides certain guarantees, good for databases, keeps pod naming predictable
ConfigMaps and Secrets
ConfigMaps
keyvalue pairs set up as environment variables or access via configmap volume
apiVersion: v1
kind: ConfigMap
metadata:
name: app-settings
labels:
app: app-settings
data:
enemies: aliens
lives: "3"
enemies.cheat: true
enemies.cheat.level: noGoodRotten
# config file
enemies=aliens
lives=3
enemies.cheat=true
enemies.cheat.level=noGoodRotten
# create configmap
kubectl create configmap {name} --from-file={path}
# slightly different
kubectl create configmap {name} --from-env-file={path}
Using ConfigMaps
# get contents
kubectl get cm/configmap {name} -o yaml
spec:
template:
...
spec:
containers: ...
env:
- name: ENEMIES
valueFrom:
configMapKeyRef:
name: app-settings
key: enemies
Load the entire ConfigMap into a container spec:
spec:
template:
...
spec:
containers: ...
envFrom:
- configMapRef:
name: app-settings
…or create a volume where config variables are files:
Note: The advantage of this is that the files are changed in place without requiring a pod restart.
spec:
template:
...
spec:
volumes:
- name: app-config-volume
configMap:
name: app-settings
containers:
volumeMounts:
- name: app-config-vol
mountPath: /etc/config
Secrets
Sensitive data that can be provided securely to containers. Just like ConfigMaps, they can be mounted as files or set as environment variables. They are stored in the tempfs on the worker nodes.
kubectl create secret generic {name} --from-literal={key}={value}
kubectl create secret generic {name} --from-file={key}={path_to_file}
kubectl create secret tls {name} --cert={path_to_cert} --key={path_to_key}
Warning: Secrets in manifest files are only base64 encoded, and not secure.
Best Practices
- Enable encryption at rest
- Limit access to etcd to only admin users
- Use SSL/TLS for etcd peer-to-peer communication
- Enforce role-based access for creating new pods as they can request secrets
How does SOPS fit into this?
Where to run Kubernetes?
-
On the cloud
- IaaS: On top of VMs, managed by you, requires setting up K8s, patching the OS, etc
- PaaS: Managed services
-
On premises
- Bare metal
- Virtual machines
-
These are usually dependent on existing infrastructure for a given organization
-
VMs
- Cluster networking
- No overlapping IP ranges
- Scalability
- High-availability
- Control plane can be made redundant
- Disaster recovery
- Cluster networking
-
For installing: Docker for Mac,
kubeadm
, cloud scenarios -
Requirements
- Linux
- 2 CPUs
- 2+ GB RAM
- Swap disabled
-
Cluster network ports
- API server defaults to 6443
- etcd defaults to 2379 and 2380
- Scheduler runs on 10251
- Controller manager runs on 10252
- Kubelet runs on control plane and worker nodes on 10250
- NodePort service 30000-32767
-
Installing on Ubuntu example
- containerd, kubectl, kubeadm, kubelet running on all nodes
- swap disabled for kubelets
swapoff -a
vi /etc/fstab
- hostnames set up each node and configured on each /etc/hosts file
modprobe
for adding kernel modules, set up on/etc/modules-load.d/*.conf
to have them run on startupsudo apt-mark hold {package}
locks package verison
Control plane has its own core cluster pods within the cluster itself. The control plane node is tainted by default so other pods are not run on it. (Eating your own dog food?)
Use kubectl explain {resource}
for quick documentation.
You can drill down like kubectl explain pod.spec.containers
, things like that…
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-world
spec:
replicas: 1
selector:
matchLabels:
app: hello-world
template:
metadata:
labels:
app: hello-world
spec:
containers:
- image: gcr.io/google-samples/hello-app:1.0
name: hello-app
Generate resource manifests quickly with dry-run flag.
kubectl create deployment hello-world \
--image=grc.io/google-samples/hello-app:1.0 \
--dry-run=client -o yaml > deployment.yaml
kubectl expose deployment hello-world \
--port=80 --target-port=8080 \
--dry-run=client -o yaml > service.yaml
When you run kubectl apply
to apply a new or updated manifest, you are submitting the changes to the API server, which parses and stores them in etcd. The contoller manager watches for any new resources, and the scheduler watches for unscheduled pods to run on given node written into etcd.
The kubelet uses the API server to check for updates, pull containers, run them, and connect networking with kube-proxy.