This article is now 3 years old! It is highly likely that this information is out of date and the author will have completely forgotten about it. Please take care when following any guidance to ensure you have up-to-date recommendations.
I’ve done a fair amount of work learning VMware PKS and NSX-T, but I wanted to
drop down a level and get more familiar with the inner workings for Kubernetes,
as well as explore some of the newer features that are exposed by the NSX
Container Plugin that are not yet in the PKS integrations.
The NSX-T docs are…not great, I certainly don’t think you can work out the steps
required from the official NCP installation
guide
without a healthy dollop of background knowledge and familiarity with Kubernetes
and CNI. Anthony Burke published this
guide
which is great, and I am lucky enough to be able to pick his brains on our
corporate slack.
So, what follows is a step-by-step guide that is the product of my learning
process - I can’t guarantee it’s perfect, but I’ve tried to write these steps in
a logical order, validate them and add some explanation as I go through. If you
spot a mistake, or an error, please feel free to get in touch (for my learning,
and so I can correct it!)
Three Ubuntu 16.04 virtual machines deployed with 2 vCPU and 4GB RAM each. The
VMs have been deployed as default installations, and this guide assumes the
same.
DNS records have been created for each VM.
Name
Role
FQDN
IP
k8s-m01
Master
k8s-m01.definit.local
10.0.0.10
k8s-w01
Worker
k8s-w01.definit.local
10.0.0.11
k8s-w02
Worker
k8s-w02.definit.local
10.0.0.12
Network adapters
Each node is configured with three network adapters.
Network
NIC
IP Addressing
Notes
Management
ens160
192.168.10.0/24
VLAN backed network used for management access from the Kubernetes nodes to the NSX Manager
Kubernetes Access
ens192
10.0.0.0/24
NSX-T segment (logical switch) that’s connected to the tier 1 router (t1-kubernetes) and provides access to Kubernetes nodes. Default route.
Kubernetes Transport
ens224
None
NSX-T segment (logical switch) that provides a transport network.
iface lo inet loopback
auto lo
# managementauto ens160
iface ens160 inet static
address 192.168.10.90
netmask 255.255.255.0
# k8s-accessauto ens192
iface ens192 inet static
address 10.0.0.10
netmask 255.255.255.0
gateway 10.0.0.1
dns-nameservers 192.168.10.17
dns-search definit.local
# k8s-transportauto ens224
iface ens224 inet manual
up ip link set ens224 up
Install Docker and Kubernetes
Disable swap (required for kubelet to run), install Docker, and install Kubernetes 1.13.
Master and Worker Nodes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Disable swapsudo swapoff -a
# Update and upgradesudo apt-get update && sudo apt-get upgrade -y
# Install dockersudo apt-get install -y docker.io apt-transport-https curl
# Start and enable dockersudo systemctl start docker.service
sudo systemctl enable docker.service
# Add the Kubernetes repository and keyecho"deb http://apt.kubernetes.io/ kubernetes-xenial main"| sudo tee -a /etc/apt/sources.list.d/kubernetes.list
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
# Update package listssudo apt-get update
# Install kubernetes 1.13 and hold the packagessudo apt-get install -y kubelet=1.13.5-00 kubeadm=1.13.5-00 kubectl=1.13.5-00
sudo apt-mark hold kubelet kubeadm kubectl
You should also edit /etc/fstab and comment out the swap line to ensure the swap stays disabled
1
2
3
4
5
6
7
8
9
10
11
# /etc/fstab: static file system information.## Use 'blkid' to print the universally unique identifier for a# device; this may be used with UUID= as a more robust way to name devices# that works even if disks are added and removed. See fstab(5).## <file system> <mount point> <type> <options> <dump> <pass>/dev/mapper/ubuntu--1604--k8s--vg-root / ext4 errors=remount-ro 01# /boot was on /dev/sda1 during installationUUID=0caaf4eb-734a-43f0-b4ef-b7b0a012abfc /boot ext2 defaults 02#/dev/mapper/ubuntu--1604--k8s--vg-swap_1 none swap sw 0 0
Install the CNI Plugin
Download the NSX Container zip package to the home folder
Master and Worker Nodes
1
2
3
4
5
6
# Validate apparmor is enables (expected output is "Y")sudo cat /sys/module/apparmor/parameters/enabled
# Extract the NSX Container packageunzip nsx-container-2.4.1.13515827.zip
# Install the CNI Pluginsudo dpkg -i nsx-container-2.4.1.13515827/Kubernetes/ubuntu_amd64/nsx-cni_2.4.1.13515827_amd64.deb
Install Open vSwitch
You must install the Open vSwitch package bundled with the NSX Container plugin
Open vSwitch should be configured to use the network interface
Master and Worker Nodes
1
2
3
4
5
6
7
# Create the OVS Bridgesudo ovs-vsctl add-br br-int
# Assign the k8s-transport interface for overlay trafficsudo ovs-vsctl add-port br-int ens224 -- set Interface ens224 ofport_request=1# Bring the bridge and overlay interface upsudo ip link set br-int up
sudo ip link set ens224 up
Configure NSX-T Resources
NSX-T 2.4 introduces a new intent-based policy API, which is reflected in the
changes in the user interface. Unfortunately the NCP does not use the new policy
API, so the objects created for consumption by NCP must be created using the
“Advanced Networking and Security” tab.
For all components, collect the object ID for later use.
Overlay Transport Zone
Create, or use an existing overlay transport zone
Routers
Tier 0
Deploy and configure a tier 0 router, this will be the point of ingress/egress
to the NSX-T networks. The router must be created in active/standby mode, since
stateful services are required (NAT).
Tier 1
Deploy and configure a tier 1 router, this router will provide access to the
Kubernetes cluster via the k8s-access segment created below. The tier 1 router
should be connected to the tier 0 router, and a router port configured for the
k8s-access segment.
Segments
k8s-transport
The switch ports for the kubernetes nodes on the transport need to be tagged so
that the NCP can identify them.
Tag
Scope
<node name>
ncp/node_name
<kubernetes cluster name>
ncp/cluster
k8s-access
The k8s-access segment will be the API access to the Kubernetes cluster. It
should be created and connected to the a router port on the tier 1 router.
IP Blocks
Name
CIDR
Notes
k8s-pod-network
172.16.0.0/16
IP Pool for SNAT
Create an IP pool in NSX Manager that is for allocating IP addresses for
translating Pod IPs and Ingress controllers using SNAT/DNAT rules. These IP
addresses are also referred to as external IPs.
Name
Range
CIDR
k8s-snat
10.0.1.1-10.0.1.254
10.0.1.0/24
Firewall
The NCP plugin creates firewall rules for pod isolation and policy for access
between a “top” and “bottom” marker section. By default, if these marker
sections are not created, all isolation rules will be created at the bottom of
the list, and all access policy sections will be created at the top. If you want
to control the rule positions (why wouldn’t you?!) you need to create some
markers.
Collect the component IDs to use later in the NCP configuration
Component
Name
ID
Tier 0 Router
t0-kubernetes
b92fbb43-b664-46e9-bd4a-d8d46dac82f8
Overlay Transport Zone
tz-overlay
f99ba556-eafe-4c23-beaf-8df1723a354e
Pod IP Block
k8s-pod-network
ab81892f-eb9a-4248-8606-b26f2293fa83
External IP Pool
k8s-snat-ip-pool
dde463fb-15d9-456f-aa3d-030e24f1b1db
Top Firewall Marker
k8s-top
f887ba4f-d634-44ff-8338-05c4a214cc93
Bottom Firewall Marker
k8s-bottom
820da106-a35c-43c0-bcf0-edaf94004d9e
Create the Kubernetes Cluster
Initialise the Cluster
Master Node
1
2
3
4
5
6
# Initialise the cluster using the master node's k8s-access IP addresssudo kubeadm init --apiserver-advertise-address=10.0.0.10
# Copy the Kubernetes config to the non-root usermkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g)$HOME/.kube/config
Join the Cluster from the worker nodes
Grab the join command from the output of the master node init command
Import the NCP docker image to the local repository on each Kubernetes node. The
image will be loaded with a long and ugly name
“registry.local/2.4.1.13515827/nsx-ncp-ubuntu”, so to keep the config files
simple and to help with upgrades later, tag the image to create an alias.
Now you can refer to the image as nsx-ncp-ubuntu:2.4.1.13515827 in the config
files.
Create a namespace, roles, service accounts
The NCP pod can run in the default or kube-system namespaces, however
creating a namespace for pods to run in is considered a best practice and allows
more granular control over access and permissions.
1
2
3
4
5
# Create Namespace for NSX resourceskind:NamespaceapiVersion:v1metadata:name:nsx-system
The first ServiceAccount we create is called ncp-svc along with two
ClusterRolesncp-cluster-role and ncp-patch-role. The two roles are
assigned to the new ServiceAccount using two ClusterRoleBinding. As the
name suggests the account is used to run processes in the NCP pod.
Figuring out these YAML files can be daunting, especially when you’re just
getting started like me. I found the official documentation pretty useful to
understand how Kubernetes service accounts and roles interact - see Managing
Service
Accounts.
# Create a ServiceAccount for NCP namespaceapiVersion:v1kind:ServiceAccountmetadata:name:ncp-svcnamespace:nsx-system---# Create ClusterRole for NCPkind:ClusterRole# Set the apiVersion to rbac.authorization.k8s.io/v1beta1 when k8s < v1.8apiVersion:rbac.authorization.k8s.io/v1metadata:name:ncp-cluster-rolerules:- apiGroups:- ""- extensions- networking.k8s.ioresources:- deployments- endpoints- pods- pods/log- networkpolicies# Move 'nodes' to ncp-patch-role when hyperbus is disabled.- nodes- replicationcontrollers# Remove 'secrets' if not using Native Load Balancer.- secretsverbs:- get- watch- list---# Create ClusterRole for NCP to edit resourceskind:ClusterRole# Set the apiVersion to rbac.authorization.k8s.io/v1beta1 when k8s < v1.8apiVersion:rbac.authorization.k8s.io/v1metadata:name:ncp-patch-rolerules:- apiGroups:- ""- extensionsresources:# NCP needs to annotate the SNAT errors on namespaces- namespaces- ingresses- servicesverbs:- get- watch- list- update- patch# NCP needs permission to CRUD custom resource nsxerrors- apiGroups:# The api group is specified in custom resource definition for nsxerrors- nsx.vmware.comresources:- nsxerrors- nsxnetworkinterfaces- nsxlocksverbs:- create- get- list- patch- delete- watch- update- apiGroups:- ""- extensions- nsx.vmware.comresources:- ingresses/status- services/status- nsxnetworkinterfaces/statusverbs:- replace- update- patch---# Bind ServiceAccount created for NCP to its ClusterRolekind:ClusterRoleBinding# Set the apiVersion to rbac.authorization.k8s.io/v1beta1 when k8s < v1.8apiVersion:rbac.authorization.k8s.io/v1metadata:name:ncp-cluster-role-bindingroleRef:# Comment out the apiGroup while using OpenShiftapiGroup:rbac.authorization.k8s.iokind:ClusterRolename:ncp-cluster-rolesubjects:- kind:ServiceAccountname:ncp-svcnamespace:nsx-system---# Bind ServiceAccount created for NCP to the patch ClusterRolekind:ClusterRoleBinding# Set the apiVersion to rbac.authorization.k8s.io/v1beta1 when k8s < v1.8apiVersion:rbac.authorization.k8s.io/v1metadata:name:ncp-patch-role-bindingroleRef:apiGroup:rbac.authorization.k8s.iokind:ClusterRolename:ncp-patch-rolesubjects:- kind:ServiceAccountname:ncp-svcnamespace:nsx-system
The second ServiceAccount we need to create is called nsx-agent-svc, which
is used to run the NSX Node Agent Pod. We create a ClusterRole to assign the
required permissions, and then another ClusterRoleBinding to assign the
account to the role.
# Create a ServiceAccount for nsx-node-agentapiVersion:v1kind:ServiceAccountmetadata:name:nsx-agent-svcnamespace:nsx-system---# Create ClusterRole for nsx-node-agentkind:ClusterRole# Set the apiVersion to rbac.authorization.k8s.io/v1beta1 when k8s < v1.8apiVersion:rbac.authorization.k8s.io/v1metadata:name:nsx-agent-cluster-rolerules:- apiGroups:- ""resources:- endpoints- services# Uncomment the following resources when hyperbus is disabled# - nodes# - podsverbs:- get- watch- list---# Bind ServiceAccount created for nsx-node-agent to its ClusterRolekind:ClusterRoleBinding# Set the apiVersion to rbac.authorization.k8s.io/v1beta1 when k8s < v1.8apiVersion:rbac.authorization.k8s.io/v1metadata:name:nsx-agent-cluster-role-bindingroleRef:apiGroup:rbac.authorization.k8s.iokind:ClusterRolename:nsx-agent-cluster-rolesubjects:- kind:ServiceAccountname:nsx-agent-svcnamespace:nsx-system
Save the namespace, NCP and NSX Agent config as a single YAML file called
nsx-ncp-rbac.yml, and then deploy the new resources using the following command:
Master Node or Admin Workstation
1
kubectl apply -f nsx-ncp-rbac.yml
Prepare the NCP config file
In order to deploy the NCP we use a YAML config file that’s included in the NSX
Container Plugin zip package. This consists of a
ConfigMap
and a
Deployment.
These specify the configuration data required to run the application (in this
case NCP) and the pod deployment details.
1
2
# Copy the ncp-deployment template from the NSX Container Plugin foldercp nsx-container-2.4.1.13515827/Kubernetes/ncp-deployment.yml .
Next we need to edit the settings to match our environment - the file below has
all the comments removed for clarity but I suggest reading through them all.
Be sure to add the namespace
cluster must match the value created in the NSX ncp/cluster tags
I’m using password authentication for NSX in my lab…don’t do this in
production! Configure TLS
Certificates
All the component IDs below are the ones collected when we created the NSX
components
Be sure to update the serviceAccountName value with the ServiceAccount
created in the RBAC
Make sure the image value is updated with the tagged name in the docker
local registry
All being well, we should now have NCP configured and ready to go, switch to the
nsx-system namespace (check out kubens for
this!) and view the resources using kubectl get <resource> and kubectl describe <resource> <resource name>
NCP will create components for each Kubernetes namespace, these can be viewed
through the Advanced Networking and Security Tab.