[NSX ALB][vSphere with Tanzu] Using markers to choose a VIPnetwork from the IPAM

[NSX ALB][vSphere with Tanzu] Using markers to choose a VIPnetwork from the IPAM

When we deploy vSphere with Tanzu using the NSX Advanced Load Balancer as load balancer, we can choose our Workload networks, but we cannot choose our Frontend networks, as these are assigned directly from the IPAM Profile we have configured in our ALB.

But what if we are using ALB for more than one purpose and we have an IPAM with several networks? For example, we have one network for Tanzu and another for balancing conventional VM services:

In AKO standalone deployments we can use the vipNetworkList field to choose which one we want to be our frontend network, but this option is not available when we deploy Vspehere with Tanzu using NSX ALB as load balancer, so when we deploy APIservers and publish services from our cluster, a Round-Robin will be performed randomly assigning IPs in those networks that we have configured in our IPAM.

Recently, I was given a case like this in which a customer had this problem and I was not able to solve it, so I got down to work and after reviewing the documentation of VMware and AVI Networks I could not find an answer to the problem, so I decided to take the source code from AKO (Github) and try to find the answer myself:

Research

When deploying vSphere with Tanzu with NSX ALB, a “special” instance of AKO is installed in the supervisor cluster, under the namespace “vmware-system-ako”.

How to connect to the supervisor cluster (VMware)

Here we find a first message that gives us a little hint:

No Marker configurable usable networks found. 

Which tells us that in this instance of AKO, which is deployed as “advancedL4” (I won’t go into detail here, but if you have any questions feel free to comment or contact me on Linkedin), the network is chosen using “Markers”.

Markers are a way of tagging objects within the NSX ALB, something similar to vSphere tags or k8s labels. Now, what do we have to tag and where? In the NSX ALB GUI I can’t find any option to tag the network:

However, if we access via the AVI shell (SSH to the controller and then shell command)

There is an option of markers, configurable with key and values!!!

Great, we already know where to add the markers, now what to add? to do that we download the AKO source code from Github and search inside it for “Markers”.

# grep -rnw . -e "Marker"

As we can see inside the file /internal/cache/controller_obj_cache.go, in line 3065 we find the message we see in the AKO, so let’s take a look at it. We find the following:

                if len(network.Markers) == 1 &&
                        *network.Markers[0].Key == lib.ClusterNameLabelKey &&
                        len(network.Markers[0].Values) == 1 &&
                        network.Markers[0].Values[0] == clusterName {
                        utils.AviLog.Infof("Marker configuration found in usable network. Using %s as vipNetworkList.", *network.Name)
                        return nil, *network.Name

From this code we deduce that the value of “key” that we will add as a marker corresponds to a variable called “ClusterNameLabelKey” and the value, corresponds to a variable called “clusterName“.

For the first one, if we do a little research on google, we find in pkg.go.dev, a document of constants where it is included:

So we have already found the value of “key” that we should apply to the network, it is “clustername”.

For the second one, if we check the code of the “lib.go” file, we find the following:

Therefore, the variable clusterName is equal to the function “GetClusterID”, which in turn points to a variable of our k8s environment called “CLUSTER_ID”. If we start a shell to the AKO pod inside the supervisor cluster and launch the “env” command:

We now have our CLUSTER_ID!, so we know which markers to add to our network!

In my case:

  • Key = clustername
  • Value = domain-c37:ae737fff-26ce-4e93-9fdf-9d3cebc3e5ae

Solution

1. Accesing the Supervisor Cluster

In order to select a VIPnetwork in our VwTZ environment, we must first log in as root in the supervisor cluster.

How to connect to the Supervisor Cluster (VMware)

2. Getting the CLUSTER_ID

Once inside, we will list the pods in the namespace “vmware-system-ako” and take the name of the pod:

Next, we will launch a shell at the pod with the following command:

# kubectl exec -it vmware-system-ako-ako-controller-manager-POD-ID -n vmware-system-ako /bin/bash

Once inside we obtain our CLUSTER_ID by launching the command env | grep CLUSTER_ID, and we write it down:

We exit the Pod but do not close the SSH session, as we will need to come back to restart AKO later.

3. Adding the marker to the network

We access via SSH to our NSX ALB Controller, and once inside we launch the “shell” command, we log in again:

Then, we launch the commands

# configure network networkname
# markers
# key clustername values CLUSTER_ID
# save
# save

Finally, we verify that our marker has been added correctly by launching the command:

# show network networkname
4. Restarting the AKO Pod and verifying operation

Let´s go back to the SSH session opened against our Cluster Supervisor, and restart AKO (It may have an impact on the service, so it is recommended to do this step in maintenance window).

# kubectl delete pod vmware-system-ako-ako-controller-manager-POD-ID -n vmware-system-ako

Let´s wait for some seconds and the Pod will redeploy, then let´s check at the logs

Eureka! From now on, our vSphere with Tanzu environment will only choose this network for VIP publishing, skipping any other network we use for another purpose and have in the IPAM.

As always, I hope you liked it, see you soon!

Leave a Reply

Your email address will not be published.