NVIDIA License System (NLS) – General Availability

On the 19th of august, NVIDIA made the new License System (NLS) generally available. The announcing email also mentioned the End-of-Life (EOL) of the current vGPU Software License Server. As of 23th of July 2023, the on-premises NVIDIA License Server is no longer supported. The EOL date of the on-premises License Server is the same as the EOL of the vGPU software release branch 11. That means that as of that date, you can only use the new NVIDIA License System. As an engineer, you need to migrate your customers and their License Server(s) to the new NLS and use the new NLS for new deployments. The requirement for using the new NLS is that the environment needs to use NVIDIA vGPU software release 13.0 or later.

With the new version of the NVIDIA license services, you can choose between two types of license services:

Cloud License Service (CLS)

NVIDIA manages the new CLS License Server, and you don’t need any on-premises components. That means that your vGPU clients connect directly to the NVIDIA Cloud to get their licenses.

The downside is that you need to rely on the NVIDIA uptime and your internet connection, and there is no redundancy at the moment. The upside is that you don’t need to manage a VM and can use these resources (time and hardware) for other needs, and you have a cost reduction.

Delegated License Service (DLS)

The DLS is a virtual appliance.  You need to download the appliance from the NVIDIA license portal, and just like the current License Server, you need to assign licenses to it. The main difference is that this VM doesn’t require a Windows license.
The new DLS Virtual appliance supports the following hypervisors:

  • Citrix Hypervisor 8.2
  • Linux Kernel-based Virtual Machine (KVM) hypervisors with QEMU 2.12.0 (qemu-kvm-2.12.0-64.el8.2.27782638)
  • Microsoft Windows Server with Hyper-V 2019 Datacenter Edition
  • Red Hat Virtualization 4.3
  • VMware vSphere Hypervisor (ESXi) 6.7, 7.0, and 7.0.2

Conclusion:

The new License Server has some advantages over the old vGPU License Server.

  1. You don’t need to have a Windows VM.
  2. It doesn’t need the JAVA software, which requires patching and securing.
  3. You don’t need to back-up the CLS.

Overall it saves you time and resources you could spend on other activities, and it gives you a more secure environment. For more information, look at the latest NLS documentation.

The need for a GPU in VDI/SBC environments

There are already many blogs about the pros and cons of using GPUs in VDI and SBC environments. I also would like to point out why I advise the usage of GPUs in these types of environments.

CPU vs. GPU

First, let me explain the difference between CPU and GPU. A CPU is designed to handle a few software threads at a time using its few cores and lots of memory cache. A GPU is designed to handle thousands of threads using their hundreds of cores.

In a production situation, the above difference means that when you have an application that wants to use a GPU but there isn’t one available, it will use the CPU instead. The downside of using a CPU is that it’s slower than a GPU and more expensive. A user can notice this when his programs are sluggish and not performing as expected. You can read more about the difference between a CPU and a GPU here.

VDI/SBC without GPU

In the early days, we had Windows 7 or Windows Server 2008R2 without a GPU and that would work perfectly. Users could do their work as if they were working on a local device. The architecture of the OS and most of the applications one was using, didn’t depend on a GPU. If they needed a graphics card and the server had none, Citrix offered a tool called OpenGL Software Accelerator and the VMware alternative was Soft 3D. This would work for some lightweight applications that required a graphics card and get fooled using the software.

In the last couple of years, we see increased usage of GPU accelerated programs like Chrome, Edge, Firefox, Dynamics, Teams, etc. Windows 10 is also more graphics-intensive than it was with Windows 7. A lot of applications already recommend or even require GPUs. They will function without a GPU but to get the best user experience (UX), you definitely should implement GPUs.

Maintaining the very best UX is exactly why I always advise the use of a GPU. I did several VDI projects where they were creating 3D drawings. In such cases implementing GPU is a no-brainer, because the 3D acceleration completely depends on it. You can read about one of those projects here. Those environments are not the ones that I would like to explain.

Looking back at projects starting with Windows Server 2012, users started to use more and more graphic intensive websites. They started consuming more video content like YouTube, online learning platforms, and websites that are more and more requiring GPU. Both Citrix and VMware developed solutions to keep giving the user the best UX without having to install graphics cards also. The downside of these software solutions is that everything is rendered using the CPU. That means you needed a faster (read: more expensive) CPU and when someone was using a graphics-intensive application, other users could start complaining that their system was slow. This all because the CPU couldn’t handle all requests fast enough.

VDI/SBC with GPU

And then came Nvidia, offering virtual GPUs (vGPU), where we could assign multiple VMs to one physical Graphics Card and give a portion of the card, this is called profiles. First, only Citrix supported this with Citrix Hypervisor (f.k.a. XenServer), and then also VMware supported this type of virtual GPUs. Nowadays we can create more VMs on the same physical server because we can assign more VMs to one GPU graphics card.

This all had a positive effect on the return on investment (ROI) for a VDI or SBC environment with GPU cards. Because of the profiles, we need both fewer physical servers and CPUs, resulting in less Rackspace, less maintenance, and fewer investments. Now we can start using the vGPU more often and help users get the best UX with the new and more graphics-intensive applications.

Almost always a GPU

Nowadays we almost always advise customers to buy GPUs for their VDI/SBC environment. We have a couple of reasons why we don’t advise GPUs:

  1. When they have a small number of VMs (≤ 4) and adding them to their already existing backend VMware environment. To use the GPU profiles, VMware requires the use of VMware vSphere Enterprise Plus. When customers don’t already have an Enterprise Plus license, it’s usually too expensive to add GPUs;
  2. That it’s just a temporary environment because they are migrating to local machines or the cloud.
  3. Usage, some customers are using the EUC environment just for some users or only when they have to work in remote offices. When this isn’t a daily situation, we can decide to only use CPU.

Conclusion

Based on the above pros and cons, you definitely need to consider GPUs when upgrading or migrating your VDI/SBC environment.

Upgrading Nvidia firmware

During the last couple off days I needed to update the firmware from the Nvidia Tesla T4 card in our servers. When following the installation steps provided by HPE I ran into some issues, so I decided to create a step by step guide on how to update the firmware.

  1. Download the latest firmware from your vendor
  2. Upload the RPM file to /usr/local/bin using Winscp or your favorite tool
  3. Connect using SSH to the host
    1. Browse to cd /usr/local/bin
    2. unpack the RPM file using the following command: rpm –ivh ./Tesla_T4_90.04.96.00.01-1-0.x86_64.rpm
      The RPM file name can be different when upgrading a newer version or other Nvidia card.
    3. Go to the folder where the RPM file is extracted for now this is the Tesla_T4_90.04.96.00.01 folder: cd /usr/local/bin/Tesla_T4_90.04.96.00.01/
    4. Change the permissions of the file
      chmod +x Tesla_T4_90.04.96.00.01.scexe
    5. Make sure all nvidia kernel modules are removed
      init 3
      rmmod nvidia
    6. When you get the following error :
      ERROR: Module nvidia is in use
      run the following command:
      service xcp-rrdd-gpumon stop
      and then run:
      rmmod nvidia
    7. Now we can upgrade the firmware using the following command:
      ./Tesla_T4_90.04.96.00.01.scexe -f
      The SCEXE file name can be different when upgrading a newer version or other Nvidia card.
      Choose -i if you would control the upgrade for every card in the host.
  4. When all the cards are upgraded you need to reboot the host and continue to the next host.

Good luck with upgrading, as you can see it’s easy.