Upgrading Nvidia firmware

Nvidia

During the last couple off days I needed to update the firmware from the Nvidia Tesla T4 card in our servers. When following the installation steps provided by HPE I ran into some issues, so I decided to create a step by step guide on how to update the firmware.

  1. Download the latest firmware from your vendor
  2. Upload the RPM file to /usr/local/bin using Winscp or your favorite tool
  3. Connect using SSH to the host
    1. Browse to cd /usr/local/bin
    2. unpack the RPM file using the following command: rpm –ivh ./Tesla_T4_90.04.96.00.01-1-0.x86_64.rpm
      The RPM file name can be different when upgrading a newer version or other Nvidia card.
    3. Go to the folder where the RPM file is extracted for now this is the Tesla_T4_90.04.96.00.01 folder: cd /usr/local/bin/Tesla_T4_90.04.96.00.01/
    4. Change the permissions of the file
      chmod +x Tesla_T4_90.04.96.00.01.scexe
    5. Make sure all nvidia kernel modules are removed
      init 3
      rmmod nvidia
    6. When you get the following error :
      ERROR: Module nvidia is in use
      run the following command:
      service xcp-rrdd-gpumon stop
      and then run:
      rmmod nvidia
    7. Now we can upgrade the firmware using the following command:
      ./Tesla_T4_90.04.96.00.01.scexe -f
      The SCEXE file name can be different when upgrading a newer version or other Nvidia card.
      Choose -i if you would control the upgrade for every card in the host.
  4. When all the cards are upgraded you need to reboot the host and continue to the next host.

Good luck with upgrading, as you can see it’s easy.

Comments

  1. TD says:

    Seems there is no /usr/local/bin anylonger in ESXi 7U3.
    And rpm is also not available anylonger.
    Could you confirm?

Leave a Reply

Your email address will not be published. Required fields are marked *

For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.