Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Post fan/GPU upgrade, and some additional fan RPM tuning via IPMI: VM server is running a lot cooler, for the most part.

Post fan/GPU upgrade, and some additional fan RPM tuning via IPMI: VM server is running a lot cooler, for the most part.

Scheduled Pinned Locked Moved Uncategorized
5 Posts 2 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • azonenberg@ioc.exchangeA This user is from outside of this forum
    azonenberg@ioc.exchangeA This user is from outside of this forum
    azonenberg@ioc.exchange
    wrote last edited by
    #1

    Post fan/GPU upgrade, and some additional fan RPM tuning via IPMI: VM server is running a lot cooler, for the most part. CPU VRM temperatures during a big compile job are less than the *idle* temps previously.

    But I'm now seeing NIC temperature and it's concerningly hot. I'm not sure why it wasn't showing up before so I have no idea how toasty it was.

    I'm also seeing what appears to be poor / unstable network performance.

    The ConnectX6 is passively air cooled and sits just to the right of the new 80mm fans (as seen from the rear panel), and I suspect what is happening is that the negative pressure from the new fans is drawing front-to-back airflow slightly to the left and reducing airflow over its heatsink. Thermal engineering is hard.

    I have another PCIe slot exhaust fan on order coming tomorrow so hopefully things are tolerable between now and then.

    Link Preview Image
    karppinen@mastodon.onlineK azonenberg@ioc.exchangeA 2 Replies Last reply
    1
    0
    • R relay@relay.infosec.exchange shared this topic
    • azonenberg@ioc.exchangeA azonenberg@ioc.exchange

      Post fan/GPU upgrade, and some additional fan RPM tuning via IPMI: VM server is running a lot cooler, for the most part. CPU VRM temperatures during a big compile job are less than the *idle* temps previously.

      But I'm now seeing NIC temperature and it's concerningly hot. I'm not sure why it wasn't showing up before so I have no idea how toasty it was.

      I'm also seeing what appears to be poor / unstable network performance.

      The ConnectX6 is passively air cooled and sits just to the right of the new 80mm fans (as seen from the rear panel), and I suspect what is happening is that the negative pressure from the new fans is drawing front-to-back airflow slightly to the left and reducing airflow over its heatsink. Thermal engineering is hard.

      I have another PCIe slot exhaust fan on order coming tomorrow so hopefully things are tolerable between now and then.

      Link Preview Image
      karppinen@mastodon.onlineK This user is from outside of this forum
      karppinen@mastodon.onlineK This user is from outside of this forum
      karppinen@mastodon.online
      wrote last edited by
      #2

      @azonenberg I recently upgraded the firmware on twelve ConnectX-6 Dx cards, took something like 5 minutes per card (including one reboot), that 5 minutes was enough to make the heatsink too hot to touch. This was on a box with decent airflow too

      1 Reply Last reply
      0
      • azonenberg@ioc.exchangeA azonenberg@ioc.exchange

        Post fan/GPU upgrade, and some additional fan RPM tuning via IPMI: VM server is running a lot cooler, for the most part. CPU VRM temperatures during a big compile job are less than the *idle* temps previously.

        But I'm now seeing NIC temperature and it's concerningly hot. I'm not sure why it wasn't showing up before so I have no idea how toasty it was.

        I'm also seeing what appears to be poor / unstable network performance.

        The ConnectX6 is passively air cooled and sits just to the right of the new 80mm fans (as seen from the rear panel), and I suspect what is happening is that the negative pressure from the new fans is drawing front-to-back airflow slightly to the left and reducing airflow over its heatsink. Thermal engineering is hard.

        I have another PCIe slot exhaust fan on order coming tomorrow so hopefully things are tolerable between now and then.

        Link Preview Image
        azonenberg@ioc.exchangeA This user is from outside of this forum
        azonenberg@ioc.exchangeA This user is from outside of this forum
        azonenberg@ioc.exchange
        wrote last edited by
        #3

        Historical network traffic before and after the recent reconfiguration.

        I wonder why all my virtual desktops are so slow?

        Link Preview Image
        azonenberg@ioc.exchangeA 1 Reply Last reply
        0
        • azonenberg@ioc.exchangeA azonenberg@ioc.exchange

          Historical network traffic before and after the recent reconfiguration.

          I wonder why all my virtual desktops are so slow?

          Link Preview Image
          azonenberg@ioc.exchangeA This user is from outside of this forum
          azonenberg@ioc.exchangeA This user is from outside of this forum
          azonenberg@ioc.exchange
          wrote last edited by
          #4

          The SSD is slightly cooler, other than a short spike right after boot that might be before the fans spun up fully or something

          Link Preview Image
          azonenberg@ioc.exchangeA 1 Reply Last reply
          0
          • azonenberg@ioc.exchangeA azonenberg@ioc.exchange

            The SSD is slightly cooler, other than a short spike right after boot that might be before the fans spun up fully or something

            Link Preview Image
            azonenberg@ioc.exchangeA This user is from outside of this forum
            azonenberg@ioc.exchangeA This user is from outside of this forum
            azonenberg@ioc.exchange
            wrote last edited by
            #5

            We can also see the bad network performance in the CPU usage charts, showing up as increased dom0 iowait time due to CephFS operations lagging

            Link Preview Image
            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups