<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Tanuj Ravi Rao - Garden</title><description>Digital garden notes and shorter content pieces.</description><link>https://tansanrao.com/</link><atom:link href="https://tansanrao.com/garden.xml" rel="self" type="application/rss+xml"/><item><title>🌿 Setup Fedora 43 on NVIDIA Nodes</title><link>https://tansanrao.com/garden/useful-snippets/fedora-43-server-setup-nvidia/</link><guid isPermaLink="true">https://tansanrao.com/garden/useful-snippets/fedora-43-server-setup-nvidia/</guid><description>Quick reference for Fedora 43 Server installs on lab nodes (NVIDIA GPU + Mellanox NIC).</description><pubDate>Thu, 05 Feb 2026 05:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;0) Baseline OS prep&lt;/h2&gt;
&lt;p&gt;On a fresh Fedora 43 install:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Update packages.&lt;/li&gt;
&lt;li&gt;Create users + add SSH keys.&lt;/li&gt;
&lt;li&gt;Install kernel headers/devel for DKMS.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;sudo dnf -y upgrade
sudo dnf -y install kernel-devel-matched kernel-headers
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;1) NVIDIA drivers (Fedora 43 use Fedora 42 CUDA repo)&lt;/h2&gt;
&lt;p&gt;Fedora 43 doesn’t have official NVIDIA repos yet; Fedora 42 works.&lt;/p&gt;
&lt;p&gt;Add the repo:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo dnf config-manager addrepo --from-repofile=https://developer.download.nvidia.com/compute/cuda/repos/fedora42/x86_64/cuda-fedora42.repo
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install &lt;strong&gt;open kernel modules&lt;/strong&gt; (all our lab hardware supports this):&lt;/p&gt;
&lt;h3&gt;Compute-only nodes (servers)&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;sudo dnf -y install nvidia-driver-cuda kmod-nvidia-open-dkms
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Workstations (GUI)&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;sudo dnf -y install nvidia-open
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;2) Mellanox NICs via DOCA (OFED/ROCE only)&lt;/h2&gt;
&lt;p&gt;Use the &lt;strong&gt;RHEL/Rocky 10&lt;/strong&gt; DOCA repo. On Fedora 43, &lt;code&gt;doca-all&lt;/code&gt; / &lt;code&gt;doca-networking&lt;/code&gt; depend on Python
3.12 and won’t work; &lt;code&gt;doca-ofed&lt;/code&gt; and &lt;code&gt;doca-roce&lt;/code&gt; work fine (and are all we need).&lt;/p&gt;
&lt;p&gt;Create the repo file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo tee /etc/yum.repos.d/doca.repo &amp;gt;/dev/null &amp;lt;&amp;lt;&apos;EOF&apos;
[doca]
name=DOCA Online Repo
baseurl=https://linux.mellanox.com/public/repo/doca/3.2.1/rhel10/x86_64/
enabled=1
gpgcheck=0
EOF
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo dnf clean all
sudo dnf -y install doca-ofed
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reboot and verify:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo reboot
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After reboot:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;nvidia-smi&lt;/code&gt; should see the GPU&lt;/li&gt;
&lt;li&gt;Mellanox tooling should see the NIC&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;nvidia-smi

sudo mst start
sudo mst status
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;3) NVIDIA Container Toolkit (Podman + Docker)&lt;/h2&gt;
&lt;p&gt;Some tools require Docker, so install it:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo dnf -y install docker docker-compose
sudo systemctl enable --now docker
sudo usermod -aG docker $USER
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add NVIDIA Container Toolkit repo:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
  sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install toolkit:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo dnf -y install nvidia-container-toolkit
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Podman: CDI config + test&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;podman run --rm \
  --device nvidia.com/gpu=all \
  --security-opt=label=disable \
  docker.io/nvidia/cuda:12.4.1-base-ubuntu22.04 nvidia-smi
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Docker: configure runtime + test&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;sudo docker run --rm --gpus all nvidia/cuda:12.4.1-base-ubuntu22.04 nvidia-smi
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;4) Experiment network (100GbE + 100Gb IB between two nodes)&lt;/h2&gt;
&lt;h3&gt;4.1 Identify Mellanox device + set link types&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;sudo mst start
sudo mst status
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Set &lt;strong&gt;port 1 = Ethernet&lt;/strong&gt; and &lt;strong&gt;port 2 = InfiniBand&lt;/strong&gt; (use the correct device from &lt;code&gt;mst status&lt;/code&gt;):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo mlxconfig -d /dev/mst/mt4123_pciconf0 set LINK_TYPE_P1=2 LINK_TYPE_P2=1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reboot to apply:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo reboot
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;4.2 Bring up InfiniBand (Subnet Manager)&lt;/h3&gt;
&lt;p&gt;Pick &lt;strong&gt;one&lt;/strong&gt; node to run the subnet manager:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo systemctl enable --now opensmd
sudo systemctl status opensmd --no-pager
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Verify IB link status:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ibstat
sudo iblinkinfo
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Expected:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;State: Active&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Physical state: LinkUp&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;On ConnectX-6 dual 100G: typically negotiates &lt;code&gt;4x 25Gbps&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;4.3 (Optional) IPoIB sanity test&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;# Host 1
sudo ip link set ibs93f1 up
sudo ip addr add 10.10.10.1/24 dev ibs93f1

# Host 2
sudo ip link set ibs93f1 up
sudo ip addr add 10.10.10.2/24 dev ibs93f1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Ping:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ping -c 3 10.10.10.2 # from node 1
ping -c 3 10.10.10.1 # from node 2
&lt;/code&gt;&lt;/pre&gt;
</content:encoded></item><item><title>🌿 Linux GCC Clangd Config</title><link>https://tansanrao.com/garden/useful-snippets/linux-gcc-clangd-config/</link><guid isPermaLink="true">https://tansanrao.com/garden/useful-snippets/linux-gcc-clangd-config/</guid><description>.clangd snippet to configure LSPs in IDEs when the Linux kernel is built using GCC.</description><pubDate>Thu, 05 Feb 2026 05:00:00 GMT</pubDate><content:encoded>&lt;pre&gt;&lt;code&gt;CompileFlags:
  Add: [-Wno-unknown-warning-option, -Wno-unused-command-line-argument]
  Remove: [-mpreferred-stack-boundary=*,
    -mindirect-branch=*,
    -mindirect-branch-register,
    -fno-allow-store-data-races,
    -fconserve-stack,
    -mrecord-mcount,
    -fno-allow-store-data-races,
    -mabi=*,
    -march=*,
    -fsanitize=bounds-strict]

Index:
  Background: Build
&lt;/code&gt;&lt;/pre&gt;
</content:encoded></item></channel></rss>