Nvidia在2025.12.26(圣诞节的Boxing Day)公开了NVIDIA/cuda-tile,宣称支持CUDA 13.1就能运行,看了眼支持列表,发现GTX1650也在CUDA 13.1的支持范围内,那这不得安排上瞧瞧😂

f14768f26097a95ba35b4dc8ba5405f9

更新驱动&开SSH

根据CUDA官网的Minor Version Compatibility,需要安装580以上版本的驱动,而我当前电脑上运行的驱动版本是,2023年安装的560.81

CUDA ToolkitMinimum Driver VersionUpper Range for Minor Version Compatibility
CUDA 13.x>= 580N/A (backward compatibility applies for newer drivers)
CUDA 12.x>= 525< 580 (newer drivers still supported through backward compatibility)
CUDA 11.x>= 450< 525 (newer drivers still supported through backward compatibility)

鉴于联想对于拯救者R7000 2020的支持已经结束,联想驱动的官网显卡驱动已经停留在472.19,我只能去Nvidia官网找最新的驱动,最终选择了安装591.44 Studio版的驱动:NVIDIA Studio 驱动程序 | 591.44 | Windows 11

在经历5分钟的安装后重启,打开WSL2(Ubuntu),按照指示安装CUDA 13.1

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/13.1.0/local_installers/cuda-repo-wsl-ubuntu-13-1-local_13.1.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-13-1-local_13.1.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-13-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-13-1

添加WSL2的环境变量,看看

# NVIDIA CUDA Toolkit
export PATH=/usr/local/cuda-13.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-13.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

顺手给WSL做了更新

wsl --update

由于老电脑键盘失灵,而且我的新工作基本都在Mac上,那开启SSH远程控制就很有必要

如果是Win11可以用这个:

[wsl2]
networkingMode=mirrored

但我目前系统还是Win10,就需要自行配置端口映射(教程网上很多,基本都OK)

Cuda-Tile编译配置

先把仓库Clone下来,项目对于LLVM采用锁Commit的方式进行版本管理,目前的commit信息为cfbb4cc31215d615f605466aef0bcfb42aa9faa5,而且要求是”Must be compatible”(为什么这些大公司不喜欢锁大版本呢?用llvm.sh安装二进制它不好吗😂)

git clone https://github.com/NVIDIA/cuda-tile.git
cmake -G Ninja -S . -B build -DCMAKE_BUILD_TYPE=Release -DCUDA_TILE_ENABLE_TESTING=ON
cmake --build build --target check-cuda-tile

测试它居然出现了报错!😥

tileiras的输出表明目前只支持50系显卡(sm_100

mocus@DESKTOP-7KDHBIR:~/cuda-tile/build/bin$ tileiras --help-hidden
OVERVIEW: tileiras: NVIDIA (R) Cuda Tile IR optimizing assembler
USAGE: tileiras [options] <tile bytecode file>
OPTIONS:
Generic Options:
-h - Alias for --help
--help - Display available options (--help-hidden for more)
--help-hidden - Display all available options
--help-list - Display list of available options (--help-list-hidden for more)
--help-list-hidden - Display list of all available options
--print-all-options - Print all option values after command line parsing
--print-options - Print non-default options after command line parsing
--version - Display the version of this program
TileIR Assembler Options:
-O - Alias for --opt-level
--arch - Alias for --gpu-name
--device-debug - Generate debug information (if present in the input bytecode)
-g - Alias for --device-debug
--gpu-name=<value> - Specify name of NVIDIA GPU to generate code for.
=sm_100 - SM 100
=sm_103 - SM 103
=sm_110 - SM 110
=sm_120 - SM 120
=sm_121 - SM 121

于是我又尝试了下 NVIDIA/cutile-python,那里面则直接指出不支持sm_75(20系显卡)

File "/home/mocus/code/test-nv-tile/.venv/lib/python3.12/site-packages/cuda/tile/_compile.py", line 225, in __call__
lib = compile_tile(self.pyfunc, pyfunc_args, self.compiler_options, tile_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mocus/code/test-nv-tile/.venv/lib/python3.12/site-packages/cuda/tile/_compile.py", line 70, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/mocus/code/test-nv-tile/.venv/lib/python3.12/site-packages/cuda/tile/_compile.py", line 213, in compile_tile
raise e
File "/home/mocus/code/test-nv-tile/.venv/lib/python3.12/site-packages/cuda/tile/_compile.py", line 206, in compile_tile
cubin_file = compile_cubin(f.name, compiler_options, sm_arch,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mocus/code/test-nv-tile/.venv/lib/python3.12/site-packages/cuda/tile/_compile.py", line 367, in compile_cubin
raise TileCompilerExecutionError(e.returncode, e.stderr.decode(), ' '.join(flags),
cuda.tile._exception.TileCompilerExecutionError: Return code 1
tileiras: for the --gpu-name option: Cannot find option named 'sm_75'!

后续支持?

翻了CUDA-ToolKit的发布文档才发现,原来CUDA 13.1支持cuTile的说法是期货(到26年前只有50系支持)😅宣传铺的满天飞(phoronix),Readme给我的感觉就是只要支持CUDA 13.1就能跑CUDA Tile,那怪我没看文档喽

image-20251227223554478