@chb I'm not into building ROCm myself, though, I don't know much about it. But the ROCm docs always stated, that the GCN 5.0 arch is the minimum requirement. The RX 580 is Polaris and this is GCN 4. The crash looks like a floating point exception to me and it is very likely, that GCN 4 has an incomplete floating point model. It is sufficient for rasterization, but may not be enough for GPU-compute workloads. Though, that are just my assumptions based on my experience and may mean nothing.
Search Criteria
Package Details: ollama-rocm-git 0.5.7.git+42cf4db6-1
Package Actions
Git Clone URL: | https://aur.archlinux.org/ollama-rocm-git.git (read-only, click to copy) |
---|---|
Package Base: | ollama-rocm-git |
Description: | Create, run and share large language models (LLMs) with ROCm |
Upstream URL: | https://github.com/ollama/ollama |
Licenses: | MIT |
Conflicts: | ollama |
Provides: | ollama |
Submitter: | sr.team |
Maintainer: | wgottwalt |
Last Packager: | wgottwalt |
Votes: | 5 |
Popularity: | 0.71 |
First Submitted: | 2024-02-28 00:40 (UTC) |
Last Updated: | 2025-01-16 08:36 (UTC) |
Dependencies (26)
- comgr (opencl-amdAUR)
- gcc-libs (gcc-libs-gitAUR, gccrs-libs-gitAUR, gcc11-libsAUR, gcc-libs-snapshotAUR)
- hip-runtime-amd (opencl-amdAUR)
- hipblas (opencl-amd-devAUR)
- hsa-rocr (opencl-amdAUR)
- libdrm (libdrm-gitAUR)
- libelf (elfutils-gitAUR)
- numactl (numactl-gitAUR)
- rocblas (opencl-amd-devAUR)
- rocsolver (opencl-amd-devAUR)
- rocsparse (rocsparse-gfx1010AUR, opencl-amd-devAUR)
- gcc-libs (gcc-libs-gitAUR, gccrs-libs-gitAUR, gcc11-libsAUR, gcc-libs-snapshotAUR) (make)
- git (git-gitAUR, git-glAUR) (make)
- go (go-gitAUR, gcc-go-gitAUR, gcc-go-snapshotAUR, gcc-go) (make)
- hip-runtime-amd (opencl-amdAUR) (make)
- hipblas (opencl-amd-devAUR) (make)
- hsa-rocr (opencl-amdAUR) (make)
- libdrm (libdrm-gitAUR) (make)
- libelf (elfutils-gitAUR) (make)
- numactl (numactl-gitAUR) (make)
- Show 6 more dependencies...
Required by (30)
- ai-writer (requires ollama)
- alpaca-ai (requires ollama)
- alpaca-git (requires ollama) (optional)
- alpaka-git (requires ollama)
- anythingllm-desktop-bin (requires ollama)
- calt-git (requires ollama)
- chatd (requires ollama)
- chatd-bin (requires ollama)
- codename-goose-bin (requires ollama) (optional)
- gollama (requires ollama) (optional)
- gollama-git (requires ollama) (optional)
- hoarder (requires ollama) (optional)
- hollama-bin (requires ollama)
- litellm (requires ollama) (optional)
- litellm-ollama (requires ollama)
- llocal-bin (requires ollama)
- lobe-chat (requires ollama) (optional)
- lumen (requires ollama) (optional)
- maestro (requires ollama) (optional)
- maestro-git (requires ollama) (optional)
- Show 10 more...
Sources (5)
wgottwalt commented on 2025-01-13 11:05 (UTC)
chb commented on 2025-01-13 10:41 (UTC) (edited on 2025-01-13 10:42 (UTC) by chb)
@wgottwalt it seems like tensile and rocblas are the main issues, I'm trying to rebuild rocblas with which rm -f "$srcdir/$dirname/library/src/blas3/Tensile/Logic/asm_full/r9nano*.yaml" https://github.com/xuhuisheng/rocm-build/blob/master/gfx803/README.md
xuhuisheng commented Oct 23, 2020 • What is the expected behavior
Dont crash and return correct loss on gfx803
What actually happens
Invalid argument: indices[5,284] = 997212422 is not in [0, 5001) (text classification)
Low accuracy with loss NaN (mnist)
How to reproduce
ROCm-3.7+ on gfx803, run tensorflow text classification sample. Tensorflow offical sample could reproduce this issue, almost 90%. https://www.tensorflow.org/tutorials/keras/text_classification
There are many people get this error, please refer here :
ROCm-3.7+ broken on gfx803 ROCm#1265 Workaround 1: I rebuild rocBLAS with BUILD_WITH_TENSILE_HOST=false, and the problem dispeared, Maybe the gfx803 r9nano_*.yml is out-of-date? This way caused compiling failure on ROCm-3.9. Workaround 2: keep BUILD_WITH_TENSILE_HOST=true, delete library/src/blas3/Tensile/Logic/asm_full/r9nano_Cijk_Ailk_Bljk_SB.yaml, and issue resolved. If I just keep one solution of this file, issue reproduced.
https://github.com/ROCm/rocBLAS/issues/1172
xuhuisheng has a docker with working ROCm
OS linux Python ROCm GPU
Ubuntu-20.04.5 5.15 3.8.10 5.4.1 RX580 https://github.com/xuhuisheng/rocm-gfx803
But I think my issue was then ctranslate2 for whisperx
wgottwalt commented on 2025-01-13 09:33 (UTC)
@chb I see, though I can imagine the performance won't be that good. The support for the interesting types like 8 bit ints and 16 bit floats is quite limited on that old hardware. Combined with the small local memory (only max 8GiB) you may be better off with a modern CPU. Hmm, could you build ROCm for aarch64, too? I test my ollama cpu-only packages against my Ampera Altra Max systems, which can easily deal with 405B models. Would be nice if I could spread the load over GFX cards, too.
chb commented on 2025-01-13 04:01 (UTC)
I'm currently trying to get rocm to compile for gfx803 (RX580), options exist for gfx900 and other archs https://github.com/lamikr/rocm_sdk_builder/issues/173
This project may be of assistance to people with 'unsupported' cards. If I'm able to complete this I will discuss with the author if this can be hosted.
wgottwalt commented on 2024-12-21 17:18 (UTC)
No, I will not change that. The ROCm documentation is very clear about the gfx900 target: "Unsupported - The current ROCm release does not support this hardware. The HIP runtime might continue to run applications for an unsupported GPU, but prebuilt ROCm libraries are not officially supported and will cause runtime errors."
In short: The target is deprecated for a while now and is in the process of getting removed.
pbordron commented on 2024-12-19 17:57 (UTC)
Crash on my Vega 56 when querying a model with an invalid device function current device: 0, in function ggml_cuda_compute_forward ....
Need to enable gfx900 target and remove the sed on Makefile.rocm in PKGBUILD in order to solve the problem
rakatan commented on 2024-11-19 14:25 (UTC)
@wgottwalt
You are aware what the function pkgver() in the PKGBUILD does, right? I'm really waiting for the day someone uses the outdated flag...
not really, having seen no other place referencing a commit I assumed this would be the place - how else is the commit pinned if it's only referenced here? My assumption got sort of confirmed by it just working with ROCm (as I saw there is a commit adding compiler flags wrt fp16 in between these two).
what outdated flag do you mean?
In short, I'm rather new to Arch, so still learning the structure, and the community expectations.
And you also recognized the ollama.service file and that it includes the line Environment='LD_LIBRARY_PATH=/usr/lib/ollama' right?
likewise, I was not aware - is that the only expected way to use ollama? ollama serve
is a command available on a binary on the PATH - should I be aware to not use it and only use it via systemd?
I'm aware that my answer is a but harsh, but damn it, why?!?
harsh and informative is good - at times. why what? ;)
wgottwalt commented on 2024-11-19 08:33 (UTC)
You are aware what the function pkgver() in the PKGBUILD does, right? I'm really waiting for the day someone uses the outdated flag...
And you also recognized the ollama.service file and that it includes the line Environment='LD_LIBRARY_PATH=/usr/lib/ollama' right?
I'm aware that my answer is a but harsh, but damn it, why?!?
rakatan commented on 2024-11-18 22:38 (UTC)
updating helps with ROCm:
diff --git a/PKGBUILD b/PKGBUILD
index b8242f6..72cebd3 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -1,7 +1,7 @@
# Maintainer: Wilken Gottwalt <wilken dot gottwalt at posteo dot net>
pkgname=ollama-rocm-git
-pkgver=0.4.1.git+c2e8cbaa
+pkgver=0.4.2.git+4759d879
pkgrel=1
pkgdesc='Create, run and share large language models (LLMs) with ROCm'
arch=(x86_64)
interestingly though, it doesn't pick up the shared libraries and I have to run it like this:
env LD_LIBRARY_PATH=/usr/lib/ollama/ ollama serve
any ideas?
risyasin commented on 2024-11-13 10:48 (UTC)
Anyone who is late to read the comment from wgottwalt here and endedup broken rocm libraries. You can downgrade to 6.0.2 by the following snippet.
sudo pacman -U file:///var/cache/pacman/pkg/rocm-opencl-sdk-6.0.2-1-any.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocalution-6.0.2-2-x86_64.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocsparse-6.0.2-2-x86_64.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocm-hip-sdk-6.0.2-1-any.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocsolver-6.0.2-3-x86_64.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocthrust-6.0.2-1-x86_64.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocm-ml-sdk-6.0.2-1-any.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocblas-6.0.2-1-x86_64.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocrand-6.0.2-1-x86_64.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocfft-6.0.2-1-x86_64.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rccl-6.0.2-1-x86_64.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocm-hip-libraries-6.0.2-1-any.pkg.tar.zst
sudo pacman -U file:///var/cache/pacman/pkg/rocm-ml-libraries-6.0.2-1-any.pkg.tar.zst
Pinned Comments
wgottwalt commented on 2024-11-09 10:46 (UTC) (edited on 2024-11-26 15:23 (UTC) by wgottwalt)
Looks like the ROCm 6.2.2-1 SDK has a malfunctioning compiler. It produces a broken ollama binary (fp16 issues). You may need to stay with ROCm 6.0.2 for now. I don't know if this got fixed in a newer build release. But the initial SDK version "-1" is broken.
ROCm 6.2.4 fixes this issue completely.