Arch Linux User Repository

Search Criteria

Enter search criteria

Search by

Keywords

Out of Date

Sort by

Sort order

Per page

Package Details: llama.cpp-cuda b5061-1

Package Actions

Git Clone URL:	https://aur.archlinux.org/llama.cpp-cuda.git (read-only, click to copy)
Package Base:	llama.cpp-cuda
Description:	Port of Facebook's LLaMA model in C/C++ (with NVIDIA CUDA optimizations)
Upstream URL:	https://github.com/ggerganov/llama.cpp
Licenses:	MIT
Conflicts:	libggml, llama.cpp
Provides:	llama.cpp
Submitter:	txtsd
Maintainer:	txtsd
Last Packager:	txtsd
Votes:	6
Popularity:	0.64
First Submitted:	2024-10-26 20:17 (UTC)
Last Updated:	2025-04-06 14:52 (UTC)

Dependencies (13)

blas-openblas
blas64-openblas
cuda (cuda11.1^AUR, cuda-12.2^AUR, cuda12.0^AUR, cuda11.4^AUR, cuda11.4-versioned^AUR, cuda12.0-versioned^AUR)
curl (curl-git^AUR, curl-c-ares^AUR)
gcc-libs (gcc-libs-git^AUR, gccrs-libs-git^AUR, gcc11-libs^AUR, gcc-libs-snapshot^AUR)
glibc (glibc-git^AUR, glibc-linux4^AUR, glibc-eac^AUR)
openmp
python (python37^AUR, python311^AUR, python310^AUR)
python-numpy (python-numpy-git^AUR, python-numpy1^AUR, python-numpy-mkl-bin^AUR, python-numpy-mkl-tbb^AUR, python-numpy-mkl^AUR)
python-sentencepiece^AUR (python-sentencepiece-git^AUR)
cmake (cmake-git^AUR, cmake3^AUR) (make)
git (git-git^AUR, git-gl^AUR) (make)
python-pytorch (python-pytorch-cxx11abi^AUR, python-pytorch-cxx11abi-opt^AUR, python-pytorch-cxx11abi-cuda^AUR, python-pytorch-cxx11abi-opt-cuda^AUR, python-pytorch-cxx11abi-rocm^AUR, python-pytorch-cxx11abi-opt-rocm^AUR, python-pytorch-cuda, python-pytorch-opt, python-pytorch-opt-cuda, python-pytorch-opt-rocm, python-pytorch-rocm) (optional)

Required by (0)

Sources (4)

Pinned Comments

txtsd commented on 2024-10-26 20:17 (UTC) (edited on 2024-12-06 14:15 (UTC) by txtsd)

Alternate versions

llama.cpp
llama.cpp-vulkan
llama.cpp-sycl-fp16
llama.cpp-sycl-fp32
llama.cpp-cuda
llama.cpp-cuda-f16
llama.cpp-hip

Latest Comments

« First ‹ Previous 1 2 3 Next › Last »

txtsd commented on 2024-12-02 02:25 (UTC)

I'll give it a look later today and see if a newer package is warranted in that case. Thanks for your input!

v1993 commented on 2024-12-01 14:53 (UTC)

To be honest, I'm not 100% sure (it's a pretty old option and tacking down its origins is kinda tricky), but I'd expect at least a performance degradation on older GPUs (Nvidia used to be really bad at fp16 on older architectures).

txtsd commented on 2024-12-01 14:38 (UTC)

@v1993 Does that have to be a separate package, or will making the change in this package suffice without breaking things for users of older GPUs?

v1993 commented on 2024-12-01 14:29 (UTC)

Would it be possible to have a package version with GGML_CUDA_F16 enabled? It's a nice performance boost on newer GPUs. Thank you for your work on this package!

Poscat commented on 2024-11-28 09:46 (UTC)

@txtsd thank you

txtsd commented on 2024-11-25 07:05 (UTC)

Builds are not static anymore, and the service file has been fixed.

txtsd commented on 2024-11-24 03:16 (UTC)

@Poscat Thank you for your input! The service file was inherited from a previous version and maintainer of the package. I admit that the service was not tested.

The static builds were created to allow for side-by-side installation with whisper.cpp, since they both install libggml files.

Poscat commented on 2024-11-24 03:12 (UTC)

diff --git a/llama.cpp.service b/llama.cpp.service
index 4678d85..be89f9b 100644
--- a/llama.cpp.service
+++ b/llama.cpp.service
@@ -7,7 +7,7 @@ Type=simple
 EnvironmentFile=/etc/conf.d/llama.cpp
 ExecStart=/usr/bin/llama-server $LLAMA_ARGS
 ExecReload=/bin/kill -s HUP $MAINPID
-Restart=never
+Restart=no

 [Install]
 WantedBy=multi-user.target

Also your sysetmd service file is wrong. Did you even test your package?

Poscat commented on 2024-11-24 03:10 (UTC) (edited on 2024-11-24 03:10 (UTC) by Poscat)

diff --git a/PKGBUILD b/PKGBUILD
index ad448a7..3fdc20f 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -50,7 +50,8 @@ build() {
   local _cmake_options=(
     -B build
     -S "${_pkgname}"
-    -DCMAKE_BUILD_TYPE=None
+    -DCMAKE_BUILD_TYPE=MinSizeRel
+    -DCMAKE_CUDA_ARCHITECTURES=native
     -DCMAKE_INSTALL_PREFIX='/usr'
     -DGGML_NATIVE=OFF
     -DGGML_AVX2=OFF
@@ -59,8 +60,8 @@ build() {
     -DGGML_FMA=OFF
     -DGGML_ALL_WARNINGS=OFF
     -DGGML_ALL_WARNINGS_3RD_PARTY=OFF
-    -DBUILD_SHARED_LIBS=OFF
-    -DGGML_STATIC=ON
+    -DBUILD_SHARED_LIBS=ON
+    -DGGML_STATIC=OFF
     -DGGML_LTO=ON
     -DGGML_RPC=ON
     -DLLAMA_CURL=ON
@@ -75,7 +76,6 @@ build() {
 package() {
   DESTDIR="${pkgdir}" cmake --install build
   rm "${pkgdir}/usr/include/"ggml*
-  rm "${pkgdir}/usr/lib/"lib*.a

   install -Dm644 "${_pkgname}/LICENSE" "${pkgdir}/usr/share/licenses/${pkgname}/LICENSE"

this patch reduces the package size from 37G to 82M

Poscat commented on 2024-11-24 01:55 (UTC)

Maybe don't enable static linking? IDK

« First ‹ Previous 1 2 3 Next › Last »