❯ ls -lh /usr/bin/llama*
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-batched
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-batched-bench
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-bench
-rwxr-xr-x 1 root root 816M Nov 23 00:02 /usr/bin/llama-cli
-rwxr-xr-x 1 root root 814M Nov 23 00:02 /usr/bin/llama-convert-llama2c-to-ggml
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-cvector-generator
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-embedding
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-eval-callback
-rwxr-xr-x 1 root root 814M Nov 23 00:02 /usr/bin/llama-export-lora
-rwxr-xr-x 1 root root 814M Nov 23 00:02 /usr/bin/llama-gbnf-validator
-rwxr-xr-x 1 root root 111K Nov 23 00:02 /usr/bin/llama-gguf
-rwxr-xr-x 1 root root 131K Nov 23 00:02 /usr/bin/llama-gguf-hash
-rwxr-xr-x 1 root root 814M Nov 23 00:02 /usr/bin/llama-gguf-split
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-gritlm
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-imatrix
-rwxr-xr-x 1 root root 816M Nov 23 00:02 /usr/bin/llama-infill
-rwxr-xr-x 1 root root 816M Nov 23 00:02 /usr/bin/llama-llava-cli
-rwxr-xr-x 1 root root 816M Nov 23 00:02 /usr/bin/llama-lookahead
-rwxr-xr-x 1 root root 816M Nov 23 00:02 /usr/bin/llama-lookup
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-lookup-create
-rwxr-xr-x 1 root root 31K Nov 23 00:02 /usr/bin/llama-lookup-merge
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-lookup-stats
-rwxr-xr-x 1 root root 816M Nov 23 00:02 /usr/bin/llama-minicpmv-cli
-rwxr-xr-x 1 root root 816M Nov 23 00:02 /usr/bin/llama-parallel
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-passkey
-rwxr-xr-x 1 root root 816M Nov 23 00:02 /usr/bin/llama-perplexity
-rwxr-xr-x 1 root root 814M Nov 23 00:02 /usr/bin/llama-quantize
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-quantize-stats
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-retrieval
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-save-load-state
-rwxr-xr-x 1 root root 817M Nov 23 00:02 /usr/bin/llama-server
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-simple
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-simple-chat
-rwxr-xr-x 1 root root 816M Nov 23 00:02 /usr/bin/llama-speculative
-rwxr-xr-x 1 root root 815M Nov 23 00:02 /usr/bin/llama-tokenize
Search Criteria
Package Details: llama.cpp-cuda b5195-1
Package Actions
Git Clone URL: | https://aur.archlinux.org/llama.cpp-cuda.git (read-only, click to copy) |
---|---|
Package Base: | llama.cpp-cuda |
Description: | Port of Facebook's LLaMA model in C/C++ (with NVIDIA CUDA optimizations) |
Upstream URL: | https://github.com/ggerganov/llama.cpp |
Licenses: | MIT |
Conflicts: | libggml, llama.cpp |
Provides: | llama.cpp |
Submitter: | txtsd |
Maintainer: | txtsd |
Last Packager: | txtsd |
Votes: | 6 |
Popularity: | 0.42 |
First Submitted: | 2024-10-26 20:17 (UTC) |
Last Updated: | 2025-04-26 22:42 (UTC) |
Dependencies (13)
- blas-openblas
- blas64-openblas
- cuda (cuda11.1AUR, cuda-12.2AUR, cuda12.0AUR, cuda11.4AUR, cuda11.4-versionedAUR, cuda12.0-versionedAUR)
- curl (curl-gitAUR, curl-c-aresAUR)
- gcc-libs (gcc-libs-gitAUR, gccrs-libs-gitAUR, gcc11-libsAUR, gcc-libs-snapshotAUR)
- glibc (glibc-gitAUR, glibc-linux4AUR, glibc-eacAUR)
- openmp
- python (python37AUR, python311AUR, python310AUR)
- python-numpy (python-numpy-gitAUR, python-numpy1AUR, python-numpy-mkl-binAUR, python-numpy-mkl-tbbAUR, python-numpy-mklAUR)
- python-sentencepieceAUR (python-sentencepiece-gitAUR)
- cmake (cmake-gitAUR, cmake3AUR) (make)
- git (git-gitAUR, git-glAUR) (make)
- python-pytorch (python-pytorch-cxx11abiAUR, python-pytorch-cxx11abi-optAUR, python-pytorch-cxx11abi-cudaAUR, python-pytorch-cxx11abi-opt-cudaAUR, python-pytorch-cxx11abi-rocmAUR, python-pytorch-cxx11abi-opt-rocmAUR, python-pytorch-cuda, python-pytorch-opt, python-pytorch-opt-cuda, python-pytorch-opt-rocm, python-pytorch-rocm) (optional)
Required by (0)
Sources (4)
Latest Comments
« First ‹ Previous 1 2 3
Poscat commented on 2024-11-24 01:54 (UTC) (edited on 2024-11-24 01:54 (UTC) by Poscat)
txtsd commented on 2024-11-15 10:19 (UTC)
Wait can you show me individual file sizes? That's not right.
txtsd commented on 2024-11-15 10:18 (UTC)
@brauliobo Such is life with CUDA :( I can't even build this on my CI because of the size requirements
brauliobo commented on 2024-11-15 10:17 (UTC)
the build is taking 45gb! just the build/bin folder takes 38gb:
braulio @ whitebeast ➜ bin git:(master) pwd
/home/braulio/.cache/yay/llama.cpp-cuda/src/build/bin
braulio @ whitebeast ➜ bin git:(master) du -h --max-depth=1
38G .
braulio @ whitebeast ➜ bin git:(master) ls
llama-batched llama-gbnf-validator llama-lookup llama-quantize llama-vdot test-grammar-integration test-sampling
llama-batched-bench llama-gguf llama-lookup-create llama-quantize-stats rpc-server test-grammar-parser test-tokenizer-0
llama-bench llama-gguf-hash llama-lookup-merge llama-retrieval test-arg-parser test-json-schema-to-grammar test-tokenizer-1-bpe
llama-cli llama-gguf-split llama-lookup-stats llama-save-load-state test-autorelease test-llama-grammar test-tokenizer-1-spm
llama-convert-llama2c-to-ggml llama-gritlm llama-minicpmv-cli llama-server test-backend-ops test-log
llama-cvector-generator llama-imatrix llama-parallel llama-simple test-barrier test-model-load-cancel
llama-embedding llama-infill llama-passkey llama-simple-chat test-c test-quantize-fns
llama-eval-callback llama-llava-cli llama-perplexity llama-speculative test-chat-template test-quantize-perf
llama-export-lora llama-lookahead llama-q8dot llama-tokenize test-grad0 test-rope
Pinned Comments
txtsd commented on 2024-10-26 20:17 (UTC) (edited on 2024-12-06 14:15 (UTC) by txtsd)
Alternate versions
llama.cpp
llama.cpp-vulkan
llama.cpp-sycl-fp16
llama.cpp-sycl-fp32
llama.cpp-cuda
llama.cpp-cuda-f16
llama.cpp-hip