mirror of
https://github.com/fauxpilot/fauxpilot.git
synced 2025-03-12 04:36:10 -07:00
30
GPU Support Matrix
Stonley edited this page 2024-11-21 22:33:15 -08:00
This page documents the GPU configurations that are known to work (or not work) and what models they are known to work with. If you have used FauxPilot with a GPU and model configuration that's not listed in the table below, please add it!
GPU Model | VRAM | Number of GPUs | Working? | Models | Notes | Checker |
---|---|---|---|---|---|---|
NVIDIA Tesla A100 | 80GB | 1 | Yes | 350M, 2B, 6B, 16B | @leemgs | |
NVIDIA Tesla A100 | 40GB | 1 | Yes | 16B | @Ghost-Assassin | |
NVIDIA Tesla T4 | 16GB | 1, 2, 4 | Yes | 350M, 2B, 6B | @Jaeker0512 | |
NVIDIA Tesla V100 | 16GB | 1 | Yes | 350M, 6B | Out of memory when tried 16B-multi-2gpu on 2 such GPUs | @askoldilvento |
NVIDIA RTX A6000 | 48GB | 1, 2, 4 | Yes | 350M, 2B, 6B, 16B | @moyix | |
NVIDIA RTX A4000 | 16GB | 1 | Yes | 6B | @grantharris33 | |
NVIDIA RTX 4090 | 24GB | 1 | Yes | 6B | @TK009 | |
NVIDIA RTX 4070 Ti | 12GB | 1 | Yes | 2B | @Stonley890 | |
NVIDIA RTX 3090 | 24GB | 1 | Yes | 2B | @152334H | |
NVIDIA RTX 3090 | 24GB | 1 | Yes | 6B | Docker-in-WSL2 & fauxpilot-windows | @Frederisk |
NVIDIA RTX 3090 | 24GB | 1 | Yes | 6B | podman in Linux [tiny tweaks needed] | @mormegil-cz |
NVIDIA RTX 3080Ti | 12GB | 1 | Yes | 350M, 2B | @??? | |
NVIDIA RTX 3070Ti | 8GB | 1 | Yes | 350M, 2B | Tested in Docker-in-WSL2 | @m5kro |
NVIDIA RTX 3060Ti | 8GB | 1 | Yes | 350M, 2B | Tested in Docker-in-WSL | @dewacandra4 |
NVIDIA RTX 2080Ti | 12GB | 1 | Yes | 350M, 2B, 6B | @leemgs | |
NVIDIA RTX 2080 | 8GB | 1 | Yes | 350M | Docker-in-WSL, Windows 10, 16GB, slow | @enoris75 |
NVIDIA RTX 2070 SUPER | 8GB | 1 | Yes | 350M, 2B | Tested in Docker-in-WSL | @SoulRaven80 |
NVIDIA RTX 2060 SUPER | 8GB | 1 | Yes | 2B | @xjtu-blacksmith | |
NVIDIA RTX 2060 XC | 12GB | 1 | Yes | 2B | Tested in Docker-in-WSL | @azeemba |
NVIDIA GTX 1080Ti | 11GB | 1 | Yes | 350M, 2B | @??? | |
NVIDIA GTX 1060 | 6GB | 1 | Yes | 350M | Docker-in-WSL2 & fauxpilot-windows | @Frederisk |
NVIDIA GTX 1060 | 6GB | 1 | Yes | 350M | Linux as is | @billyblackburn |
NVIDIA Titan Xp | 12GB | 1 | Yes | 350M, 2B, 6B | @leemgs | |
AMD RX6800XT | 16GB | 1 | Yes | 2B | Python Backend Only. Used hack as https://github.com/fauxpilot/fauxpilot/discussions/81#discussioncomment-5785300. triton transformer backend will crash. | @Ghost-Assassin |
NVIDIA Quadro T2000 | 36GB (4GB dedicated, 32GB shared) | 1 | Yes | 350M, 2B, 6B | ~8s for 10 solution on smallest model. All others using shared memory are extremely slow, 16B loads but doesn't work | @MikeS159 |