fauxpilot

mirror of https://github.com/fauxpilot/fauxpilot.git synced 2025-03-12 04:36:10 -07:00

This page documents the GPU configurations that are known to work (or not work) and what models they are known to work with. If you have used FauxPilot with a GPU and model configuration that's not listed in the table below, please add it!

GPU Model	VRAM	Number of GPUs	Working?	Models	Notes	Checker
NVIDIA Tesla A100	80GB	1	Yes	350M, 2B, 6B, 16B		@leemgs
NVIDIA Tesla A100	40GB	1	Yes	16B		@Ghost-Assassin
NVIDIA Tesla T4	16GB	1, 2, 4	Yes	350M, 2B, 6B		@Jaeker0512
NVIDIA Tesla V100	16GB	1	Yes	350M, 6B	Out of memory when tried 16B-multi-2gpu on 2 such GPUs	@askoldilvento
NVIDIA RTX A6000	48GB	1, 2, 4	Yes	350M, 2B, 6B, 16B		@moyix
NVIDIA RTX A4000	16GB	1	Yes	6B		@grantharris33
NVIDIA RTX 4090	24GB	1	Yes	6B		@TK009
NVIDIA RTX 4070 Ti	12GB	1	Yes	2B		@Stonley890
NVIDIA RTX 3090	24GB	1	Yes	2B		@152334H
NVIDIA RTX 3090	24GB	1	Yes	6B	Docker-in-WSL2 & fauxpilot-windows	@Frederisk
NVIDIA RTX 3090	24GB	1	Yes	6B	podman in Linux [tiny tweaks needed]	@mormegil-cz
NVIDIA RTX 3080Ti	12GB	1	Yes	350M, 2B		@???
NVIDIA RTX 3070Ti	8GB	1	Yes	350M, 2B	Tested in Docker-in-WSL2	@m5kro
NVIDIA RTX 3060Ti	8GB	1	Yes	350M, 2B	Tested in Docker-in-WSL	@dewacandra4
NVIDIA RTX 2080Ti	12GB	1	Yes	350M, 2B, 6B		@leemgs
NVIDIA RTX 2080	8GB	1	Yes	350M	Docker-in-WSL, Windows 10, 16GB, slow	@enoris75
NVIDIA RTX 2070 SUPER	8GB	1	Yes	350M, 2B	Tested in Docker-in-WSL	@SoulRaven80
NVIDIA RTX 2060 SUPER	8GB	1	Yes	2B		@xjtu-blacksmith
NVIDIA RTX 2060 XC	12GB	1	Yes	2B	Tested in Docker-in-WSL	@azeemba
NVIDIA GTX 1080Ti	11GB	1	Yes	350M, 2B		@???
NVIDIA GTX 1060	6GB	1	Yes	350M	Docker-in-WSL2 & fauxpilot-windows	@Frederisk
NVIDIA GTX 1060	6GB	1	Yes	350M	Linux as is	@billyblackburn
NVIDIA Titan Xp	12GB	1	Yes	350M, 2B, 6B		@leemgs
AMD RX6800XT	16GB	1	Yes	2B	Python Backend Only. Used hack as https://github.com/fauxpilot/fauxpilot/discussions/81#discussioncomment-5785300. triton transformer backend will crash.	@Ghost-Assassin
NVIDIA Quadro T2000	36GB (4GB dedicated, 32GB shared)	1	Yes	350M, 2B, 6B	~8s for 10 solution on smallest model. All others using shared memory are extremely slow, 16B loads but doesn't work	@MikeS159