fauxpilot

mirror of https://github.com/fauxpilot/fauxpilot.git synced 2025-03-12 04:36:10 -07:00

Author	SHA1	Message	Date
Brendan Dolan-Gavitt	02f7887f17	Support newer upstream Triton The main change is that the model config format has changed. To deal with this we have a new script in the converter that will upgrade a model to the new version. Aside from that, we also no longer need to maintain our own fork of Triton since they have fixed the bug with GPT-J models. This should make it a lot easier to stay synced with upstream (although we still have to build our own container since there doesn't seem to be a prebuilt Triton+FT container hosted by NVIDIA). Newer Triton should let us use some nice features: - Support for more models, like GPT-NeoX - Streaming token support (this still needs to be implemented in the proxy though) - Dynamic batching Still TODO: - Proxy support for streaming tokens - Add stuff to setup.sh and launch.sh to detect if a model upgrade is needed and do it automatically.	2023-02-13 16:30:28 -05:00
Parth Thakkar	4bf40cdb6c	Some minor ergonomic changes for python backend - Add validation rule to ensure is set to fastertransformer or python-backend - Add warning if model is unavailable, likely the user has not set correctly Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>	2023-01-02 18:54:51 +05:30
Parth Thakkar	e90f1c5dc2	Discard extra newlines Doesn't seem necessary. Copilot works without these newlines Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>	2022-12-24 02:58:31 +05:30
Parth Thakkar	566cf7a675	Fix json formatting issue with streaming See this and the following comment for context: https://github.com/fauxpilot/fauxpilot/issues/1\#issuecomment-1357174072 Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>	2022-12-24 02:50:06 +05:30
Parth Thakkar	f0a12b5e8e	Merge branch 'main' into python_backend Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>	2022-11-09 12:51:16 -06:00
Brendan Dolan-Gavitt	b7b85461af	Reformat the error to match OpenAI's	2022-11-01 12:46:55 -04:00
Fred de Gier	2a91018792	Resolve merge conflicts and fix issues with setup.sh	2022-10-20 16:09:12 +02:00
Fred de Gier	e2486698e0	feat: Return a 400 if prompt exceeds max tokens	2022-10-20 14:56:39 +02:00
Parth Thakkar	01f1cbb629	Add python backend support - Modify dockerfile to include bitsandbytes, transformers and latest version of pytorch - Minor modifications in utils/codegen.py so that same client works with FT and Py-backend - Minor modifications in launch.sh (no need to name models by GPU) - Add installation script for adding a new python model (with super simple config_template) - Modify setup.sh so that it aworks with both FT and Python backend models Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>	2022-10-16 22:05:09 -05:00
Fred de Gier	de71bb6ff5	Resole merge conflicts	2022-10-03 14:27:32 +02:00
Fred de Gier	87f4f53e27	Simplify config and port handling	2022-10-03 14:13:10 +02:00
Fred de Gier	6f49915d2a	Error handling	2022-10-03 11:47:54 +02:00
Rowe Wilson Frederisk Holme	2914ca278d	Fix error `to_word_list_format` is undefined Another bug from #49.	2022-09-22 02:10:13 +08:00
Fred de Gier	8895b74238	Rewrite API to FastAPI, separate API from CodeGen, remove dev settings	2022-09-12 12:59:37 +02:00

14 Commits