1
0
mirror of https://github.com/fauxpilot/fauxpilot.git synced 2025-03-12 04:36:10 -07:00

14 Commits

Author SHA1 Message Date
Brendan Dolan-Gavitt
02f7887f17 Support newer upstream Triton
The main change is that the model config format has changed. To
deal with this we have a new script in the converter that will
upgrade a model to the new version.

Aside from that, we also no longer need to maintain our own fork
of Triton since they have fixed the bug with GPT-J models. This
should make it a lot easier to stay synced with upstream (although
we still have to build our own container since there doesn't seem
to be a prebuilt Triton+FT container hosted by NVIDIA).

Newer Triton should let us use some nice features:

- Support for more models, like GPT-NeoX
- Streaming token support (this still needs to be implemented in
  the proxy though)
- Dynamic batching

Still TODO:

- Proxy support for streaming tokens
- Add stuff to setup.sh and launch.sh to detect if a model upgrade
  is needed and do it automatically.
2023-02-13 16:30:28 -05:00
Parth Thakkar
4bf40cdb6c Some minor ergonomic changes for python backend
- Add validation rule to ensure  is set to fastertransformer or python-backend
- Add warning if model is unavailable, likely the user has not set  correctly

Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>
2023-01-02 18:54:51 +05:30
Parth Thakkar
e90f1c5dc2 Discard extra newlines
Doesn't seem necessary. Copilot works without these newlines

Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>
2022-12-24 02:58:31 +05:30
Parth Thakkar
566cf7a675 Fix json formatting issue with streaming
See this and the following comment for context:

https://github.com/fauxpilot/fauxpilot/issues/1\#issuecomment-1357174072

Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>
2022-12-24 02:50:06 +05:30
Parth Thakkar
f0a12b5e8e Merge branch 'main' into python_backend
Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>
2022-11-09 12:51:16 -06:00
Brendan Dolan-Gavitt
b7b85461af Reformat the error to match OpenAI's 2022-11-01 12:46:55 -04:00
Fred de Gier
2a91018792 Resolve merge conflicts and fix issues with setup.sh 2022-10-20 16:09:12 +02:00
Fred de Gier
e2486698e0 feat: Return a 400 if prompt exceeds max tokens 2022-10-20 14:56:39 +02:00
Parth Thakkar
01f1cbb629 Add python backend support
- Modify dockerfile to include bitsandbytes, transformers and latest version of pytorch
- Minor modifications in utils/codegen.py so that same client works with FT and Py-backend
- Minor modifications in launch.sh (no need to name models by GPU)
- Add installation script for adding a new python model (with super simple config_template)
- Modify setup.sh so that it aworks with both FT and Python backend models

Signed-off-by: Parth Thakkar <thakkarparth007@gmail.com>
2022-10-16 22:05:09 -05:00
Fred de Gier
de71bb6ff5 Resole merge conflicts 2022-10-03 14:27:32 +02:00
Fred de Gier
87f4f53e27 Simplify config and port handling 2022-10-03 14:13:10 +02:00
Fred de Gier
6f49915d2a Error handling 2022-10-03 11:47:54 +02:00
Rowe Wilson Frederisk Holme
2914ca278d
Fix error to_word_list_format is undefined
Another bug from .
2022-09-22 02:10:13 +08:00
Fred de Gier
8895b74238 Rewrite API to FastAPI, separate API from CodeGen, remove dev settings 2022-09-12 12:59:37 +02:00