fauxpilot

mirror of https://github.com/fauxpilot/fauxpilot.git synced 2025-03-12 04:36:10 -07:00

History

Brendan Dolan-Gavitt 02f7887f17 Support newer upstream Triton

The main change is that the model config format has changed. To
deal with this we have a new script in the converter that will
upgrade a model to the new version.

Aside from that, we also no longer need to maintain our own fork
of Triton since they have fixed the bug with GPT-J models. This
should make it a lot easier to stay synced with upstream (although
we still have to build our own container since there doesn't seem
to be a prebuilt Triton+FT container hosted by NVIDIA).

Newer Triton should let us use some nice features:

- Support for more models, like GPT-NeoX
- Streaming token support (this still needs to be implemented in
  the proxy though)
- Dynamic batching

Still TODO:

- Proxy support for streaming tokens
- Add stuff to setup.sh and launch.sh to detect if a model upgrade
  is needed and do it automatically.

2023-02-13 16:30:28 -05:00

models

Initial commit

2022-08-02 21:47:27 -04:00

codegen_gptj_convert.py

Support newer upstream Triton

2023-02-13 16:30:28 -05:00

config_template.pbtxt

forgot to add the config template

2022-08-28 17:03:19 -04:00

Dockerfile

Initial commit

2022-08-02 21:47:27 -04:00

download_and_convert_model.sh

Initial commit

2022-08-02 21:47:27 -04:00

huggingface_gptj_convert.py

Initial commit

2022-08-02 21:47:27 -04:00

model_config_pb2.py

Support newer upstream Triton

2023-02-13 16:30:28 -05:00

README.md

doc: Added manual on converter script files

2022-11-08 19:21:37 +09:00

triton_config_gen.py

small fixes to the triton config gen script

2023-01-11 20:43:18 -05:00

upgrade_model_config.py

Support newer upstream Triton

2023-02-13 16:30:28 -05:00

README.md

This section describes the Python scripts necessary for converting deep learning model files:

Dockerfile: A Docker file used to construct an image based on Ubuntu 20.04 that includes the Transformer library.
- download_and_convert_model.sh: A shell script that converts model codegen-6B-multi with the provided number of GPUs.
  - codegen_gptj_convert.py: A Python script for converting SalesForce CodeGen models to GPT-J (e.g., Salesforce/codegen-350M-multi).
  - huggingface_gptj_convert.py: A Python script for converting the HF model to the GPT-J format (e.g., GPTJForCausalLM model)
triton_config_gen.py: A Python script that creates a config and weight file for running a Codgen model with Triton.
- config_template.pbtxt: A template file for defining the config file's data format.