LLaMA-Factory/examples/lora_single_gpu/README.md

6 lines
101 B
Markdown
Raw Normal View History

2024-03-04 19:16:35 +00:00
Usage:
- `pretrain.sh`
- `sft.sh` -> `reward.sh` -> `ppo.sh`
- `sft.sh` -> `dpo.sh` -> `predict.sh`