hiyouga
17bf8a2c3a
support ORPO
2024-03-31 18:29:50 +08:00
li.yunhao
9c2ef9cdf4
fix pile datset hf hub url
2024-03-30 16:06:10 +08:00
hiyouga
3271af2afc
add orca_dpo_pairs dataset
2024-03-20 20:09:06 +08:00
SirlyDreamer
e165965341
Follow HF_ENDPOINT environment variable
2024-03-20 08:31:30 +00:00
hiyouga
be99799413
update parser
2024-03-10 13:35:20 +08:00
hiyouga
894d183214
update readme, add starcoder2, cosmopedia
2024-03-03 01:01:46 +08:00
hiyouga
32884523c5
update data
2024-03-02 19:37:18 +08:00
hiyouga
1630a4cb8f
fix #2533
2024-02-21 22:47:48 +08:00
hiyouga
22acab8aff
fix #2481
2024-02-15 19:07:47 +08:00
hiyouga
a754f6e9ec
update data/readme
2024-02-10 21:04:29 +08:00
hiyouga
7d2dc83c5e
improve aligner
2024-02-10 16:39:19 +08:00
Mark Mueller
1d3598afa1
Slim Orca data parsing
2024-02-08 19:32:20 +01:00
Johann-Peter Hartmann
49c69ea4b9
WS fix
2024-02-06 20:13:04 +01:00
Johann-Peter Hartmann
1126563505
add ranking to dpo dataset
2024-02-06 20:12:36 +01:00
Johann-Peter Hartmann
870182c3a9
remove comma
2024-02-03 08:48:39 +01:00
Johann-Peter Hartmann
4e27950acb
Merge branch 'hiyouga:main' into main
2024-01-31 14:05:52 +01:00
hiyouga
521ad76552
fix autoset attn impl, update data readme
2024-01-31 11:58:07 +08:00
Johann-Peter Hartmann
d9a8301ed4
Add support for german datasets
2024-01-30 10:18:01 +01:00
hiyouga
dbaaa4546e
Update dataset_info.json
2024-01-23 00:10:32 +08:00
hiyouga
b2fb0eca56
fix #2282 and update tool prompt
2024-01-22 22:27:30 +08:00
hiyouga
486cc8d360
add array param format
2024-01-21 22:17:48 +08:00
hiyouga
487dee066f
fix dataset
2024-01-18 12:59:30 +08:00
hiyouga
f1067d2b58
enable cutoff len
2024-01-18 12:25:42 +08:00
hiyouga
d9f1cae351
support function calling
2024-01-18 09:54:23 +08:00
hiyouga
5b93d545e2
tiny update
2023-12-25 18:29:34 +08:00
hiyouga
709ac8870a
add models
2023-12-18 19:09:31 +08:00
hiyouga
71389be37c
support autogptq in llama board #246
2023-12-16 16:31:30 +08:00
hiyouga
0a9c6e0146
support system column #1765
2023-12-12 19:45:59 +08:00
hiyouga
d5b2c57a35
fix modelscope data hub
2023-12-12 18:33:06 +08:00
hoshi-hiyouga
6382efec52
Merge branch 'main' into feat/support_ms
2023-12-12 17:55:32 +08:00
xingjun.wang
e80a989d49
modify guanaco
2023-12-12 15:00:37 +08:00
xingjun.wang
73b50a26b9
update dataset info
2023-12-12 14:53:59 +08:00
xingjun.wang
09533e95ed
update args for MsDataset.load
2023-12-12 13:02:54 +08:00
xingjun.wang
fe4acc66b0
add new datasets
2023-12-12 12:44:15 +08:00
xingjun.wang
0ce18a3782
add open orca
2023-12-12 12:34:04 +08:00
hiyouga
28d5de7e78
fix #1784
2023-12-09 20:53:18 +08:00
yuze.zyz
e4cf2a75ca
fix typo
2023-12-08 18:13:26 +08:00
yuze.zyz
9c2247d700
support ms dataset
2023-12-08 18:00:57 +08:00
hiyouga
bf6f6aeefe
fix #1696
2023-12-01 15:34:50 +08:00
Marco
9468ee9012
Update dataset_info.json
...
Added the Nectar dataset already preprocessed and divided in sft and rl to which I added a preprompt to each instruction since it has been seen that this increase instruction following
2023-11-30 16:21:34 +01:00
hiyouga
7b1aa6f63c
update dataset
2023-11-17 23:19:12 +08:00
hiyouga
ce78303600
support full-parameter PPO
2023-11-16 02:08:04 +08:00
hiyouga
386f590209
add template, modify datasets
2023-11-09 15:53:23 +08:00
hiyouga
2b5e33c338
update data readme
2023-11-03 00:15:23 +08:00
hiyouga
cc8ffa10d8
update data readme (zh)
2023-11-02 23:42:49 +08:00
hiyouga
a837172413
support sharegpt format, add datasets
2023-11-02 23:10:04 +08:00
hiyouga
026af87e7f
add MathInstruct dataset
2023-09-13 22:30:14 +08:00
hiyouga
a9d1fb72f7
refactor dataset_attr, add eos in pt, fix #757
2023-09-01 19:00:45 +08:00
codemayq
604f85487b
add ad gen dataset
2023-08-27 20:35:32 +08:00
codemayq
cece66d48a
add readme for dataset
2023-08-23 19:55:45 +08:00