DeepSeek-R1-Distill
论文
DeepSeek-R1
模型结构
该算法共有三种模型,分别是LLama3.1,LLama3.3和Qwen2.5,三者都是decoder-only结构。
算法原理
DeepSeek-R1-Distill-model基于目前性能较好的开源模型,使用DeepSeek-R1
生成的高质量数据进行监督微调(SFT)获得。
环境配置
Docker(方法一)
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-ubuntu22.04-dtk24.04.3-py3.10
docker run --shm-size 500g --network=host --name=dpskv3 --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash
pip install https://download.sourcefind.cn:65024/directlink/4/lmslim/DAS1.3/lmslim-0.1.2+das.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl
pip install https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.3/vllm-0.6.2+das.opt1.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl
Dockerfile(方法二)
docker build -t <IMAGE_NAME>:<TAG> .
docker run --shm-size 500g --network=host --name=dpskv3 --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash
pip install https://download.sourcefind.cn:65024/directlink/4/lmslim/DAS1.3/lmslim-0.1.2+das.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl
pip install https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.3/vllm-0.6.2+das.opt1.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl
数据集
数据需要使用DeepSeek-R1
获取,本项目提供一个示例数据集用于测试,见examples/toy.json
训练
可使用LLaMA-Factory训练,安装方法如下
git clone http://developer.sourcefind.cn/codes/OpenDAS/llama-factory.git
cd llama-factory && pip install -e ".[torch,metrics]"
SFT
deepseek_r1_distill.yaml
# 单机N卡训练配置(按需修改)
model_name_or_path: /path/to/your/model
stage: sft
do_train: true
finetuning_type: full
deepspeed: examples/deepspeed/ds_z3_config.json
dataset: deepseek-r1_distill
template: qwen
cutoff_len: 2048
max_samples: 5000
overwrite_cache: true
preprocessing_num_workers: 16
output_dir: /path/to/save/checkpoints
logging_steps: 10
save_steps: 500
overwrite_output_dir: true
per_device_train_batch_size: 1
gradient_accumulation_steps: 4
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 1800
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
cd llama-factory
llamafactory-cli train /path/to/deepseek_r1_distill.yaml
推理
vllm服务
vllm serve /path/to/distill_model --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager
curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "model_id",
"prompt": "your prompt",
"max_tokens": 512,
"temperature": 0
}'
result
curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "/home/modelzoo/DeepSeek-R1-Distill-Qwen-14B/",
"prompt": "甲乙两班共有学生98人,甲班比乙班多6人,求两班各有多少人?",
"max_tokens": 300,
"temperature": 0
}'
{"id":"cmpl-5473237b46054a98ba27906a4b099e33","object":"text_completion","created":1737515343,"model":"/home/modelzoo/DeepSeek-R1-Distill-Qwen-14B/","choices":[{"index":0,"text":"(用方程解)\n\n首先,设乙班有x人,那么甲班就有x + 6人。\n\n根据总人数,可以列出方程:x + (x + 6) = 98。\n\n解这个方程,得到x = 41。\n\n因此,乙班有41人,甲班有47人。\n</think>\n\n**解答:**\n\n设乙班有 \\( x \\) 人,则甲班有 \\( x + 6 \\) 人。\n\n根据题意,两班共有学生98人,可以列出方程:\n\n\\[\nx + (x + 6) = 98\n\\]\n\n解方程:\n\n\\[\n2x + 6 = 98\n\\]\n\n\\[\n2x = 98 - 6\n\\]\n\n\\[\n2x = 92\n\\]\n\n\\[\nx = 46\n\\]\n\n因此,乙班有46人,甲班有:\n\n\\[\nx + 6 = 46 + 6 = 52\n\\]\n\n**答案:**\n\n甲班有 \\(\\boxed{52}\\) 人,乙班有 \\(\\boxed{46}\\) 人。","logprobs":null,"finish_reason":"stop","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":27,"total_tokens":285,"completion_tokens":258}}
精度
与Nvidia GPU保持一致。
应用场景
算法类别
对话问答
热点应用行业
电商,教育,广媒
预训练权重
model | 下载地址 |
---|---|
DeepSeek-R1-Distill-Qwen-1.5B | huggingface | SCNet高速下载通道 |
DeepSeek-R1-Distill-Qwen-7B | huggingface | SCNet高速下载通道 |
DeepSeek-R1-Distill-Llama-8B | huggingface | SCNet高速下载通道 |
DeepSeek-R1-Distill-Qwen-14B | huggingface | SCNet高速下载通道 |
DeepSeek-R1-Distill-Qwen-32B | huggingface | SCNet高速下载通道 |
DeepSeek-R1-Distill-Llama-70B | huggingface | SCNet高速下载通道 |