Environment and Code Preparation

1. Image Preparation

ChatLearn supports vLLM and SGLang as backend frameworks for Rollout generation. Depending on the chosen Rollout backend, you can select the appropriate Docker image for your experiments.

vLLM

You can prepare the image by referring to Dockerfile.torch2.6.0.vllm085. Alternatively, you can directly pull and use the following image:

dsw-registry.cn-shanghai.cr.aliyuncs.com/pai-training-algorithm/chatlearn:torch2.6.0-vllm0.8.5-ubuntu24.04-cuda12.6-py312

SGLang

You can prepare the image by referring to Dockerfile.torch2.8.0.sglang052. Alternatively, you can directly pull and use the following image:

dsw-registry.cn-shanghai.cr.aliyuncs.com/pai-training-algorithm/chatlearn:torch2.8.0-sglang0.5.3-ubuntu24.04-cuda12.6-py312

Image History

Image URL

Pkg Version

Model List

dsw-registry.cn-shanghai.cr.aliyuncs.com/pai-training-algorithm/chatlearn:torch2.8.0-sglang0.5.3-ubuntu24.04-cuda12.6-py312

sglang 0.5.3.post1
transformers 4.57.0

Qwen3-VL
Qwen2.5-VL
Qwen3
Qwen2.5

dsw-registry.cn-shanghai.cr.aliyuncs.com/pai-training-algorithm/chatlearn:torch2.8.0-sglang0.5.3-ubuntu24.04-cuda12.6-py312

sglang 0.5.2
transformers 4.56.1

Qwen2.5-VL
Qwen3
Qwen2.5

dsw-registry.cn-shanghai.cr.aliyuncs.com/pai-training-algorithm/chatlearn:torch2.6.0-vllm0.8.5-te2.7-ubuntu24.04-cuda12.6-py312

vllm 0.8.5
transformer_engine 2.7

Moonlight
Deepseek-r1

dsw-registry.cn-shanghai.cr.aliyuncs.com/pai-training-algorithm/chatlearn:torch2.6.0-vllm0.8.5-ubuntu24.04-cuda12.6-py312

vllm 0.8.5
transformers 4.51.3

Qwen2.5-VL
Qwen3
Qwen2.5

2. Code Preparation

# Download ChatLearn code
git clone https://github.com/alibaba/ChatLearn.git

If you choose Megatron as the training framework, you need to download Pai-Megatron-Patch.

# Download Megatron-LM
git clone --recurse-submodules https://github.com/alibaba/Pai-Megatron-Patch.git

If you encounter network connectivity issues with GitHub, you can alternatively download our pre-prepared Pai-Megatron-Patch archive using the following command:
wget https://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/csrc/Pai-Megatron-Patch.tar && tar -xvf Pai-Megatron-Patch.tar

3. Running Reinforcement Learning Experiments

ChatLearn supports FSDP and Megatron as training backends. Please refer to the following tutorials for detailed instructions: