Profile¶
ChatLearn provides two ways to profile performance:
Torch profiler
nsys Note: For large models, the profile result can be very large. It is recommended to reduce the model size when profiling.
Torch Profiler¶
Users can enable the Torch profiler by configuring the rlhf setting profiler_dir: path_to_profile_dir
in the main configuration file of the system.
profiler_dir: path_to_profile_dir
nsys¶
Users can enable the nsys profiler by configuring the rlhf setting nsys: True
in the main configuration file of the system.
runtime:
nsys: True
When launching the program, nsys startup parameters need to be added before the execution command, as shown in the following example:
nsys profile -w true -t cuda,nvtx,osrt,cudnn,cublas -s none --capture-range=cudaProfilerApi --capture-range-end=stop-shutdown --cudabacktrace=true -x true --force-overwrite true -o my_profile \
python train_rlhf.py XXX