Model Parallelism

DeepSpeed-MII supports model parallelism via tensor parallelism for splitting models across multiple GPUs.

For model parallelism with Non-Persistent Pipelines, please see Pipeline Model Parallelism.

For model parallelism with Persistent Deployments, please see Persistent Deployment Model Parallelism.