Model Parallelism
DeepSpeed-MII supports model parallelism via tensor parallelism for splitting models across multiple GPUs.
For model parallelism with Non-Persistent Pipelines, please see Pipeline Model Parallelism.
For model parallelism with Persistent Deployments, please see Persistent Deployment Model Parallelism.