Model Replicas

DeepSpeed-MII supports creating multiple replicas of a model with Persistent Deployments. Please see Persistent Deployment Model Replicas.