Make small changes to the simple workflows you use to host your models at REST endpoints via Domino Model APIs to host the same Model APIs at AWS Sagemaker inference endpoints.
Domino provides easy-to-use APIs that abstract away the complexities of packaging a container image. With a single invocation of the API, Domino bundles the model, prediction code, dependent files and packages, and the base environment into a container image that is AWS SageMaker-compliant.
This API builds a Docker image for a given version of a Model API in an AWS Sagemaker-compliant format and exports it to AWS ECR or any third-party container registry outside Domino. AWS SageMaker then deploys this model container image to serve requests.
As part of the API request, you provide credentials for the registry. These credentials are not saved in Domino and optionally work within a time-to-live (TTL) duration.
The API changes your Model API image to comply with SageMaker inference requirements:
-
An 'ENTRYPOINT instruction' is added to the Docker image exported to Sagemaker. To execute this in AWS, run it with
'docker run image serve'
. -
An
/invocations
endpoint is added to the container; this endpoint serves the model predictions. -
A
/ping
endpoint is added to the container. It allows the hosted model to respond to periodic GET requests that track the state of the endpoint. -
The container’s port is bound to 8080, as required by SageMaker for serving predictions.
POST /:modelId/:modelVersionId/exportImageForSagemaker
{
"registryUrl": "ECR_URL_in_your_AWS_account", // ex. 1234567890.dkr.ecr.us-east-2.amazonaws.com
"repository": "my_repo", // name of the repository in ECR
"tag": "tag1", // (Optional)
"username": "AWS", // (Optional)
"password": "xxtokenxx" // (Optional)
}
{
"modelId": "5555dbbd017416e4515555",
"modelVersionId": "3333dbbd017416e4513333",
"exportId": "3333dbbd017416e4513333",
"status": "preparing"
}
When you export a Domino Model to AWS SageMaker and create an endpoint from the exported Model, the endpoint might fail with the error "The primary container for production variant variant-name-1 did not pass the ping health check."
Use these steps to work around the issue:
-
Add
USER root
in the environment Dockerfile instructions. -
Publish the Model API from the new instructions.
-
Export the same image in AWS SageMaker and create an endpoint.