module
Version:
v0.35.0
Opens a new window with list of versions in this module.
Published: May 11, 2021
License: Apache-2.0
Opens a new window with license information.
README
¶
Website • Slack • Docs
Scale compute-intensive serverless workloads
Cortex is a Kubernetes-based serverless platform built for AWS.
Deploy realtime, batch, and async workloads
- Realtime - realtime APIs respond to requests in real-time and autoscale based on in-flight request volumes.
- Batch - batch APIs run distributed and fault-tolerant batch processing jobs on-demand.
- Async - async APIs process requests asynchronously and autoscale based on request queue length.
Scale across hundreds of CPU and GPU instances
- No resource limits - allocate as much CPU, GPU, and memory as each workload requires.
- No cold starts - keep a minimum number of API replicas running to ensure that requests are handled in real-time.
- No timeouts - run workloads for as long as you want.
Control your AWS spend
- Spot instance management - Cortex automatically runs workloads on spot instances and falls back to on-demand instances to ensure reliability.
- Multi-instance type clusters - choose the ideal EC2 instance type for your workloads or mix and match several instance types in the same cluster.
- Customizable autoscaling - optimize the autoscaling behavior for each workload to ensure efficient resource utilization.
Click to show internal directories.
Click to hide internal directories.