New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal - Operational Intent for Docker Deployment #11187
Comments
+1 We also see the need for the app developers and the IT admins intent to be passed all the way through till the execution point. I think this is a great start. There are more attributes we would need on the storage side, and I'll articulate them after we get some consensus on what they could be. For what we have in our prototype product right now, we have been passing the intent around in a separate channel (we don't like this since its yet something else someone has to maintain). FWIW, I am sharing the intent parameters we currently use (and by no means is this complete or even the right set... but it's what we use in the prototype) {
"apiName": "composition",
"apiVersion": "v1beta1",
"applicationName": "helloWorld",
"registryURL": "https://registry.hub.docker.com/v1/search",
"containers": [
{
"name": "redis",
"image": "dockerfile/redis",
"exposedPorts": [
6379, 6378
],
"memoryLimit": "8GB",
"cpuNiceLevel": "-20",
"highlyAvailable": "yes",
"storage": {
"filesystem": {
"type": "ext4",
"initialFormat": "yes",
"blockSize": "4k"
},
"erasureCoded": "yes",
"flashTiering": "yes",
"dedupe": "yes",
"minVolumeSize": "5TB",
"thinVolumeSize": "20TB",
"snapshots": {
"enabled": "yes",
"interval": "1H",
"queue": 24
}
}
},
{
"name": "nginx",
"image": "dockerfile/nginx",
"exposedPorts": [
80, 443
],
"scale": 5,
"storage": {
"filesystem": {
"type": "xfs",
"initialFormat": "yes",
"hint": "ephemeral",
"blockSize": "8k"
},
"erasureCoded": "no",
"flashTiering": "no",
"dedupe": "no",
"minVolumeSize": "1MB",
"thinVolumeSize": "20TB",
"snapshots": {
"enabled": "no"
}
}
}
],
"applicationProperties": {
"admin": {
"allowedUsers": [
"admin1", "admin2"
]
}
}
} |
ping @vieux @aluzzardi |
@gourao - good to see that it fits how you see things coming out. Standardizing attributes would not only validate the needed use cases, but also create interoperable implementations for schedulers, drivers, policy-instantiation logic, etc.. Few comments regarding the format:
Both of the above can be expressed in a generic way that can be delivered to drivers in opaque way, but meaningful to the level needed for schedulers, etc.
|
I've started implementing labels on compose. |
Should the intent specify a specific technology. For example instead of saying "use VXLAN overlay" can it just say, "isolate application traffic". Because VXLAN implies switching. There could be a device that achieves isolation without overlay network, simply based on firewall rules. Amazon puts multiple tenants on the same subnet, but isolates tenants by firewall rules. Similarly instead of saying "must use SSD", can it say "the io latency should be below x units". Because SSD is today's technology and tomorrow there might be a better technology available. If we could avoid any technology references, let the enforcer translate into appropriate technologies it would be nice in my opinion. |
@joeswaminathan - valid points... intent needs to be abstract. Let us put out some first implementation version out in order to get some feedback and improve upon this. I am sure this is going to be a iterative process. |
Hi @jainvipin any further thoughts about this ? What is the status? /cc @tgraf |
@jeremyeder : Working on various implementation pieces around this idea based on labels and the new libnetwork API. Will be able to share a first preview of the implementation pretty soon. |
@jainvipin It has been detected that this issue has not received any activity in over 1 year. Can you please let us know if it is still relevant:
Thank you! |
Proposal - Operational Intent for Docker Deployment
Authors: @jainvipin, @tgraf, @dvorkinista
Introduction
Problem Statement
Containers consume and interact with a variety of software and hardware resources, consumption of these resources require an operational-intent to allow desired instantiation of policies. For example:
production
require better prioritization of resource acquisition. This can apply to compute, network and storage resources and can be further extended to grouping behaviors across collections of hosts. The operational intent specifies what constitutes high priority, which can vary from one deployment to another.web -> app ->db
, the intention is to disallowweb
container to communicate directly withdb
tier, or another container to allow communicating withapp
ordb
tier.Operational-intent is a high level specification that abstracts out the specifics to achieve the implementation of the intent. Ultimately, the intent realization depends on the infrastructure capabilities such as operating system capabilities, networking topology, and hardware capabilities of infrastructure elements like compute, storage, and network.
Operational-Intent and Docker-Compose
Docker compose, captures the orchestration-intent, whereas operational-intent describes the constraints, governance, rules and policies that must be observed in order to achieve the orchestration-intent.
Operational-Intent and Docker-Swarm
Docker Swarm, schedules containers in a multi-host cluster, with pluggable backends. The operational-intent can be leveraged by the scheduling backend to achieve better utilization of not just compute, but also network and storage resources.
The Goal - Highly abstracted intent with separation of concern
The goal is to provide a highly abstracted intent based model alongside existing application composition tools to capture operational constraints and governance rules decoupled from any underlying infrastructure or operating system specifics and correlate it to application developer intent. No authority involved should require intimate knowledge outside of its own concern context. For example, an operator must be able to impose rules affecting application developer intent without understanding the details of the application itself. For Example, while the operational constraint
use an encrypted VXLAN tunnel
affects the inter communication between containers, the application developer must not be aware of its existence, the operator defines how this intent is achieved given the infrastructure present.Separating this concern and decoupling operational constraints will allow to develop and test an application in a small scale development environment with all application developer intent captured and then move the application containers to an integration and production environment without requiring any change to the application intent. All operational and governance constraints required to run in the production environment should be imposed independently alongside the existing application intent. It will also allow to move an application from one production environment to another while preserving the original intent, the intent can be translated to whatever infrastructure configuration is required in the new environment.
Operational-Intent Realization
Following diagram describes the functional elements desired in realizing the operational intent
Intent Specification
Examples: An application-composition describing itself as
production
launch, an application requiringblock-io
access privileges, an application running as abackup
service.Example: Application
cache
must use SSD storage, applicationaccounting
must use storage with backup, network packets toweb
must be load balanced, all network overlays must use VXLAN encapsulation on port 8472, any application labelednetwork latency sensitive
should do QoS as follows, an application labellednet-encrypt
should be spawned on a host with hardware encryption capabilities, ...Examples: An encryption must use AES or stronger, a production tier labeled
production
must get higher compute/network/storage precedence than development environment with the labeldevel
, applications from multiple tenants must be isolated from each other with encryption applied, mirror all packets of a specific tenant for monitoring/analysis, ...Examples: host2 is over capacity and should not spawn any additional containers, host3 is capable of providing SR-IOV acceleration, network infrastructure between host1 and host2 is capable of providing multi-destination network, ability to do native hardware encryption is available on host8, infrastructure has ability to allow multiple classes/priority of applications within the infrastructure, ...
Scope of Intent
The scope of intent varies and can be described at the following levels:
Examples:
high-priority
be given higher priority over other applicationsConflict Resolution
In the event that multiple authorities specify intent which conflict, a set of rules exist which allow for the resolution of conflicts:
Examples: The infrastructure owner specifying that encryption is required overwrites any operational constraint which does not specify encryption, the infrastructure owner may make an exception to an operational constraint for debugging purposes.
Examples: The operator disallowing application "accounting" to talk to "database" will always overwrite the application developer request, the operational constraint "maximum 1GB memory for proxy" overwrites the application intent of allocating 2GB worth of memory.
Examples: Application with label
priority
may consume 1GB of memory overwrites the generic intent "All applications may use up to 500MB of memory".It should be noted that certain conflicts eventually can not be resolved automatically and require additional clarification from intent authorities. In such an event the conflict is reported to monitoring facilities. Automatically resolved conflicts are also reported to provide a feedback loop and enable validation of correct intent enforcement.
Implementation
Following are the functional elements required to implement the operational-intent.
Intent Specification
The intent specification is a set of logical, intuitive YAML or Json specifications that describes the arbitrary attributes that are generically categorized. The implementation of intent attributes require translating the intent to concrete data as discussed further below. For example, a high priority job may mean different set of resources in different environments, yet the logical intent remains the same.
The intent specification is consumed by a tool, similar to Docker compose, and is responsible for validating the intent specification, its format and communicate the intent to intent distribution logic.
The intent specification is extensible and customizable.
Intent Distribution
The scope of the operational-intent is expected to be at the cluster level. The distribution logic, implemented as a logically centralized entity, acts upon elements that have cluster-wide significance and is responsible for translating the logical intent to a set of instructions to the backend-drivers within the infrastructure to achieve the desired result. The intent distribution logic produces metadata to be consumed by the drivers, in an interoperable and extensible way. It utilizes a state distribution mechanism, like libpack, or equivalent in a multi-host cluster environment to distribute the state.
Intent Realization
The extension drivers or scheduler backends, for example a swarm-backend, or network-extensions, storage extensions, are expected to implement the logic and become the enforcer of the network rules/policy/governance.
The drivers consume the metadata produced by intent-distribution logic and outputs operational state back into the distributed store. While, the desired state, as was requested by the Intent distribution logic, may differ from the operational state, the goal is to always get to the desired state as indicated in the intent. The operational state distributed by the drivers is expected to be used for improving future scheduling, monitoring, etc.
The drivers also publish the capabilities to in order to be consumed by intent distribution logic. The specification of the driver and scheduler backend APIs is documented in a separate proposal (#11188).
Scheduler Integration
As documented in the driver APIs (#11188), the capabilities are captured generically and are acted upon by schedulers largely as opaque information. Thus, scheduler implementation is not expected to change as new capabilities are discovered/described by the drives and consequently specified in operational-intent.
The scheduler can use the information for placement of the jobs, or in cases choose to not schedule a job. For example, when a capability 'network-encryption' is required by an application, and various drivers exposing this capability are on the hosts that are unavailable to take an additional job.
An end-to-end example
Please note: this is just an example of an infrastructure resource utilization policy and does not represent the only typical use case.
1. Specify the application intent (in docker-compose)
group.yml specifies the application composition, as mentioned in docker-compose
Alternatively a label can be specified in the YAML composition
2. Define an operational policy (json format here, but could be YAML):
The policy described here is an example policy, and the attributes names are not expected to be known apriori to scheduler or other modules other than drivers and operational-intent distribution logic. A typical policy definition consists of a discrete set of network policies, storage policies and compute policies. Ultimately a combination of these policies gets used in the global policies, which are used as attach points to applications or application-composition.
Few things to observe in the intent specification above:
3. Instantiate the operational policy
Extending and customizing attributes
All attribute values in the proposal are expressed as strings, therefore extending or customizing the specification is easily done by specifying the attribute value to "mydomain.com/attribute_value_xyz" which of course is going to be interpreted by specific backends that implement such attributes. While the top order attributes are always consistent and uniform, additional value-adds are possible for the differentiation.
Proposed Attributes in an operational policy
Subjected to community discussion and conclusion, the proposal here speaks to a simple top level hierarchy, representing network, storage and compute attributes. And a way to combine these policies in an arbitrary way to create global level policies. Following list recommends as starting point of proposed standard attributes:
Open Items
The text was updated successfully, but these errors were encountered: