mashmallow or pydantic models from json-schema
Apologies if this already discussed somewhere - I am new to this project. To get started I used cfn2py script and tested some round-trip serializations back to json and yaml using some deepdiff and cfn-lint checks.
One concern is that boolean values are not JSON booleans but strings. Why does the t.to_dict() and t.to_json() data contain strings instead of JSON booleans? It seems like encode_to_dict(obj) should be replaced with just a json.loads(json.dumps(obj)) and let the json lib take care of all the necessary python/JSON compatibility and encodings.
Or using marshmallow or pydantic models in general should take care of all the schema mappings and serializations. It might also be easier to use botocore service descriptions or other AWS json payloads to auto-generate json-schema and models. It's not quite the same thing as CFN templates, but botocore has service API descriptions in e.g. lib/python3.7/site-packages/botocore/data/cloudformation/2010-05-15/service-2.json; see also
- https://pydantic-docs.helpmanual.io/datamodel_code_generator/
- https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-resource-specification.html
- https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/resource-type-schemas.html
The resource-type-schemas might be amenable to auto-generation of service models using mashmallow or pydantic schema parsers and code generators. If something like this were to work well, it might eliminate most if not all of the issues about supporting new features in CFN services.
In this example, the cfn_schemas is a directory with unzipped data from a regional download .zip file.
pip install datamodel-code-generator[http]
wget https://schema.cloudformation.us-east-1.amazonaws.com/CloudformationSchema.zip
mkdir cfn_schemas
mv CloudformationSchema.zip cfn_schemas/
cd cfn_schemas/
unzip CloudformationSchema.zip
cd ..
datamodel-codegen --input cfn_schemas/aws-s3-bucket.json --input-file-type jsonschema --output aws_s3_bucket.py
cat aws_s3_bucket.py
The pydantic models have built in serializations.
The resulting aws_s3_bucket.py contains:
# generated by datamodel-codegen:
# filename: aws-s3-bucket.json
# timestamp: 2021-03-11T02:07:54+00:00
from __future__ import annotations
from typing import List, Optional
from pydantic import BaseModel
class DefaultRetention(BaseModel):
Years: Optional[int] = None
Days: Optional[int] = None
Mode: Optional[str] = None
class ReplicationTimeValue(BaseModel):
Minutes: int
class FilterRule(BaseModel):
Value: str
Name: str
class AccelerateConfiguration(BaseModel):
AccelerationStatus: str
class Metrics(BaseModel):
Status: str
EventThreshold: Optional[ReplicationTimeValue] = None
class RoutingRuleCondition(BaseModel):
KeyPrefixEquals: Optional[str] = None
HttpErrorCodeReturnedEquals: Optional[str] = None
class DeleteMarkerReplication(BaseModel):
Status: Optional[str] = None
class OwnershipControlsRule(BaseModel):
ObjectOwnership: Optional[str] = None
class CorsRule(BaseModel):
ExposedHeaders: Optional[List[str]] = None
AllowedMethods: List[str]
AllowedOrigins: List[str]
AllowedHeaders: Optional[List[str]] = None
MaxAge: Optional[int] = None
Id: Optional[str] = None
class AccessControlTranslation(BaseModel):
Owner: str
class ObjectLockRule(BaseModel):
DefaultRetention: Optional[DefaultRetention] = None
class S3KeyFilter(BaseModel):
Rules: List[FilterRule]
class Destination(BaseModel):
BucketArn: str
BucketAccountId: Optional[str] = None
Format: str
Prefix: Optional[str] = None
class RedirectAllRequestsTo(BaseModel):
Protocol: Optional[str] = None
HostName: str
class TagFilter(BaseModel):
Value: str
Key: str
class PublicAccessBlockConfiguration(BaseModel):
RestrictPublicBuckets: Optional[bool] = None
IgnorePublicAcls: Optional[bool] = None
BlockPublicPolicy: Optional[bool] = None
BlockPublicAcls: Optional[bool] = None
class NoncurrentVersionTransition(BaseModel):
StorageClass: str
TransitionInDays: int
class ServerSideEncryptionByDefault(BaseModel):
SSEAlgorithm: str
KMSMasterKeyID: Optional[str] = None
class MetricsConfiguration(BaseModel):
TagFilters: Optional[List[TagFilter]] = None
Id: str
Prefix: Optional[str] = None
class ObjectLockConfiguration(BaseModel):
ObjectLockEnabled: Optional[str] = None
Rule: Optional[ObjectLockRule] = None
class LoggingConfiguration(BaseModel):
DestinationBucketName: Optional[str] = None
LogFilePrefix: Optional[str] = None
class Tiering(BaseModel):
AccessTier: str
Days: int
class DataExport(BaseModel):
Destination: Destination
OutputSchemaVersion: str
class ReplicationTime(BaseModel):
Status: str
Time: ReplicationTimeValue
class RedirectRule(BaseModel):
ReplaceKeyWith: Optional[str] = None
HttpRedirectCode: Optional[str] = None
Protocol: Optional[str] = None
HostName: Optional[str] = None
ReplaceKeyPrefixWith: Optional[str] = None
class EncryptionConfiguration(BaseModel):
ReplicaKmsKeyID: str
class InventoryConfiguration(BaseModel):
Destination: Destination
OptionalFields: Optional[List[str]] = None
IncludedObjectVersions: str
Enabled: bool
Id: str
Prefix: Optional[str] = None
ScheduleFrequency: str
class ReplicationRuleAndOperator(BaseModel):
TagFilters: Optional[List[TagFilter]] = None
Prefix: Optional[str] = None
class VersioningConfiguration(BaseModel):
Status: str
class CorsConfiguration(BaseModel):
CorsRules: List[CorsRule]
class ReplicaModifications(BaseModel):
Status: str
class Transition(BaseModel):
TransitionDate: Optional[str] = None
TransitionInDays: Optional[int] = None
StorageClass: str
class SseKmsEncryptedObjects(BaseModel):
Status: str
class Tag(BaseModel):
Value: str
Key: str
class AbortIncompleteMultipartUpload(BaseModel):
DaysAfterInitiation: int
class SourceSelectionCriteria(BaseModel):
ReplicaModifications: Optional[ReplicaModifications] = None
SseKmsEncryptedObjects: Optional[SseKmsEncryptedObjects] = None
class OwnershipControls(BaseModel):
Rules: List[OwnershipControlsRule]
class RoutingRule(BaseModel):
RedirectRule: RedirectRule
RoutingRuleCondition: Optional[RoutingRuleCondition] = None
class NotificationFilter(BaseModel):
S3Key: S3KeyFilter
class ServerSideEncryptionRule(BaseModel):
BucketKeyEnabled: Optional[bool] = None
ServerSideEncryptionByDefault: Optional[ServerSideEncryptionByDefault] = None
class ReplicationDestination(BaseModel):
AccessControlTranslation: Optional[AccessControlTranslation] = None
Account: Optional[str] = None
Metrics: Optional[Metrics] = None
Bucket: str
EncryptionConfiguration: Optional[EncryptionConfiguration] = None
StorageClass: Optional[str] = None
ReplicationTime: Optional[ReplicationTime] = None
class Rule(BaseModel):
Status: str
NoncurrentVersionExpirationInDays: Optional[int] = None
Transitions: Optional[List[Transition]] = None
TagFilters: Optional[List[TagFilter]] = None
NoncurrentVersionTransitions: Optional[List[NoncurrentVersionTransition]] = None
Prefix: Optional[str] = None
NoncurrentVersionTransition: Optional[NoncurrentVersionTransition] = None
ExpirationDate: Optional[str] = None
ExpirationInDays: Optional[int] = None
Transition: Optional[Transition] = None
Id: Optional[str] = None
AbortIncompleteMultipartUpload: Optional[AbortIncompleteMultipartUpload] = None
class WebsiteConfiguration(BaseModel):
RoutingRules: Optional[List[RoutingRule]] = None
IndexDocument: Optional[str] = None
RedirectAllRequestsTo: Optional[RedirectAllRequestsTo] = None
ErrorDocument: Optional[str] = None
class TopicConfiguration(BaseModel):
Event: str
Topic: str
Filter: Optional[NotificationFilter] = None
class IntelligentTieringConfiguration(BaseModel):
Status: str
TagFilters: Optional[List[TagFilter]] = None
Tierings: List[Tiering]
Id: str
Prefix: Optional[str] = None
class StorageClassAnalysis(BaseModel):
DataExport: Optional[DataExport] = None
class LambdaConfiguration(BaseModel):
Function: str
Event: str
Filter: Optional[NotificationFilter] = None
class ReplicationRuleFilter(BaseModel):
Prefix: Optional[str] = None
And: Optional[ReplicationRuleAndOperator] = None
TagFilter: Optional[TagFilter] = None
class BucketEncryption(BaseModel):
ServerSideEncryptionConfiguration: List[ServerSideEncryptionRule]
class LifecycleConfiguration(BaseModel):
Rules: List[Rule]
class QueueConfiguration(BaseModel):
Event: str
Filter: Optional[NotificationFilter] = None
Queue: str
class ReplicationRule(BaseModel):
Status: str
Destination: ReplicationDestination
Filter: Optional[ReplicationRuleFilter] = None
Priority: Optional[int] = None
SourceSelectionCriteria: Optional[SourceSelectionCriteria] = None
Id: Optional[str] = None
Prefix: Optional[str] = None
DeleteMarkerReplication: Optional[DeleteMarkerReplication] = None
class ReplicationConfiguration(BaseModel):
Role: str
Rules: List[ReplicationRule]
class AnalyticsConfiguration(BaseModel):
TagFilters: Optional[List[TagFilter]] = None
StorageClassAnalysis: StorageClassAnalysis
Id: str
Prefix: Optional[str] = None
class NotificationConfiguration(BaseModel):
QueueConfigurations: Optional[List[QueueConfiguration]] = None
LambdaConfigurations: Optional[List[LambdaConfiguration]] = None
TopicConfigurations: Optional[List[TopicConfiguration]] = None
class Model(BaseModel):
InventoryConfigurations: Optional[List[InventoryConfiguration]] = None
WebsiteConfiguration: Optional[WebsiteConfiguration] = None
DualStackDomainName: Optional[str] = None
AccessControl: Optional[str] = None
AnalyticsConfigurations: Optional[List[AnalyticsConfiguration]] = None
AccelerateConfiguration: Optional[AccelerateConfiguration] = None
PublicAccessBlockConfiguration: Optional[PublicAccessBlockConfiguration] = None
BucketName: Optional[str] = None
RegionalDomainName: Optional[str] = None
OwnershipControls: Optional[OwnershipControls] = None
ObjectLockConfiguration: Optional[ObjectLockConfiguration] = None
ObjectLockEnabled: Optional[bool] = None
LoggingConfiguration: Optional[LoggingConfiguration] = None
ReplicationConfiguration: Optional[ReplicationConfiguration] = None
Tags: Optional[List[Tag]] = None
DomainName: Optional[str] = None
BucketEncryption: Optional[BucketEncryption] = None
WebsiteURL: Optional[str] = None
NotificationConfiguration: Optional[NotificationConfiguration] = None
LifecycleConfiguration: Optional[LifecycleConfiguration] = None
VersioningConfiguration: Optional[VersioningConfiguration] = None
MetricsConfigurations: Optional[List[MetricsConfiguration]] = None
IntelligentTieringConfigurations: Optional[
List[IntelligentTieringConfiguration]
] = None
CorsConfiguration: Optional[CorsConfiguration] = None
Id: Optional[str] = None
Arn: Optional[str] = None
I'll have to come back to read your additional comments. But when running your tests, did you set the TROPO_REAL_BOOL environment variable? The mapping is done here. This was added for backwards compatibility and will be the default in the next major revision.
The TROPO_REAL_BOOL was not set.
I would like to second that using Pydantic is really sweet. Typehints, serialization, Literals, etc.; it has been so agile to use. But not sure how big an overhaul it would be for this repo.
Looking at this PR for example: https://github.com/cloudtools/troposphere/pull/1858/files Seems like all of the definitions could be Pydantic BaseModels. But there is likely lots of machinery that rely on the current form 🤷
@dazza-codes @lautjy I found a python library https://github.com/MacHu-GWU/cottonformation-project#welcome-to-cottonformation-documentation seems like they did exactly what you said about the Typehint, Parameter suggest and validation.
Seems like this guy use the cloudformation schema json file from AWS and jinja2 automatically generates all those code, I think we can borrow this to here.
- generate code from schema json file: https://github.com/MacHu-GWU/cottonformation-project/blob/main/cottonformation/code/spec.py#L686
- the generated code: https://github.com/MacHu-GWU/cottonformation-project/tree/main/cottonformation/res