What you need to know before

GitHub Integration with Tracepaper

Tracepaper seamlessly integrates with GitHub to provide you with complete control and visibility over your application models and the generated code. Here’s what you need to know about how this integration works:

Required GitHub Repositories: For each project modeled in Tracepaper, you must provide three GitHub repositories:
- Model Repository: This repository will store the XML files for your Tracepaper models.
- Backend Code Repository: This repository will contain the generated Python code and AWS CloudFormation infrastructure definitions.
- GUI Assist Framework Repository (Beta): This repository is used for the GUI Assist Framework, currently in beta, which aids in generating front-end components.

You will need to grant Draftsman.io write access to these repositories to allow Tracepaper to store and update the necessary files.

Automatic Storage: Once the repositories are set up, every model you create in Tracepaper is automatically saved in the designated model repository. This ensures that all versions of your models are securely stored and easily accessible for future reference.
Code Generation: When you model your domain in Tracepaper, the platform generates a Python project, which includes infrastructure definitions using AWS CloudFormation. This generated code is automatically pushed to the backend code repository, ensuring that your codebase is always in sync with your models.
Version Control: With GitHub’s robust version control system, you can track changes made to your models and code over time. This is especially useful for maintaining a clear history of your project’s evolution, allowing you to revert to previous versions if necessary.
Ownership and Access Control: Since the GitHub repositories are under your control, you have full ownership of the models and code. You can manage access to the repositories, control who can view or edit your models, and enforce security policies as needed.

In summary, Tracepaper’s GitHub integration requires you to set up and manage three GitHub repositories, giving you the best of both worlds: a powerful low-code modeling tool combined with the full capabilities of GitHub’s version control features. This integration allows you to maintain full ownership and control over your projects while ensuring that your work is secure, well-documented, and up-to-date.

1. Domain-Driven Design (DDD)

Tracepaper is influenced by Domain-Driven Design, and in the inspirations section, you will find a mapping between DDD concepts and Tracepaper concepts where applicable. If you start researching DDD, the results can be quite extensive and intimidating. Here are just the essentials that you need to know beforehand, along with best practices for implementing these concepts in Tracepaper.

Aggregates

Definition: An aggregate is a cohesive set of data definitions and behaviors that function together as a single unit. It ensures consistency and integrity within its boundary.

Best Practice: Keep aggregates small and focused. This practice helps maintain clear separation of concerns, simplifies business logic, and enhances the maintainability and agility of your system. Small aggregates are easier to change and evolve, while large aggregates can become complex and harder to manage over time.
Behavior and State: Aggregates encapsulate both behavior (methods) and state (attributes). The behavior ensures that any operations on the aggregate respect the business rules and constraints. Domain Events play a crucial role in state management, as they are used in event-sourcing to capture and apply changes to the aggregate's state over time.
Consistency Boundary: The aggregate defines a consistency boundary within which all rules and constraints are enforced. Changes to any part of the aggregate are managed in a way that ensures the aggregate’s consistency.

Additional Best Practices for Working with Aggregates in Tracepaper:

Leverage ARNs (Application Resource Names) for Sub-Collections: When working with nested objects within an aggregate, such as tasks within an objective, consider assigning ARNs to these objects. This practice optimizes searching and querying, especially when these nested objects are mirrored as autonomous entities in the view model.
Use the "Is Create Flow" Checkbox Appropriately: Ensure that the "Is Create Flow" checkbox is checked for flows that are responsible for creating new aggregate instances. This prevents the flow from executing if the aggregate instance already exists in the event store, safeguarding against unintended data overwrites.
Handle Idempotency with Care: In distributed systems, idempotency is crucial to ensure that repeated commands do not result in multiple unintended changes. Consider implementing both technical and functional idempotency where necessary, especially for flows that might be triggered multiple times by the client or due to network retries.

By adhering to these best practices, you can effectively leverage the principles of Domain-Driven Design within Tracepaper, ensuring that your aggregates are both powerful and manageable, while maintaining the integrity and performance of your system.

Domain Events

Domain events represent significant occurrences within the domain that change the state.

2. Event Sourcing

Event sourcing is a pattern where state changes are logged as a sequence of events. This approach provides benefits such as auditability, replayability, and the ability to derive the current state from event history. The main reason we use this at Tracepaper is that it enables the ability to evolve the internal view model and subsequently the external query model without the need for data migrations. By storing only the events (facts), the perspective on reality (internal or external view model) needed during runtime can be recalculated as needed. This perspective and the calculation method can evolve over time without rewriting the historical events.

Tracepaper forces you to use event sourcing because it is the only way to change state. It relies on two modeling concepts: a domain event schema (part of the aggregate) and a mapping to the aggregate document (in-memory view model).

Event schemas in Tracepaper are modeled as simple data models. Each domain event has a unique name within the application. For example, if you have an aggregate called "User" with a behavior flow "Create," you would add a domain event called "UserCreated." Besides the name, the event has a collection of attributes, each with a name and a type ( e.g., String or Boolean), and may have an explicit default value. The default value ensures backward compatibility. If an event is extended with additional attributes during the application's lifecycle, the mapping may fail for past events. However, a default value can be set to handle missing historical values. Domain events may also include nested collections of objects, to which the same rules apply.

Mappings are modeled as relations between the aggregate document and the domain events. Each domain event has its own mapping. For the root attributes in the document, it involves simple set/add/subtract operations. For nested collections, one of the nested object attributes is configured to serve as a business key. Nested objects are stored as dictionaries rather than lists to facilitate easy retrieval. The contents of nested objects are also mapped with set/add/subtract operations.

The EventStore acts as the single source of truth within the application. We selected a by-default redundant database ( DynamoDB) to act as our event store, ensuring your events are replicated to all availability zones within a region and providing you with point-in-time recovery as well. Everything is abstracted away behind configuration, such as the time to live for events (default is indefinitely) or key management with the aid of the keychain. So you probably will never see or interact with this event store directly. The important thing to remember about the event store is “don’t manually intervene in this database” but if you are absolutely certain that you must, double-check that you have a backup that you can recover from before you proceed.

3. CQRS (Command Query Responsibility Segregation)

CQRS stands for Command Query Responsibility Segregation, a pattern that separates the operations for reading data ( queries) from those that update data (commands). The main benefits include optimized performance, enhanced scalability, and improved security.

In Tracepaper, commands are modeled as GraphQL API mutations, including a message schema, API path, and authorization method. These commands are converted into asynchronous events that trigger aggregate behavior flows or automations. Commands may lead to a domain event that updates the internal state.

The aggregate document acts as an in-memory view model, a state representation that aids the command model by supporting the execution of business rules to determine if a domain event (state change) should be issued or not. This internal view model has a secondary purpose, serving as a source for the external query model.

The in-memory view model is mapped to an external format, and stored in a query-optimized database. This data is exposed via GraphQL queries with defined schemas, paths, and authorization methods.

In depth knowledge is not needed to use Tracepaper, but if you are curious take a look at the following links:

4. Python Programming

Python is a high-level, interpreted programming language known for its readability and simplicity. It supports multiple programming paradigms, including procedural, object-oriented, and functional programming. Python's extensive standard library and active community make it a popular choice for a wide range of applications, from web development to data analysis and machine learning.

Why We Selected Python

We selected Python for Tracepaper because of its ease of use, readability, and the speed at which developers can write and maintain code. Python's syntax is clear and concise, making it accessible for developers with varying levels of experience. Additionally, Python has a rich ecosystem of libraries and tools that support rapid development and integration with other technologies.

Using Python Within Tracepaper

In Tracepaper, you are programming in a constrained environment where the primary focus is on modeling domain logic. Python is used to inject custom code into the generated code, allowing for specific behaviors and customizations. The code you write will mainly be procedural, as it is designed to fit within the pre-generated structure provided by Tracepaper.

To gain a deeper understanding of Python and how to effectively use it within Tracepaper, consider the following resources:

By understanding these fundamentals, you'll be well-equipped to leverage Python within Tracepaper to customize and enhance your domain models effectively.

5. Serverless Architecture

Serverless architecture is a cloud computing model where the cloud provider automatically manages the infrastructure. The key benefits of serverless architecture are scalability, cost-efficiency, and simplified operations. Tracepaper leverages these benefits by converting its models to business logic (Python) and infrastructure definitions ( CloudFormation). This process is abstracted away from the user, meaning you only need a basic understanding for troubleshooting, without needing to build a serverless architecture yourself.

In Tracepaper, AWS Lambda functions are used wherever compute power is needed, eliminating the need for VMs or containers. This approach ensures that compute resources are managed efficiently and seamlessly. State management in Tracepaper is handled through DynamoDB, using a single-table design, although multiple tables are utilized. For example, there is one event-store table and one view-store table. All entities are stored within these tables, using logical separation to optimize for cost and operations.

One of the major scalability benefits of using serverless architecture in Tracepaper is that serverless services like AWS Lambda automatically scale with the application's needs. This offers high availability and reliability across multiple availability zones, ensuring that the application can handle varying workloads without manual intervention.

6. GraphQL

GraphQL is a query language for APIs and a runtime for executing those queries by leveraging your existing data. It provides a more efficient, powerful, and flexible alternative to traditional REST APIs. With GraphQL, clients can request exactly the data they need, significantly reducing the amount of data transferred over the network and eliminating over-fetching and under-fetching issues commonly encountered with REST APIs.

In the context of a system with a rich domain, such as Tracepaper, GraphQL excels by allowing clients to specify their exact data requirements, making it easier to work with complex and interconnected data models. This is particularly advantageous when dealing with a rich view model with a multitude of entities interconnected by relations such as one-to-one, one-to-many, many-to-one, and many-to-many, as clients can retrieve nested data structures in a single query.

In Tracepaper, commands are modeled as requests for changes and exposed as GraphQL mutations. When a command, such as “UpdateUser,” is sent, the GraphQL “server” (using AWS AppSync) converts this command into an asynchronous event (e.g., “UpdateUserRequested”). This event-driven approach ensures that aggregates react only to events, maintaining a clear separation of concerns. Once the event is published, the API returns a trace ID to the client. This trace ID can be used to query or subscribe to trace events, allowing the client to monitor the processing of the request. This integration of GraphQL with an event-driven architecture provides the benefits of flexible and efficient data retrieval while supporting the robust processing model of CQRS and event sourcing.

7. Regex Patterns for Input Validation

Regular expressions, or regex, are sequences of characters that form search patterns. They are commonly used for string matching and manipulation, such as validating input formats, searching text, and replacing substrings. Regex provides a powerful way to ensure that the input data adheres to specific formats, which is essential for maintaining data integrity and preventing errors.

In Tracepaper, regex patterns are used for input validation in commands. By defining regex patterns, you can ensure that user input matches the expected format before it is processed by the system. This helps prevent invalid data from triggering events or causing unexpected behavior in your application.

Example Regex Patterns:

Email Validation: ^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$
Phone Number Validation: ^\+?[1-9]\d{1,14}$
Date Validation (YYYY-MM-DD): ^\d{4}-\d{2}-\d{2}$

Online Resources for Learning Regex

To get started with regex and practice using it for input validation, consider the following resources:

Regex101: Regex101 - An online regex tester and debugger with a comprehensive explanation of regex patterns.
Regular Expressions 101 Video: Regular Expressions 101 by Corey Schafer - A beginner-friendly video tutorial on regex.
Interactive Regex Tutorial: RegexOne - An interactive tutorial that allows you to practice regex patterns in your browser.

Canonical Keys and ARNs in Tracepaper

Understanding Canonical Keys

In distributed systems, particularly those utilizing event-driven architectures and Domain-Driven Design (DDD), managing and accessing entities efficiently and consistently is critical. Canonical keys serve as a structured, standardized way to uniquely identify entities within the system, ensuring consistency in referencing, accessing, and manipulating entities across various services and contexts.

The Role of ARNs

In Tracepaper, the concept of canonical keys is often implemented through Application Resource Names (ARNs). An ARN is a structured identifier that uniquely represents an entity within the system. This identifier typically encapsulates the entity's hierarchical position, making it straightforward to reference and retrieve the entity across different parts of the system.

For example, an ARN might look something like this:

anove:company:tracepaper:objective:myObjective

This structure provides a clear path from the top-level domain (Anove) through the company and the specific application (Tracepaper) down to the individual objective, making it easy to identify and retrieve the entity.

Flexibility and Best Practices in Using ARNs

While using ARNs is a recommended approach for implementing canonical keys in Tracepaper, it is important to note that this is a functional construct. The construction of ARNs can be automated, but it may also be provided as input via an API. In such cases, it is advisable, though not mandatory, to validate the ARN against a regular expression (regex) to ensure it follows the expected structure. This validation helps maintain consistency and prevent errors but can be bypassed if necessary.

It’s also worth mentioning that calling this key an "ARN" is a suggestion, not a requirement. Depending on your specific use case or domain, you may choose to use a different name for this key. The underlying principle remains the same: to use a standardized, hierarchical identifier that ensures entities are uniquely and consistently referenced throughout the system.

The Role of the Keychain

Tracepaper uses a keychain to manage the transformation and storage of keys in the system. The keychain ensures that any functional identifiers, such as ARNs, are properly handled and transformed into the appropriate technical identifiers when needed. This means that even if your ARNs are provided as API inputs or generated through custom logic, the keychain will manage their integrity and consistency across the system. As a result, using a functional identifier like an ARN does not pose a concern, as the keychain abstracts away the complexities of key management and ensures seamless operation.

Best Practices for Using Canonical Keys, ARNs, and the Keychain

Consistent Structure: Whether you choose to call it an ARN or another name, ensure that the key follows a consistent structure across the system. This consistency is essential for effective querying, access control, and data management.
Hierarchy and Granularity: Design your keys to reflect the appropriate level of hierarchy and granularity for your use case. For example, the key should include sufficient detail to uniquely identify an entity without overcomplicating the structure.
Regex Validation: If you allow ARNs or similar keys to be provided via an API, consider implementing regex validation to ensure the key adheres to the expected format. This validation can catch potential errors early and maintain consistency across the system.
Use in Queries: Leverage the hierarchical nature of these keys in your queries, especially when filtering data. For example, using key_begins_with in queries allows you to efficiently retrieve all entities related to a particular domain or context.
Authorization and Access Control: Use the structure of your keys to enforce authorization and access control. By embedding certain hierarchical elements in the key, you can control access to entities based on the user’s permissions within that hierarchy.
Leverage the Keychain: Trust the keychain to handle the complexities of key management. Whether your ARNs are automatically generated or manually provided, the keychain will ensure they are correctly transformed and stored, maintaining consistency and integrity across the system.
Documentation and Naming Conventions: Clearly document the structure and intended use of your keys, whether they are called ARNs or something else. Consistent naming conventions and well-documented patterns help ensure that all team members and system components interact with these keys correctly.

Conclusion

Canonical keys, such as ARNs, are vital for the effective management of entities in Tracepaper. While the construction and validation of these keys can be flexible, adhering to best practices ensures that your system remains consistent, scalable, and maintainable. The Tracepaper keychain further enhances this process by abstracting the complexities of key management, allowing you to focus on designing robust and functional identifiers.

For a deeper dive into the importance of key management and the use of canonical keys in Tracepaper, refer to this detailed explanation. By following these guidelines, you can ensure that your use of canonical keys or ARNs aligns with the best practices for robust and scalable system design.