Cloud SaaS Factory on AWS

This blog will walk you through the various considerations and support features offered with a SaaS Factory approach for Architecture and Design.

For businesses looking for accessibility, functionality, and adaptability in a competitive business environment, software-as-a-service (SaaS) is becoming an increasingly attractive option. SaaS is a software distribution model that provides businesses with a lot of flexibility and cost-effectiveness, making it a very dependable choice for many business models and industries. Due to its ease of use, user accessibility, security, and widespread connectivity, it has become popular among many businesses. The industry is seeing a trend in more ISVs enabling their products to be SaaS-ready and even more so with the acceptance of HyperCloud paving the way for affordable connectivity across the globe.

If you are an ISV looking for a cloud solution, then this blog will walk you through the various considerations and support features offered with a SaaS Factory approach for Architecture and Design.

Key considerations based on Saas Factory approach for Architecture and Design

  • Single-tenant vs Multi-tenant (Dedicated vs Shared resources)
  • How do we ensure Isolation (policies, partitioning, or routing) and security within the SaaS solution for the SaaS customers?
  • How do we ensure the SLO-like reliability of the SaaS solution?
  • Should we build it Cloud Native / Serverless? Will this make us lock into the Cloud platform? Should we consider Managed vs Self-managed components?
  • What is the right minimum resource sizing/capacity without impacting User Experience?
  • How does billing/cost be attributed to each customer and continuously keep optimizing cost?
  • How do keep track, instrument, and reporting of metrics across tenants?
  • How to ensure the scalable architecture be designed for growth, without a linear increase in operations overhead?
  • What are the technical considerations for the Tiers, Tiers of SaaS offering?

Let us now take a deeper look at each of these considerations

Single-tenant vs Multi-tenant

When only one SaaS customer’s users are using a component of the infrastructure, it is considered to be dedicated to that customer under the single-tenant model of deployment. This might be a single SaaS customer-specific database, VM, or Lambda.

A multi-tenant model is a deployment model in which users from different SaaS customers share the same environment or component and are considered to be using shared resources.

1
2

Approach – Using workshops, it is possible to identify components that need to be dedicated based on the aforementioned factors, which then could be shared.

Isolation

Given the multi-tenant (Shared resources) nature of SaaS offerings, isolation is a crucial component. In the AWS, several levels of data plane isolation can be attained, including:

  1. Account level
  2. Network level (VPC or Subnet)
  3. Compute level (EC2, ECS, or EKS)
  4. Service Level (Document DB – Collection level)
  5. Cluster level (In Kubernetes Cluster: Namespace, Service Mesh)
  6. Storage level (S3 Buckets, EBS volumes)

Network-level isolation based on subnets is taken into consideration in the architecture as dedicated components will increase overhead for the EKS Control plane for each customer in a Multi-Account strategy.

3

Approach – Isolation should be implemented and handled at a layer other than by the developer at the code level, as well as taken into account in the cloud architecture and design.

Reliability

Every stateful SaaS solution component would need to be assessed for compliance with these reliability requirements, both in terms of data loss and business continuity. The key stateful components must be architected in AWS leveraging the following aspects once they have been determined to be critical based on RTO and RPO expectations:

  1. Cluster computes spanning Multiple Availability Zones, ensuring services/pods are distributed across AZ
  2. Managed / Containerized Amazon Services providing Multi-AZ failover services
  3. Backup, Retention & Restore methods
  4. Cross-Region snapshot copy or data sync
4

Approach – In addition to the high availability across AZ, components with shorter RTO and RPO will require a hot standby DR in another region. To provide a reliable client experience, On-Boarding components, and control plane components like metrics reporters and cost consumption reporters should be highly available.

Managed vs Self-Managed

Many of the technology platforms used by SaaS solutions are available as Managed Services or Boxed/containerized services. For example, DocumentDB(Mongo), ElasticCache(Redis), and EKS on EC2(Kubernetes) as Managed Service offerings. S3(File Object Store), Confluent Kafka, EKS on Fargate (Kubernetes) as Boxed/Containerized Service offerings.

The decision of using Managed/Boxed services vs self-managed on IaaS is typically based on the following:

  1. Built-in High availability and auto-recovery
  2. Data durability, Backup & Cross-region copy/sync
  3. Readily available tooling for Data Sync / Data Capture
  4. API-based provisioning and scaling
  5. Avoiding Human resources on non-core activities vs fine-grained control required on these platforms for fine-tuning

Approach – The provisioning, scaling, and administration of these platforms are addressed by these managed/boxed services, which are evaluated against the cost of the services based on historical challenges, future demand, and related risk concerns.

Sizing/Capacity

The base capacity and unit scaling capacity in terms of computation, storage, and throughput for each individual component deployed must be determined and understood because they will influence the following:

  1. User Experience
  2. Reliability/SLO
  3. Scalability
  4. Cost

Application design should focus on standardizing component performance for a specific size, and this is validated by performance benchmarking exercises. Equal attention on building a mechanism of throttling or bursting as a buffer to manage any additional needs beyond entitlement as required to be implemented in the Operation model to enable a smooth user experience.

Approach – Component sizing should initially be based on previous experience of sizing for the component and the same is to be validated through benchmarking exercises with sufficient buffers to handle any burst needs.

Cost Attribution

To keep track of costs, it is crucial to assign costs to both dedicated and shared components. Shared component costs may be allocated to the maximum number of consumers for the base capacity anticipated, and then among actual customers. You can accomplish cost attribution via:

  1. Tagging the deployed assets and tracking via AWS Cost Utilization Report or Cost Explorer
  2. Tracking usage using open solutions like kubecost for specific customers, at namespace, pod, or service level for Dedicated components deployed on the Pooled Resources like EKS Pooled Cluster

Approach – Along with the allocated cost for the Shared resources, customer-level cost attribution for Dedicated components using the Pooled resources will be deployed.

Cost Optimization

To maintain costs within budget, cost optimization for both dedicated and shared components is crucial. Cost Optimization is possible through:

  1. Using Spot instance for stateless components
  2. Using the Shared account, Shared Compute, Shared Data, and Shared Channels wherever possible
  3. Scale down instances/Sizing wherever possible

Approach – Tier-based cost optimization approach to be arrived at by reducing the cost for non-critical components wherever applicable

Scaling for Growth

Each component must be scalable for business expansion even while it is sized for the basic capacity to start small. To ensure scaling while considering the service constraints using the single account method, both vertical and horizontal growth at the Cluster level, EC2 level, and pod level are required.

  1. Autoscaling for EKS on EC2 with Horizontal Pod scaling
  2. Additional EKS cluster for Pooled resources with App Mesh (Service Mesh) for cross-cluster service discovery and routing
  3. Scalable container instances leveraging Fargate nodes
  4. Document DB, Kafka cluster, node additions for scaling
  5. Highly scalable and durable S3 as File Store for inbound files
  6. Other Scalable boxed/containerized services including ALB are considered
5

Approach – Every component is architected to be horizontally scalable considering the service limits of a single account.

Tiers

There are many subscription tiers available for SaaS offerings. These tiers offer a variety of entitlements depending on which the consumer can subscribe in accordance with their demands, in addition to varied pricing alternatives. This aids in creating a model of expectation and value versus cost for each Tier.

From an architectural standpoint, it’s crucial to make sure that the scaling requirements and cost projections for each tier required to deliver the value and user experience are well specified. Finding the variables that affect the entitlements is necessary for this.

1. Storage in terms of maximum storage for the tier

2. Compute by means of maximum throughput like transactions processed per second

6

3. Scalability, buffer zone, and throttling strategy must be in place driven by business requirements in managing customer expectations

4. Standardization of feature to establish the sizing to performance expectations are important

It is necessary to provide both internal management and SaaS customer admin users with sufficient metric visibility of consumption from the perspective of Tier entitlement and subscription usage. Application management and metrics layers should implement alerts for Tier upgrades.

Approach – For any component to be included in the Tier feature entitlements, it is necessary to identify the key technological and business aspects that affect pricing, performance, and value.

Metrics Tracking

Both the Operations team and the Customer Administrator need the right instrumentation to understand the overall platform’s health and the health of its components, specifically their consumption relative to their entitlement for that tier.

This presents an opportunity to cross-sell additional features and upsell higher tiers. These must be as close to real-time as possible and have the ability to predict when an update may be necessary. From a cloud standpoint, there are various alternatives.

  1. Cloud Platform provided metrics tracking by CloudWatch Insights
  2. Prometheus, Grafana, X-Ray for Kubernetes pod, service level metrics
  3. Elastic, PostgreSQL / MySQL based on business metrics like Transactions Handled per second, Storage utilized, Campaign effectiveness, etc.
7

Approach – Cloud platform monitoring tools must be used for components at the infrastructure level. In addition to the metrics offered by the platform, open-source instrumentations must be used for Kubernetes or no-SQL services. Application logic will implement business-level metrics.

Through this SaaS Factory methodology, 1Cloudhub is an expert at enabling various B2C digital products on the cloud to create a safe, scalable, and economical architecture/design. SaaS has no established blueprint. It is a business strategy rather than a technical implementation; however, the underlying technology should support competitiveness, growth, and innovation. Reach out to us for all things cloud and let us help you become SaaS ready today.

If you have any questions or suggestions, please reach out to us at contactus@1cloudhub.com

Written by:  Umashankar N

Tags:

In Blog
Subscribe to our Newsletter1CloudHub