Disaster Recovery

<< Click to Display Table of Contents >>

Navigation:  Low-code Process Automation > Automation - Test and Production environments > Automation Service Overview > Reliability >

Disaster Recovery

Overview

If you want to increase the resilience of Automation Service and maintain the continuity of services under an outage or disaster scenario, Bizagi offers a disaster recovery service. The disaster recovery service provides two options: replicating the database or the complete set of resources. This service costs in addition to the Automation Service service fees.

Automation Service is provisioned at an isolated primary site whose geographic location is chosen according to your requirements (for example, to meet local regulations or performance preferences).

When you purchase the disaster recovery service, Bizagi will provide a secondary site (called a recovery site) to be used if the primary site becomes inoperative due to a disaster. Bizagi uses a paired region to provision the secondary site. The paired region is at least 300 miles distance from the primary site.

 

note_pin

This service is available for the Production environment only.

 

Option 1: Full replica

Both the primary site and the recovery site have a full deployment. This deployment includes the Automation Service and a synchronized database. The recovery site mirrors the network, security, storage, and other resources and configurations of the primary site.

The primary site is actively handling requests from the interaction of end-users. The recovery site becomes active only when the primary site experiences a service disruption and a disaster has been declared. In that case, all new user requests are routed to the recovery site.

 

note_pin

If you have a configured VPN, you need to configure a second VPN (additional). This is done in the onboarding.

 

This approach provides a lower RTO. Failover occurs faster because the application and services are already deployed. This service has the following  characteristics:

 

Database Recovery Point Objective (RPO)

This is the point where the data from the database is recovered. In the Bizagi full Replica Disaster recovery the RPO is 5 minutes.

 

File Storage Recovery Point Objective (RPO)

This is the point where the data from the file storage is recovered. In the Bizagi full Replica Disaster recovery the RPO is less than 15 minutes for files and it is different from the database RPO because files are stored in a Storage Account.

 

note_pin

It is important to understand the difference between the database and file storage. The database contains all the information in relational tables. On the other hand, files are stored in a Storage Account to increase performance. See the Automation Service Architecture.

 

Complete Environment Recovery Time Objective (RTO)

This is the timeframe that all your Automation Service is restored after an outage. The RTO is 3 hours.

 

DisasterRecoveryPlan2

 

In case of a major disruption, Bizagi will declare a disaster and activate the disaster recovery plan. As a result, the recovery site will temporarily become active and receive the operation in a secure, isolated, and reliable way. The recovery site will be redirected to the replicated database. After the disaster event has been resolved, Bizagi will make the decision to initiate the fallback plan to run the service in the original primary site. During fail-over, users may experience a slight impact on performance, but it is temporary since the recovery site will be running at the same Performance Level of the primary site.

 

 

Option 2: Database-only

The environment is fully deployed in the primary region. Both sites are synchronized with the contents of the database. In case of a disaster, Automation Service is activated in a secondary site and connected to the standby database. All requests are routed to the new site using the replicated database. With this approach, there is no need to incur overhead and time that the database restore operation requires because the database is ready and running. The database only disaster recovery has the following characteristic.

 

Database Recovery Point Objective (RPO)

This is the point where the data from the database is recovered. In the Bizagi database-only Replica Disaster recovery the RPO is 5 minutes.

 

File Storage Recovery Point Objective (RPO)

This is the point where the data from the file storage from files attached to tasks or cases is recovered. In the Bizagi database-only Replica Disaster recovery the RPO is less than 15 minutes for files and it is different from the database RPO because files are stored in a Storage Account.

 

note_pin

It is important to understand the difference between the database and file storage. The database contains all the information in relational tables. On the other hand, files are stored in a Storage Account to increase performance. See the Automation Service Architecture.

 

Complete Environment Recovery Time Objective (RTO)

This is the timeframe that all your Automation Service is restored after an outage. The RTO is 18 hours.

 

DisasterRecoveryPlan1

 

 

Disaster Response Scenarios

Scenario 1: Customers with Disaster Recovery (DR) Purchased

For customers who have purchased Bizagi's Disaster Recovery (DR) service, we provide an enhanced level of protection and rapid recovery in the event of a disaster.

1.Immediate Response:

oIncident Detection: Upon detecting a service disruption, our monitoring systems alert our DR operations team immediately.

oAssessment: The team quickly assesses the situation to confirm the disaster and determine the impact on services.

 

2.Customer Notification:

oDeclaration of Disaster: We declare a disaster within 30 minutes to 1 hour after confirmation.

oCommunication: Customers are notified via email within the following 4 hours, with updates and expected timeframes for resolution.

 

3.Activation of DR Plan:

oFailover to Secondary Region: Services are switched to a secondary recovery region, ensuring minimal downtime.

oService Continuity: We aim to meet the specific Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) as outlined in the DR agreement, ensuring minimal data loss and quick restoration of services.

 

4.Ongoing Updates:

oStatus Reports: Regular updates are provided to customers via email regarding the recovery progress and expected restoration times.

 

5.Post-Restoration:

oConfirmation: Customers are notified as soon as services are fully restored.

 

Scenario 2: Customers Without Disaster Recovery (Relying on High Availability)

For customers who have not opted for our Disaster Recovery (DR) service, we offer a standard Service Level Agreement (SLA) with High Availability (HA) of 99.95%. You can extend the percentage of the SLA. If you consume  more than 500 monthly BPUs you will have a 99.99% SLA. If you consume less, you can aquire an enhanced availability service to encrease your SLA to 99.99% by paying an additional cost, in the Production environment.  This ensures business continuity within a single Azure region. In the unlikely event of a disaster, our procedures are designed to restore services as swiftly as possible once Azure has resolved the issue.

1.Initial Response:

oIncident Detection: Our monitoring systems alert our operations team to any service disruption.

oAssessment: The team assesses the situation to confirm if it is a disaster affecting the Azure region or datacenter.

 

2.Customer Notification:

oDeclaration of Disaster: We declare a disaster within 30 minutes to 1 hour after confirmation.

oCommunication: Customers are notified via email within the following 6 hours, with updates and expected timeframes for resolution.

 

3.Coordination with Azure:

oMonitoring: We work closely with Azure to monitor their restoration efforts.

oService Restoration: Once Azure restores services in the affected region, our team makes every effort to restore our services as quickly as possible.

oAs there is no defined Recovery Time Objective (RTO) or Recovery Point Objective (RPO) for customers without DR, the restoration time will depend on Azure's resolution time and subsequent Bizagi recovery efforts.

oData restoration: Bizagi manages backups to minimize interruptions to normal operations, reduce the overall impact of unexpected service interruptions, and minimize data loss in case of a disaster. Backups are geo-replicated. In case of any failure, Bizagi can restore the database to the nearest restoring point based on the database backups made.

 

4.Post-Restoration:

oConfirmation of Service Restoration: Customers will be notified as soon as services are fully restored.

 

Summary

With DR Purchased:

Enhanced protection with a secondary recovery region.

Minimal downtime with specific RTO and RPO.

Prompt failover and continuous updates.

 

Without DR (High Availability):

Dependent on Azure's restoration.

No specific RTO or RPO.

Efforts to restore services as quickly as possible post-Azure recovery.


Last Updated 11/29/2024 8:28:53 AM