# Crunchy Operator (/pgbackrest) PGO Crunchy Operator uses pgBackRest for managing backups. [pgBackRest](https://pgbackrest.org/) - is a backup and restore solution for PostgreSQL databases that offers several features, such as parallel backup and restore, compression, full, differential, and incremental backups, backup rotation and archive expiration, backup integrity and etc. It supports multiple repositories, which can be located locally or remotely via TLS/SSH, or be cloud provided storage as S3/GCS/Azure. ### Backup configuration Backup configuration is done through the [`spec.backups.pgbackrest`](https://access.crunchydata.com/documentation/postgres-operator/5.3.1/references/crd/#postgresclusterspecbackupspgbackrest) parameter. See the example below. {% code title="aidboxdb.yaml" %} ```yaml spec: backups: pgbackrest: repos: - name: repo1 # repo schedules: full: "0 1 * * 0" # Full backup once a week on Sunday at 1 AM incremental: "0 1 * * 1-6" # incremental - from Monday to Saturday at 1 AM gcs: bucket: "" # GCS bucket name configuration: - secret: name: pgo-gcs-creds # GCS credentials - configMap: name: pgbackrest-config # pgbackrest config global: repo1-path: /backup/aidboxdb # Backup path in bucket repo1-retention-full-type: time # Retention policy repo1-retention-full: "30" # Delete backups after 30 days manual: repoName: repo1 options: # Manual backup configuration - '--type=full' - '--compress-level=6' - '--start-fast=y' - '--process-max=20' - '--log-level-console=info' ``` {% endcode %} And create additional configs and secrets ```yaml --- apiVersion: v1 kind: Secret metadata: name: pgo-gcs-creds namespace: aidboxdb-db dataString: gcs.conf: |- [global] repo1-gcs-key=/etc/pgbackrest/conf.d/gcs-key.json gcs-key.json: |- --- apiVersion: v1 kind: ConfigMap metadata: name: pgbackrest-config namespace: aidboxdb-db data: db.conf: |- [global] compress-level=6 start-fast=y process-max=20 ``` #### Repositories `repos:` - Defines a pgBackRest repository. This allows you to configure where and how your backups and WAL archive are stored. You can keep backups in up to four (4) different locations. Supported 4 locations (see full [Backup Configuration](https://access.crunchydata.com/documentation/postgres-operator/5.3.1/tutorial/backups/) instructions): * `azure` - For use with Azure Blob Storage. * `gcs` - For use with Google Cloud Storage (GCS). * `s3` - For use with Amazon S3 or any S3-compatible storage system such as MinIO. * `volume` For use with a Kubernetes Persistent Volume. GCS configuration example: 1. Specify GCS bucket and secret with credentials ```yaml spec: backups: pgbackrest: repos: - name: repo1 gcs: bucket: "" configuration: - secret: name: pgo-gcs-creds ``` 2. Create `secret` with GCS connection credentials ```yaml apiVersion: v1 kind: Secret metadata: name: pgo-gcs-creds namespace: aidboxdb-db dataString: gcs.conf: |- [global] repo1-gcs-key=/etc/pgbackrest/conf.d/gcs-key.json gcs-key.json: |- ``` #### Schedule In this spec, we define incremental backup from Monday to Saturday and take one full backup every Sunday at 1 AM: ```yaml spec: backups: pgbackrest: repos: - name: repo1 schedules: full: "0 1 * * 0" # Full backup once a week on Sunday at 1AM incremental: "0 1 * * 1-6" # incremental - from Monday to Saturday at 1AM ``` #### Backup retention Define backup retention policy. In this spec we store all backups for 30 days, after that period - delete them: ```yaml spec: backups: pgbackrest: global: repo1-path: /backup/aidboxdb # Backup path in bucket repo1-retention-full-type: time # Retention policy repo1-retention-full: "30" # Delete backups after 30 days ``` ### Create backup At certain instances, you may find it necessary to perform a singular backup, especially before making significant modifications or updates to an application. To do so, you must first configure the `spec.backups.pgbackrest.manual` section, which includes details about the type of backup desired and any additional pgBackRest configuration settings required: ```yaml spec: backups: pgbackrest: manual: repoName: repo1 options: # Manual backup configuration - '--type=full' # Take full backup - '--compress-level=6' # Compress GZ - '--start-fast=y' # Do no wait checkpoint - '--process-max=20' # Max processes to use for compressing and transfer ``` For creating a manual backup you should annotate `postgrescluster` resource with `postgres-operator.crunchydata.com/pgbackrest-backup` annotation: ```bash $ kubectl annotate -n aidboxdb-db postgrescluster aidboxdb --overwrite \ postgres-operator.crunchydata.com/pgbackrest-backup="$(date)" ``` ### Recovery Sometimes you need to recover your database or clone your production database to the stage environment. Generally in the recovery process, we can define two types of recovery: clone the existing cluster to another environment, PITR - recovery database at a specific point in time. #### Clone To create a new clone of the existing PG cluster you should specify `dataSource` parameter for the new cluster. In the sample below we create `stage` cluster as a copy of `aidboxdb` cluster in `aidboxdb-db` namespace. ```yaml apiVersion: postgres-operator.crunchydata.com/v1beta1 kind: PostgresCluster metadata: name: stage namespace: stage spec: dataSource: postgresCluster: clusterName: aidboxdb repoName: repo1 clusterNamespace: aidboxdb-db image: healthsamurai/aidboxdb:15.2.0-crunchy postgresVersion: 15 instances: - dataVolumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 1Gi backups: pgbackrest: repos: - name: repo1 volume: volumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 1Gi ``` #### PITR When you need recovery to a specific point in time you should add recovery options to the new cluster configuration. ```yaml apiVersion: postgres-operator.crunchydata.com/v1beta1 kind: PostgresCluster metadata: name: stage-pitr namespace: stage-pitr spec: dataSource: postgresCluster: clusterName: aidboxdb repoName: repo1 clusterNamespace: aidboxdb-db options: - --type=time - --target="2023-04-09 10:00:00-04" image: healthsamurai/aidboxdb:15.2.0-crunchy postgresVersion: 15 instances: - dataVolumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 1Gi backups: pgbackrest: repos: - name: repo1 volume: volumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 1Gi ``` Look at `dataSource` . In this section, you can specify the type of recovery and target. ```yaml spec: dataSource: postgresCluster: clusterName: aidboxdb clusterNamespace: aidboxdb-db repoName: repo1 options: - --type=time - --target="2023-04-09 10:00:00-04" ``` ### Inspect backup You can list of backups via direct exec `pgbackrest info` command on database image ```bash $ export NS=aidboxdb-db $ kubectl exec -n $NS \ $(kubectl get pod -n $NS -l "postgres-operator.crunchydata.com/data=postgres" -o jsonpath='{.items[0].metadata.name}') \ -- bash -c 'pgbackrest info' ``` For verifying existing backups you can run `pgbackrest verify` command ```bash $ export NS=aidboxdb-db $ kubectl exec -n $NS \ $(kubectl get pod -n $NS -l "postgres-operator.crunchydata.com/data=postgres" -o jsonpath='{.items[0].metadata.name}') \ -- bash -c 'pgbackrest --stanza=db --log-level-console=info verify' ```