Sunday, July 8, 2018

Deploying Data as Code (Delphix + Terraform + Amazon RDS)

Architecture diag from delphix.

Last year Delphix blogged about how the Dynamic Data Platform can be leveraged with Amazon's RDS (link here).  Subsequently, they released a knowledge article outlining how the solution can be accomplished (link here).

I thought I would take the work I have been doing in developing terraform plugin and create a set of blueprints that could easily deploy a working example of the scenario. I also took that a step further and created some docker containers that would package up all of the requirements to make this as simple as possible.

This demonstration requires the Delphix Dynamic Data Platform and Oracle 11g. You will need to be licensed to use both.

TLDR;

  1. Build the packer template delphix-centos7-rds.json via the instructions found here:  https://github.com/delphix/packer-templates
  2. Via terraform, build the blueprints found here:  https://github.com/delphix/delphix-terraform-blueprints-rds

Walk Through

This example automates the deployment of the solution described in KBA1671

This example requires that you posses the proper privileges in an AWS account, access to Oracle 11g software, and access to version 5.2 of the Delphix Dynamic Data Platform in AWS.

Consult https://github.com/delphix/delphix-terraform-blueprints-rds details on the prerequisties.

Automation products used:
  • Ansible
  • Packer
  • Terraform
  • Docker
  • Delphix Dynamic Data Platform

Building the AMI

In this example, we will be using a simple configuration using Oracle 11g as the backend. 
We will first create an Amazon AMI that is configured with Oracle 11g and ready to use with Delphix.
We will build the image using a docker container running Packer and Ansible.

We will follow the instructions here to build the delphix-centos7-rds.json template




We will be using the cloudsurgeon/packer-ansible docker container to build our ami.
See the full description on https://hub.docker.com/r/cloudsurgeon/packer-ansible for usage details.


  1. First we clone the repo, then navigate into the directory.
  2. Next, copy the .example.docker file to .environment.env and edit the values to reflect our environment
  3. Now we run the docker container against the delphix-centos7-rds.json template to create our AMI.
    Details of the command:
    • docker run – invoking docker to run a specified container

    • --env-file .environment.env –passing in a file that will be instantiated as environment variables inside the container
    • -v pwd:/build – mount the current working directory to /build inside the container
    • -i - run the container in interactive mode
    • -t – run a seudo tty
    • cloudsurgeon/packer-ansible:latest - use the latest version of this image
  4. When the container starts, it will download the necessary Ansible roles required to build the image.
  5. After downloading the Ansible roles, the container executes Packer to start provisioning the infrastructure in AWS to prepare and create the machine image. This process can take around 20 minutes to complete. 

Build the Demo environment with Terraform


Now that we have a compatible image, we can build the demo environment.

  1. First we clone the repo, then navigate into the directory.
  2. Next, copy the .example.docker file to .environment.env and edit the values to reflect our environment.
  3. See the Configuring section of the README for details on the variables.
  4. We will be using the cloudsurgeon/rds_demo docker container to deploy our demo environment. See the full description on https://hub.docker.com/r/cloudsurgeon/rds_demo for usage details.
  5. run the rds_demo container to initialize the directory.
    docker run — env-file .environment.env -i -t -v $(pwd):/app/ -w /app/ cloudsurgeon/rds_demo init
  6. run the rds_demo container to build out the environment.
    Details of the command:
    • docker run – invoking docker to run a specified container
    • --env-file .environment.env – passing in a file that will be instantiated as environment variables inside the container
    • -v $(pwd):/app – mount the current working directory to /app inside the container
    • -w /app/ - use /app as the working directory inside the container 
    • -i - run the container in interactive mode
    • -t – run a seudo tty
    • cloudsurgeon/rds_demo:latest – use the latest version of this image
    • apply –auto-approve – pass along the apply flag to terraform and automatically approve the changes (avoids typing yes a few times)

This is repo is actually a set of three terraform blueprints that build sequentially on top of eachother, due to dependencies.
The sequence of automation is as follows:
Phase 1 - Build the networking, security rules, servers and RDS instance. This phase can will take around 15 minutes to complete, due to the time it takes AWS to create a new RDS instance

Phase 2 - Configure DMS & Delphix, Start the DMS replication task.
Phase 3 - Create the Virtual Database copy of the RDS data source.

Using the Demo

Once phase_3 is complete, the screen will present two links. One is to the Delphix Dynamic Data Platform, the other link is to the application portal you just provisioned.

  1. Click the “Launch RDS Source Instance” button. The RDS Source Instance will open in a new browser tab.
  2. Add someone, like yourself,  as a new employee to the application
  3. Once your new record is added. Go back to the application portal and launch the RDS Replica Instance
  4. You are now viewing a read-only replica of our application data. The replica is a data pod running on the Delphix Dynamic Data platform. The data is being sync’d automatically from our source instance in RDS via Amazon DMS.
  5. Go back to the application portal and launch the Dev Instance.
    The backend for the Dev Instance is also a data pod running on the Delphix Dynamic Data Platform
    It is a copy of the RDS replica data pod.
    Notice we don’t see our new record.
    That is because we provisioned this copy before we entered our new data.
    If we want to bring in the new data, we just need to simply refresh our Dev data pod.
    While we could definitely easily do that using the Dynamic Data Platform web interface, let’s do it via terraform instead.
  6. In the terminal, we will run our same docker command again, but with a slight difference in the end. 
    This time, instead of apply --auto-approve, we will pass phase_3 destroy –auto-approve
    Details of the new parts of the command:
    • phase_3 – apply these actions only to phase_3
    • destroy – destroy the assets 
    • --auto-approve – assume ‘yes’

    Remember, phase_3 was just the creation of our virtual copy of the replica. By destroying phase_3, Terraform is instructing the DDP to destroy the virtual copy.
  7. If you login to the DDP (username delphix_admin/password is in your .environment.env file), you will see the dataset being deleted in the actions pane.
  8. If you close and relaunch the Dev Instance from the application portal again, you will see that the backend database is no longer present.

  9. Now we run our Docker container again with the apply command. And it rebuilds phase_3

  10. If you close and relaunch the Dev Instance from the application portal again, you will see that the backend database is present again and this time includes the latest data from our environment.
  11. When you are finished playing with your demo, you can destroy all of the assets you created with the following docker command:
    docker run --env-file .environment.env -i -t -v $(pwd):/app/ -w /app/ cloudsurgeon/rds_demo destroy -auto-approve
  12. It will take about 15-20 minutes to completely destroy everything.

Thursday, June 21, 2018

Creating a Test Data Catalog with Delphix

Test environment data is all over the place, slowing down your projects, and injecting quality issues. It doesn’t have to be this way.



According to the TDM Strategy survey done by Infosys in 2015, up to 60% of application development and testing time is devoted to data-related tasks. That statistic is consistent with my personal experience with the app dev lifecycle, as well as my experience with the world’s largest financial institutions.

A huge contributor to the testing bottleneck is data friction. Incorporating people, process, and technology into DataOps practices is the only way to reduce data friction across organizations and to enable the rapid, automated, and secure management of data at scale.

For example, by leveraging the Delphix Dynamic Data Platform as a Test Data Catalog, I have seen several of my customers nearly double their test frequency while reducing data-related defects. The Test Data Catalog is a way of leveraging Delphix to transform manual event-driven testing organizations into automated testing factories; where everyone in testing and dev, including the test data engineers, can leverage self-service to get the data they need and to securely share the data they produce.

Below you will find two videos I recorded to help illustrate and explain this concept. The first is an introduction that speaks a little deeper on the problem space. In the second video, I demonstrate how to use Delphix as a Test Data Catalog.





Reach out to me on Twitter or LinkedIn with your questions or if you have suggestions for future videos.

Wednesday, June 20, 2018

Solving CI/CD (Continuously Interrupted/Continuously Disappointed)

Continuous — (adj.) forming an unbroken whole; without interruption.

Continuous Integration and Continuous Deployment are two popular practices that have yielded huge benefits for many companies across the globe. Yet, it’s all a lie.

Although the benefits are real, the idea behind CI&CD is largely aspirational for most companies, and would more properly be titled, “The Quest for CI/CD: A Not-So-Merry Tale.”

Because, let’s face it, there is still a lot of waiting in most CI/CD. To avoid false advertising claims, perhaps we should just start adding quiet disclaimers with asterisks, like so CI/CD**.

The waiting still comes from multiple parts of the process, but most frequently, teams are still waiting on data. Waiting for data provisioning. Waiting for data obfuscation. Waiting for access requests. Waiting for data backup. Waiting for data restore. Waiting for new data. Waiting for data subsets. Waiting for data availability windows. Waiting for Bob to get back from lunch — even when devs just generate their own data on the fly– QA and Testing get stuck with the bill. (I am talking to three F100 companies right now where this last issue is the source of some extreme pain).
I wish I could say that any one technology could solve all data issues (I have seven kids and that fact alone would pay for their entire college fund). But, I can say that Delphix solves some very real and very big data issues for some of the world’s biggest and best known brands, through the power of DataOps. It allows organizations to leverage the best of people, process, and technology to eliminate data friction across all spectrums.

Here I share a video of how I tie Jenkins together with Delphix to provision, backup, restore, and share data in a automated, fast, and secure manner. This video explains how I demonstrated some of the functionality in my Delphix SDLC Toolchain demo.


**excluding those things that we obviously have to wait for.