Wednesday, July 16, 2014

It's Time to Ditch your Pony Express


In the "Wild West" days of the early United States, there wasn't a quick, nor reliable, way to transport data from one place to another. You would have to send and receive data, via written messages. You would then have to send that message with people heading in that general direction, in hopes that it would get there months later. If that letter contained sensitive information, they would either have to write using coded wording, or hope that the numerous "carriers" would not let curiosity get the best of them. If some nefarious soul were determined to use forceful means to acquire your message, you could bet that the letter you sent with the Smith family bound for Tascaloosa was sure to be given up quickly. To solve these problems, a few entrepreneurial fellows created the Pony Express ("PE"). The PE employed mail carriers to specialize in the expedited delivery of messages (data) and to ensure secure delivery.



The formation of the PE was revolutionary, but that concept continued to be improved upon. The PE was replaced by the telegraph and the United States Post Office, and they, along with private enterprise, gradually improved both the speed and security of the delivery of data. Fast forward to today, where most of your data delivery is done electronically. But despite hundreds of years of innovation, the Pony Express has found its way back into your application delivery projects, data center consolidations,  and disaster recovery solutions; slowing your projects, injecting defects, and skyrocketing costs.

Tell-tale signs that the Pony Express is alive and well in your Government Agency or Company:

  • When a new database copy or application environment needs to be made, you submit a request/ticket that has to pass through numerous hands and wait days/weeks/months before it is delivered.
  • When a database or application environment  data needs to be refreshed, you submit a request/ticket that has to pass through numerous hands and wait days/weeks/months before it is delivered.
  • You rely on a person/group to manually ensure production data is secured via masking/obfuscation before delivering data to non-production sources.
  • You subset data to accomplish any of the above in a reasonable amount of time or at a reasonable cost.
  • You chuckled and thought "I wish I could do that" or "I wish I could get it that fast" to any of the above.

 If none of the above apply, then I wholeheartedly thank you for being a Delphix customer. If any of the above apply, then you are a victim of the 1800's.



Whether your data is a Microsoft, Oracle, PostGres, or Sybase database; or if your data is in files on Linux, Unix, or Microsoft operating systems; waiting on multiple people to deliver your data (when they get around to it) and entrusting them to secure it is a centuries old methodology. The Delphix Agile Data Platform can deliver full read/write autonomous copies of your data within minutes in as little as 10% of the space. And not just gigabytes of data, but terabytes. (Is 5TB in 5 minutes fast enough for you?) If you choose, Delphix Agile Masking will also ensure that each one of those copies has had all of the sensitive information replaced with pseudo-data to ensure that your copies will have realistic data that will help your projects move faster at reduced risk.


The Pony Express was the first attempt along the data agility journey that has recently taken a huge leap forward despite around a decade of stunted innovation and "business as usual." It's time to put those ponies out to pasture.
For more information check out the following links: 


Wednesday, July 9, 2014

Taming the Rapids Feeding Your Data Lake

First, here is a good little page that explains Hadoop Data Lakes at a high-level.


To put that into an analogy....

Just like a real lake is fed by rivers and streams, a data lake is also fed by data rivers and data streams (Binaries, flat files, Sybase, Oracle, MSSQL, etc.) Is your data stream currently fast and large enough to handle your company or government organization's data flows?



With real rivers, when heavy rains fall or a waterway becomes choked (think "beaver dam"), the river can quickly overflow its banks and wreak considerable mayhem and damage on the surrounding ecosystem. The same thing happens with data. When you have data coming in faster than you can read, process, analyze, etc., the surrounding environment can quickly become encumbered or disrupted (i.e. storage exhaustion, BI misinformation, application development delays, production outages). The same effects will happen when constraints restrict your data flow (ticketing system handoff delays between departments, inability to quickly refresh full data sets, cumbersome data rewind processes, etc.) And for every production data river, you will have 4-10 non-production tributaries (Dev(s), Test, Int, QA, Stage, UAT, Break/Fix, Training, BI, etc.). Time to build an ark.


The ebbs and tides of data are going to come and are often influenced by external factors beyond our control. The best you can do is be prepared and agile enough to adapt to the weather and adverse conditions. By virtualizing your non-production data sources with Delphix, you have now "widened you banks" by 90%. You have enabled your engineers to develop "flood control systems" two times faster, enabling your systems to quickly adapt to fast-evolving needs. By allowing them to test with full virtual environments, and not some simulation or sample data, you allow your engineers to know exactly how your systems will behave when called upon in live applications. No more simply hoping the dam will hold. Hope is not a strategy.




And those data rivers aren't just there to look pretty and enhance your view. They are there because you are using them to harness their power to benefit your company or government organization (i.e. leveraging CRM Data), irrigate other areas (i.e. feeding Data Warehouses), and as a means of agility/mobility (i.e leveraging Business Intelligence to react to market conditions).



Delphix supports the Hadoop ecosystem by enabling you to efficiently and effectively handle the various datasources that will feed the Hadoop Data Lake, all of their necessary downstream copies (staging, identification, curation, etc), and accelerate the application projects utilizing the data sources (masked and refined, if needed). Delphix delivers the right data, in the right manner (masked/unmasked), to the right team, at the right time.

Find out more about how to lift your applications and data out of the flood plane here:

http://www.delphix.com/2014/06/25/delphix-product-overview/

Find out how Delphix helped Bobby Durrett of US Foods quickly restore production and save countless hours and overtime in his unsolicited testimonial on his blog:

http://www.bobbydurrettdba.com/2014/07/08/used-delphix-to-quickly-recover-ten-production-tables/