Saturday 19 November 2011

Continuous availability – no longer a dream?

Zero downtime is a goal that many companies are striving for. It sounds so straighforward, and yet it’s not that simple to achieve – especially when it involves the continuous availability of large, high-volume databases. One of the inherent problems is that data replication for high-availability is filled with many nuances that need to be addressed for a successful deployment, including maintaining sub-second latency, active/active considerations, scalability options, conflict detection/resolution, recovery, exception processing, and verifying that the source/target are synchronized properly.

One of the problems that organizations face is the need to address lots of different business issues using, what often involves, multiple software packages. Integrating these different pieces of software – perhaps even from different vendors – can add an extra level of complexity to the job in hand. What those organizations really need is a single piece of software that’s flexible enough to provide a comprehensive solution for changed data capture, replication, enhancing existing ETL (Extract, Transform, and Load) processes, and data migrations/conversions. Quite a big ask.

Wouldn’t you be interested in software that offers industrial-strength, near-real-time data integration solutions that include high-performance Changed Data Capture (CDC), data replication, data synchronization, enhanced ETL and business event publishing? And what if it was equally simple to experience the high-speed delivery of mainframe data (IMS, DB2, VSAM, etc) into data warehouses and downstream applications? Too good to be true?

If you’re like me, you carry around a list of capabilities in your head, and tick them off – or more often don’t tick them off – when you give software the once over. So here’s the kind of things I’d have on my list for an integration engine. In general I’d expect:
  • Concurrent operation across multiple operating system platforms
  • Multi-step processes within a single script (UNION)
  • Simultaneous multi-record type file handling
  • Multi-level array handling (repeating groups) of source data store records/rows
  • Data filtering and cleansing
  • Dynamic look-up table processing
  • Support for data transfer and communication using TCP/IP and MQSeries
  • Preservation of referential integrity (RI) rules on target updates
  • Joins/Merges of heterogeneous databases/files.
In terms of data transformation I’d like to see:
  • Case (If/Else) logic
  • Extensive date cleansing and formatting
  • Arithmetic functions (add, subtract, multiply, etc)
  • Aggregation functions (sum, min, max, avg, etc)
  • Data type conversions
  • String functions
  • Data filtering
  • XML data formatting
  • Delimited data formatting.
When it comes to datastore processing I’d want:
  • High performance bulk data transfer
  • Concurrent processing of multiple data store types
  • Creation of target data stores from source data store format
  • Insert/append to existing target data stores
  • Update/replace existing target data stores
  • Delete from existing target data stores
  • New column/field creation Data Movement.
And for Data Movement, my list includes MQSeries, TCP/IP, and FTP.

If there was also some kind of Integration Center that had an easy-to-use Graphical User Interface (GUI) enabling users to quickly develop data integration interfaces from a single control point – that would be good. Additionally, some way to develop, deploy and maintain data interfaces, create relational DDL (Data Definition Language), XML (Extensible Mark-up Language ) and C/C++ structures from COBOL Copybooks, monitor the status of integration engines, and contain an integrated metadata repository – that would be a real plus.

I’d definitely want to find out more about a single piece of software that provided high-performance Changed Data Capture (CDC) and Apply, data replication, event publishing, Extract, Transformation, and Load (ETL), and data conversions/migrations.

So, if you’re like me and want to know more, there’s a webinar from SQData’s Scott Quillicy on 1 December at 2pm GMT (8am CST). To join the webinar from your PC, you need to register before the event at https://www1.gotomeeting.com/register/844029904. I’ll see you there.

No comments: