Data Virtualization: February 2011

Thursday, February 3, 2011

The Chief Executive Officer of Queplix, Mark Cashman, shares his view and insights on the state of Queplix and trends within technology and industry that are shaping them.

The Chief Executive Officer of Queplix, Mark Cashman, shares his view and insights on the state of our business and trends within technology and industry that are shaping them.

A Market Point of View

Written by Mark Cashman Tuesday, February 01, 2011 11:11 PM

Some things should make you stand up and take notice; data virtualization is one of those things. Certainly virtualization as a technology or category is no longer new. But data virtualization is one of the emerging segments in virtualization that has now reached critical mass. All of the same drivers make it compelling - lower cost, lower risk and higher return on investment.

Data virtualization is the process of identifying, packaging, moving and ultimately storing your data without physically making additional copies throughout the process. Data virtualization creates a layer between applications and their data so that you can use the data, to your benefit. This virtual layer allows you to unlock your data from proprietary applications to get the utility you need. Data virtualization separates the applications from your data such that you can use the data in new ways.

Data virtualization has been cast by some as a replacement to ETL, EAI or EII. It is far more substantial - data virtualization is enabling you to do things you have not been able to do before without a tremendous amount of work, and with greater return on investment and lower risk.

As a proof point I would suggest doing a little research on which companies are now marketing themselves as data virtualization vendors. Here you you will find some household names from the technology company roster. These vendors understand the challenges the enterprise face and want to cash in on the action. But, of course, it is often the case of finding a wolf in sheep’s clothing. Their primary business model is wrapped around legacy technology. For some, the complexity of their approach brings so much customization that there is more revenue in the services than in the software product. Surely the model is upside down - we can help you fix it.

So, do your research and consider taking on change.

Stand up and take notice of the challenges you have faced to date and ask yourself does the existing way of managing my data really work? Can I achieve my goals and objectives using existing tools and technology, and how long will it take? The comfortable or easy bet didn’t get you to where you are today. Reach out explore other opportunities and realize another evolution surrounds you. Don’t become extinct like the dinosaurs! The nimble market leaders across all industries know how to use new and innovative technologies that offer competitive advantage. Faster development cycles, lower risk and greater return on investment are more important than ever. Data virtualization can bring you all of this across applications for data discovery, business intelligence, data integration, data management, governance, quality, compliance and much more

In my next blog post I will explore how this new approach streamlines many of the tasks you perform today simply and more cost effectively.

Queplix Data Virtualization and Hadoop - marriage made in heaven

Recently, there have been a lot of news and development covering advances in parallel processing frameworks, such as Hadoop. Some innovative data warehouse software vendors are increasingly starting to research new development strategies the parallel processing offers. So far majority of the efforts were targeted at the improving the performance and optimization maps of the queries within the traditional physical data warehouse architectures. For example, traditional data warehouse vendors like Teradata joining the Hadoop movement and applying parallel processing to their physical DW infrastructures. Companies like Yahoo and Amazon are also spearheading map/reduce Hadoop adaption for large data scale analytics.

I had been monitoring advances in the Hadoop front in particular, as I believe it will provide a convergence grounds for our products and a new development direction for Queplix Data Virtualization. Data virtualization and Hadoop are born out of the same premise – provide data storage scalability and ease of information access and sharing and I see how the two technologies complement each other perfectly.

Hadoop’s data warehouse infrastructure (Hive) is what we are researching now to integrate with Queplix Data Virtualization products. Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called Hive QL. Queplix Data Virtualization will soon utilize the flexibility of its object-oriented data modeling combined with massive power of the Hadoop parallel processing to build virtual data warehouse solutions. Imagine the analytical performance of such virtual data warehouse solution created by using the Virtual Metadata Catalog and Virtual Entities in its base as organizational and hierarchal units (instead of traditional tables and columns and SQL-driven access). Such “virtual” data warehouse solution would be a perfect fit for large scale operational and analytical processing, data quality and data governance projects with full power of Queplix heuristic and semantic data analysis. Today, data virtualization solutions are deployed by many larger enterprises to gain the visibility into the disperse application data silos without disrupting the original sources and applications; in the near future Data Virtualization and Hadoop-based virtual data warehouse solutions will be deployed in tandem to implement the full spectrum data management enterprise solutions ranging from larger-scale data integration projects (i.e. massive application data store mergers as a result of M&A between large companies) all the way to Virtual Master Data Management pioneered by Queplix. Such solutions will not only provide a better abstraction and continuity of business for the enterprise applications but also will utilize full power of parallel processing and will provide immense scalability to Queplix semantic data analytics and data alignment products.

Here are some of new and exciting ideas about Queplix is working on now: utilizing Hadoop for Virtual CEP (Complex Event Processing) within Queplix Virtual Metadata Catalog; generating “data steward” real time alerts using predictive data lineage analysis actually before the data quality problems start affect your enterprise applications; implementing Hadoop-based virtual data warehouse solutions to provide High Availability for large application stores that require massive analytics and semantic data processing; large scale Virtual Master Data Management initiatives involving enterprise-wide Customer or Product Catalog building; large-scale Business Intelligence projects based on Queplix Virtual Metadata Catalog.

Watch this blog for new developments and advances of Queplix technology to integrate Hadoop and Data Virtualization as we make announcements throughout this year!