You are currently browsing the tag archive for the ‘Informatica’ tag.

On Monday, March 21, Informatica, a vendor of information management software, announced Big Data Management version 10.1. My colleague Mark Smith covered the introduction of v. 10.0 late last year, along with Informatica’s expansion from data integration to broader data management. Informatica’s Big Data Management 10.1 release offers new capabilities, including for the hot topic of self-service data preparation for Hadoop, which Informatica is calling Intelligent Data Lake. The term “data lake” describes large collections of detailed data from across an organization, often stored in Hadoop. With this release Informatica seeks to add more enterprise capabilities to data lake implementations.

This is the latest step in Informatica’s  big data efforts. The company has been investing in Hadoop for five years, and I covered some of its early efforts. The Hadoop market has been evolving over that time, growing in popularity and maturing in terms of information management and data governance requirements. Our big data benchmark research has shown increases of more than 50 percent in the use of Hadoop, with our big data analytics research showing 37 percent of participants in production. Building on decades of experience in providing technology to integrate and manage data in data marts and data warehouses, Informatica has been extending these capabilities to the big data market and Hadoop specifically.

The Intelligent Data Lake capabilities are the most significant features of version 10.1. They include self-service data preparation, automation of some data integration tasks, and collaboration features to share information among those working with the data. The concept of self-service data preparation has become popular of late. Our big data analytics research shows that preparing data for analysis and reviewing it for quality and consistency are the two most time-consuming tasks, so making data preparation easier and faster would benefit most organizations.  Recognizing this market opportunity, several vendors are competing in this space; Informatica’s offering is called REV. With version 10.1 the Big Data Management product will have similar capabilities, including a familiar spreadsheet-style interface for working with and blending data as it is loaded into the target system. However, the REV capabilities available as part of Informatica’s cloud offering are separate from those in Big Data Management 10.1. They require separate licenses and there is no upgrade path or option as a result sharing work between the two environments is limited. Informatica faces two challenges with self-service: how well users view its self-service capabilities and user interface vs. those of their competitors and whether analysts and data scientists will be inclined to use Informatica’s products since they are mostly targeted at the data preparation process rather than the analytic process.

The collaborative capabilities of 10.1 should help organizations with their information management processes. Our most recent findings on collaboration come from our data and analytics in the cloud research, which shows that only 30 percent of participants are satisfied with their collaborative capabilities. The new release enables those who are working with the data to tag it with comments about what they found valuable or not, note issues with data quality and point others toward useful transformations they have performed. This type of information sharing can help reduce some of the time spent on data preparation. Ideally these collaboration capabilities could be surfaced all the way through the business intelligence and analytic process, but Informatica would have to do that through its technology partners since it does not offer products in those markets.

Version 10.1 includes other enhancements. The company has made additional investments in its use of Apache Spark both for performance purposes and for its machine-learning capabilities. I recently wrote about Spark and its rise in adoption. More transformations are implemented in Spark than in Hadoop’s MapReduce, which Informatica claims speeds up the processing by up to 500 percent. It also uses Spark to speed up the matching and linking processes in its master data management functions.

I should note that although Informatica is adopting these open source technologies, its product is not open source. Much of big data development is driven by the open source community, and that presents an obstacle to Informatica. Our next-generation predictive analytics research shows that Apache Hadoop is the most popular distribution, with 41 percent of organizations using or planning to use this distribution. Informatica itself does not provide a distribution of Hadoop but partners with vendors that do. Whether vr_Big_Data_Analytics_20-Hadoop_for_big_data_analyticsInformatica can win over a significant portion of the open source community remains a question. Whether it has to is another. In positioning release 10.1 the company describes the big data use cases as arising alongside conventional data warehouse and business intelligence use cases.

This release includes a “live data map” that monitors data landing in Hadoop (or other targets). The live data map infers the data format (such as social security numbers, dates and schemas) and creates a searchable index on the type of data it has catalogued; this enables organizations to easily identify, for instance, all the places where personally identifiable information (PII) is stored. They can use this information to ensure that the appropriate governance policies are applied to this data. Informatica has also enhanced its security capabilities in Big Data. Its Secure@Source product, which won an Innovation Award from Ventana Research last year , provides enterprise visibility and advanced analytics on sensitive data threats. The latest version adds support for Apache Hive tables and Salesforce data. Thus for applications that require these capabilities a more secure environment is available.

The product announcement was timed to coincide with the Strata Hadoop conference, a well-attended industry event that many vendors use to gain maximum visibility for such announcements. However, availability of the product release is planned for the second quarter of 2016. As an organization matures in its use of Hadoop, it will need to apply proper data management and governance practices.  With version 10.1 Informatica is one of the vendors to consider in meeting those needs.


David Menninger

SVP & Research Director

My colleague Mark Smith and I recently attended data integration vendor Informatica’s annual industry analyst event. The company offered some impressive numbers regarding growth and profitability over the years, with 30 consecutive quarters of growth even during the recent recession. Through acquisition and its own research and development activities Informatica now has a broad portfolio of products. It includes data integration and supporting migration, replication and synchronization needs, master data management, complex event processing and other elements of the information management spectrum. As at last year’s event, the company retains a sharp focus on the data integration related portfolio, and its product roadmap addresses four key themes impacting that market: big data, cloud computing, social media and mobile technology. We also see these themes as significant technology trends, and our approach is outlined in our 2012 research agendas for information management and in the larger business technology innovation agenda. Thus it was interesting to hear Informatica’s take on them.

There is little question that the impact of data on a large scale is growing in organizations of all types. Our benchmark research on both big data and Hadoop shows the role these technologies play in processing large amounts of data and reveals the challenges organizations face in coping with huge volumes of data. Last fall Informatica introduced HParser to help organizations use Hadoop, and we expect more Hadoop capabilities in the upcoming release of Informatica 9.5, which is scheduled to be launched at Informatica World in May.  Users of data integration products wonder whether Hadoop is replacing such tools. For its part, Informatica claims its license sales are expanding as a result of Hadoop and offered the example of JPMorgan Chase, which was shared last fall at Hadoop World.  In his presentation, Larry Feinsmith of JPMorgan Chase explained that use of Informatica had actually increased in conjunction with its use of Hadoop. We do not find this result surprising, as our research shows that the majority of organizations (63%) are not replacing existing technologies with Hadoop but rather adding Hadoop to their data processing capabilities. Informatica is developing new capabilities as part of its 9.5 release to capitalize on this opportunity.

With respect to cloud computing, Informatica was early to market, establishing a cloud presence with a separate division that delivers a subset of its product capabilities via the cloud. Our business data in the cloud benchmark research shows that organizations are expanding their use of cloud-based applications and services and that within two years cloud-based deployments of information management applications may rival on-premises deployments in numbers. In this context, Informatica stated its intent to offer its entire product line and all its capabilities via the cloud. Granted, delivering that will take some time but company officials said they expect to be able to complete this transition over the next couple of years. In reality, from Informatica’s perspective the transition to the cloud will produce a “hybrid IT” environment where some processing is done on premises and some via the cloud. Our research suggests that this will be the case in general: Half of organizations said they will move data from the cloud to on-premises systems in the future, and 42 percent expect to move data from their premises to cloud-based applications.

For depth on Informatica’s social media efforts, look to Mark Smith’s post. But I will note that the rise of social media as a new and important data source provides Informatica with an opportunity to capitalize on as part of its expanding set of data integration capabilities.

In the business landscape we tend to view mobile technology most often in the context of business intelligence, since it can be a valuable channel through which to deliver those capabilities. It is less likely that organizations would use mobile devices to author data integration routines or perform other information management tasks. However, the spread of mobile technologies is having an impact on information management. Organizations are rearchitecting their IT infrastructures to support mobile devices and also need to collect and process more location-based data for employees on the go. Mobile technology also creates new challenges for securing and governing access to data. In this area Informatica had little to say. While acknowledging these issues, it gave few specifics as to how its products will address them.

Besides trends in big data, the cloud, social media and mobile technology, our research agenda also recognizes the influences of analytics and collaboration on business intelligence and information management. Informatica is at a disadvantage relative to some of the other major information management vendors with respect to analytics. IBM, Oracle, SAS and SAP all offer both information management and analytics products. In our research on business analytics more than two-thirds (69%) of participants said they spend more time preparing data than analyzing it. Vendors whose portfolios include capabilities to deal with both tasks are in a better position to integrate them and reduce the amount of time necessary to derive business value from the data. While Informatica makes its capabilities available as a service, for instance to tap into lineage information while using a business intelligence product, I’d like to see it produce tighter integration with third-party tools to tackle this important issue.

Similarly, I’d like to see more specifics on Informatica’s collaboration strategy. The company demonstrated a prototype of collaboration around its Business Glossary product and discussed some collaboration capabilities grounded in the master data management process. As indicated in my previous post, we think many business intelligence and information management processes are moving to a more collaborative approach, so vendors need a plan here. Specifically, I think it is important to support market-leading collaboration technologies in order to attract a critical mass to create meaningful dialogue that would add value to the underlying processes. I expect we’ll hear more from Informatica on its plans for collaboration; these demonstrations were clearly early work in the field.

Given the breadth of Informatica’s product offerings and its successful execution of sales, I recommend you consider them when evaluating your information management needs. Although this event was only for the analyst community, you can expect to hear more details as Informatica brings version 9.5 to market in the coming months.


David Menninger – VP & Research Director

Follow on

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 22 other followers

RSS David Menninger’s Analyst Perspective’s at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

David Menninger – Twitter

Top Rated

Blog Stats

  • 46,527 hits
%d bloggers like this: