You are currently browsing the category archive for the ‘Uncategorized’ category.

Predictive analytics is a rewarding yet challenging subject. In our benchmark research on next-generation predictive analytics at leastvr_NG_Predictive_Analytics_16_why_users_dont_produce_predictive_analyses half the participants reported that predictive analytics allows them to achieve competitive advantage (57%) and create new revenue opportunities (50%). Yet even more participants said that users of predictive analytics don’t have enough skills training to produce their own analyses (79%) and don’t understand the mathematics involved (66%). (In the term “predictive analytics” I include all types of data science, not just one particular type of analysis.)

Various software vendors are taking steps to simplify the use of this technology. RapidMiner is one of them. The company focuses on making its open source predictive analytics faster and easier to use. Its database-independent predictive analytics platform has more than 1,400 customers and averages 20,000 downloads per month. The product, also called RapidMiner, has been deployed more than 100,000 times and has a community of some 250,000 users. The latest version of the platform, Version 7.1 was released in the spring. RapidMiner has been around for almost 10 years, and in that time, the predictive analytics market has grown and changed dramatically in parallel with the big data market. Big data was not part of the original focus of the company, nor was cloud computing, but over time RapidMiner has incorporated capabilities in both areas.

The company also has a distinctive personality embodied by its founder and president, Ingo Mierswa. It is evident in his YouTube video series, “5 Minutes with Ingo”, in which he explains various aspects of predictive analytics. This approach to training potential users makes sense. According to our research, adequate training in predictive analytics concepts and the application of predictive analytics to business problems correlate more highly with satisfaction in using it (93% each) than does product training (85%). These satisfaction rates compare favorably with just 66 percent on average. The RapidMiner training videos are not only entertaining, they can potentially help an organization be more successful in understanding and using predictive analytics.

The RapidMiner product set itself provides several approaches to predictive analytics. RapidMiner Studio is a desktop tool for creating predictive analytic models. It is available for download from the RapidMiner website. Like many other predictive analytics tools, it includes connectors to a variety of data sources and supports data preparation tasks that are often needed before predictive models can be developed. Using drag-and-drop visual design, users create data flows or pipelines of activity moving data from sources, through any necessary transformations and into modeling processes.

RapidMiner Studio has several unique features to guide the user through these processes. In designing the overall pipeline of activity, a feature called Wisdom of Crowds examines what other users have done in similar situations and recommends what the next step (or “operator”) ought to be. Behind the scenes, RapidMiner is using its own technology to help predict the most likely next step. Wisdom of Crowds also provides parameter recommendations to help choose among the myriad of options and parameter settings. As further techniques to assist users, RapidMiner Studio has components to compare multiple models and to select models automatically.

While users can perform the entire predictive analytics process using RapidMiner Studio alone, they also can connect it to RapidMiner Server to support larger data sets and collaboration among multiple users. The Server product has a shared repository for processes, data and connections to other data sources and includes a framework to provide security and version control for the various items in the repository. As an alternative to an on-premises server, RapidMiner Cloud provides the same capabilities as the server product in a hosted environment.

For big data analytics RapidMiner Radoop leverages Hadoop implementations by pushing down the predictive analytics pipelines created in RapidMiner studio. These pipelines execute in the appropriate Hadoop component including MapReduce, Spark, Pig, Hive and Mahout, allowing access to the full data set and taking advantage of the cluster resources for parallel execution of the workloads without the need to code in any of these tools. Spark has become a popular framework for analytics on Hadoop, as evidenced by the Spark Summits, which I wrote about recently. It provides faster execution of analytic processes and a more flexible, expressive framework than MapReduce. Users familiar with Spark (R or MLlib) PySpark, Pig or Hive can write scripts in these packages that can be executed with Radoop. For security and authentication Radoop integrates with Kerberos, Apache Sentry and Apache Ranger.

RapidMiner recognizes the value of visualization in the analytics process and has established technical partnerships and integration with two providers, Qlik and Tableau. RapidMiner Studio can create both Qlik and Tableau data exchange files for visualization of the output of predictive analytics models. Other connections, integrations and extensions are available through the RapidMiner marketplace including Cassandra, MongoDB, SolR and Splunk.

To gain maximum value from predictive analytics, organizations must not only create the models to predict behaviors, they must deploy those models in an operational context to impact business outcomes in real time. According to our research more than one-third (37%) of organizations are applying their models at least on a daily basis. RapidMiner can convert any of its pipeline processes into a Web service so they can be embedded in other business processes and invoked in real time. RapidMiner also supports PMML, which is an industry standard for expressing models and allows embedding of models into databases for real-time scoring of new data records as they are entered into the database.

While RapidMiner has invested in making predictive analytics easier to use and accessible to a wider group of analysts, it is a daunting challenge to make these types of analyses truly self-service. Knowing when to use a particular algorithm and how to set all the various parameters requires deep knowledge of the discipline of predictive analytics. For example, in creating a k-nearest neighbors model, how many people would know what value of “k” to use for the number of nearest neighbors to model? And this is just one relatively simple parameter on one type of algorithm. The Wisdom of Crowds parameter recommendations help, but it’s still not an automated process, and users should realize they will need at least some knowledge of the various algorithms to maximize the effectiveness of their modeling efforts.

I’d also like to see RapidMiner invest more in the model management process. Once a model is created, it immediately starts to become stale for various reasons. Market conditions change. New data is generated. The competitive environment changes. The key questions are how far out of date the model has become and when it should be replaced with a better model.  Models should constantly be re-evaluated. In our predictive analytics research 63 percent of organizations that update their models at least daily reported a significant improvement in their activities and processes, compared with 31 percent of those that update their models less frequently. Any vendor that automates this process could help organizations boost their effectiveness.

Overall RapidMiner has made predictive analytics more accessible to a wider audience via its products and its educational efforts. The company has done this in an entertaining way, which is important to retain the attention of those who are being educated. Predictive analytics is a critical aspect of maximizing the value of data in an organization. Those that are not taking advantage of these types of analytics should be. RapidMiner makes it easier to tackle some of these challenges and may help get any organization over the hump of learning how to build and deploy predictive analytic models.


David Menninger

SVP & Research Director

Follow Me on Twitter @dmenningerVR and Connect with me on LinkedIn.


Qlik helped pioneer the visual discovery market with its QlikView product. In some respects, Qlik and its competitors also spawned the self-service trend rippling through the analytics market today. Their aim was to enable business users to perform analytics for themselves rather than building a product with the perfect set of features for IT. After establishing success with end users the company began to address more of the concerns of IT, eventually creating a robust enterprise-grade analytics platform. This approach has worked for Qlik, driving growth that led to an initialVR_AnalyticsandBI_VI_HotVendor_2015 public offering in 2010. The company now generates more than half a billion dollars in revenue annually, making it one of the largest independent analytics vendors. Of which based on their company and products was rated a Hot Vendor in our 2015 Value Index on Analytics and Business Intelligence and one of the highest ranked in usability.

However, as Qlik was experiencing that dramatic growth, the analytics market was changing from a Windows-based, desktop platform to a mobile, cloud-based one. As a result of these market shifts, a couple years ago the company introduced the Qlik Sense product line to offer a modern, cloud-based platform for its analytics. Thus the company embraced a two-product strategy consisting of QlikView and Qlik Sense, which my colleague Mark Smith wrote about earlier this year. When Qlik introduced this split in product lines, some customers had questions about whether it would continue to invest in QlikView. Any questions I had about both parts of its product strategy were answered a few weeks ago at Qonnections, its annual user conference – both by company executives and in my conversations with customers.

Qlik has continued its support of and investment in the QlikView product line and will provide annual updates to the product, which is now on version 12. Customers who are happy with their QlikView implementations – and I spoke with several at the conference – can continue to use the product and can expect enhancements, albeit less frequently than updates for the Qlik Sense product line. However, since QlikView and Qlik Sense share the same QIX analytics engine, customers can begin to make the transition to Qlik Sense without giving up their QlikView applications.

The company also introduced Qlik Sense 3.0, which is now generally available. It includes new features for self-service data preparation, enhanced search capabilities and an expanded set of application programming interfaces (APIs). The new data preparation features follow an industry trend toward vr_DAC_23_time_spent_in_analyticsproviding more self-service capabilities for end users. Data preparation remains a challenge for many organizations. Our benchmark research on data and analytics in the cloud shows that this activity is where the majority (55%) of organizations spend the most amount of time in their analytics process. Qlik has done a nice job here. Its user interface is intuitive, using a “connected bubbles” metaphor. Data sets show up as bubbles and can be joined graphically to other data sets or bubbles. The software automatically detects the join field based on profiling of the data involved. Other products have used drag-and-drop techniques with an automatic suggestion of join fields, but Qlik has made the visuals more appealing and easier to work with. Date fields and geographic fields are also detected during the profiling process, automating more of the steps involved in working with these types of fields. The new version also includes a graphical interface for defining derived or calculated fields.

The search capabilities, historically a strength for Qlik, have been extended to include metadata and charts. Users can search for a particular measure such as profit by region and see thumbnails of the charts and graphs that reference this measure. Qlik refers to this feature as “visual search.” Seeing the thumbnails provides more context and should make it easier to find the appropriate measure or visualization quickly.

Qlik Sense 3 has bidirectional language support as well as more international versions. With this release the company has officially added support for Korean, Polish, traditional Chinese and Turkish in addition to 11 other languages already supported.

Outside of the Qlik Sense product improvements, the company also supports more connectors to additional data sources as a result of its acquisition of Industrial CodeBox announced at Qonnections. Users now have direct connectivity to Twitter, Facebook, Google, Microsoft Dynamics CRM and Sugar CRM data. In addition to connectors, Qlik DataMarket provides access to a variety of free and subscription-based external data sources that can be used as part of an organization’s analytics. The new data sources include a financial services package with data from 35 major stock exchanges and indices including quote data and financial statement data from publicly traded companies.

The company also continues to invest in cloud-based analytics. Our research shows that two-thirds (67%)  of organizations use cloud-based analytics today or expect to within 12 months. Later this year Qlik will extend its cloud offerings to include Qlik Sense Cloud Business. Previously the company had introduced Qlik Sense Cloud Basic, a free version for individual usage, and Qlik Sense Cloud Plus, which allows sharing of analyses with up to five individuals. The Business version will provide departmental and small business support with sharing of analyses among selected groups or individuals within an organization.

On an entirely different front, in early June the company announced that it has agreed to be acquired by private equity firm Thoma Bravo. This is the latest in a spate of public technology companies being acquired by private equity firms. Tibco, Informatica and EMC are at various stages of going down a similar route. The transition to cloud-based products may be part of what is driving Qlik to go private. Cloud products are generally delivered on a subscription basis, which produces less revenue recognition up front, and it is difficult for a public company to meet the market’s revenue and profitability expectations as it transitions from large enterprise license deals with lots of upfront revenue.

Due to standard regulatory restrictions, the companies can’t say much about the acquisition and subsequent plans other than that the deal is expected to close in the third quarter of 2016. These restrictions contrast with Qlik’s public disclosure of its product roadmap, which not many software companies do. It is helpful for customers to understand how the products might evolve over the next 18 to 24 months.

In terms of future developments, users could benefit from more investment by Qlik and its new owners in collaboration and mobile capabilities. A few years ago I noted that Qlik experimented with supporting collaboration capabilities like chat streams and sharing analytic displays, but these features have fallen by the wayside. On the mobile front, Qlik is in the middle of transitioning from QlikView Mobile delivered as a native app on mobile devices to Qlik Sense mobile capabilities delivered via HTML5. As a result, there are some gaps, at least temporarily, between the two sets of products.

Overall, Qlik has continued to demonstrate an ability to design and deliver products that are visually appealing and excel in ease of use. Qlik Sense 3.0 includes additional capabilities that will help users understand and analyze their data in a pure browser-based product accessible from the cloud and mobile devices. If you haven’t considered Qlik in the past, perhaps the new release is a good reason to consider it now.


David Menninger

SVP & Research Director

Follow Me on Twitter @dmenningerVR and Connect with me on LinkedIn.

Follow on

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 22 other followers

RSS David Menninger’s Analyst Perspective’s at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

David Menninger – Twitter

Top Rated

Blog Stats

  • 46,527 hits
%d bloggers like this: