As a technology, predictive analytics has existed for years, but adoption has not been widespread among businesses. In our recent benchmark research on business analytics among more than 2,600 organizations, predictive analytics ranked only 10th among technologies they use to gene­rate analytics, and only one in eight of those companies use it. Predictive analytics has been costly to acquire, and while enterprises in a few vertical industries and specific lines of business have been willing to invest large sums in it, they constitute only a fraction of the organizations that could benefit from them. Ventana Research has just completed a benchmark re­search project to learn about how the organizations that have adopted predictive analytics are using it and to ac­quire real-world information about their levels of maturity, trends and best practices. In this post I want to share some of the key findings from our research.

As I have noted, varieties of predictive analytics are on the rise. The huge volumes of data that organizations accumulate are driving some of this interest. Our Hadoop research highlights the intersection of this big data and predictive analytics: More than two-thirds (69%) of Hadoop users perform advanced analytics such as data mining. Regardless of the reasons for the rise, our new research confirms the importance of predictive analytics. Participants overwhelmingly reported that these capabilities are important or very important to their organization (86%) and that they plan to deploy more predictive analytics (94%). One reason for the importance assigned to predictive analytics is that most organizations apply it to core functions that produce revenue. Marketing and sales are the most common of those. The top five sources of data tapped for predictive analytics also relate directly to revenue: customer, marketing, product, sales and financial.

Although participants are using predictive analytics for important purposes and are generally positive about the experience, they do not minimize its complexities. While now usable by more types of people, this technology still requires special skills to design and deploy, and in half of organizations the users of it don’t have them. Having worked for two different vendors in the predictive analytics space, I personally can testify that the mathematics of it requires special training. Our research bears this out. For example, 58 percent don’t understand the mathematics required. Although not a math major, I had always been analytically oriented, but to get involved in predictive analytics I had to learn new concepts or new ways to apply concepts I knew.

Organizations can overcome these issues with training and support. Unfortunately, most are not doing an adequate job in these areas. Not half (44%) said their training in predictive analytics concepts and techniques is adequate, and fewer than one-fourth (24%) provide adequate help desk resources. These are important places to invest because organizations that do an adequate job in these two areas have the highest levels of satisfaction with their use of predictive analytics; 89% of them are satisfied vs. 66% overall. But we note that product training is not the most important type. That also correlated to higher levels of satisfaction, but training in concepts and the application of those concepts to business problems showed stronger correlation.

Timeliness of results also has an impact on satisfaction. Organizations that use real-time scoring of records occasionally or regularly are more satisfied than those that use real-time scoring infrequently or not at all. Our research also shows that organizations need to update their models more frequently. Almost four in 10 update their models quarterly or less frequently, and they are less satisfied with their predictive analytics projects than those who update more frequently. In some ways model updates represent the “last mile” of the predictive analytics process. To be fully effective, organizations need to build predictive analytics into ongoing business processes so the results can be used in real time. Using models that aren’t up to date undermines the whole effort.

Thanks to our sponsors, IBM and Alpine Data Labs, for helping to make this research available. And thanks to our media sponsors, Information ManagementKD Nuggets and TechTarget, for helping in gaining participants and promoting the research and educating the market. I encourage you to explore these results in more detail to help ensure your organization maximizes the value of its predictive analytics efforts.

Regards,

David Menninger – VP & Research Director

I want to share my observations from the recent annual SAS analyst briefing. SAS is a huge software company with a unique culture and a history of success. Being privately held SAS is  not required to make the same financial disclosures as publicly held organizations, it released enough information to suggest another successful year, with more than $2.7 billion in revenue and 10 percent growth in its core analytics and data management businesses. Smaller segments showed even higher growth rates. With only selective information disclosed, it’s hard to dissect the numbers to spot specific areas of weakness, but the top-line figures suggest SAS is in good health.

One of the impressive figures SAS chooses to disclose is its investment in research and development, which at 24 percent of total revenue is a significant amount. Based on presentations at the analyst event, it appears a large amount of this investment is being directed toward big data and cloud computing. At last year’s event SAS unveiled plans for big data, and much of the focus at this year’s event was on the company’s current capabilities, which consist of high-performance computing and analytics.

SAS has three ways of processing and managing large amounts of data. Its high-performance computing (HPC) capabilities are effectively a massively parallel processing (MPP) database, albeit with rich analytic functionality. The main benefit of HPC is scalability; it allows processing of large data sets.

A variation on the HPC configuration includes pushing down operations into third-party databases for “in-database” analytics. Currently, in-database capabilities are available from Teradata and EMC Greenplum, plus SAS has announced plans to integrate with other database technologies. The main benefit of in-database processing is that it minimizes the need to move data out of the database and into the SAS application, which saves time and effort.

More recently, SAS introduced a third alternative that it calls High Performance Analytics (HPA), which provides in-memory processing and can be used with either configuration. The main benefit of in-memory processing is enhanced performance.

These different configurations each have advantages and disadvantages, but having multiple alternatives can create confusion about which one to use. As a general rule of thumb, if your analysis involves a relatively small amount of data, perhaps as much as a couple of gigabytes, you can run HPA on a single node. If your system involves larger amounts of data coming from a database on multiple nodes, you will want to install HPA on each node to be able to process the information more quickly and handle internode transfers of data.

SAS also has the ability to work with data in Hadoop. Users can access information in Hadoop via Hive to interact with tables as if they were native SAS data sets. The analytic processing is done in SAS, but this approach eliminates the need to extract the data from Hadoop and put it into SAS. Users can also invoke MapReduce jobs in Hadoop from the SAS environment. To be clear, SAS does  not automatically generate these jobs or assist users in creating them, but this does offer a way for SAS users to create a single process that mixes SAS and Hadoop processing.

I’d like to see SAS push down more processing into the Hadoop environment and make more of the Hadoop processing automatic. SAS plans to introduce a new capability later in the year called LASR Analytic Server that is supposed to deliver better integration with Hadoop as well as better integration with the other distributed databases SAS supports, such as EMC Greenplum and Teradata.

There were some other items to note at the event. One is a new product for end-user interactive data visualization called Visual Analytics Explorer, which is scheduled to be introduced during the first quarter of this year. For years SAS was known for having powerful analytics but lackluster user interfaces, so this came as a bit of a surprise, but the initial impression shared by many in attendance was that SAS has done a good job on the design of the user interface for this product.

In the analytics software market, many vendors have introduced products recently that provide interactive visualization. Companies such as QlikViewTableau and Tibco Spotfire have built their businesses around interactive visualization and data exploration. Within the last year, IBM introduced Cognos Insight, MicroStrategy introduced Visual Insight, and Oracle introduced visualization capabilities in its Exalytics appliance. SAS customers will soon have the option of using an integrated product rather than a third-party product for these visualization capabilities.

Based on SAS CTO Keith Collins’s presentation I expect to see SAS making a big investment in SaaS (software as a service, pun intended) and other cloud offerings, including platform as a service and infrastructure as a service. Collins outlined the company’s OnCloud initiative, which begins with offering some applications on demand and will roll out additional capabilities over the next two years. SAS plans full cloud support for its products, including a self-service subscription portal, developer capabilities and a marketplace for SAS and third-party cloud-based applications and also plans to support public cloud, private cloud and hybrid configurations. Since SAS already offers its products on a subscription basis, the transition to a SaaS offering should be relatively easy from a financial perspective. This move is consistent with the market trends identified in our Business Data in the Cloud benchmark research. We also see other business intelligence vendors such as MicroStrategy and information management vendors such as Informatica adopting similarly broad commitments to cloud-based versions of their products.

Overall, SAS continues to execute well. Its customers should welcome these new developments, particularly the interactive visualization. The big-data strategy is still too SAS-centric, focused primarily on extracting information from Hadoop and other databases. I expect that the upcoming LASR Analytics Server will leverage these underlying MPP environments better. The cloud offerings will make it easier for new customers to evaluate the SAS products. I recommend you keep an eye on these developments at they come to market.

Regards,

David Menninger – VP & Research Director

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 13 other followers

RSS David Menninger’s Blogs at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

Twitter Updates

Error: Twitter did not respond. Please wait a few minutes and refresh this page.

Top Rated

Blog Stats

  • 40,560 hits
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: