Thursday, March 21, 2013

R PMML Support: BetteR than EveR

Once represented as a PMML file, a predictive solution (data transformations + model) can be readily moved into the operational environment where it can be put to work immediately. That's the promise of PMML.

R is living up to that promise through its strong PMML export capabilities. The latest addition to the list of supported model types is Naive Bayes classifiers. More specifically, the R PMML package allows for PMML export for Naive Bayes models built using the naiveBayes function of the e1071 package. 

For more details and for a complete list of supported model types (as well as data pre-processing), click HERE.

Thursday, March 7, 2013

Making the case for PMML and ADAPA

If you are not familiar with PMML, the Predictive Model Markup Language, you may be wondering what all the fuss is about ...

PMML is the de facto standard to represent data mining and predictive analytic solutions. With PMML, one can easily share a predictive solution among PMML-compliant applications and systems  For example, you can build your model in R, export it in PMML, and use ADAPA, the Zementis Scoring Engine, to deploy it in production.

Many data mining models are a one-time affair. You use historical data to build the model and use it to analyze ... historical data. Wait! That sounds more like descriptive analytics, not predictive analytics. Well, that is sort of true. To be truly predictive, a data mining model needs to be applied to new data. These are the models that need to be operationally deployed and, from my point of view, these are the solutions that are truly revolutionizing the way we do business and live in the Big Data world.

If you want then to use your data mining model to make predictions when presented with new data, it needs to be a dynamic asset. It cannot be static. You need to be able to build it and instantly put it to use. And, that's where PMML and ADAPA come in handy.

Obviously, a few data mining tools try to lock you in. You happily build the model using tool A, just to realize that you need the same tool to execute it. In this case, you are missing out. Here are some of the benefits of moving your predictive model to ADAPA:
  • Overcome speed/memory limitations
  • Dramatically lower your infrastructure cost
  • Tap into all the advantages of cloud computing with ADAPA on the Cloud (IBM SmartCloud or Amazon EC2)
  • Produce scores in real-time (using Web Services or Java API), on-demand, or batch-mode
  • Execute your models directly from Excel, by using the ADAPA Add-in for Excel
  • Benefit from using a set of PMML-compliant model development tools (best of breed)
  • Deploy your models in minutes
  • Manage models via Web Services or a Web console
  • Upload one or many models into ADAPA at once
  • Benefit from the seamless integration of business rules and predictive models (yes, for those who need it, ADAPA comes with a business rules engine)
PMML and ADAPA allow you to use best of breed tools (not the same old tool) for the job at hand. Also, you can leverage the expertise from a diverse group of data scientists. That means, not all your data scientists need to be experts on a single tool. They can use different tools that share one thing in common, the PMML standard. And, once represented in PMML, models can be easily understood by all team members. PMML allows for transparency and, in doing so, fosters best practices.

Why not benefit from: 1) an open standard to represent data mining models; and 2) a proven scoring engine that consumes any version of PMML and make it available for execution right away, in real-time?

Keep also in mind that ADAPA's sister product, the Universal PMML Plug-in (UPPI), allows you to move the same PMML file in-database or Hadoop. UPPI is currently available for EMC Greenplum, SAP Sybase IQ, IBM Netezza, and Teradata/Aster. With UPPI for in-database scoring, there is no need to move your data outside the database. Data and models reside inside it and so there is minimal data movement and maximum scoring speed. UPPI is also available for Datameer and will soon be available for Hadoop/Hive.

Making a model operational in minutes has never been easier! And, it is all because of PMML and scoring tools such as ADAPA and UPPI.

Welcome to the World of Predictive Analytics!

© Predictive Analytics by Zementis, Inc. - All Rights Reserved.

Copyright © 2009 Zementis Incorporated. All rights reserved.

Privacy - Terms Of Use - Contact Us