Model deployment used to be a big task. Predictive models, once built, needed to be re-coded into production to be able to score new data. This process was prone to errors and could easily take up to six months. Re-coding of predictive models has no place in the big data era we live in. Since data is changing rapidly, model deployment needs to be instantaneous and error-free.
PMML, the Predictive Model Markup Language, is the standard to represent predictive models. Given that PMML can be produced by all the top commercial and open-source data mining tools (e.g., FICO Model Builder, SAS EM, IBM SPSS, R, KNIME, ...), a predictive model can be easily moved into the production environment once it is represented as a PMML file.
Zementis offers ADAPA for real-time scoring and UPPI for big data scoring which make the entire model deployment process a no-brainer. Given that ADAPA and UPPI are universal PMML consumers (accept any version of PMML produced by any PMML-compliant tool), they can make predictive models instantly available for execution inside the production environment.
Check out the Zementis website for details.
Tuesday, September 10, 2013
Predictive Models with PMML - Upcoming workshop at UCSD Extension - Oct 24-25
October 24-25, 2013
San Diego Supercomputer Center (SDSC), UC San Diego Campus
San Diego Supercomputer Center (SDSC), UC San Diego Campus
TO REGISTER, FOLLOW THE LINK BELOW:
The Predictive Model Markup Language (PMML) is the de facto standard to represent data mining and predictive analytic models. With PMML, one can easily share a predictive solution among PMML-compliant applications and systems.
Developed in partnership with the San Diego Supercomputer Center’s (SDSC) Predictive Analytics Center of Excellence (PACE), this 2-day, hands-on workshop, will explore how the PMML language allows for models to be deployed in minutes. You will get to know its business value and the data mining tools and companies supporting PMML. You will also begin to understand the language elements and capabilities and learn how to effectively extract the most out of your PMML code.
- Practice PMML on SDSC’s Gordon with the guidance of world class instructors from industry and academia.
- Learn how to represent an entire data mining solution using open-standards
- Understand how to use PMML effectively as a vehicle for model logging, versioning and deployment
- Identify and correct issues with PMML code as well as add missing computations to auto-generated PMML code
- PLUS…Receive a comprehensive tour of SDSC to discover its inner workings, extensive capabilities and current projects.
Instructors
- Alex Guazzelli, Ph.D., Vice President of Analytics, Zementis, Inc.
- Natasha Balac, Ph.D., Director of PACE, SDSC, UC San Diego
- Paul Rodriguez, Ph.D., Research Programmer Analyst, SDSC, UC San Diego
Scholarships Available!
Thanks to the generous underwriting of Zementis, three (3) half-tuition scholarships are available. Learn more and apply
Thanks to the generous underwriting of Zementis, three (3) half-tuition scholarships are available. Learn more and apply
Note: Students should have a fundamental knowledge of data mining methods and basic experience with computer programming language. Students must bring a laptop (MAC or PC) each day to fully participate during the hands-on portion of the workshop.
Course Number: CSE-41184 Credit: 2 units
This course is part of the following Certificate Program(s):
TO REGISTER, FOLLOW THE LINK BELOW:
Wednesday, August 21, 2013
R and PMML Support
A PMML package for R that exports all kinds of predictive models is available directly from CRAN.
Traditionally, the pmml package offered support for the following data mining algorithms:
- ksvm (kernlab): Support Vector Machines
- nnet: Neural Networks
- rpart: C&RT Decision Trees
- lm & glm (stats): Linear and Binary Logistic Regression Models
- arules: Association Rules
- kmeans and hclust: Clustering Models
Recently, it has been expanded to support:
- multinom (nnet): Multinomial Logistic Regression Models
- glm (stats): Generalized Linear Models for classification and regression with a wide variety of link functions
- randomForest: Random Forest Models for classification and regression
- coxph (survival): Cox Regression Models to calculate survival and stratified cumulative hazards
- naiveBayes (e1071): Naive Bayes Classifiers
- glmnet: Linear ElasticNet Regression Models
The pmml package can also export data transformations built with the pmmlTransformations package (see below). It can also be used to merge two disctinct PMML files into one. For example, if transformations and model were saved into separate PMML files, it can combine both files into one, as described in Chapter 5 of the PMML book - PMML in Action.
How does it work?
Simple, once you build your model using any of the supported model types, pass the model object as an input parameter to the pmml function as shown in the figure below:
Example - sequence of R commands used to build a linear regression model using lm and the Iris dataset:
Example - sequence of R commands used to build a linear regression model using lm and the Iris dataset:
Documentation
For more on the pmml package, please take a look at the paper we published in The R Journal. For that, just follow the link below:
1) Paper: PMML: An Open Standard for Sharing Models
Also, make sure to check out the package's documentation from CRAN:
2) CRAN: pmml Package
1) Paper: PMML: An Open Standard for Sharing Models
Also, make sure to check out the package's documentation from CRAN:
2) CRAN: pmml Package
R PMML Transformations Package
This is a brand new R package. Called pmmlTranformations, this package transforms data and when used in conjunction with the pmml package, it allows for data transformations to be exported together with the predictive model in a single PMML file. Transformations currently supported are:
- Min-max normalization
- Z-score normalization
- Dummy-fication of categorical variables
- Value Mapping
- Discretization (binning)
- Variable renaming
If you would like to contribute code to the pmmlTransformations package, please feel free to contact us.
How does it work?
The pmmlTransformations package works in tandem with the pmml package so that data pre-processing can be represented together with the model in the resulting PMML code.
In R, as shown in the figure below, this process includes three steps:
- With the use of the pmmlTransformations package, transform the raw input data as appropriate
- Use transformed and raw data as inputs to the modeling function/package (hclust, nnet, glm, ...)
- Output the entire solution (data pre-processing + model) in PMML using the pmml package
Example - sequence of R commands used to build a linear regression model using lm with transformed data
Documentation
For more on the pmmlTransformations package, please take a look at the paper we wrote for the KDD 2013 PMML Workshop. For that, just follow the link below:
1) KDD Paper: The R pmmlTransformations Package
1) KDD Paper: The R pmmlTransformations Package
Also, make sure to check out the package's documentation from CRAN:
2) CRAN: pmmlTransformations Package
2) CRAN: pmmlTransformations Package
Wednesday, August 7, 2013
Data Transformations - from R to PMML - The pmmlTransformations Package
We are very excited to announce the availability of the R pmmlTransformations package. This package allows you to export data transformations together with your model from R into a PMML file, which you can then be deployed in the Zementis ADAPA or UPPI scoring engines. Real-time or big data scoring made easy with R, PMML, and Zementis.
The pmmlTransformations package provides R users with functions that greatly enhance the available data mining capabilities and PMML support by allowing transformations to be performed on the data before it is used for modeling. The pmmlTransformations package works in tandem with the pmml package so that data pre-processing can be represented together with the model in the resulting PMML code.
In R, this process includes three steps:
- With the use of the pmmlTransformations package, transform the raw input data as appropriate
- Use transformed and raw data as inputs to the modeling function/package (hclust, nnet, glm, ...)
- Output the entire solution (data pre-processing + model) in PMML using the pmml package

The pmmlTransformations package is available for download in CRAN (as well as the pmml package). Give it a try!
Want to learn more? Check out the paper we published about the pmmlTransformations package at KDD 2013.
Labels:
Data Transformations,
PMML,
Predictive Analytics,
Predictive Models,
R
Wednesday, July 10, 2013
PMML Workshop at KDD 2013 and UCSD Extension PMML Class
KDD 2013 PMML Workshop
Join us for the KDD PMML Workshop to be held in Chicago on August 11. Organized by the Data Mining Group (DMG), this workshop will feature invited talks and presentations of selected papers.
Zementis will be presenting two papers about PMML-support in R: Coding and representing data transformations and model through the pmmltransformations and pmml packages.
UCSD PMML Class (Coming this Fall)
UCSD Extension has teamed up with the San Diego Supercomputer Center Predictive Analytics Center of Excellence (PACE) and Zementis to offer a PMML class to the data mining community on October 24 and 25.
For more information about this great opportunity to learn the standard that is revolutionizing how predictive solutions are documented and deployed, refer to the UCSD Extension catalog.
Join us for the KDD PMML Workshop to be held in Chicago on August 11. Organized by the Data Mining Group (DMG), this workshop will feature invited talks and presentations of selected papers.
Zementis will be presenting two papers about PMML-support in R: Coding and representing data transformations and model through the pmmltransformations and pmml packages.
UCSD PMML Class (Coming this Fall)
UCSD Extension has teamed up with the San Diego Supercomputer Center Predictive Analytics Center of Excellence (PACE) and Zementis to offer a PMML class to the data mining community on October 24 and 25.
For more information about this great opportunity to learn the standard that is revolutionizing how predictive solutions are documented and deployed, refer to the UCSD Extension catalog.
Tuesday, May 7, 2013
The Zementis Partnership with FICO
Stuart Wells, FICO CTO, announced the strategic partnership between Zementis and FICO at FICO World on May 2, 2013. FICO clients will now benefit from the outstanding Zementis scoring technology.
How? The Zementis ADAPA scoring engine provides a highly scalable framework to deploy, integrate, and execute complex data mining and predictive models based on the PMML standard. Models built in most commercial and open source data mining tools, such as FICO Model Builder or R, can now instantly be deployed in the FICO Anaytic Cloud.
Customers, application developers and FICO partners will be able to extract value and insight from their predictive models and data immediately, using ADAPA and PMML. This will result in quicker time to innovation and value on their analytic applications.
Read the press release!
Predictive Analytics Deployment
Zementis offers software solutions that enable scalable, real-time execution of predictive analytics across a variety of platforms based on the PMML standard. These include:
ADAPA Scoring Engine: Our solution for real-time scoring. ADAPA is available for on-site deployment as a traditional license or as a service in the Amazon Elastic Compute Cloud (EC2) and IBM SmartCloud Enterprise. And now, with our FICO partnership, ADAPA will also be available in the FICO Analytic Cloud.
UPPI, the Universal PMML Plug-in: The leading solution for Big Data, UPPI provides scoring in-database and for Hadoop. It is available for EMC Greenplum, IBM Netezza, SAP Sybase IQ, Teradata/Aster as well as Hadoop/Hive and Datameer.
Labels:
ADAPA,
FICO,
Model Deployment,
PMML,
Predictive Model Markup Language,
Zementis
Subscribe to:
Posts (Atom)







