Showing posts with label Weka. Show all posts
Showing posts with label Weka. Show all posts

Friday, November 6, 2009

PMML and Open Source Data Mining

Open source tools provide a cost-effective, yet powerful option for data mining. The following contenders adhere to the PMML standard which facilitates model exchange among open source and commercial vendors, providing a definitive route for production deployment of predictive models.
The R Project
The R Project for Statistical Computing is definitely the most used and revered statistical package among advocates of open-source and community computing projects. Like the iPhone app store, you can basically find anything you need in CRAN (statistical that’s to say ... yep, no navigation system for R), the Comprehensive R Archive Network. It is in CRAN that you will find the R PMML Package. This package allows R users to export PMML for a variety of models, including decision trees and neural networks (among many others). We recently co-authored an article with Graham Williams, the original author and maintainer of the package. It can be downloaded directly from The R Journal website. If you are interested in contributing code for the package, please contact us.
KNIME
Developed by the University of Konstanz, KNIME is an open-source platform that enables users to visually create and execute data flows. Since KNIME 2.0 (available as of December 2008), users can import and export PMML models into and out of KNIME. Given that users can use R within KNIME, the R PMML package can also be used to export and convert R models to PMML within KNIME. New versions of KNIME will most certainly expand its support for PMML even further.
Weka
Developed by the University of Waikato, Weka provides a large collection of machine learning algorithms for solving data mining problems. Although Weka has currently no export functionality for PMML, Mark Hall is currently working on implementing import functionally for PMML. Weka can already import models such as regression, decision trees and neural networks. PMML support in Weka is constantly expanding with the addition of transformations and built-in functions.

RapidMiner
Most recently, Rapid-I announced that it will extend the latest version of its RapidMiner software to include support for PMML. RapidMiner, formerly known as YALE, is an open-source platform that offers operators for all aspects of data mining. As with KNIME, Rapid-I is one of the latest companies to join the rankings of the Data Mining Group (DMG) beside companies like IBM, Microstrategy, SPSS, SAS and Zementis. The DMG is already busy at work refining and adding yet more capabilities and power to PMML.

PMML Discussion Forums
For an on-going discussion and to read about the latest PMML news, we would like to invite you to join the PMML group in LinkedIn or the discussion forum in the PMML group on Analytic Bridge, a social network community for analytics professionals. For PMML resources, examples, and useful links, please take a look at the PMML page on the Zementis website.

Thursday, October 15, 2009

Latest issue of ACM SIGKDD Explorations focuses on open source analytics, PMML and cloud computing.


The latest issue of ACM SIGKDD Explorations is out! This issue is relevant in many ways, since it not only gives special attention to open source analytics (including articles on Weka and KNIME), but it also discusses PMML and cloud computing.

PMML, in particular, gets special treatment. It is described in a full article written by Rick Pechter from Microstrategy. As Rick puts it, "the Predictive Model Markup Language data mining standard has arguably become one of the most widely adopted data mining standards in use today."

PMML is also discussed in most of the other articles, including the one by Zementis, entitled: "Efficient Deployment of Predictive Analytics through Open Standards and Cloud Computing". In this article, we use the ADAPA scoring engine to illustrate how the benefits of PMML and cloud computing can be combined to offer a platform that leverages these elements to deliver an efficient deployment process for statistical models.

So, don't miss out on this issue of SIGKDD Explorations. We invite you to explore all the peer-reviewed articles in detail.

Welcome to the World of Predictive Analytics!

© Predictive Analytics by Zementis, Inc. - All Rights Reserved.





Copyright © 2009 Zementis Incorporated. All rights reserved.

Privacy - Terms Of Use - Contact Us