Classification problem

Hi experts,
I have a problem working with a query-based taxonomy. I will tell you step-by-step what I did so maybe it will make it easier for you to diagnose my problem (ie. if I missed a step)
I have configurated a web repository (called test_repository) and an index that uses the repository as a datasource. The web repository contains two websites, which are actually the english and french versions of a website (ie. they share the same HTTP system, but the system paths of each website differs, one is /en and the other is /fr).
I then created a query-based taxonomy (called test_taxonomy), and created the categories 'en' and 'fr'. Within these categories I created other subcategories. I then used the Taxonomy Query Builder to define the content of the categories and the subcategories. Since I am categorizing a web repository, my queries are all URL character strings.
For example:
For test_taxonomy-->en, I set the query to
property= 'Folder'
value = /test_repository/website_en/*
and for the subcategories within 'en', I set the query to
property= 'Folder'
value = /test_repository/website_en/cat<n>/*
(n = 1...10)
And I used similar setup for the category 'fr' and its subcategories.
There are around 900 documents in the repository. Now, after I saved and clicked 'update'. There are only a very few documents (ie. 5%) that appear in the appropiate subcategories, the rest would remain in the first level categories of 'en' or 'fr' even though their URL indicate that they should belong in one of the subcategories (ie. the document with URL starting with www.mywebsite.com/en/cat1). Still other documents failed to get classified (even though their URL starts with www.mywebsite.com/en for example), and they remain in the 'To be Classified' folder of the taxonomy (and yes, I have configurated 'auto-classification' in the index).
What could be wrong? I think I have done everything correctly.... The yield of the queries shouldn't be that low!?
Thanks for your patience. Urgent problem, points will be generously awarded
Charles

hmm.. I can't seem to classify web repository datasource with the 'Folder Id' query, the system will inform me that the resources do not exist (because it's web based and the repository is not hierarchical?).
Also, I read in the documentations:
http://help.sap.com/saphelp_nw04/helpdata/en/6b/36527995b3cc43bf47d7451608b0be/frameset.htm
that 'In the case of Web repositories, the path has to contain an asterisk (*) as a placeholder. The asterisk prevents the system from checking the existence of a folder in the repository.'
So in the end, I used *, but by following a suggestion from another thread,
I use it in the pattern: *ABC instead. In this case, I will get all resources that begin with the URL http://www.example.com/XYZ/ABC/...
However, the yield is still unsatisfactory, for a considerable amount of the resources still didn't get auto-classified
Thanks

Similar Messages

Permit Classification Problem

Dear All
While creating classification in IPMD for a permit; the system gives a dump.
Shown to ABAP progemammer but still no solution has been found.
Is there any problem in assigning the values to the class or characteristics for the IPMD classification.
Regards
Praveen Dhankar

Hi all
Please find the dump in here as follows:
Transaction Code: IPMD
Select a line item
When I go into change mode and click on the detail; it displays a screen.
Now when I click on Classification it gives me a dump.
Runtime Errors         ASSIGN_LENGTH_0
Date and Time          06.05.2009 10:02:52
Short text
Program error: ASSIGN with length 0 in program "SAPLCLCV".
What happened?
Error in the ABAP Application Program
The current ABAP program "SAPLCLCV" had to be terminated because it has come across a statement that unfortunately cannot be executed. Tried to create a field with length 0. This is not possible.
Error analysis
In an ASSIGN statement in the program "SAPLCLCV" a field symbol with length 0 should have been created. However, this is not possible length 0. This is not possible.
Trigger Location of Runtime Error
     Program                                 SAPLCLCV
     Include                                 LCLCVU03
     Row                                     49
     Module type                             (FUNCTION)
     Module Name                             CLCV_CONV_EXIT
Source Code Extract
Line SourceCde
   19     SMASK(7)                 TYPE C VALUE '==$',
   20     CFELD                    LIKE RMCLF-OBJEK,
   21 *------- the object that has to to be classified will be build up in
   22 *------- field XFELD appropriate definition of TCLO
   23     XFELD(100)               TYPE C,
   24     L_STRING_LENGTH          TYPE I,
   25     L_FIELD_LENGTH           TYPE I,
   26     L_KEYFIELD               TYPE TCLO-KEYF0,
   27     L_RETCODE                TYPE SY-SUBRC,
   28
   29     BEGIN OF FIELD_ATTR.
   30       INCLUDE STRUCTURE DFIES. DATA:
   31     END   OF FIELD_ATTR.
   32
   33   FIELD-SYMBOLS:
   34     <OBJEKT>,
   35     <FELD>.
   36
   37   CLEAR OFFSET.
   38
   39 * read TCLO to work area and set CONVTAB                       v 1235023
   40 * corresponding to processed object type
   41   perform SET_CONVTAB using TABLE.                            "^ 1235023
   42
   43 *------- assign given string to work field
   44   ASSIGN EX_OBJECT TO <OBJEKT>.
   45
   46   LOOP AT CONVTAB.
   47 *------- reduce given string to its componets via given information
   48 *------- of contab
>>>>>     ASSIGN <OBJEKT>(CONVTAB-LEN) TO <FELD>.
   50
   51     IF CONVTAB-CON IS INITIAL.
52       WRITE <FELD> TO CFELD.
53     ELSE.
54 *------- a specific conversion rule is involved
55       REPLACE '$' WITH CONVTAB-CON INTO MASK.
56       condense mask no-gaps.                                "UP_H578730
57 *------- prepare field with given conversion rule
58       WRITE <FELD> TO CFELD USING EDIT MASK MASK.
59       MASK = SMASK.
60     ENDIF.
61
62     WRITE CFELD TO XFELD+OFFSET(CONVTAB-OUTL).
63     OFFSET = OFFSET + CONVTAB-OUTL.
64     WRITE SPACE TO XFELD+OFFSET(1).
65     OFFSET = OFFSET + 1.
66 *    shift_value = convtab-outl.
67     shift_value = convtab-len.
68     help_field = <objekt>.

Batch classification problem

Dear Gurus,
I have a stock of a material 100MT,with batch number a .i am using batch charcteristic values as b and c with classification for batch a.i am doing PGI for 10 MT.In MB51 if i am seeing the material doccument.I am seeing the despatched qty 10,with batch number a and charcteristic value b and c.No problem till now.
As per my scenario,Now I am doing MSC2N with batch a, and i am changing the charcterictics values as b1 and c1.
Now if i am checking material doccument for the previous one (601),in classification i am seeing b1 and c1,but as per my requirement we should see batch charcterisctic values asly b and c for 601 which i have done previously because while despatch it was b and c only.
What I have to do from my end.Please help me where i have to do configuration.
Guna

Hi Guna,
I think that in the material document the system records only the batch number, e.g. A in your case.
When you display the characteristcs it is taking it from the batch record; so if you changed the values of the characteristics sfter the movement, you'll see the new values (as you do). System does not capture and freeze the classification of the batch in the material document.
Regards,
Mario

Commit Classification problems

Hi everyone,
Does Someone know a funcion module which I could use to save classification data on tables AUSP and KSSK?
Regards,
Gabriel

Try this BAPI FM : BAPI_CHARACT_CHANGE
Also check for other FM's in the transaction BAPI
Hierarchical(TAB) --> Cross Application Components -->
Classification System
Perhaps you will find one of the required BAPIs.
Also check for this FM : CLAF_CLASSIFICATION_OF_OBJECTS,
bapiclass,
Finally look at this one : CLAP_DDB_UPDATE_CLASSIFICATION
Regards,
Subramanian V.

EIM/WIM multiple classifications problem - alarm node

Hello,
I need some piece of advice with alarm workflow. I would like to select multiple classifications with one alarm node, but boolean "OR" statement is not working properly.
For example: I have category A and category B. I would like to specify Filter-Activity condition based on Relationships. I select :
<object type>-<attribute>-<operator>-<value>-<boolean>
Classification-Classification name-=-A-"OR"
Classification-Classification name-=-B-"AND"
But after "OK" button is pressed and I reopen the alarm node, but the "OR" statement is changed to "AND", so I am unable to filter based on multiple classifications.
Classifications does not contain the same string, so I cannot use for example another operator to reach the selection.
Any advice how to do this? (copying of alarm workflows is such a strange idea:)

seems to be a defect, you should be able to evalute condition 1 and condition 2 = result 1
result 1 or condition 3, result 1 and condition 3 etc...
in your case if you have,
classification classification name contains test OR
classification classification name contains test2 OR ----- Result 1
classification classification name contains test3 OR------ Result 2
classification classification name contains test4 AND-----Result 3
and this is not saving then it is a bug and should be filed as such.

Itunes 7 Album Classification Problem!!

Ok, so today I got itunes 7, its great and all, but it seems that its classifying my albums by year instead of by name, if anyone knows how I can put it back to being classified by name please tell me how to do so. It would be greatly appreciated, Thank You!!

Click on the top bar where it reads "NAME" or "ARTIST" instead
of year, sir. Simple problem, solution.

Tax classification problem

Dear Friends ,
Through T.code ovk1 I maintained tax category as MWST for country Codes DE and ZA.
while creating a customer in under germany DE, at sales area > billing tab why system showing two times MWST for DE and ZA country codes like below. I did not understand this issue. Please help me.
DE MWST
ZA MWST
Regards
Satya

Problem is sloved ,
Thanks a lot
Bye
Satya

RV016 - System Log Classification Problem

I have a RV016 V3 with lastest firmware 4.1.
I have enabled Proteclink Web.
When I check the system log, I see it clasifies the urls the wrong way.
For example, if I go to www.facebook.com, the router blocks it ok, but in the log, it appears like a dialer url, when it should clasified as Social Networking.
It happens with all the urls.
What can I do to solve this??
Thanks in advance.
Regards
Juan Pablo

Good afternoon,
Thanks for using our forum
My name is Haider Ali Malhi and I am part of the Small Business Support Community. I tried to replicate your network and discovered that the issue you mentioned first is present in the old firmware. Beta versions can not really be relied upon as they are not the final product, also I would suggest to wait for the latest firmware to be released because that should solve any issues you are currently facing.
Thank you for your time and patience.
I hope you find this information useful, also please do let me know how this works out.
Greetings,
Haider Ali Malhi
Small Business Content Developer

Taxonomy? Classification? Categorisation?

Not sure if there is a way around this classification problem
I have a supplier who produces products that can be classified under two various categories ie Uniforms and Site cleaning services. However, the invoices are under the one Vendor Id (same company).
How can i force the user to select the right category.
As it stands the user selects the invoice based on the supplier then is presented with two categories (uniform, site services).
Is there a taxonomy that can be used that groups such suppliers?
ie use a special field to flag the vendor Id so that regardless of what service they are invoicing for it falls under the one category.
What i am looking for is an established industry standard to deal with such suppliers for categorisation purposes to increase granular spend visibilty

Hi Fred
You need to create Purchasing group
Refer
http://help.sap.com/saphelp_erp60_sp/helpdata/EN/48/6dd238d7554c49e10000000a11402f/frameset.htm
Purchasing group
Re: How to create Purchase group
thanks
ng

Dmtreedemo.java error

hi,
i got error in the following programme in java named dmdemotree.java the code and the error are as mentioned below
i have installed oracle 10g r2 and i have used JDK 1.4.2 softwares , i have set classpath for jdm.jar and ojdm_api.jar available in oracle 10g r2 software ,successfully compiled but at execution stage i got error as
F:\Mallari\DATA MINING demos\java\samples>java dmtreedemo localhost:1521:orcl scott tiger
--- Build Model - using cost matrix ---
javax.datamining.JDMException: Generic Error.
at oracle.dmt.jdm.resource.OraExceptionHandler.createException(OraExcept
ionHandler.java:142)
at oracle.dmt.jdm.resource.OraExceptionHandler.createException(OraExcept
ionHandler.java:91)
at oracle.dmt.jdm.OraDMObject.createException(OraDMObject.java:111)
at oracle.dmt.jdm.base.OraTask.saveObjectInDatabase(OraTask.java:204)
at oracle.dmt.jdm.OraMiningObject.saveObjectInDatabase(OraMiningObject.j
ava:164)
at oracle.dmt.jdm.resource.OraPersistanceManagerImpl.saveObject(OraPersi
stanceManagerImpl.java:245)
at oracle.dmt.jdm.resource.OraConnection.saveObject(OraConnection.java:3
83)
at dmtreedemo.executeTask(dmtreedemo.java:622)
at dmtreedemo.buildModel(dmtreedemo.java:304)
at dmtreedemo.main(dmtreedemo.java:199)
Caused by: java.sql.SQLException: Unsupported feature
at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:134)
at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:179)
at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:269)
at oracle.jdbc.dbaccess.DBError.throwUnsupportedFeatureSqlException(DBEr
ror.java:690)
at oracle.jdbc.driver.OracleCallableStatement.setString(OracleCallableSt
atement.java:1337)
at oracle.dmt.jdm.utils.OraSQLUtils.createCallableStatement(OraSQLUtils.
java:126)
at oracle.dmt.jdm.utils.OraSQLUtils.executeCallableStatement(OraSQLUtils
.java:532)
at oracle.dmt.jdm.scheduler.OraProgramJob.createJob(OraProgramJob.java:7
7)
at oracle.dmt.jdm.scheduler.OraJob.saveJob(OraJob.java:107)
at oracle.dmt.jdm.scheduler.OraProgramJob.saveJob(OraProgramJob.java:85)
at oracle.dmt.jdm.scheduler.OraProgramJob.saveJob(OraProgramJob.java:290
at oracle.dmt.jdm.base.OraTask.saveObjectInDatabase(OraTask.java:199)
... 6 more
SO PLZ HELP ME OUT IN THIS , I WILL BE VERY THANK FULL
===========================================================
the sample code is
// Copyright (c) 2004, 2005, Oracle. All rights reserved.
// File: dmtreedemo.java
* This demo program describes how to use the Oracle Data Mining (ODM) Java API
* to solve a classification problem using Decision Tree (DT) algorithm.
* PROBLEM DEFINITION
* How to predict whether a customer responds or not to the new affinity card
* program using a classifier based on DT algorithm?
* DATA DESCRIPTION
* Data for this demo is composed from base tables in the Sales History (SH)
* schema. The SH schema is an Oracle Database Sample Schema that has the customer
* demographics, purchasing, and response details for the previous affinity card
* programs. Data exploration and preparing the data is a common step before
* doing data mining. Here in this demo, the following views are created in the user
* schema using CUSTOMERS, COUNTRIES, and SUPPLIMENTARY_DEMOGRAPHICS tables.
* MINING_DATA_BUILD_V:
* This view collects the previous customers' demographics, purchasing, and affinity
* card response details for building the model.
* MINING_DATA_TEST_V:
* This view collects the previous customers' demographics, purchasing, and affinity
* card response details for testing the model.
* MINING_DATA_APPLY_V:
* This view collects the prospective customers' demographics and purchasing
* details for predicting response for the new affinity card program.
* DATA MINING PROCESS
* Prepare Data:
* 1. Missing Value treatment for predictors
* See dmsvcdemo.java for a definition of missing values, and the steps to be
* taken for missing value imputation. SVM interprets all NULL values for a
* given attribute as "sparse". Sparse data is not suitable for decision
* trees, but it will accept sparse data nevertheless. Decision Tree
* implementation in ODM handles missing predictor values (by penalizing
* predictors which have missing values) and missing target values (by simple
* discarding records with missing target values). We skip missing values
* treatment in this demo.
* 2. Outlier/Clipping treatment for predictors
* See dmsvcdemo.java for a discussion on outlier treatment. For decision
* trees, outlier treatment is not really necessary. We skip outlier treatment
* in this demo.
* 3. Binning high cardinality data
* No data preparation for the types we accept is necessary - even for high
* cardinality predictors. Preprocessing to reduce the cardinality
* (e.g., binning) can improve the performance of the build, but it could
* penalize the accuracy of the resulting model.
* The PrepareData() method in this demo program illustrates the preparation of the
* build, test, and apply data. We skip PrepareData() since the decision tree
* algorithm is very capable of handling data which has not been specially
* prepared. For this demo, no data preparation will be performed.
* Build Model:
* Mining Model is the prime object in data mining. The buildModel() method
* illustrates how to build a classification model using DT algorithm.
* Test Model:
* Classification model performance can be evaluated by computing test
* metrics like accuracy, confusion matrix, lift and ROC. The testModel() or
* computeTestMetrics() method illustrates how to perform a test operation to
* compute various metrics.
* Apply Model:
* Predicting the target attribute values is the prime function of
* classification models. The applyModel() method illustrates how to
* predict the customer response for affinity card program.
* EXECUTING DEMO PROGRAM
* Refer to Oracle Data Mining Administrator's Guide
* for guidelines for executing this demo program.
// Generic Java api imports
import java.math.BigDecimal;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.ResultSetMetaData;
import java.sql.SQLException;
import java.sql.Statement;
import java.text.DecimalFormat;
import java.text.MessageFormat;
import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Stack;
// Java Data Mining (JDM) standard api imports
import javax.datamining.ExecutionHandle;
import javax.datamining.ExecutionState;
import javax.datamining.ExecutionStatus;
import javax.datamining.JDMException;
import javax.datamining.MiningAlgorithm;
import javax.datamining.MiningFunction;
import javax.datamining.NamedObject;
import javax.datamining.SizeUnit;
import javax.datamining.algorithm.tree.TreeHomogeneityMetric;
import javax.datamining.algorithm.tree.TreeSettings;
import javax.datamining.algorithm.tree.TreeSettingsFactory;
import javax.datamining.base.AlgorithmSettings;
import javax.datamining.base.Model;
import javax.datamining.base.Task;
import javax.datamining.data.AttributeDataType;
import javax.datamining.data.CategoryProperty;
import javax.datamining.data.CategorySet;
import javax.datamining.data.CategorySetFactory;
import javax.datamining.data.ModelSignature;
import javax.datamining.data.PhysicalAttribute;
import javax.datamining.data.PhysicalAttributeFactory;
import javax.datamining.data.PhysicalAttributeRole;
import javax.datamining.data.PhysicalDataSet;
import javax.datamining.data.PhysicalDataSetFactory;
import javax.datamining.data.SignatureAttribute;
import javax.datamining.modeldetail.tree.TreeModelDetail;
import javax.datamining.modeldetail.tree.TreeNode;
import javax.datamining.resource.Connection;
import javax.datamining.resource.ConnectionFactory;
import javax.datamining.resource.ConnectionSpec;
import javax.datamining.rule.Predicate;
import javax.datamining.rule.Rule;
import javax.datamining.supervised.classification.ClassificationApplySettings;
import javax.datamining.supervised.classification.ClassificationApplySettingsFactory;
import javax.datamining.supervised.classification.ClassificationModel;
import javax.datamining.supervised.classification.ClassificationSettings;
import javax.datamining.supervised.classification.ClassificationSettingsFactory;
import javax.datamining.supervised.classification.ClassificationTestMetricOption;
import javax.datamining.supervised.classification.ClassificationTestMetrics;
import javax.datamining.supervised.classification.ClassificationTestMetricsTask;
import javax.datamining.supervised.classification.ClassificationTestMetricsTaskFactory;
import javax.datamining.supervised.classification.ClassificationTestTaskFactory;
import javax.datamining.supervised.classification.ConfusionMatrix;
import javax.datamining.supervised.classification.CostMatrix;
import javax.datamining.supervised.classification.CostMatrixFactory;
import javax.datamining.supervised.classification.Lift;
import javax.datamining.supervised.classification.ReceiverOperatingCharacterics;
import javax.datamining.task.BuildTask;
import javax.datamining.task.BuildTaskFactory;
import javax.datamining.task.apply.DataSetApplyTask;
import javax.datamining.task.apply.DataSetApplyTaskFactory;
// Oracle Java Data Mining (JDM) implemented api imports
import oracle.dmt.jdm.algorithm.tree.OraTreeSettings;
import oracle.dmt.jdm.resource.OraConnection;
import oracle.dmt.jdm.resource.OraConnectionFactory;
import oracle.dmt.jdm.modeldetail.tree.OraTreeModelDetail;
public class dmtreedemo
//Connection related data members
private static Connection m_dmeConn;
private static ConnectionFactory m_dmeConnFactory;
//Object factories used in this demo program
private static PhysicalDataSetFactory m_pdsFactory;
private static PhysicalAttributeFactory m_paFactory;
private static ClassificationSettingsFactory m_clasFactory;
private static TreeSettingsFactory m_treeFactory;
private static BuildTaskFactory m_buildFactory;
private static DataSetApplyTaskFactory m_dsApplyFactory;
private static ClassificationTestTaskFactory m_testFactory;
private static ClassificationApplySettingsFactory m_applySettingsFactory;
private static CostMatrixFactory m_costMatrixFactory;
private static CategorySetFactory m_catSetFactory;
private static ClassificationTestMetricsTaskFactory m_testMetricsTaskFactory;
// Global constants
private static DecimalFormat m_df = new DecimalFormat("##.####");
private static String m_costMatrixName = null;
public static void main( String args[] )
try
if ( args.length != 3 ) {
System.out.println("Usage: java dmsvrdemo <Host name>:<Port>:<SID> <User Name> <Password>");
return;
String uri = args[0];
String name = args[1];
String password = args[2];
// 1. Login to the Data Mining Engine
m_dmeConnFactory = new OraConnectionFactory();
ConnectionSpec connSpec = m_dmeConnFactory.getConnectionSpec();
connSpec.setURI("jdbc:oracle:thin:@"+uri);
connSpec.setName(name);
connSpec.setPassword(password);
m_dmeConn = m_dmeConnFactory.getConnection(connSpec);
// 2. Clean up all previuosly created demo objects
clean();
// 3. Initialize factories for mining objects
initFactories();
m_costMatrixName = createCostMatrix();
// 4. Build model with supplied cost matrix
buildModel();
// 5. Test model - To compute accuracy and confusion matrix, lift result
// and ROC for the model from apply output data.
// Please see dnnbdemo.java to see how to test the model
// with a test input data and cost matrix.
// Test the model with cost matrix
computeTestMetrics("DT_TEST_APPLY_OUTPUT_COST_JDM",
"dtTestMetricsWithCost_jdm", m_costMatrixName);
// Test the model without cost matrix
computeTestMetrics("DT_TEST_APPLY_OUTPUT_JDM",
"dtTestMetrics_jdm", null);
// 6. Apply the model
applyModel();
} catch(Exception anyExp) {
anyExp.printStackTrace(System.out);
} finally {
try {
//6. Logout from the Data Mining Engine
m_dmeConn.close();
} catch(Exception anyExp1) { }//Ignore
* Initialize all object factories used in the demo program.
* @exception JDMException if factory initalization failed
public static void initFactories() throws JDMException
m_pdsFactory = (PhysicalDataSetFactory)m_dmeConn.getFactory(
"javax.datamining.data.PhysicalDataSet");
m_paFactory = (PhysicalAttributeFactory)m_dmeConn.getFactory(
"javax.datamining.data.PhysicalAttribute");
m_clasFactory = (ClassificationSettingsFactory)m_dmeConn.getFactory(
"javax.datamining.supervised.classification.ClassificationSettings");
m_treeFactory = (TreeSettingsFactory) m_dmeConn.getFactory(
"javax.datamining.algorithm.tree.TreeSettings");
m_buildFactory = (BuildTaskFactory)m_dmeConn.getFactory(
"javax.datamining.task.BuildTask");
m_dsApplyFactory = (DataSetApplyTaskFactory)m_dmeConn.getFactory(
"javax.datamining.task.apply.DataSetApplyTask");
m_testFactory = (ClassificationTestTaskFactory)m_dmeConn.getFactory(
"javax.datamining.supervised.classification.ClassificationTestTask");
m_applySettingsFactory = (ClassificationApplySettingsFactory)m_dmeConn.getFactory(
"javax.datamining.supervised.classification.ClassificationApplySettings");
m_costMatrixFactory = (CostMatrixFactory)m_dmeConn.getFactory(
"javax.datamining.supervised.classification.CostMatrix");
m_catSetFactory = (CategorySetFactory)m_dmeConn.getFactory(
"javax.datamining.data.CategorySet" );
m_testMetricsTaskFactory = (ClassificationTestMetricsTaskFactory)m_dmeConn.getFactory(
"javax.datamining.supervised.classification.ClassificationTestMetricsTask");
* This method illustrates how to build a mining model using the
* MINING_DATA_BUILD_V dataset and classification settings with
* DT algorithm.
* @exception JDMException if model build failed
public static void buildModel() throws JDMException
System.out.println("---------------------------------------------------");
System.out.println("--- Build Model - using cost matrix ---");
System.out.println("---------------------------------------------------");
// 1. Create & save PhysicalDataSpecification
PhysicalDataSet buildData =
m_pdsFactory.create("MINING_DATA_BUILD_V", false);
PhysicalAttribute pa = m_paFactory.create("CUST_ID",
AttributeDataType.integerType, PhysicalAttributeRole.caseId );
buildData.addAttribute(pa);
m_dmeConn.saveObject("treeBuildData_jdm", buildData, true);
//2. Create & save Mining Function Settings
// Create tree algorithm settings
TreeSettings treeAlgo = m_treeFactory.create();
// By default, tree algorithm will have the following settings:
// treeAlgo.setBuildHomogeneityMetric(TreeHomogeneityMetric.gini);
// treeAlgo.setMaxDepth(7);
// ((OraTreeSettings)treeAlgo).setMinDecreaseInImpurity(0.1, SizeUnit.percentage);
// treeAlgo.setMinNodeSize( 0.05, SizeUnit.percentage );
// treeAlgo.setMinNodeSize( 10, SizeUnit.count );
// ((OraTreeSettings)treeAlgo).setMinDecreaseInImpurity(20, SizeUnit.count);
// Set cost matrix. A cost matrix is used to influence the weighting of
// misclassification during model creation (and scoring).
// See Oracle Data Mining Concepts Guide for more details.
String costMatrixName = m_costMatrixName;
// Create ClassificationSettings
ClassificationSettings buildSettings = m_clasFactory.create();
buildSettings.setAlgorithmSettings(treeAlgo);
buildSettings.setCostMatrixName(costMatrixName);
buildSettings.setTargetAttributeName("AFFINITY_CARD");
m_dmeConn.saveObject("treeBuildSettings_jdm", buildSettings, true);
// 3. Create, save & execute Build Task
BuildTask buildTask = m_buildFactory.create(
"treeBuildData_jdm", // Build data specification
"treeBuildSettings_jdm", // Mining function settings name
"treeModel_jdm" // Mining model name
buildTask.setDescription("treeBuildTask_jdm");
executeTask(buildTask, "treeBuildTask_jdm");
//4. Restore the model from the DME and explore the details of the model
ClassificationModel model =
(ClassificationModel)m_dmeConn.retrieveObject(
"treeModel_jdm", NamedObject.model);
// Display model build settings
ClassificationSettings retrievedBuildSettings =
(ClassificationSettings)model.getBuildSettings();
if(buildSettings == null)
System.out.println("Failure to restore build settings.");
else
displayBuildSettings(retrievedBuildSettings, "treeBuildSettings_jdm");
// Display model signature
displayModelSignature((Model)model);
// Display model detail
TreeModelDetail treeModelDetails = (TreeModelDetail) model.getModelDetail();
displayTreeModelDetailsExtensions(treeModelDetails);
* Create and save cost matrix.
* Consider an example where it costs $10 to mail a promotion to a
* prospective customer and if the prospect becomes a customer, the
* typical sale including the promotion, is worth $100. Then the cost
* of missing a customer (i.e. missing a $100 sale) is 10x that of
* incorrectly indicating that a person is good prospect (spending
* $10 for the promo). In this case, all prediction errors made by
* the model are NOT equal. To act on what the model determines to
* be the most likely (probable) outcome may be a poor choice.
* Suppose that the probability of a BUY reponse is 10% for a given
* prospect. Then the expected revenue from the prospect is:
* .10 * $100 - .90 * $10 = $1.
* The optimal action, given the cost matrix, is to simply mail the
* promotion to the customer, because the action is profitable ($1).
* In contrast, without the cost matrix, all that can be said is
* that the most likely response is NO BUY, so don't send the
* promotion. This shows that cost matrices can be very important.
* The caveat in all this is that the model predicted probabilities
* may NOT be accurate. For binary targets, a systematic approach to
* this issue exists. It is ROC, illustrated below.
* With ROC computed on a test set, the user can see how various model
* predicted probability thresholds affect the action of mailing a promotion.
* Suppose I promote when the probability to BUY exceeds 5, 10, 15%, etc.
* what return can I expect? Note that the answer to this question does
* not rely on the predicted probabilities being accurate, only that
* they are in approximately the correct rank order.
* Assuming that the predicted probabilities are accurate, provide the
* cost matrix table name as input to the RANK_APPLY procedure to get
* appropriate costed scoring results to determine the most appropriate
* action.
* In this demo, we will create the following cost matrix
* ActualTarget PredictedTarget Cost
* 0 0 0
* 0 1 1
* 1 0 8
* 1 1 0
private static String createCostMatrix() throws JDMException
String costMatrixName = "treeCostMatrix";
// Create categorySet
CategorySet catSet = m_catSetFactory.create(AttributeDataType.integerType);
// Add category values
catSet.addCategory(new Integer(0), CategoryProperty.valid);
catSet.addCategory(new Integer(1), CategoryProperty.valid);
// Create cost matrix
CostMatrix costMatrix = m_costMatrixFactory.create(catSet);
// ActualTarget PredictedTarget Cost
costMatrix.setValue(new Integer(0), new Integer(0), 0);
costMatrix.setValue(new Integer(0), new Integer(1), 1);
costMatrix.setValue(new Integer(1), new Integer(0), 8);
costMatrix.setValue(new Integer(1), new Integer(1), 0);
//save cost matrix
m_dmeConn.saveObject(costMatrixName, costMatrix, true);
return costMatrixName;
* This method illustrates how to compute test metrics using
* an apply output table that has actual and predicted target values. Here the
* apply operation is done on the MINING_DATA_TEST_V dataset. It creates
* an apply output table with actual and predicted target values. Using
* ClassificationTestMetricsTask test metrics are computed. This produces
* the same test metrics results as ClassificationTestTask.
* @param applyOutputName apply output table name
* @param testResultName test result name
* @param costMatrixName table name of the supplied cost matrix
* @exception JDMException if model test failed
public static void computeTestMetrics(String applyOutputName,
String testResultName, String costMatrixName) throws JDMException
if (costMatrixName != null) {
System.out.println("---------------------------------------------------");
System.out.println("--- Test Model - using apply output table ---");
System.out.println("--- - using cost matrix table ---");
System.out.println("---------------------------------------------------");
else {
System.out.println("---------------------------------------------------");
System.out.println("--- Test Model - using apply output table ---");
System.out.println("--- - using no cost matrix table ---");
System.out.println("---------------------------------------------------");
// 1. Do the apply on test data to create an apply output table
// Create & save PhysicalDataSpecification
PhysicalDataSet applyData =
m_pdsFactory.create( "MINING_DATA_TEST_V", false );
PhysicalAttribute pa = m_paFactory.create("CUST_ID",
AttributeDataType.integerType, PhysicalAttributeRole.caseId );
applyData.addAttribute( pa );
m_dmeConn.saveObject( "treeTestApplyData_jdm", applyData, true );
// 2 Create & save ClassificationApplySettings
ClassificationApplySettings clasAS = m_applySettingsFactory.create();
HashMap sourceAttrMap = new HashMap();
sourceAttrMap.put( "AFFINITY_CARD", "AFFINITY_CARD" );
clasAS.setSourceDestinationMap( sourceAttrMap );
m_dmeConn.saveObject( "treeTestApplySettings_jdm", clasAS, true);
// 3 Create, store & execute apply Task
DataSetApplyTask applyTask = m_dsApplyFactory.create(
"treeTestApplyData_jdm",
"treeModel_jdm",
"treeTestApplySettings_jdm",
applyOutputName);
if(executeTask(applyTask, "treeTestApplyTask_jdm"))
// Compute test metrics on new created apply output table
// 4. Create & save PhysicalDataSpecification
PhysicalDataSet applyOutputData = m_pdsFactory.create(
applyOutputName, false );
applyOutputData.addAttribute( pa );
m_dmeConn.saveObject( "treeTestApplyOutput_jdm", applyOutputData, true );
// 5. Create a ClassificationTestMetricsTask
ClassificationTestMetricsTask testMetricsTask =
m_testMetricsTaskFactory.create( "treeTestApplyOutput_jdm", // apply output data used as input
"AFFINITY_CARD", // actual target column
"PREDICTION", // predicted target column
testResultName // test metrics result name
testMetricsTask.computeMetric( // enable confusion matrix computation
ClassificationTestMetricOption.confusionMatrix, true );
testMetricsTask.computeMetric( // enable lift computation
ClassificationTestMetricOption.lift, true );
testMetricsTask.computeMetric( // enable ROC computation
ClassificationTestMetricOption.receiverOperatingCharacteristics, true );
testMetricsTask.setPositiveTargetValue( new Integer(1) );
testMetricsTask.setNumberOfLiftQuantiles( 10 );
testMetricsTask.setPredictionRankingAttrName( "PROBABILITY" );
if (costMatrixName != null) {
testMetricsTask.setCostMatrixName(costMatrixName);
displayTable(costMatrixName, "", "order by ACTUAL_TARGET_VALUE, PREDICTED_TARGET_VALUE");
// Store & execute the task
boolean isTaskSuccess = executeTask(testMetricsTask, "treeTestMetricsTask_jdm");
if( isTaskSuccess ) {
// Restore & display test metrics
ClassificationTestMetrics testMetrics = (ClassificationTestMetrics)
m_dmeConn.retrieveObject( testResultName, NamedObject.testMetrics );
// Display classification test metrics
displayTestMetricDetails(testMetrics);
* This method illustrates how to apply the mining model on the
* MINING_DATA_APPLY_V dataset to predict customer
* response. After completion of the task apply output table with the
* predicted results is created at the user specified location.
* @exception JDMException if model apply failed
public static void applyModel() throws JDMException
System.out.println("---------------------------------------------------");
System.out.println("--- Apply Model ---");
System.out.println("---------------------------------------------------");
System.out.println("---------------------------------------------------");
System.out.println("--- Business case 1 ---");
System.out.println("--- Find the 10 customers who live in Italy ---");
System.out.println("--- that are least expensive to be convinced to ---");
System.out.println("--- use an affinity card. ---");
System.out.println("---------------------------------------------------");
// 1. Create & save PhysicalDataSpecification
PhysicalDataSet applyData =
m_pdsFactory.create( "MINING_DATA_APPLY_V", false );
PhysicalAttribute pa = m_paFactory.create("CUST_ID",
AttributeDataType.integerType, PhysicalAttributeRole.caseId );
applyData.addAttribute( pa );
m_dmeConn.saveObject( "treeApplyData_jdm", applyData, true );
// 2. Create & save ClassificationApplySettings
ClassificationApplySettings clasAS = m_applySettingsFactory.create();
// Add source attributes
HashMap sourceAttrMap = new HashMap();
sourceAttrMap.put( "COUNTRY_NAME", "COUNTRY_NAME" );
clasAS.setSourceDestinationMap( sourceAttrMap );
// Add cost matrix
clasAS.setCostMatrixName( m_costMatrixName );
m_dmeConn.saveObject( "treeApplySettings_jdm", clasAS, true);
// 3. Create, store & execute apply Task
DataSetApplyTask applyTask = m_dsApplyFactory.create(
"treeApplyData_jdm", "treeModel_jdm",
"treeApplySettings_jdm", "TREE_APPLY_OUTPUT1_JDM");
executeTask(applyTask, "treeApplyTask_jdm");
// 4. Display apply result -- Note that APPLY results do not need to be
// reverse transformed, as done in the case of model details. This is
// because class values of a classification target were not (required to
// be) binned or normalized.
// Find the 10 customers who live in Italy that are least expensive to be
// convinced to use an affinity card.
displayTable("TREE_APPLY_OUTPUT1_JDM",
"where COUNTRY_NAME='Italy' and ROWNUM < 11 ",
"order by COST");
System.out.println("---------------------------------------------------");
System.out.println("--- Business case 2 ---");
System.out.println("--- List ten customers (ordered by their id) ---");
System.out.println("--- along with likelihood and cost to use or ---");
System.out.println("--- reject the affinity card. ---");
System.out.println("---------------------------------------------------");
// 1. Create & save PhysicalDataSpecification
applyData =
m_pdsFactory.create( "MINING_DATA_APPLY_V", false );
pa = m_paFactory.create("CUST_ID",
AttributeDataType.integerType, PhysicalAttributeRole.caseId );
applyData.addAttribute( pa );
m_dmeConn.saveObject( "treeApplyData_jdm", applyData, true );
// 2. Create & save ClassificationApplySettings
clasAS = m_applySettingsFactory.create();
// Add cost matrix
clasAS.setCostMatrixName( m_costMatrixName );
m_dmeConn.saveObject( "treeApplySettings_jdm", clasAS, true);
// 3. Create, store & execute apply Task
applyTask = m_dsApplyFactory.create(
"treeApplyData_jdm", "treeModel_jdm",
"treeApplySettings_jdm", "TREE_APPLY_OUTPUT2_JDM");
executeTask(applyTask, "treeApplyTask_jdm");
// 4. Display apply result -- Note that APPLY results do not need to be
// reverse transformed, as done in the case of model details. This is
// because class values of a classification target were not (required to
// be) binned or normalized.
// List ten customers (ordered by their id) along with likelihood and cost
// to use or reject the affinity card (Note: while this example has a
// binary target, such a query is useful in multi-class classification -
// Low, Med, High for example).
displayTable("TREE_APPLY_OUTPUT2_JDM",
"where ROWNUM < 21",
"order by CUST_ID, PREDICTION");
System.out.println("---------------------------------------------------");
System.out.println("--- Business case 3 ---");
System.out.println("--- Find the customers who work in Tech support ---");
System.out.println("--- and are under 25 who is going to response ---");
System.out.println("--- to the new affinity card program. ---");
System.out.println("---------------------------------------------------");
// 1. Create & save PhysicalDataSpecification
applyData =
m_pdsFactory.create( "MINING_DATA_APPLY_V", false );
pa = m_paFactory.create("CUST_ID",
AttributeDataType.integerType, PhysicalAttributeRole.caseId );
applyData.addAttribute( pa );
m_dmeConn.saveObject( "treeApplyData_jdm", applyData, true );
// 2. Create & save ClassificationApplySettings
clasAS = m_applySettingsFactory.create();
// Add source attributes
sourceAttrMap = new HashMap();
sourceAttrMap.put( "AGE", "AGE" );
sourceAttrMap.put( "OCCUPATION", "OCCUPATION" );
clasAS.setSourceDestinationMap( sourceAttrMap );
m_dmeConn.saveObject( "treeApplySettings_jdm", clasAS, true);
// 3. Create, store & execute apply Task
applyTask = m_dsApplyFactory.create(
"treeApplyData_jdm", "treeModel_jdm",
"treeApplySettings_jdm", "TREE_APPLY_OUTPUT3_JDM");
executeTask(applyTask, "treeApplyTask_jdm");
// 4. Display apply result -- Note that APPLY results do not need to be
// reverse transformed, as done in the case of model details. This is
// because class values of a classification target were not (required to
// be) binned or normalized.
// Find the customers who work in Tech support and are under 25 who is
// going to response to the new affinity card program.
displayTable("TREE_APPLY_OUTPUT3_JDM",
"where OCCUPATION = 'TechSup' " +
"and AGE < 25 " +
"and PREDICTION = 1 ",
"order by CUST_ID");
* This method stores the given task with the specified name in the DMS
* and submits the task for asynchronous execution in the DMS. After
* completing the task successfully it returns true. If there is a task
* failure, then it prints error description and returns false.
* @param taskObj task object
* @param taskName name of the task
* @return boolean returns true when the task is successful
* @exception JDMException if task execution failed
public static boolean executeTask(Task taskObj, String taskName)
throws JDMException
boolean isTaskSuccess = false;
m_dmeConn.saveObject(taskName, taskObj, true);
ExecutionHandle execHandle = m_dmeConn.execute(taskName);
System.out.print(taskName + " is started, please wait. ");
//Wait for completion of the task
ExecutionStatus status = execHandle.waitForCompletion(Integer.MAX_VALUE);
//Check the status of the task after completion
isTaskSuccess = status.getState().equals(ExecutionState.success);
if( isTaskSuccess ) //Task completed successfully
System.out.println(taskName + " is successful.");
else //Task failed
System.out.println(taskName + " failed.\nFailure Description: " +
status.getDescription() );
return isTaskSuccess;
private static void displayBuildSettings(
ClassificationSettings clasSettings, String buildSettingsName)
System.out.println("BuildSettings Details from the "
+ buildSettingsName + " table:");
displayTable(buildSettingsName, "", "order by SETTING_NAME");
System.out.println("BuildSettings Details from the "
+ buildSettingsName + " model build settings object:");
String objName = clasSettings.getName();
if(objName != null)
System.out.println("Name = " + objName);
String objDescription = clasSettings.getDescription();
if(objDescription != null)
System.out.println("Description = " + objDescription);
java.util.Date creationDate = clasSettings.getCreationDate();
String creator = clasSettings.getCreatorInfo();
String targetAttrName = clasSettings.getTargetAttributeName();
System.out.println("Target attribute name = " + targetAttrName);
AlgorithmSettings algoSettings = clasSettings.getAlgorithmSettings();
if(algoSettings == null)
System.out.println("Failure: clasSettings.getAlgorithmSettings() returns null");
MiningAlgorithm algo = algoSettings.getMiningAlgorithm();
if(algo == null) System.out.println("Failure: algoSettings.getMiningAlgorithm() returns null");
System.out.println("Algorithm Name: " + algo.name());
MiningFunction function = clasSettings.getMiningFunction();
if(function == null) System.out.println("Failure: clasSettings.getMiningFunction() returns null");
System.out.println("Function Name: " + function.name());
try {
String costMatrixName = clasSettings.getCostMatrixName();
if(costMatrixName != null) {
System.out.println("Cost Matrix Details from the " + costMatrixName
+ " table:");
displayTable(costMatrixName, "", "order by ACTUAL_TARGET_VALUE, PREDICTED_TARGET_VALUE");
} catch(Exception jdmExp)
System.out.println("Failure: clasSettings.getCostMatrixName()throws exception");
jdmExp.printStackTrace();
// List of DT algorithm settings
// treeAlgo.setBuildHomogeneityMetric(TreeHomogeneityMetric.gini);
// treeAlgo.setMaxDepth(7);
// ((OraTreeSettings)treeAlgo).setMinDecreaseInImpurity(0.1, SizeUnit.percentage);
// treeAlgo.setMinNodeSize( 0.05, SizeUnit.percentage );
// treeAlgo.setMinNodeSize( 10, SizeUnit.count );
// ((OraTreeSettings)treeAlgo).setMinDecreaseInImpurity(20, SizeUnit.count);
TreeHomogeneityMetric homogeneityMetric = ((OraTreeSettings)algoSettings).getBuildHomogeneityMetric();
System.out.println("Homogeneity Metric: " + homogeneityMetric.name());
int intValue = ((OraTreeSettings)algoSettings).getMaxDepth();
System.out.println("Max Depth: " + intValue);
double doubleValue = ((OraTreeSettings)algoSettings).getMinNodeSizeForSplit(SizeUnit.percentage);
System.out.println("MinNodeSizeForSplit (percentage): " + m_df.format(doubleValue));
doubleValue = ((OraTreeSettings)algoSettings).getMinNodeSizeForSplit(SizeUnit.count);
System.out.println("MinNodeSizeForSplit (count): " + m_df.format(doubleValue));
doubleValue = ((OraTreeSettings)algoSettings).getMinNodeSize();
SizeUnit unit = ((OraTreeSettings)algoSettings).getMinNodeSizeUnit();
System.out.println("Min Node Size (" + unit.name() +"): " + m_df.format(doubleValue));
doubleValue = ((OraTreeSettings)algoSettings).getMinNodeSize( SizeUnit.count );
System.out.println("Min Node Size (" + SizeUnit.count.name() +"): " + m_df.format(doubleValue));
doubleValue = ((OraTreeSettings)algoSettings).getMinNodeSize( SizeUnit.percentage );
System.out.println("Min Node Size (" + SizeUnit.percentage.name() +"): " + m_df.format(doubleValue));
* This method displayes DT model signature.
* @param model model object
* @exception JDMException if failed to retrieve model signature
public static void displayModelSignature(Model model) throws JDMException
String modelName = model.getName();
System.out.println("Model Name: " + modelName);
ModelSignature modelSignature = model.getSignature();
System.out.println("ModelSignature Deatils: ( Attribute Name, Attribute Type )");
MessageFormat mfSign = new MessageFormat(" ( {0}, {1} )");
String[] vals = new String[3];
Collection sortedSet = modelSignature.getAttributes();
Iterator attrIterator = sortedSet.iterator();
while(attrIterator.hasNext())
SignatureAttribute attr = (SignatureAttribute)attrIterator.next();
vals[0] = attr.getName();
vals[1] = attr.getDataType().name();
System.out.println( mfSign.format(vals) );
* This method displayes DT model details.
* @param treeModelDetails tree model details object
* @exception JDMException if failed to retrieve model details
public static void displayTreeModelDetailsExtensions(TreeModelDetail treeModelDetails)
throws JDMException
System.out.println( "\nTreeModelDetail: Model name=" + "treeModel_jdm" );
TreeNode root = treeModelDetails.getRootNode();
System.out.println( "\nRoot node: " + root.getIdentifier() );
// get the info for the tree model
int treeDepth = ((OraTreeModelDetail) treeModelDetails).getTreeDepth();
System.out.println( "Tree depth: " + treeDepth );
int totalNodes = ((OraTreeModelDetail) treeModelDetails).getNumberOfNodes();
System.out.println( "Total number of nodes: " + totalNodes );
int totalLeaves = ((OraTreeModelDetail) treeModelDetails).getNumberOfLeafNodes();
System.out.println( "Total number of leaf nodes: " + totalLeaves );
Stack nodeStack = new Stack();
nodeStack.push( root);
while( !nodeStack.empty() )
TreeNode node = (TreeNode) nodeStack.pop();
// display this node
int nodeId = node.getIdentifier();
long caseCount = node.getCaseCount();
Object prediction = node.getPrediction();
int level = node.getLevel();
int children = node.getNumberOfChildren();
TreeNode parent = node.getParent();
System.out.println( "\nNode id=" + nodeId + " at level " + level );
if( parent != null )
System.out.println( "parent: " + parent.getIdentifier() +
", children=" + children );
System.out.println( "Case count: " + caseCount + ", prediction: " + prediction );
Predicate predicate = node.getPredicate();
System.out.println( "Predicate: " + predicate.toString() );
Predicate[] surrogates = node.getSurrogates();
if( surrogates != null )
for( int i=0; i<surrogates.length; i++ )
System.out.println( "Surrogate[" + i + "]: " + surrogates[i] );
// add child nodes in the stack
if( children > 0 )
TreeNode[] childNodes = node.getChildren();
for( int i=0; i<childNodes.length; i++ )
nodeStack.push( childNodes[i] );
TreeNode[] allNodes = treeModelDetails.getNodes();
System.out.print( "\nNode identifiers by getNodes():" );
for( int i=0; i<allNodes.length; i++ )
System.out.print( " " + allNodes.getIdentifier() );
System.out.println();
// display the node identifiers
int[] nodeIds = treeModelDetails.getNodeIdentifiers();
System.out.print( "Node identifiers by getNodeIdentifiers():" );
for( int i=0; i<nodeIds.length; i++ )
System.out.print( " " + nodeIds[i] );
System.out.println();
TreeNode node = treeModelDetails.getNode(nodeIds.length-1);
System.out.println( "Node identifier by getNode(" + (nodeIds.length-1) +
"): " + node.getIdentifier() );
Rule rule2 = treeModelDetails.getRule(nodeIds.length-1);
System.out.println( "Rule identifier by getRule(" + (nodeIds.length-1) +
"): " + rule2.getRuleIdentifier() );
// get the rules and display them
Collection ruleColl = treeModelDetails.getRules();
Iterator ruleIterator = ruleColl.iterator();
while( ruleIterator.hasNext() )
Rule rule = (Rule) ruleIterator.next();
int ruleId = rule.getRuleIdentifier();
Predicate antecedent = (Predicate) rule.getAntecedent();
Predicate consequent = (Predicate) rule.getConsequent();
System.out.println( "\nRULE " + ruleId + ": support=" +
rule.getSupport() + " (abs=" + rule.getAbsoluteSupport() +
"), confidence=" + rule.getConfidence() );
System.out.println( antecedent );
System.out.println( "=======>" );
System.out.println( consequent );
* Display classification test metrics object
* @param testMetrics classification test metrics object
* @exception JDMException if failed to retrieve test metric details
public static void displayTestMetricDetails(
ClassificationTestMetrics testMetrics) throws JDMException
// Retrieve Oracle ABN model test metrics deatils extensions
// Test Metrics Name
System.out.println("Test Metrics Name = " + testMetrics.getName());
// Model Name
System.out.println("Model Name = " + testMetrics.getModelName());
// Test Data Name
System.out.println("Test Data Name = " + testMetrics.getTestDataName());
// Accuracy
System.out.println("Accuracy = " + m_df.format(testMetrics.getAccuracy().doubleValue()));
// Confusion Matrix
ConfusionMatrix confusionMatrix = testMetrics.getConfusionMatrix();
Collection categories = confusionMatrix.getCategories();
Iterator xIterator = categories.iterator();
System.out.println("Confusion Matrix: Accuracy = " + m_df.format(confusionMatrix.getAccuracy()));
System.out.println("Confusion Matrix: Error = " + m_df.format(confusionMatrix.getError()));
System.out.println("Confusion Matrix:( Actual, Prection, Value )");
MessageFormat mf = new MessageFormat(" ( {0}, {1}, {2} )");
String[] vals = new String[3];
while(xIterator.hasNext())
Object actual = xIterator.next();
vals[0] = actual.toString();
Iterator yIterator = categories.iterator();
while(yIterator.hasNext())
Object predicted = yIterator.next();
vals[1] = predicted.toString();
long number = confusionMatrix.getNumberOfPredictions(actual, predicted);
vals[2] = Long.toString(number);
System.out.println(mf.format(vals));
// Lift
Lift lift = testMetrics.getLift();
System.out.println("Lift Details:");
System.out.println("Lift: Target Attribute Name = " + lift.getTargetAttributeName());
System.out.println("Lift: Positive Target Value = " + lift.getPositiveTargetValue());
System.out.println("Lift: Total Cases = " + lift.getTotalCases());
System.out.println("Lift: Total Positive Cases = " + lift.getTotalPositiveCases());
int numberOfQuantiles = lift.getNumberOfQuantiles();
System.out.println("Lift: Number Of Quantiles = " + numberOfQuantiles);
System.out.println("Lift: ( QUANTILE_NUMBER, QUANTILE_TOTAL_COUNT, QUANTILE_TARGET_COUNT, PERCENTAGE_RECORDS_CUMULATIVE,CUMULATIVE_LIFT,CUMULATIVE_TARGET_DENSITY,TARGETS_CUMULATIVE, NON_TARGETS_CUMULATIVE, LIFT_QUANTILE, TARGET_DENSITY )");
MessageFormat mfLift = new MessageFormat(" ( {0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9} )");
String[] liftVals = new String[10];
for(int iQuantile=1; iQuantile<= numberOfQuantiles; iQuantile++)
liftVals[0] = Integer.toString(iQuantile); //QUANTILE_NUMBER
liftVals[1] = Long.toString(lift.getCases((iQuantile-1), iQuantile));//QUANTILE_TOTAL_COUNT
liftVals[2] = Long.toString(lift.getNumberOfPositiveCases((iQuantile-1), iQuantile));//QUANTILE_TARGET_COUNT
liftVals[3] = m_df.format(lift.getCumulativePercentageSize(iQuantile).doubleValue());//PERCENTAGE_RECORDS_CUMULATIVE
liftVals[4] = m_df.format(lift.getCumulativeLift(iQuantile).doubleValue());//CUMULATIVE_LIFT
liftVals[5] = m_df.format(lift.getCumulativeTargetDensity(iQuantile).doubleValue());//CUMULATIVE_TARGET_DENSITY
liftVals[6] = Long.toString(lift.getCumulativePositiveCases(iQuantile));//TARGETS_CUMULATIVE
liftVals[7] = Long.toString(lift.getCumulativeNegativeCases(iQuantile));//NON_TARGETS_CUMULATIVE
liftVals[8] = m_df.format(lift.getLift(iQuantile, iQuantile).doubleValue());//LIFT_QUANTILE
liftVals[9] = m_df.format(lift.getTargetDensity(iQuantile, iQuantile).doubleValue());//TARGET_DENSITY
System.out.println(mfLift.format(liftVals));
// ROC
ReceiverOperatingCharacterics roc = testMetrics.getROC();
System.out.println("ROC Details:");
System.out.println("ROC: Area Under Curve = " + m_df.format(roc.getAreaUnderCurve()));
int nROCThresh = roc.getNumberOfThresholdCandidates();
System.out.println("ROC: Number Of Threshold Candidates = " + nROCThresh);
System.out.println("ROC: ( INDEX, PROBABILITY, TRUE_POSITIVES, FALSE_NEGATIVES, FALSE_POSITIVES, TRUE_NEGATIVES, TRUE_POSITIVE_FRACTION, FALSE_POSITIVE_FRACTION )");
MessageFormat mfROC = new MessageFormat(" ( {0}, {1}, {2}, {3}, {4}, {5}, {6}, {7} )");
String[] rocVals = new String[8];
for(int iROC=1; iROC <= nROCThresh; iROC++)
rocVals[0] = Integer.toString(iROC); //INDEX
rocVals[1] = m_df.format(roc.getProbabilityThreshold(iROC));//PROBABILITY
rocVals[2] = Long.toString(roc.getPositives(iROC, true));//TRUE_POSITIVES
rocVals[3] = Long.toString(roc.getNegatives(iROC, false));//FALSE_NEGATIVES
rocVals[4] = Long.toString(roc.getPositives(iROC, false));//FALSE_POSITIVES
rocVals[5] = Long.toString(roc.getNegatives(iROC, true));//TRUE_NEGATIVES
rocVals[6] = m_df.format(roc.getHitRate(iROC));//TRUE_POSITIVE_FRACTION
rocVals[7] = m_df.format(roc.getFalseAlarmRate(iROC));//FALSE_POSITIVE_FRACTION
System.out.println(mfROC.format(rocVals));
private static void displayTable(String tableName, String whereCause, String orderByColumn)
StringBuffer emptyCol = new StringBuffer(" ");
java.sql.Connection dbConn =
((OraConnection)m_dmeConn).getDatabaseConnection();
PreparedStatement pStmt = null;
ResultSet rs = null;
try
pStmt = dbConn.prepareStatement("SELECT * FROM " + tableName + " " + whereCause + " " + orderByColumn);
rs = pStmt.executeQuery();
ResultSetMetaData rsMeta = rs.getMetaData();
int colCount = rsMeta.getColumnCount();
StringBuffer header = new StringBuffer();
System.out.println("Table : " + tableName);
//Build table header
for(int iCol=1; iCol<=colCount; iCol++)
String colName = rsMeta.getColumnName(iCol);
header.append(emptyCol.replace(0, colName.length(), colName));
emptyCol = new StringBuffer(" ");
System.out.println(header.toString());
//Write table data
while(rs.next())
StringBuffer rowContent = new StringBuffer();
for(int iCol=1; iCol<=colCount; iCol++)
int sqlType = rsMeta.getColumnType(iCol);
Object obj = rs.getObject(iCol);
String colContent = null;
if(obj instanceof java.lang.Number)
try
BigDecimal bd = (BigDecimal)obj;
if(bd.scale() > 5)
colContent = m_df.format(obj);
} else
colContent = bd.toString();
} catch(Exception anyExp) {
colContent = m_df.format(obj);
} else
if(obj == null)
colContent = "NULL";
else
colContent = obj.toString();
rowContent.append(" "+emptyCol.replace(0, colContent.length(), colContent));
emptyCol = new StringBuffer(" ");
System.out.println(rowContent.toString());
} catch(Exception anySqlExp) {
anySqlExp.printStackTrace();
}//Ignore
private static void createTableForTestMetrics(String applyOutputTableName,
String testDataName,
String testMetricsInputTableName)
//0. need to execute the following in the schema
String sqlCreate =
"create table " + testMetricsInputTableName + " as " +
"select a.id as id, prediction, probability, affinity_card " +
"from " + testDataName + " a, " + applyOutputTableName + " b " +
"where a.id = b.id";
java.sql.Connection dbConn = ((OraConnection) m_dmeConn).getDatabaseConnection();
Statement stmt = null;
try
stmt = dbConn.createStatement();
stmt.executeUpdate( sqlCreate );
catch( Exception anySqlExp )
System.out.println( anySqlExp.getMessage() );
anySqlExp.printStackTrace();
finally
try
stmt.close();
catch( SQLException sqlExp ) {}
private static void clean()
java.sql.Connection dbConn =
((OraConnection) m_dmeConn).getDatabaseConnection();
Statement stmt = null;
// Drop apply output table
try
stmt = dbConn.createStatement();
stmt.executeUpdate("DROP TABLE TREE_APPLY_OUTPUT1_JDM");
} catch(Exception anySqlExp) {}//Ignore
finally
try
stmt.close();
catch( SQLException sqlExp ) {}
try
stmt = dbConn.createStatement();
stmt.executeUpdate("DROP TABLE TREE_APPLY_OUTPUT2_JDM");
} catch(Exception anySqlExp) {}//Ignore
finally
try
stmt.close();
catch( SQLException sqlExp ) {}
try
stmt = dbConn.createStatement();
stmt.executeUpdate("DROP TABLE TREE_APPLY_OUTPUT3_JDM");
} catch(Exception anySqlExp) {}//Ignore
finally
try
stmt.close();
catch( SQLException sqlExp ) {}
// Drop apply output table created for test metrics task
try
stmt = dbConn.createStatement();
stmt.executeUpdate("DROP TABLE DT_TEST_APPLY_OUTPUT_COST_JDM");
} catch(Exception anySqlExp) {}//Ignore
finally
try
stmt.close();
catch( SQLException sqlExp ) {}
try
stmt = dbConn.createStatement();
stmt.executeUpdate("DROP TABLE DT_TEST_APPLY_OUTPUT_JDM");
} catch(Exception anySqlExp) {}//Ignore
finally
try
stmt.close();
catch( SQLException sqlExp ) {}
//Drop the model
try {
m_dmeConn.removeObject( "treeModel_jdm", NamedObject.model );
} catch(Exception jdmExp) {}
// drop test metrics result: created by TestMetricsTask
try {
m_dmeConn.removeObject( "dtTestMetricsWithCost_jdm", NamedObject.testMetrics );
} catch(Exception jdmExp) {}
try {
m_dmeConn.removeObject( "dtTestMetrics_jdm", NamedObject.testMetrics );
} catch(Exception jdmExp) {}

Hi
I am not sure whether this will help but someone else was getting an error with a java.sql.SQLexception: Unsupported feature. Here is a link to the fix: http://saloon.javaranch.com/cgi-bin/ubb/ultimatebb.cgi?ubb=get_topic&f=3&t=007947
Best wishes
Michael

Predictive Algorithm for Churn analysis

Hi,
Can anybody help me with the algorithm which I can use for churn analysis?
Thanks,
Atul

Hi Atul,
For Churn analysis or what is usually referred to as a binary classification problem where the customer is either staying or leaving=churning I would suggest one of the following algorithms:
CNR Decision Tree - which also provides a decision tree to explain which feature split is influencing the target (churn) the most.
You could also chose one of the R based Neural Network algorithms, however the produced predictive model & results are usually hard to explain.
If need be you can enhance the number of available algorithms by adding you own R functions - there are a lot of examples in this community.
If you have SAP HANA you could also chose:
Decision Trees:C4.5, CHAID or CART (new in SAP HANA SP08).
Other supervised learning algorithms for binary classification: Naive Bayes or SVM (Support Vector Machine).
There are a lot more but this should get you started.
Best regards,
Kurt Holst

Oracle Data Miner ROC Chart

Is there anyone who can explain some things about the roc chart for me?
How is what is showed in the roc chart related to the confusion matrix next to it given in the Oracle Data Miner?
How is this roc chart constructed? How is it possible that it represents the decision tree model I made?
I hope somebody can help me

Hi,
This explaination comes from one of our algorithm engineers:
"The ROC analysis applies to binary classification problems. One of the classes is selected as a "positive" one. The ROC chart plots the true positive rate as a function of the false positive rate. It is parametrized by the probability threshold values. The true positive rate represents the fraction of positive cases that were correctly classified by the model. The false positive rate represents the fraction of negative cases that were incorrectly classified as positive. Each point on the ROC plot represents a true_positive_rate/false_positive_rate pair corresponding to a particular probability threshold. Each point has a corresponding confusion matrix. The user can analyze the confusion matrices produced at different threshold levels and select a probability threshold to be used for scoring. The probability threshold choice is usually based on application requirements (i.e., acceptable level of false positives).
The ROC does not represent a model. Instead it quantifies its discriminatory ability and assists the user in selecting an appropriate operating point for scoring."
I would add to this that you can select a threshold point the build activity to bias the apply process. Currently we generate a cost matrix based on the selected threshold point rather than use the threshold point directly.
Thanks, Mark

XML form frustration, please help! Points to all who help me solve this.

Hi all, I am really getting frustrated with Portals at the moment for a variety of reasons, all of which I hope, are because I have little experience. I will try and explain what I am trying to do and that might help you understand (hopefully you will have come across this before!)
Situation:
-     A user wants to publish a News story to a few groups of users not all.
-     We have a good Active Directory structure in place and want to user that as our security groups in the portal.
Solution 1:
Using Taxonomies, form custom properties and form classification
Problem 1: We have 110 groups at the moment which contains Users organised by roles, branches, departments and floors. To set this up would take too long and administration would be a nightmare!
Solution 2:
Using Permissions on KM folders News gets published into different folders
Problem 2:
As above 110 folders and 110 XML forms Serious administration headache!
The ideal would be to publish the XML News form into one folder e.g. /documents/news/ and have on the form, the ability to search and select which groups would have read permissions on the XML News Item. I have seen a similar form in Collaboration which allows you to select multiple members to a room, however I cant get access to this.
I am sure this can be done however I am getting nowhere on my own so I am asking you all for a little help. If there is anything else you need to know then I will do my best to explain.
Thanks in advance
Phil
Message was edited by: Phil Wade
Does any body know how I can submit this to SAP for their help?

Hi all, I am really getting frustrated with Portals at the moment for a variety of reasons, all of which I hope, are because I have little experience. I will try and explain what I am trying to do and that might help you understand (hopefully you will have come across this before!)
Situation:
-     A user wants to publish a News story to a few groups of users not all.
-     We have a good Active Directory structure in place and want to user that as our security groups in the portal.
Solution 1:
Using Taxonomies, form custom properties and form classification
Problem 1: We have 110 groups at the moment which contains Users organised by roles, branches, departments and floors. To set this up would take too long and administration would be a nightmare!
Solution 2:
Using Permissions on KM folders News gets published into different folders
Problem 2:
As above 110 folders and 110 XML forms Serious administration headache!
The ideal would be to publish the XML News form into one folder e.g. /documents/news/ and have on the form, the ability to search and select which groups would have read permissions on the XML News Item. I have seen a similar form in Collaboration which allows you to select multiple members to a room, however I cant get access to this.
I am sure this can be done however I am getting nowhere on my own so I am asking you all for a little help. If there is anything else you need to know then I will do my best to explain.
Thanks in advance
Phil
Message was edited by: Phil Wade
Does any body know how I can submit this to SAP for their help?

TREX logs to verify

Hi:
I'm having some trouble in classification problems. I want to view the server logs and the TREX logs.
But, i want to know, which are the name of the TREX logs i should review.
Thanks in advance,
Felipe

Hi Felipe,
One more question, do you know what crawler processes are for?
SAP describes it like that:
"The crawler service allows crawlers to collect resources located in internal or external repositories, for example, for indexing purposes. A crawler returns the resources and the hierarchical or net-like structures of the respective repositories.
Services and applications that need repositories to be crawled (for example, the index management service) request a crawler from the crawler service."
Link to the description
And how they work with the TREX ?
Another explanation from SAP:
"Crawlers are used in Knowledge Management to collect resources that are stored in internal or repositories. The resources found and the hierarchical or net-like structures are forwarded to various services and applications for further processing."
Link to it
So the crawler services check the internal and external repositories for new or updated documents and provide them for further processing to other services.
Hope it helps.
Regards,
Norman Schröder

Experiment at ~24 Run Time - To Be Expected?

I've put together a pretty straightforward experiment using an SVM to address a classification problem. The experiment has run all steps successfully, but has now been in the "Evaluating" phase for ~21 hours.
As a cloud service, I was under the impression that there was some scalability behind it, so I shouldn't be seeing this. The model is only working on a moderately-sized dataset (~300k rows by ~20 columns).
Is this situation "normal"? To be expected? I've done this same project on a smaller subset on my local PC in R, and the result was that no classification predictions could be made. Is it likely that AzureML is having the same
problem - where a prediction can't be reliably made, and so it's just cycling forever?
Thanks for your input.

Experiment Link
However, I have no option to view the Output Log, either from the Properties Pane of the studio, or from Right-Clicking on the Evaluate Model module (the View Output Log is greyed out, and not selectable).

Classification problem

Similar Messages

Maybe you are looking for