Tuesday, 6 November 2007

Banking on Rules Part II

Creating a RuleFlow


In this example, we are going to create a simple RuleFlow and add it to our Banking example from Part I.

Paul Browne has already posted an excellent blog on this topic, which has a slightly more complicated example.

Caveat


This example is so simple, that it would actually be easier done with an Agenda-Group, but the purpose of this post is to demonstrate a very simple example of RuleFlow. I will post a more complicated one later on.

Pre-Requisites


In order to run this example you will need Eclipse with the latest Drools plug-in (Eclipse Workbench). Eclipse can be downloaded from http://www.eclipse.org/ and the Eclipse Workbench can be downloaded from http://labs.jboss.com/drools/downloads.html.

Step 1 - Creating the Rules


For those who have already downloaded the examples in Drools Part I, I am going to use that code as the starting point for this.

Our first task, is to create a rule and allocate it to a ruleflow-group so that we can use the RuleFlow to include and exclude
rules. To do this, copy rule07.drl to rule08.drl and rename the rules to be prefixed with "Rule 08" instead of "Rule 07". Now, add the ruleflow-group entries for "Rule 08 - Credit" and "Rule 08 - Debit":

rule08.drl

package simple;

import net.tplusplus.drools.bankingondrools1.*;

rule "Rule 08 - Credit"
    ruleflow-group "CreditGroup"
    salience 100
    when
        AccountingPeriod( $start : start, $end : end )
        $cashflow : AllocatedCashflow( $account : account, $date : date <= $end, $amount : amount, type==TypedCashflow.CREDIT )
        not AccountingPeriod( start < $start)
    then
        System.out.println("Credit: "+$date+" :: "+$amount);
        
        $account.setBalance($account.getBalance()+$amount);
        System.out.println("Account: "+$account.getAccountNo()+" - new balance: "+$account.getBalance());
            
        retract($cashflow);
end

rule "Rule 08 - Debit"
    ruleflow-group "DebitGroup"
    salience 100
    when
        AccountingPeriod( $start : start, $end : end )
        $cashflow : AllocatedCashflow( $account : account, $date : date <= $end, $amount : amount, type==TypedCashflow.DEBIT )
        not AccountingPeriod( start < $start)
    then
        System.out.println("Debit: "+$date+" :: "+$amount);
        
        $account.setBalance($account.getBalance()-$amount);
        System.out.println("Account: "+$account.getAccountNo()+" - new balance: "+$account.getBalance());
            
        retract($cashflow);
end

rule "Rule 08 - Retract Accounting Period"
    salience 50
    when
        $accountingPeriod : AccountingPeriod( $start : start, $end : end )
        not AccountingPeriod( start < $start)
    then
        System.out.println("Retracting Accounting Period: "+$start+" - "+$end);
        retract($accountingPeriod);
end

Open the class net.tplusplus.drools.bankingondrools1.RuleRunner and add the following method:

RuleRunner.java

    public static void simple8()
    throws Exception
    {
        Account acc1 = new Account(1);
        Account acc2 = new Account(2);
        
        Object[] facts =
        {
            new AllocatedCashflow(acc1,new SimpleDate("01/01/2007"), TypedCashflow.CREDIT, 300.00),
            new AllocatedCashflow(acc1,new SimpleDate("05/02/2007"), TypedCashflow.CREDIT, 100.00),
            new AllocatedCashflow(acc2,new SimpleDate("11/03/2007"), TypedCashflow.CREDIT, 500.00),
            new AllocatedCashflow(acc1,new SimpleDate("07/02/2007"), TypedCashflow.DEBIT, 800.00),
            new AllocatedCashflow(acc2,new SimpleDate("02/03/2007"), TypedCashflow.DEBIT, 400.00),
            new AllocatedCashflow(acc1,new SimpleDate("01/04/2007"), TypedCashflow.CREDIT, 200.00),
            new AllocatedCashflow(acc1,new SimpleDate("05/04/2007"), TypedCashflow.CREDIT, 300.00),
            new AllocatedCashflow(acc2,new SimpleDate("11/05/2007"), TypedCashflow.CREDIT, 700.00),
            new AllocatedCashflow(acc1,new SimpleDate("07/05/2007"), TypedCashflow.DEBIT, 900.00),
            new AllocatedCashflow(acc2,new SimpleDate("02/05/2007"), TypedCashflow.DEBIT, 100.00),
            
            new AccountingPeriod(new SimpleDate("01/01/2007"),new SimpleDate("31/03/2007")),
            new AccountingPeriod(new SimpleDate("01/04/2007"),new SimpleDate("30/06/2007"))
            
        };
        
        new RuleRunner().runRules(new String[]{"/simple/rule08.drl"},facts);        
    }


This gives us the ability to use a RuleFlow to include rule "Rule 08 - Credit" and not rule "Rule 08 - Debit" by specifying the ruleflow-group "CreditGroup", and vice versa by specifying the ruleflow-group "DebitGroup".

Now, modify the main method in RuleRunner to comment out calls to all the methods except rule08(). This will just tidy up the output, and exclude output from Banking On Drools - Part I:

RuleRunner.java

    public static void main(String[] args)
    throws Exception
    {
        //simple1();
        //simple2();
        //simple3();
        //simple4();
        //simple5();
        //simple6();
        //simple7();
        simple8();
    }


So far, all we have done is allocate two rules to RuleFlow groups. What happens if we run the code as it stands without introducing a RuleFlow?

Run the class RuleRunner, and we get the output:

output:

Loading file: /simple/rule08.drl
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
date=Mon Jan 01 00:00:00 GMT 2007,type=Credit,amount=300.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
date=Mon Feb 05 00:00:00 GMT 2007,type=Credit,amount=100.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=2,balance=0.0],
date=Sun Mar 11 00:00:00 GMT 2007,type=Credit,amount=500.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
date=Wed Feb 07 00:00:00 GMT 2007,type=Debit,amount=800.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=2,balance=0.0],
date=Fri Mar 02 00:00:00 GMT 2007,type=Debit,amount=400.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
date=Sun Apr 01 00:00:00 BST 2007,type=Credit,amount=200.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
date=Thu Apr 05 00:00:00 BST 2007,type=Credit,amount=300.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=2,balance=0.0],
date=Fri May 11 00:00:00 BST 2007,type=Credit,amount=700.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
date=Mon May 07 00:00:00 BST 2007,type=Debit,amount=900.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=2,balance=0.0],
date=Wed May 02 00:00:00 BST 2007,type=Debit,amount=100.0]
Inserting fact: AccountingPeriod[start=Mon Jan 01 00:00:00 GMT 2007,
end=Sat Mar 31 00:00:00 BST 2007]
Inserting fact: AccountingPeriod[start=Sun Apr 01 00:00:00 BST 2007,
end=Sat Jun 30 00:00:00 BST 2007]
Retracting Accounting Period: Mon Jan 01 00:00:00 GMT 2007
- Sat Mar 31 00:00:00 BST 2007
Retracting Accounting Period: Sun Apr 01 00:00:00 BST 2007
- Sat Jun 30 00:00:00 BST 2007

From the output, we can see that neither the Credit Rule nor the Debit Rule ran.

So, if we allocate a Rule to a RuleFlow Group, then we need to find some way of specifying the RuleFlow Group to run if we are to include these rules.

Step 2 - Modifying the RuleRunner


Our next step is to modify the RuleRunner runRules() method to allow us to specify a all the RuleFlows for the rule engine, and the initial RuleFlow Group to use.

To do this we need to overload the RunRules() method in RuleRunner to provide the RuleFlows and the process to start, as follows:

RuleRunner.java

    public void runRules(String process, String[] ruleflows, String[] rules, Object... facts)
    throws Exception
    {        
        RuleBase ruleBase = RuleBaseFactory.newRuleBase();
        PackageBuilder builder = new PackageBuilder();        
        
        for(String ruleflow : ruleflows)
        {
            builder.addRuleFlow( new InputStreamReader(
                    RuleRunner.class.getResourceAsStream( ruleflow ) ) );
        }
        
        for(String ruleFile : rules)
        {
            System.out.println("Loading file: "+ruleFile);
            builder.addPackageFromDrl(new InputStreamReader(this.getClass().getResourceAsStream(ruleFile)));
        }
        
        Package pkg = builder.getPackage();        
        ruleBase.addPackage(pkg);
        
        WorkingMemory workingMemory = ruleBase.newStatefulSession();    

        for(Object fact : facts)
        {
            System.out.println("Inserting fact: "+fact);
            workingMemory.insert(fact);
        }

        workingMemory.startProcess(process);
        workingMemory.fireAllRules();
    }

Step 3 - Creating the RuleFlow


By now we have allocated two of our rules to RuleFlow Groups and have provided a runRules() method that allows us to specify RuleFlows and a starting RuleFlow Group. Now, we need to create the RuleFlow.

To do this we are going to use the Eclipse Workbench (for Drools).

Open the Drools perspective and first let's create a source folder for our ruleflows:

Create a new source folder called ruleflows:



In Package Explorer, right click on the new ruleFlow folder and select New -> Other. From the Wizard select RuleFlow File.



Click on Next, and enter the filename creditFlows



Click on finish and eclipse will open the creditFlows.rf document in the graphical editor as follows:



Our next step is to create a simple ruleFlow for processing credits only.

The editor pane already contains a Start element. Next, we need to place a RuleFlowGroup on the editor, by clicking on RuleFlowGroup element in the left pane and then clicking on the edit pane below the Start element.

Repeat the process with the End element and place them in the editor as shown below:



Now we need to create the connections between the elements. Click on the ConnectionCreation element in the left pane and click on the Start element in the editor pane. Move the cursor to the RuleSet element and click again. A connection will be created between the two elements.

Repeat the process to connect the RuleSet element to the End element. Your edit pane should now look like:



Ok. So far we have a rule flow diagram with a Start, RuleFlow, and End element connected. This is all pretty much meaningless, however, until we set the properties in the ruleflow so that it is associated with a ruleflow group.

We want this ruleflow to execute the Credit rule that we created earlier in rule08.drl

rule08.drl

rule "Rule 08 - Credit"
    ruleflow-group "CreditGroup"
    salience 100
    when
        AccountingPeriod( $start : start, $end : end )
        $cashflow : AllocatedCashflow( $account : account, $date : date <= $end, $amount : amount, type==TypedCashflow.CREDIT )
        not AccountingPeriod( start < $start)
    then
        System.out.println("Credit: "+$date+" :: "+$amount);
        
        $account.setBalance($account.getBalance()+$amount);
        System.out.println("Account: "+$account.getAccountNo()+" - new balance: "+$account.getBalance());
            
        retract($cashflow);
end

This means we need to set up our ruleflow to call the ruleflow-group "CreditGroup".

So, here we go:

Click on the RuleSet element in the editor pane and then click on the properties tab.

Set the Name to CreditFlow, and set the RuleFlowGroup to CreditGroup (same as the ruleflow-group in the rule above).

The screen shot is show below:



The name simply gives our element a display name that should be meaningful to us as we review the ruleflow diagram. The important property is the RuleFlowGroup which should be the same as the RuleFlowGroup of the rules that we want to include in this RuleFlow.

Finally, we need to set the properties for the CreditFlows ruleFlow itself. Click anywhere in the design screen to display the properties for the ruleFlow.

Set the properties as follows:

Connection Layout = Shortest Path
Id = credits
Name = creditFlows
Package = simple
Version = 1




The Id is the id by which we will indentify this ruleflow when we want to invoke it. To start this process we will call workingMemory.startProcess("credits"). The name is simply the name by which we want to identify this ruleflow. The package is the same package as that to which our rules belong (see rule08.drl), and the version is our latest version.

Specifying the RuleFlow to Run


Now let’s modify our simple8() method in RuleRunner to call our ruleflow.

    public static void simple8()
    throws Exception
    {
        Account acc1 = new Account(1);
        Account acc2 = new Account(2);
        
        Object[] facts =
        {
            new AllocatedCashflow(acc1,new SimpleDate("01/01/2007"), TypedCashflow.CREDIT, 300.00),
            new AllocatedCashflow(acc1,new SimpleDate("05/02/2007"), TypedCashflow.CREDIT, 100.00),
            new AllocatedCashflow(acc2,new SimpleDate("11/03/2007"), TypedCashflow.CREDIT, 500.00),
            new AllocatedCashflow(acc1,new SimpleDate("07/02/2007"), TypedCashflow.DEBIT, 800.00),
            new AllocatedCashflow(acc2,new SimpleDate("02/03/2007"), TypedCashflow.DEBIT, 400.00),
            new AllocatedCashflow(acc1,new SimpleDate("01/04/2007"), TypedCashflow.CREDIT, 200.00),
            new AllocatedCashflow(acc1,new SimpleDate("05/04/2007"), TypedCashflow.CREDIT, 300.00),
            new AllocatedCashflow(acc2,new SimpleDate("11/05/2007"), TypedCashflow.CREDIT, 700.00),
            new AllocatedCashflow(acc1,new SimpleDate("07/05/2007"), TypedCashflow.DEBIT, 900.00),
            new AllocatedCashflow(acc2,new SimpleDate("02/05/2007"), TypedCashflow.DEBIT, 100.00),
            
            new AccountingPeriod(new SimpleDate("01/01/2007"),new SimpleDate("31/03/2007")),
            new AccountingPeriod(new SimpleDate("01/04/2007"),new SimpleDate("30/06/2007"))
            
        };
        
        String[] ruleFiles = { "/simple/rule08.drl" };
        String[] ruleFlows = { "/creditFlows.rfm" };
        
        new RuleRunner().runRules("credits",ruleFlows,ruleFiles,facts);        
    }


Finally, let's compile the project and run it.

output:

Loading file: /simple/rule08.drl
...
Credit: Sun Mar 11 00:00:00 GMT 2007 :: 500.0
Account: 2 - new balance: 500.0
Credit: Mon Feb 05 00:00:00 GMT 2007 :: 100.0
Account: 1 - new balance: 100.0
Credit: Mon Jan 01 00:00:00 GMT 2007 :: 300.0
Account: 1 - new balance: 400.0
Retracting Accounting Period: Mon Jan 01 00:00:00 GMT 2007
- Sat Mar 31 00:00:00 BST 2007
Retracting Accounting Period: Sun Apr 01 00:00:00 BST 2007
- Sat Jun 30 00:00:00 BST 2007


Notice, now we have all the Credits being applied, but no Debits!!


Saturday, 3 November 2007

Benchmarking Drools

In my earlier post I said that I hadn't had a chance to run a speed-test on Drools to compare using Drools to sort values with using Java. Last week I did just that and have learned a lot about tuning Drools along the way.

I have included all the source code in an appendix at the end of this post, along with a link to all the zipped source.

Objective


The objective is to determine how to best use the Drools rule engine depending on the number of Facts, type of rules, and requirements of the system including performance, scalability and maintainability.

The Approach


A simple problem will be passed to the Drools rule engine for solution. The different approaches used in the solution will include the use of stateful and stateless sessions, using the rule engine to sort and order facts, passing the rule engine pre-ordered and sorted facts, and modifying the rules themselves to measure changes in performance.

The Problem


The problem is basically to aggregate credit and debit cashflows into accounting periods and determine the account balance at the end of each period after all the cashflows have been applied.

Caveats


This problem is quite a simple one and using the Drools rule engine for such a task is very much taking a sledgehammer to crack a nut - unless we were expecting to introduce more rules and facts into the equation later.

Expectations


  • I would expect a stateful session to be faster if the RuleBase is cached

  • I would expect a stateless session (which can be applied to this problem) to be faster than a stateful session

  • I would expect better performance if the cashflows are aggregated into accounting periods and passed to the rule engine for each accounting period in turn.

The Tests


The tests are broken down into nine test sets (numbered 1 to 9 below). Each test set provides a different solution to the problem and all test sets provide exactly the same output.

Each test set is run several times with a different number of facts each time. For each set of facts supplied, the test set is run through several iterations and the fastest, slowest, and average time to complete processing is measured in milliseconds.

The tests themselves evolved with findings and feedback and so the final set of nine tests is listed below:

  1. Stateful Session

    A new RuleBase is created for each iteration and then all the facts are inserted together into a stateful session.

    The rule engine orders the accounting periods and aggregates the facts into each accounting period in turn before calculating the end of period balance by applying the credits and debits to the relevant bank account.

  2. Stateful Session with cached RuleBase

    A new RuleBase is created for each different set of Facts and is then cached and reacquired for each iteration. All the facts are inserted together into a stateful session.

    The rule engine orders the accounting periods and aggregates the facts into each accounting period in turn before calculating the end of period balance by applying the credits and debits to the relevant bank account.

  3. Stateful, Cached,Condition elements grouped within rules

    A new RuleBase is created for each different set of Facts and is then cached and reacquired for each iteration. All the facts are inserted together into a stateful session.

    The rule engine orders the accounting periods and aggregates the facts into each accounting period in turn before calculating the end of period balance by applying the credits and debits to the relevant bank account.

    Within the rules themselves, the Condition elements are grouped together as per the example below:

    This condition set shows two AccountingPeriod conditions separated with a Cashflow condition:

    AccountingPeriod( $start : start, $end : end )
    $cashflow : Cashflow( $account : account, $date : date <= $end
    && date >= $start, $amount : amount, type==Cashflow.DEBIT )
    not AccountingPeriod( start < $start)

    In the condition set below, we have grouped the AccountingPeriod conditions:

    AccountingPeriod( $start : start, $end : end )
    not AccountingPeriod( start < $start)
    Cashflow( $account : account, $date : date <= $end
    && date >= $start, $amount : amount, type==Cashflow.CREDIT )


  4. Stateful, Cached, Group, long for sorting

    A new RuleBase is created for each different set of Facts and is then cached and reacquired for each iteration. All the facts are inserted together into a stateful session.

    The rule engine orders the accounting periods and aggregates the facts into each accounting period in turn before calculating the end of period balance by applying the credits and debits to the relevant bank account.

    Within the rules themselves, the Condition elements are grouped together.

    When ordering the AccountingPeriods and aggregating the Cashflows, this is done using a long primitive representation of the period start and end dates and the cashflow date instead of a Date object.

  5. Stateful, Cached, Group, Long for sorting

    A new RuleBase is created for each different set of Facts and is then cached and reacquired for each iteration. All the facts are inserted together into a stateful session.

    The rule engine orders the accounting periods and aggregates the facts into each accounting period in turn before calculating the end of period balance by applying the credits and debits to the relevant bank account.

    Within the rules themselves, the Condition elements are grouped together.

    When ordering the AccountingPeriods and aggregating the Cashflows, this is done using a Long object representation of the period start and end dates and the cashflow date instead of a Date object.

  6. Stateful, Cached, Grouped, Facts inserted by Accounting Period

    A new RuleBase is created for each different set of Facts and is then cached and reacquired for each iteration.

    The accounting periods are ordered, and the cashflows are aggregated for each accounting period in turn.

    Each accounting period is then processed in turn and all the facts for each accounting period are inserted into a stateful session and the results for that period are obtained.

    Within the rules themselves, the Condition elements are grouped together.

  7. Stateless, Cached, Grouped, Facts inserted by Accounting Period

    A new RuleBase is created for each different set of Facts and is then cached and reacquired for each iteration.

    The accounting periods are ordered, and the cashflows are aggregated for each accounting period in turn.

    Each accounting period is then processed in turn and all the facts for each accounting period are inserted into a stateless session and the results for that period are obtained.

    Within the rules themselves, the Condition elements are grouped together.

  8. Stateless, Cached, Grouped, Facts inserted by Accounting Period, using the accumulate method

    A new RuleBase is created for each different set of Facts and is then cached and reacquired for each iteration.

    The accounting periods are ordered, and the cashflows are aggregated for each accounting period in turn.

    Each accounting period is then processed in turn and all the facts for each accounting period are inserted into a stateless session and the results for that period are obtained.

    Within the rules themselves, the Condition elements are grouped together, and the accumulate method is used to total the credits and debits respectively.

  9. Plain Java method

    All the sorting, aggregating and balance calculations are done in Java.

Test Results


I have quoted below the results for 588 facts being inserted into the rule engine, with the rules being repeated through 50 iterations to get the average time in milliseconds to process the cashflows and obtain the account balance for each accounting period.

1 - 79 ms - Stateful
2 - 27 ms - Stateful, Cached
3 - 9 ms - Stateful, Cached, group conditions
4 - 21 ms - Stateful, Cached, group conditions, long primitive
5 - 21 ms - Stateful, Cached, group conditions, Long
6 - 6 ms - Stateful, Cached, group conditions, aggregate cashflows
7 - 6 ms - Stateless
8 - 4 ms - Statless, Accumulate
9 - 1 ms - Java

All the results for each set of tests, grouped by number of Facts and number of iterations are detailed in Appendix A: Results

Discussion of the Results


  • Caching v Non-Caching RuleBase

    Test1 and Test2 both use the rule file test01.drl. All the facts are asserted in one go and sorted within the rules. The Facts are then retracted once they have been used. This makes writing the rules easier, as you don’t have to worry about Facts hanging around that could influence other rules or even re-fire completed rules.

    Caching the RuleBase does show a marked improvement, but that improvement becomes less marked as the number of Facts increases.

  • Grouping Condition Elements

    Test3, groups the conditions in the rule files. For this it uses test02.drl.

    In test01.drl we have:

    AccountingPeriod( $start : start, $end : end )
    $cashflow : Cashflow( $account : account, $date : date <= $end
    && date >= $start, $amount : amount, type==Cashflow.DEBIT )
    not AccountingPeriod( start < $start)

    in test02.drl we have:

    AccountingPeriod( $start : start, $end : end )
    not AccountingPeriod( start < $start)
    Cashflow( $account : account, $date : date <= $end
    && date >= $start, $amount : amount, type==Cashflow.CREDIT )

    Simply changing the order from AccountingPeriod, Cashflow, not AccountingPeriod to AccountingPeriod, not AccountingPeriod, Cashflow has reduced the average time from 27ms to 9ms!!
    Note that this performance improvement continues throughout all the test sets, regardless of the number of Facts.

  • Using long and Long for Sorting

    Test4, and Test5 use test03.drl and test04.drl respectively, and so use long primitive and Long instead of Date for sorting. This does not show an improvement over sorting with Dates.

    Note, however, that the compareTo() and equals() method in the Cashflow object use the Date object for testing and, so these results may have been rendered invalid, depending on any use of the Collections Api within the rule engine. I may change those methods later and try it again.

  • Aggregating Cashflows before Inserting into Drools

    Test6 uses test05.drl, and collates all the Cashflows for each period and then inserts them into the rule engine for each accounting period in turn. This again, shows a marked improvement from 21ms to 6ms and this scale of improvement is consistent across all the test sets.

  • Using a Stateless Session

    Test7 uses test05.drl and uses a stateless session. Not suprisingly this test has consistently similar results to Test6 across all the test sets.

  • Using the accumulate() method

    Test8, uses test06.drl which again uses a stateless session, but uses the accumulate() method within the rules and does not retract the cashflows. This shows an improvement over Test7 which becomes more marked as the number of Facts increases.

  • Plain Java method

    Test9, is a simple java method that performs all the cashflows. Not surprisingly, this is the fastest approach. As the number of rules increases, however, we may see a difference. Certainly, the complexity of the java code as the rules increased would increase much more, and Drools does offer other facilities over the java solution including the BPM.

Summary


It is important to decide when to use Stateful sessions and when to use Stateless sessions. For this problem, Stateless was sufficient. If facts are modified during the rule process or new rules are generated or inserted, then a Stateful session is required.

Collating and sorting Facts before inserting them into the rule engine can show a marked improvement. This will have to be thought through beforehand as there are times when this is not practical.

Grouping condition elements within the rules themselves can have a marked improvement on performance and using new features such as the accumulate method show additional marked improvements.

Appendix A: Results


Average time in ms to process 16 Facts,
JVM arguments: -Xms512M -Xmx512m



Average time in ms to process 588 Facts,
JVM arguments: -Xms512M -Xmx512m



Average time in ms to process 4020 Facts,
JVM arguments: -Xms512M -Xmx512m



Average time in ms to process 96112 Facts,
JVM arguments: -Xms512M -Xmx512m



Average time in ms to process 1921936 Facts,
JVM arguments: -Xms1512M -Xmx1512m



The Source


The source code for this benchmark test is split into 3 projects

  • The Lib Project

    This project contains common classes used by all projects. SimpleDate extends the Date class to allow instantiation from Strings, and TimedResults is used for timing the rule engine, providing methods to retrieve fastest, slowest and average times.

  • The RuleRunner Project

    This project contains the classes that interface with the Drools engine. You will need to download the latest version of Drools bin and include the jars in this project.

    This project contains 4 classes so far: The Fact interface must be implemented by any facts that are to be inserted into the rule engine; and Globals must be inserted as Global objects. To use stateful sessions within Drools you need to instantiate a StatefulRunner and to instantiate a stateless session you need to instantiate a StatelessRunner.

  • The SpeedTest Project
    This project contains all the code and rules for the benchmark tests. The main class is net.tplusplus.test.drools.speedtest.SpeedTest1.

For more discussion of the source, refer to the next post - Banking on Drools Part II.

Download the Source


The source code for these tests can be downloaded here


Saturday, 13 October 2007

Banking on Drools - Part I

For the last few months I've been working at the European Bank for Reconstruction and Development on a project using DROOLS. We started with DROOLS 3 and have more recently moved to DROOLS 4.

I am not allowed to use actual examples from work so this blog will instead document by example the process of developing a complete personal banking application that will handle credits, debits, currencies and that will use a set of design patterns that I have created for the process.

In order to make the examples documented here clear and modular, I will try and steer away from re-visiting existing code to add new functionality, and will instead extend and inject where appropriate.

From my own personal experience, DROOLS is an excellent choice for banking and financial rules and through use of DROOLS I have adopted my own standards.

- Ideally the Facts that are asserted (I know that in 4 they are inserted, but I prefer the term asserted) into the DROOLS engine should be decoupled from the banking data and I prefer to inject helper objects into the facts being asserted rather than use helper classes or globals for calculations or storage of intermediate values - I refer to this as the Calculator Pattern later in the blog.

- Helper classes I use for logging rule activity and those helper classes can print to a file or use log4j as you wish.

- For the creation of additional Facts at rule-time, I use a RuleFactory object that is supplied as a global - I think in more complicated rulesets, it would be better injecting it into a Fact or Facts somewhere, but so far I have not needed to go that far.

The examples here are simplistic and are presented as demonstration and discussion documents so any feedback is welcome.

Finally, this is my first attempt at blogging so bear with me.

Step 1 - A Simple Example


I am using the latest version of JBOSS Rules, with Eclipse Europa and the JBOSS Rules plug-in for Eclipse.  I haven't got the links to hand at the moment, so I will add them later.

My first task is to write a simple class to interface with the Rule Engine, and a simple rule just to see it all hanging together.  Here we go:

RuleRunner.java

package com.javarepository.rules;
import java.io.InputStreamReader;
import org.drools.RuleBase;
import org.drools.RuleBaseFactory;
import org.drools.WorkingMemory;
import org.drools.compiler.PackageBuilder;
import org.drools.rule.Package;

public class RuleRunner
{
    public RuleRunner(){}
  
    public void runRules(String[] rules, Object... facts)
    throws Exception
    {      
        RuleBase ruleBase = RuleBaseFactory.newRuleBase();
        PackageBuilder builder = new PackageBuilder();      
      
        for(String ruleFile : rules)
        {
            System.out.println("Loading file: "+ruleFile);
            builder.addPackageFromDrl(
                    new InputStreamReader(
                        this.getClass().getResourceAsStream(ruleFile)));
        }
      
        Package pkg = builder.getPackage();      
        ruleBase.addPackage(pkg);
      
        WorkingMemory workingMemory
            = ruleBase.newStatefulSession();  

        for(Object fact : facts)
        {
            System.out.println("Inserting fact: "+fact);
            workingMemory.insert(fact);
        }

        workingMemory.fireAllRules();
    }

    public static void simple1()
    throws Exception
    {
        new RuleRunner().runRules(
                new String[]{"/simple/rule01.drl"});      
    }      

    public static void main(String[] args)
    throws Exception
    {
        simple1();
    }
}

And a simple rule to run:

rule01.drl

package simple
rule "Rule 01"
    when
        eval (1==1)
    then
        System.out.println("Rule 01 Works");      
end

I'm not going to go into too much detail about what is happening here.  Suffice it to say that eval (1==1) will always return true and so we should get an output of:

output:

Loading file: /simple/rule01.drl
Rule 01 Works


Step 2 - Introducing Facts


My next step is to assert some simple facts and print them out.

Let's add the method simple2() to the RuleRunner class (not forgetting to call it in the main method).

simple2() method from RuleRunner.java

public static void simple2()
throws Exception
{
    Number n1=3, n2=1, n3=4, n4=1, n5=5;
    new RuleRunner().runRules(
            new String[]{"/simple/rule02.drl"},
            n1,n2,n3,n4,n5);      
}

This doesn't use any specific facts but instead asserts a set of java.lang.Number's

Now we will create a simple rule to print out these facts.

rule02.drl

package simple
rule "Rule 02 - Number Printer"
    when
        Number( $intValue : intValue )
    then
        System.out.println("Number found with value: "+$intValue);      
end

Once again, this rule does nothing special.  It identifies any facts that are Numbers and prints out the values.  So what would we expect and what do we get?

From the inputs, we might expect

output:

Loading file: /simple/rule02.drl
Inserting fact: 3
Inserting fact: 1
Inserting fact: 4
Inserting fact: 1
Inserting fact: 5
Number found with value: 5
Number found with value: 1
Number found with value: 4
Number found with value: 1
Number found with value: 3

but what we actually get is:

output:

Loading file: /simple/rule02.drl
Inserting fact: 3
Inserting fact: 1
Inserting fact: 4
Inserting fact: 1
Inserting fact: 5
Number found with value: 5
Number found with value: 4
Number found with value: 1
Number found with value: 3

My first instinct was the think that this is actually a feature of Drools.  It is instead a feature of autoboxing.  Reading the DROOLS documentation (and an email from Mark Proctor) reveals that by default the DROOLS WorkingMemory uses an IdentityHashMap to store all the asserted Objects.  The following simple test, generates the same output as above, demonstrating that the constant value of 1 is stored only once in the runtime constant pool and two Integers autoboxed from this constant will actually be the same physical object and so will not be duplicated in the IdentityHashMap keyset.

TestIdentityHashMap.java

package com.javarepository.test;
import java.util.IdentityHashMap;

public class TestIdentityHashMap
{
    public static void main(String[] args)
    {
        IdentityHashMap<Object,Object< map = new IdentityHashMap<Object,Object<();
        Number n1=3, n2=1, n3=4, n4=1, n5=5;
        map.put(n1,n1);
        map.put(n2,n2);
        map.put(n3,n3);
        map.put(n4,n4);
        map.put(n5,n5);

        System.out.println(n2==n4);

        for(Object key : map.keySet())
        {
            System.out.println(key);
        }
    }
}

output:

true
4
1
3
5

Further documentation on how the jvm stores primitives and objects can be found at

http://java.sun.com/docs/books/jvms/second_edition/html/Overview.doc.html

Step 3 - Sorting Numbers


There are probably a hundred and one better ways to sort numbers; but we will need to apply some cashflows in date order when we start looking at banking rules so let's look at a simple rule based example.

simple3() method from RuleRunner.java

public static void simple3()
throws Exception
{
    Number n1=3, n2=1, n3=4, n4=1, n5=5;
    new RuleRunner().runRules(
        new String[]{"/simple/rule03.drl"},
        n1,n2,n3,n4,n5);      
}

Actually, this method is exactly the same as simple2() with the exception that it supplies rule03.drl to the Rule Engine.

Now let's look at the rule that will sort our numbers:

rule03.drl

package simple
rule "Rule 03"
    when
        $number : Number( $intValue : intValue )
        not Number( intValue < $intValue)
    then
        System.out.println("Number found with value: "
            +$intValue);  
    retract($number);
end

The first line of the rules identifies a Number and extracts the value.  The second line ensures that there does not exist a smaller number than the one found.  By executing this rule, we might expect to find only one number - the smallest in the set.  However, the retraction of the number after it has been printed, means that the smallest number has been removed, revealing the next smallest number, and so on.

So, the output we generate is

output:

Loading file: /simple/rule03.drl
Inserting fact: 3
Inserting fact: 1
Inserting fact: 4
Inserting fact: 1
Inserting fact: 5
Number found with value: 1
Number found with value: 3
Number found with value: 4
Number found with value: 5

I've not tried any timings with this approach but would be interested to compare this with a sorting algorithm.

Step 4 - Sorting Cashflows



Now we want to start moving towards our personal accounting rules.  The first step is to create a Cashflow POJO.

Cashflow.java

package com.javarepository.rules;
import java.util.Date;
public class Cashflow
{
    private Date date;
    private double amount;

    public Cashflow(){}

    public Cashflow(Date date, double amount)
    {
        this.date = date;
        this.amount = amount;
    }

    public Date getDate()
    {
        return date;
    }

    public void setDate(Date date)
    {
        this.date = date;
    }

    public double getAmount()
    {
        return amount;
    }

    public void setAmount(double amount)
    {
        this.amount = amount;
    }

    public String toString()
    {
        return "Cashflow[date="+date+",amount="+amount+"]";
    }
}

The Cashflow has two simple attributes, a date and an amount.  I have added a toString method to print it and overloaded the constructor to set the values.

Now, let's add the method simple4() to RuleRunner.

simple4() method from RuleRunner.java

    public static void simple4()
    throws Exception
    {
         Object[] cashflows =
            {
            new Cashflow(new SimpleDate("01/01/2007"), 300.00),
            new Cashflow(new SimpleDate("05/01/2007"), 100.00),
            new Cashflow(new SimpleDate("11/01/2007"), 500.00),
            new Cashflow(new SimpleDate("07/01/2007"), 800.00),
            new Cashflow(new SimpleDate("02/01/2007"), 400.00),
            };

        new RuleRunner().runRules(
            new String[]{"/simple/rule04.drl"},
            cashflows);      
}

Here, we simply create a set of Cashflows and supply them and rule04.drl to the RuleEngine.

SimpleDate is a simple class that extends Date and takes a String as input. The code is listed below

SimpleDate.java

package com.javarepository.rules;
import java.text.SimpleDateFormat;
import java.util.Date;

public class SimpleDate extends Date
{
     public SimpleDate(String datestr)
    throws Exception
    {
        SimpleDateFormat format = new SimpleDateFormat("dd/MM/yyyy");
        this.setTime(format.parse(datestr).getTime());
    }
}

Now, let's look at rule04.drl to see how we print the sorted Cashflows:

rule04.drl

package simple
import com.javarepository.rules.*;
rule "Rule 04"
    when
        $cashflow : Cashflow( $date : date, $amount : amount )
        not Cashflow( date < $date)
    then
        System.out.println("Cashflow: "+$date+" :: "+$amount);  
        retract($cashflow);
end

Here, we identify a Cashflow and extract the date and the amount.  In the second line of the rules we ensure that there is not a Cashflow with an earlier date than the one found.  In the consequences, we print the Cashflow that satisfies the rules and then retract it, making way for the next earliest Cashflow.

So, the output we generate is:

output:

Loading file: /simple/rule04.drl
Inserting fact: Cashflow[date=Mon Jan 01 00:00:00 GMT 2007,amount=300.0]
Inserting fact: Cashflow[date=Fri Jan 05 00:00:00 GMT 2007,amount=100.0]
Inserting fact: Cashflow[date=Thu Jan 11 00:00:00 GMT 2007,amount=500.0]
Inserting fact: Cashflow[date=Sun Jan 07 00:00:00 GMT 2007,amount=800.0]
Inserting fact: Cashflow[date=Tue Jan 02 00:00:00 GMT 2007,amount=400.0]
Cashflow: Mon Jan 01 00:00:00 GMT 2007 :: 300.0
Cashflow: Tue Jan 02 00:00:00 GMT 2007 :: 400.0
Cashflow: Fri Jan 05 00:00:00 GMT 2007 :: 100.0
Cashflow: Sun Jan 07 00:00:00 GMT 2007 :: 800.0
Cashflow: Thu Jan 11 00:00:00 GMT 2007 :: 500.0

Step 5 - Processing Credits



Here we extend our Cashflow to give a TypedCashflow which can be CREDIT or DEBIT.  Ideally, we would just add this to the Cashflow type, but so that we can keep all the examples simple, we will go with the extensions.

TypedCashflow.java

package com.javarepository.rules;
import java.util.Date;

public class TypedCashflow extends Cashflow
{
     public static final int CREDIT = 0;
    public static final int DEBIT = 1;

    private int type;

    public TypedCashflow(){}

    public TypedCashflow(Date date, int type, double amount)
    {
        super(date, amount);
        this.type = type;
        }

    public int getType()
    {
        return type;
    }

    public void setType(int type)
    {
        this.type = type;
    }

    public String toString()
    {
        return "TypedCashflow[date="+getDate()
            +",type="
            +(type==CREDIT?"Credit":"Debit")
            +",amount="+getAmount()+"]";
    }
}

There are lots of ways to improve this code, but for the sake of the example this will do.

Now, let's add the method simple5() to RuleRunner.

simple5() method from RuleRunner.java

public static void simple5()
throws Exception
{
    Object[] cashflows =
    {
        new TypedCashflow(new SimpleDate("01/01/2007"),    
            TypedCashflow.CREDIT, 300.00),
        new TypedCashflow(new SimpleDate("05/01/2007"),
            TypedCashflow.CREDIT, 100.00),
        new TypedCashflow(new SimpleDate("11/01/2007"),
            TypedCashflow.CREDIT, 500.00),
        new TypedCashflow(new SimpleDate("07/01/2007"),
            TypedCashflow.DEBIT, 800.00),
        new TypedCashflow(new SimpleDate("02/01/2007"),
            TypedCashflow.DEBIT, 400.00),
    };

    new RuleRunner().runRules(
        new String[]{"/simple/rule05.drl"},
        cashflows);      
}

Here, we simply create a set of Cashflows which are either CREDIT or DEBIT Cashflows and supply them and rule05.drl to the RuleEngine.

Now, let's look at rule05.drl to see how we print the sorted Cashflows:

rule05.drl

package simple
import com.javarepository.rules.*;

rule "Rule 05"
    when
        $cashflow : TypedCashflow( $date : date, $amount : amount,
                type==TypedCashflow.CREDIT )

        not TypedCashflow( date < $date, type==TypedCashflow.CREDIT )
    then
        System.out.println("Credit: "+$date+" :: "+$amount);  
        retract($cashflow);
end

Here, we identify a Cashflow with a type of CREDIT and extract the date and the amount.  In the second line of the rules we ensure that there is not a Cashflow of type CREDIT with an earlier date than the one found.  In the consequences, we print the Cashflow that satisfies the rules and then retract it, making way for the next earliest Cashflow of type CREDIT.

So, the output we generate is

output:

Loading file: /simple/rule05.drl
Inserting fact: TypedCashflow[date=Mon Jan 01 00:00:00 GMT 2007,type=Credit,amount=300.0]
Inserting fact: TypedCashflow[date=Fri Jan 05 00:00:00 GMT 2007,
    type=Credit,amount=100.0]
Inserting fact: TypedCashflow[date=Thu Jan 11 00:00:00 GMT 2007,
    type=Credit,amount=500.0]
Inserting fact: TypedCashflow[date=Sun Jan 07 00:00:00 GMT 2007,
    type=Debit,amount=800.0]
Inserting fact: TypedCashflow[date=Tue Jan 02 00:00:00 GMT 2007,
    type=Debit,amount=400.0]
Credit: Mon Jan 01 00:00:00 GMT 2007 :: 300.0
Credit: Fri Jan 05 00:00:00 GMT 2007 :: 100.0
Credit: Thu Jan 11 00:00:00 GMT 2007 :: 500.0

Step 6 - Processing Credits and Debits


Here we are going to process both CREDITs and DEBITs on 2 bank accounts to calculate the account balance.  In order to do this, I am going to create two separate Account Objects and inject them into the Cashflows before passing them to the Rule Engine.  The reason for this is to provide easy access to the correct Bank Accounts without having to resort to Helper classes.

Let's take a look at the Account class first.  This is a simple POJO with an account number and balance:

Account.java

package com.javarepository.rules;

public class Account
{
    private long accountNo;
    private double balance=0;

    public Account(){};

    public Account(long accountNo)
    {
        this.accountNo = accountNo;
    }

    public long getAccountNo()
    {
        return accountNo;
    }

    public void setAccountNo(long accountNo)
    {
        this.accountNo = accountNo;
    }

    public double getBalance()
    {
        return balance;
    }

    public void setBalance(double balance)
    {
        this.balance = balance;
    }

    public String toString()
    {
        return "Account["
            +"accountNo="+accountNo
            +",balance="+balance
            +"]";
    }
}

Now let's extend our TypedCashflow to give AllocatedCashflow (allocated to an account).

AllocatedCashflow.java

package com.javarepository.rules;
import java.util.Date;

public class AllocatedCashflow extends TypedCashflow
{
    private Account account;
    public AllocatedCashflow(){}

    public AllocatedCashflow(Account account, Date date, int type, double amount)
    {
        super(date, type, amount);
        this.account = account;
    }

    public Account getAccount()
    {
        return account;
    }

    public void setAccount(Account account)
    {
        this.account = account;
    }

    public String toString()
    {
        return "AllocatedCashflow["
            +"account="+account
            +",date="+getDate()
            +",type="
            +(getType()==CREDIT?"Credit":"Debit")
            +",amount="+getAmount()+"]";
    }
}

Now, let's add the method simple6() to RuleRunner.  Here we create two Account objects and inject one into each cashflow as appropriate.  For simplicity I have simply included them in the constructor.

simple6() method from RuleRunner.java

public static void simple6()
throws Exception
{
    Account acc1 = new Account(1);
    Account acc2 = new Account(2);

    Object[] cashflows =
    {
        new AllocatedCashflow(acc1,new SimpleDate("01/01/2007"),
            TypedCashflow.CREDIT, 300.00),
        new AllocatedCashflow(acc1,new SimpleDate("05/02/2007"),
            TypedCashflow.CREDIT, 100.00),
        new AllocatedCashflow(acc2,new SimpleDate("11/03/2007"),
            TypedCashflow.CREDIT, 500.00),
        new AllocatedCashflow(acc1,new SimpleDate("07/02/2007"),
            TypedCashflow.DEBIT,  800.00),
        new AllocatedCashflow(acc2,new SimpleDate("02/03/2007"),
            TypedCashflow.DEBIT,  400.00),
        new AllocatedCashflow(acc1,new SimpleDate("01/04/2007"),    
            TypedCashflow.CREDIT, 200.00),
        new AllocatedCashflow(acc1,new SimpleDate("05/04/2007"),
            TypedCashflow.CREDIT, 300.00),
        new AllocatedCashflow(acc2,new SimpleDate("11/05/2007"),
            TypedCashflow.CREDIT, 700.00),
        new AllocatedCashflow(acc1,new SimpleDate("07/05/2007"),
            TypedCashflow.DEBIT,  900.00),
        new AllocatedCashflow(acc2,new SimpleDate("02/05/2007"),
            TypedCashflow.DEBIT,  100.00)          
        };

    new RuleRunner().runRules(
        new String[]{"/simple/rule06.drl"},
        cashflows);      
}

Now, let's look at rule06.drl to see how we apply each cashflow in date order and calculate and print the balance.

rule06.drl

package simple;
import com.javarepository.rules.*;
rule "Rule 06 - Credit"
    when
        $cashflow : AllocatedCashflow( $account : account,
            $date : date, $amount : amount,
            type==TypedCashflow.CREDIT )

        not AllocatedCashflow( account == $account, date < $date)
    then
        System.out.println("Credit: "+$date+" :: "+$amount);
        $account.setBalance($account.getBalance()+$amount);
        System.out.println("Account: "+$account.getAccountNo()
            +" - new balance: "+$account.getBalance());

        retract($cashflow);
end

rule "Rule 06 - Debit"
    when
        $cashflow : AllocatedCashflow( $account : account,
            $date : date, $amount : amount,
            type==TypedCashflow.DEBIT )

        not AllocatedCashflow( account == $account, date < $date)
    then
        System.out.println("Debit: "+$date+" :: "+$amount);
        $account.setBalance($account.getBalance()-$amount);
        System.out.println("Account: "+$account.getAccountNo()
            +" - new balance: "+$account.getBalance());

        retract($cashflow);
end

Here, we have separate rules for CREDITs and DEBITs, however we do not specify a type when checking for earlier cashflows.  This is so that all cashflows are applied in date order regardless of which type of cashflow type they are.  In the rule section we identify the correct account to work with and in the consequences we update it with the cashflow amount.

output:

Loading file: /simple/rule06.drl
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
    date=Mon Jan 01 00:00:00 GMT 2007,type=Credit,amount=300.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
    date=Mon Feb 05 00:00:00 GMT 2007,type=Credit,amount=100.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=2,balance=0.0],
    date=Sun Mar 11 00:00:00 GMT 2007,type=Credit,amount=500.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
    date=Wed Feb 07 00:00:00 GMT 2007,type=Debit,amount=800.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=2,balance=0.0],
    date=Fri Mar 02 00:00:00 GMT 2007,type=Debit,amount=400.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
    date=Sun Apr 01 00:00:00 BST 2007,type=Credit,amount=200.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
    date=Thu Apr 05 00:00:00 BST 2007,type=Credit,amount=300.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=2,balance=0.0],
    date=Fri May 11 00:00:00 BST 2007,type=Credit,amount=700.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=1,balance=0.0],
    date=Mon May 07 00:00:00 BST 2007,type=Debit,amount=900.0]
Inserting fact: AllocatedCashflow[account=Account[accountNo=2,balance=0.0],
    date=Wed May 02 00:00:00 BST 2007,type=Debit,amount=100.0]

Debit: Fri Mar 02 00:00:00 GMT 2007 :: 400.0
Account: 2 - new balance: -400.0
Credit: Sun Mar 11 00:00:00 GMT 2007 :: 500.0
Account: 2 - new balance: 100.0
Debit: Wed May 02 00:00:00 BST 2007 :: 100.0
Account: 2 - new balance: 0.0
Credit: Fri May 11 00:00:00 BST 2007 :: 700.0
Account: 2 - new balance: 700.0
Credit: Mon Jan 01 00:00:00 GMT 2007 :: 300.0
Account: 1 - new balance: 300.0
Credit: Mon Feb 05 00:00:00 GMT 2007 :: 100.0
Account: 1 - new balance: 400.0
Debit: Wed Feb 07 00:00:00 GMT 2007 :: 800.0
Account: 1 - new balance: -400.0
Credit: Sun Apr 01 00:00:00 BST 2007 :: 200.0
Account: 1 - new balance: -200.0
Credit: Thu Apr 05 00:00:00 BST 2007 :: 300.0
Account: 1 - new balance: 100.0
Debit: Mon May 07 00:00:00 BST 2007 :: 900.0
Account: 1 - new balance: -800.0

The Source


The source for this blog entry can be downloaded from
here