Transaction In Doubt

My notes on Software Engineering

Tag: javascript

Polyglot

From its beginning the objective of the durable_rules project has been to provide a library, which can be leveraged in the most popular languages for building internet applications. Even though durable_rules borrows a healthy number of concepts from the Production Business Rules and Expert Systems domains, I consider it critical to tap into the infrastructure and knowledge base, which has been built over the last few years with languages such as JavaScript, Python and Ruby.

The initial durable_rules prototypes were built using JavaScript and the node.js environment. As I considered targeting a larger audience, the implementation evolved into a core engine written in C, using JSON as the lingua-franca for defining both rulesets and the data to be processed. A set of thin adapters would allow using the engine in JavaScript, Python and Ruby. I borrowed the MongoDB JSON query syntax as the linguistic abstraction for defining rulesets. For optimal performance the data in Redis would be stored using the message pack format (not JSON).

arch

Because scripting languages have great support for the JSON data model (hash sets, arrays and primitives) I was optimistic (perhaps naive) this approach would be a one-size fits all solution. However, as the project progressed, I found a couple of problems with this idea. 

First problem

In durable_rules the logical (and, or) and sequence operator (when_all, when_any) precedence matters. In MongoDB an expression like: ‘qty > 100 and price < 9.95’ is represented as ‘{{ qty: { $gt: 100 } }, { price: { $lt: 9.95 } }}’. This is nice and simple, but the JSON data model doesn’t imply hash-set ordering. What if I don’t want the engine to evaluate price when qty < 100? …And then came other features such as correlations, statecharts, flowcharts, parallel, salience, tumbling windows, count windows… I departed from MongoDB queries. The example below shows how a typical rule is defined in the latest durable_rules implementation. This rule waits for three events: The first of type ‘purchase’. The second with the same ‘ip’ address as the first but different ‘cc’ number. And the third with same ‘ip’ address as the first, but ‘cc’ number different than that of the second and the first.

JSON.stringify({
    suspect: {
        all: [
            {first: {t: 'purchase'}},
            {second: { 
                $and: [
                    {ip: {first: 'ip'}}, 
                    {$neq: {cc: {first: 'cc'}}} 
                ]
            }},
            {third: { 
                $and: [
                    {ip: {first: 'ip'}}, 
                    {$neq: {cc: {first: 'cc'}}}, 
                    {ip: {second: 'ip' }}, 
                    {$neq: {cc: {second: 'cc'}}}
                ]
            }}
        ],
    }
})

Second problem

The rule above expresses precisely what I want. But I just cannot imagine manually defining tens, hundreds or thousands of rules using the format above. It is not sustainable. Ruby, with its great DSL support, gave me the creative inspiration for making significant improvements over the user model. By leveraging language extensibility I was able to define and follow the set of principles for the three libraries I have implemented so far JavaScript, Python, Ruby:

  1. Leverage the language
    • Use existing expression language
    • Use existing scoping constructs
    • Use blocks, lambdas and decorators
  2. Enable compact and maintainable code
    • Avoid nesting\scoping when possible
    • Encourage the user to define short phrases, avoid cascading
    • Facilitate troubleshooting

To the extent possible I tried driving uniformity across library implementations, however I did not sacrifice usability for uniformity. Below is a summary of the features I used for implementing the libraries. 

arch

Notes on Ruby

Ruby has the most comprehensive language extensibility: Operator overloading for logical and arithmetic expressions. Property accessor interception to leverage ‘.’ notation. Block support to seamlessly integrate actions as part of the rule definition. Contexts applied to block execution to avoid unnecessary variable qualification in expression rvalues and action implementations. All these features lead to a compact and fairly well integrated ruleset definitions.  

Durable.ruleset :suspect do
  when_all c.first = m.t == "purchase",
           c.second = (m.ip == first.ip) & (m.cc != first.cc),
           c.third = (m.ip == second.ip) & (m.cc != first.cc) & (m.cc != second.cc) do
    puts "detected " + first.cc + " " + second.cc + " " + third.cc
  end
end

‘durable_rules’ exposes a REST API to post events and assert facts. I used the Sinatra framework to implement such an API. It was very easy to integrate, except for the threading model: for each request a synchronous response is expected and multiple concurrent requests are dispatched in different threads in a single process.

One caveat, which really bothers me: In Ruby it is not possible to overload logical ‘and’ and ‘or’. I had to use bitwise operators instead. The bitwise operator precedence is different from that of their logical counterparts. So, it is required to enclose logical expressions with parens ‘(m.ip == first.ip) & (m.cc != first.cc)’ as opposed to ‘m.ip == first.ip && m.cc != first.cc’.

Notes on Python

Python has great language extensibility. Similar to Ruby: operator overloading for logical and arithmetic expressions and property accessor interception allows using ‘.’ notation for tuple names. Function decorators for rule action integration. Contexts for leveraging scoping constructs when defining rulesets, statecharts and flowcharts. This results in compact and well integrated ruleset definitions.

with ruleset(’suspect'):
    @when_all(c.first << m.t == 'purchase',
              c.second << (m.ip == first.ip) & (m.cc != first.cc),
              c.third << (m.ip == second.ip) & (m.cc != first.cc) & (m.cc != second.cc))
    def detected(c):
        print ('detected {0} {1} {2}'.format(c.first.cc, c.second.cc, c.third.cc))

I used the Werkzeug framework to implement the REST API. I considered using Flask, but I really didn’t need all its features. Similar to Sinatra and Ruby: for each request a synchronous response is expected and multiple concurrent requests are dispatched in different threads in a single process.

Python has the same caveat as Ruby regarding logical ‘and’ and ‘or’ overloading. Lambda support in Python is fairly limited, so I resorted to function decorators, still with a very reasonable outcome. In python ‘=‘ is a statement, so I had to use the ‘<<‘ operator for naming events and facts in correlated sequences.  

Notes on JavaScript

JavaScript does not have operator overloading, so a set of functions for describing logical and arithmetic expressions needs to be provided. Property accessor overloading is not readily available, I played a nifty trick to overcome this obstacle (see note below). Lambda support enables rule action integration. Contexts allow leveraging scoping constructs when defining rulesets, statecharts and flowcharts. While the result seems acceptable, the question still remains: has this crossed the usability tipping point?

with (d.ruleset('suspect')) {
    whenAll(c.first = m.amount.gt(100),
            c.second = m.ip.eq(c.first.ip).and(m.cc.neq(c.first.cc)), 
            c.third = m.ip.eq(c.second.ip).and(m.cc.neq(c.first.cc), m.cc.neq(c.second.cc)),
        function(c) {
            console.log('detected ' + c.first.cc + ' ' + c.second.cc + ' ' + c.third.cc);
        }
    );
}

Node.js provides great support for implementing the REST API. In addition the single threaded continuation based model is a beautiful invention (in my opinion), it helps with JavaScript ‘global’ variable management, enables non-blocking IO calls and provides nice concurrency control using multiple processes.

‘m.amount.gt(100)’: It is easy to take it for granted. In order to enable such an expression without requiring the user to define object hierarchies, I leveraged V8 extensibility in C++ (given that the rulesets are run in node.js). In V8 it is possible to provide a proxy object that intercepts Property Set\Get and call back a JavaScript lambda:

C++ class definition:

class Proxy {
public:

  explicit Proxy(Handle<Function> gvalue, Handle<Function> svalue) { 
    Isolate* isolate = Isolate::GetCurrent();
    _gfunc.Reset(isolate, gvalue); 
    _sfunc.Reset(isolate, svalue); 
  }
  Handle<Object> Wrap();

private:

  static Proxy* Unwrap(Local<Object> obj);
  static void Get(Local<String> name, const PropertyCallbackInfo<Value>& info);
  static void Set(Local<String> name, Local<Value> value, const PropertyCallbackInfo<Value>& info);
 
  Persistent<Function> _gfunc;
  Persistent<Function> _sfunc;
};

From JavaScript

var s = r.createProxy(
    function(name) {
      ….
    },
    function(name, value) {
    }
 );

Please find the full implementation in durable_rules code base.

A note on multi-platform support

Unfortunately a Polyglot framework does not imply multi-platform support. In the Unix ecosystem, the ‘durable_rules’ libraries (including the C engine implementations) have a high degree of portability. That is to say: having implemented the framework in the Mac OS X platform, very few changes were needed to make it work in Linux. Integrating the Windows platform has been a lot more difficult: VisualStudio, which is the premier toolset for Windows, doesn’t fully support C99 (variable length arrays, asprintf, snprintf…), luckily all these limitations can be overcome without needing fundamental changes nor creating a different code fork. Hiredis is not officially supported by Redis for Windows, the MS Open Tech team has done a nice job porting and maintaining the Redis code base for Windows. I extracted the ‘Hiredis’ portion for the ‘durable_rules’ C engine as it heavily depends on it. 

 

Rete Meets Redis

In my previous post I briefly touched on the subject of Rules and the work I published in NPM as ‘durable’, which source is available under http://www.github.com/jruizgit/rules. In this post I explore in detail how to use the Rete algorithm and the Redis server as a powerful combination for processing real time events. A little bit of background first: The Rete algorithm was created a few decades ago and has been used in Inference based Expert Systems for efficient execution of business rules. Redis is a data-structure server, currently  used in a large variety of scalable applications.

Ruleset Definition

A ruleset is a set of conditions followed by actions. In ‘durable’ I use rulesets to express conditions about events, that is, things that happen. Let’s illustrate this with a very simple and downright simplistic example. Imagine we are rating the effectiveness of a web page, and we establish the following rules:

  1. As soon as a visitor hits our web page, we will start a timer of 5 minutes (assuming the visitor is now reading).
  2. When our visitor clicks on a link within the 5 minute timeout, we are going to increase the page score (yes the page worked!)
  3. When the timeout expires and our visitor did not click on any link, we are going to decrease the page score (the page was abandoned)

I now translate the rules above to the simple JSON abstraction used in ‘durable’ for expressing rules. Don’t worry if you don’t understand all the details as long as you see how it relates to the description above, you should be able to follow. If you are curious about the rules definition abstraction please visit https://github.com/jruizgit/rules/blob/master/concepts.md.

rating: {
    r1: {
        when: { event: ’start' },
        run: function(s) {
            s.state = ‘reading’;
            s.startTimer(’timeout’, 300);
        }
    },
    r2: {
        whenAll: {
            $m: { event: ’click’  },
            $s: { state: ‘reading’ }
        },
        run: function(s) { increaseScore(); }
    },
    r3: {
        whenAll: {
            $m: { $t: ’timeout’ },
            $s: { state: ‘reading’ }
        },
        run: function(s) { decreaseScore(); }
    }
}

The Rete Tree

When running the ruleset above, the first thing the ‘durable’ runtime is going to do is create a Rete tree as shown below:

Rating Ruleset Rete Tree

In a Rete tree there are three different kinds of nodes:

  1. The solid line circles in the diagram above are called alpha nodes. They have one input and can have several outputs. Their main task is routing data into beta or action nodes. They are associated with an assertion about the input. When the assertion is met, the data is pushed into all children nodes.
  2. The dashed line circles are known as beta nodes. They have two or more inputs and one output. They are in charge of doing join operations. That is, pushing data into action nodes when all input criteria has been met.
  3. The squares represent actions. Actions do real work (such as sending email or triggering other events), they can change the state of the tree, negating previous state and asserting new state.

The two top level alpha nodes represent the type of assertions that can be made. In the case of durable there are always two top level nodes: one for event messages and another for user state changes.

Forward Chaining

Now it is time to consider how an event sequence works. Please pay special attention to step 2.iv. Remembering facts about rules antecedent state is the key for high performance rule evaluation. When the event in step 3 happens, the system doesn’t need to reevaluate the entire ruleset and it does not need to deserialize the user state.

1. Let’s start by posting the following message:

{
    id: 1,
    event: ‘start'
}
i. The message comes through the REST handler and is posted to the ruleset’s Rete tree.
ii. The message is pushed to the alpha node `type = ‘$m’`. The condition is matched, so it is now pushed to the next alpha node.
iii. The message meets the condition `$m.event = ‘start’`. It is then pushed to the action node.
iv. An action item is added to the action queue of the ruleset. The REST handler response is sent to back to the user at this point.

2. The action is picked up by the background dispatch loop:

i. The action is executed, the state is changed to `s.state = ‘reading’`.
ii. The new state is pushed to the alpha node `type = ‘$s’`. The condition is matched. The state is pushed to the next alpha node.
iii. The state meets the condition `s.state = ‘reading’`, so it is pushed to the beta node.
iv. The ‘all’ join in the beta node is not complete yet. This input is remembered (this fact is stored in memory).

3. Now we assume the user clicked on a link in the page and as a result the following message is posted:

{
    id: 2,
    event: ‘click'
}
i. The request comes through the REST handler and is posted to the ruleset’s Rete tree.
ii. The message is pushed to the alpha node `type = ‘$m’`, which pushes the message to the next alpha node.
iii. The message meets the condition `$m.subject = ‘approved’` and is pushed to the beta node.
iv. The ‘all’ join is now satisfied, so the message is pushed to the action node.
v. Now an action is added to the action queue of the ruleset. The REST handler response is sent to back to the user at this point.

4. The action is picked up by the background dispatch loop.

i. The all join state is deleted.
ii. The action is executed.

Tree State in Redis

In the previous section you might have noticed I glossed over the Rete tree memory management details (points 2.iv and 3.iv). In order to scale out, that is, to be able to use several cores to evaluate the ruleset, it is necessary to store the tree state in a cache external to the processes. Managing the beta nodes state needs careful consideration. Let’s look at a first approach to handling step 2.iv and 3.iv above:

Get the beta node state from the cache
If the ‘all' join is not complete
    Store the fact that one beta node side is complete
Else
    Delete the beta node state from the cache
    Add an item to the action queue

The pseudo code above has two fatal flaws: 1. A race condition when checking and storing the beta node state can lead to losing event messages 2. Failure after deleting the beta node state can lead to dropping the action. To fix both problems we need to use a lock and the code needs to be idempotent, so it can be retried in case of failure.

Acquire a lock with a timeout (in case of failure)
Get the beta node state from the cache
If the ‘all' join is not complete
    Store the fact that one beta node side is complete
Else
    Add an item to the action queue if it has not been added
    Delete the beta node state from the cache
Release the lock

Now the code above requires up to 5 cache accesses. In addition, recovering from a failure will delay waiting for the lock to timeout. In order to achieve real-time event processing performance only one memory cache access can be afforded for each relevant event (I came to this conclusion after studying the performance profiles from my benchmarks). There has to be a better way of doing this.

Redis offers features such as sorted sets, single threaded multi command execution and Lua scripting, which can be used to efficiently address the problem above. The Forward Chaining algorithm has to be modified for efficient memory management and join evaluation. Specifically: 1. Push the beta node state all the way down to the action nodes 2. The action nodes now are in charge of storing the information in the Redis, evaluating upstream joins and enqueuing actions in a single multi execute sequence.To illustrate this point, for the example above, the following sets are created to store the join information:

  • rating.r1.$m
  • rating.r2.$s.$s
  • rating.r2.$m.$m
  • rating.r3.$s.$s
  • rating.r3.$m.$m

The join evaluation works as follows:

Multi execute sequence for step 2.iv.
    Add an entry identifying the user state to rating.r2.$s.$s.
    Join by testing if rating.r2.$s.$s and rating.r2.$m.$m have an entry.
Multi execute sequence for step 3.iv.
    Add an entry identifying the event message to rating.r2.$m.$m.
    Join by testing if rating.r2.$s.$s and rating.r2.$m.$m have an entry.
    Add action entry to the action queue.
    Remove the entries in rating.r2.$m.$m and rating.r2.$s.$s.

Please note multi execute is run as one single threaded set of instructions in the Redis server. The use of sets ensure the idempotency of operations (it can be retried in case of failure).