Transaction operation plays very important role in business system, the most common example is the transaction system running in bank, basically, transaction have following issue:
- Multiple operations, at least one write, in specific order.
- Rollback and commit, rollback all when any of them failed.
All the examples above are single operations, which means that you needn’t to wait for other operations to be finished in an entire business flow. So what's the problem in multiple operations? The second issue above is common in any programming language, we always need to call a function like “commit()” when we are sure that all operations are finished correctly, or “rollback()” when catch any exception, I think all of you are familiar with these if you have experience on JDBC programming. Node.js driver for HANA also provide similar interface.
How about the first issue? We just execute statements one by one in Java, but this doesn’t work in Node.js. Why? Do you remember node.js is "non-blocking"? Therefore you cannot write code like this:
statement.exec(sql1, function(){}); statement.exec(sql2, function(){}); statement.exec(sql3, function(){});
all these statements will return immediately, no wait for the execution to be finished, and not execute in the order as you expected. for example a "insert then update" logic may executed as "update then insert".
What's the node-style to do "one by one" logic? Yes, as we mentioned above the callbacks. The callbacks function will be called only after the main operation finished, that's what we want. So you should write the code like:
client.exec(querySql, function(err, rows) { if (err) { //rollback } else { client.exec(insertSql, function(err, affectedRows) { if (err) { //rollback } else { client.exec(updateSql,function(err, affectedRows) { //.... }) } }); } });
While, it just looks like...yes, a pyramid. Actually, this is called "callback pyramid", or "callback hell", which is the point of node.js blamed again and again. This issue not exist only in database operation, it's widely exist in node.js code, we couldn't imagine how terrible it's when there are five or more nested callbacks!
It's sure that this is a shock to the programmer who are used to writing "normal" code like Java. It's lucky that there have been some solutions, and one of them is a component called "async"(caolan/async · GitHub). I have to say that "async" is one of the most important components in node.js, it covers flow control and collection operations, when you consider to run a for-each or do-while loop, you'd better consider to employ this component, rather than write code by yourself, we will not introduce all its functions one by one, just take a look at how to use it in transaction operation.
The most frequently used methods in "async" are
- series : do tasks one by one
- waterfall : do tasks one by one, and send result to next task
- parallel : do tasks at same time
Maybe it's little hard to understand, so we set a simplest example here:
- Check record count
- First write operation : add new customer name in customer table
- Second write operation : add nationality of new customer
- Commit
It's little complicated so we will split the whole program into pieces and explain the detail.
Firstly you need to download the component by npm first, and import it at beginning:
var async = require('async');
Let's consider the requirement, every customer should have an unique ID which auto-increment, we obtain the new ID by "max+1" (it's not good practice so you should not do like this in real system, here just for explain how to deal with dependency between operations). According to the document of "async", we need to use waterfall flow if there are dependencies between operations, means you can send parameters to the next operation. You can call
async.waterfall([connect, query, insertName, insertNation], done);
to do async operations. The first parameter is an array consist of the functions of operations, they will run in the order if current async operations is order-sensitive like waterfall or series, the second parameter "done" here is a callback function, it will be called when all of these operations finished, or if there is an error in any of the operations.
Corresponding SQLs are:
var querySql = "SELECT MAX(ID) FROM NODETEST.CUSTOMERS"; var insertNameSql = "INSERT INTO NODETEST.CUSTOMERS VALUES(?,'Ruby')"; var insertNationSql = "INSERT INTO NODETEST.NATION VALUES(?,'JP')";
Next let's check the operations:
function connect(next) { client.setAutoCommit(false); client.connect(next); }
First is connect, here is something you may be familiar with, we set auto-commit to false, which is to enable transaction. This function has a parameter called "next", it's a function, you will find that all following functions have this parameter. It's very important in async, you MUST call this function when you want to "return" from current operation function, and the behavior of "next" function depends on the parameter you sent to it:
- If any error occurred and you want to stop the async control flow, you need to send something (error information) as the first parameter, then "done()" function (in the "async.waterfall()" statement) will be called, and following operations will not run anymore.
- If you want to move to the next operations, set the first parameter to null, and other optional parameter can be set if you need to send something to the next operation. For the waterfall flow, these optional data will be sent to next operation; for series you can find these data at "done()". You can check the reference of async for more detail.
- No any parameter is also acceptable, which means move on without any optional data.
Forgetting call "next()" is the most common mistake when coding with async, when you find program suspended in one of the operation but no error, please check whether you forgot to call it.
Let's go back to the connect() function, we set the next() as callback of client.connect(), therefore the operation will move on when client.connect() has some result, to query() if sucess or done() if failed. Here we suppose that connected successfully and go to query():
function query(next) { client.exec(querySql, { rowsAsArray: true }, function (err, rows) { if (err) { next(err); } else { console.log('Max:' + rows[0][0]); next(null, rows[0][0]); } }); }
This a simple example of async operation element, get a next() function as parameter, and do query. In the query's callback function we can see two conditions to call next(), in the no-error branch, we will set max ID as second parameter, which is the optional data to the next operation.
//Insert new records function insertName(maxId, next) { client.prepare(insertNameSql, function (err, statement) { if (err) { next(err); } else { var newId = maxId + 1; statement.exec([newId], function (err, rows) { if (err) { next(err); } else { next(null, newId); } }); } }); } function insertNation(newId, next) { client.prepare(insertNationSql, function (err, statement) { if (err) { next(err); } else { statement.exec([newId], function (err, rows) { if (err) { next(err); } else { next(); } }); } }); }
Insert operations are showed as above, since query() send max ID to next operation, insertName() operation have one more additional parameter "maxId" compared with query(). In insertName() we obtain new ID and also send to insertNation() operation. Note that we call next() with error information in all error handling branches.
OK, here we reach the final function: done():
function done(err, rows) { if (err) { console.error(err); console.log("Rollback!"); client.rollback() } else { client.commit(function(err) { if (err) { console.error('Error:', err); } else { console.log('Commit!'); } }); } client.setAutoCommit(true); client.end(); }
As we mentioned above done() will be called if any next() has an "un-null" value, some of you may be confused by the relationship between next() and done(), yes, they have a very ambiguous relations, next() do something in shadow. Actually you needn't to be worried about them, just remember "you will drill out from done if you have any error, otherwise from next operation, just like a gopher". In this example if done() goes to error branch, database will rollback and discard all write result at any time, while commit result only if reach here without any error. Here we set the auto-commit back to auto manually which is unnecessary in this example because we close connection after that, however if you running a long connection and do not need this anymore, please remember to close it in time.
OK, here is the flow of waterfall transaction. Let's run it and watch the output:
It seem correct, but how about the rollback? Let's compose a error scenario, with one line change in insertNation:
function insertNation(newId, next) { client.prepare(insertNationSql, function (err, statement) { if (err) { next(err); } else { var errorId = newId - 1; //Error statement.exec([errorId], function (err, rows) { if (err) { next(err); } else { next(); } }); } }); }
Since nation table has a unique column USER_ID, it will raise error if we insert duplicate value.
and you can check tables to ensure that no new record inserted.
This is the introduction of transaction's implementation in node.js, maybe you can find out that the code is more and more complicated and need some techniques on coding, which is different with other languages'.