One of the main design requirements is to baseline the expected volume of data flowing through the middleware layer. Recently one of my clients asked me to review the design of some interfaces which were failing the load testing and was not able to pass the peak volume testing . They had issues of server crashing and JVM out of memory errors. On analysis the peak volumes were quite huge for these interfaces to handle. The payload file size used to touch a peak of 25 mb. As per the design the process used to receive the XML payload as a string, which when get converted to a variable and transformed used to clog the heap space which may finally result in running out of heap space(JVM).
There were multiple options which came to the table. I am listing a few of them
1. split the payload into smaller chunks before feeding it to the BPEL process and process it one by one.
+ve--> The BPEL process will be left untouched as flow will be same as the payload size will be manageable.
-ve -->There will be more additional components which will be performing the splitting of payload into smaller chunks. The more components involved adds to more points of exception handling.
-ve --> The overhead of maintaining the atomicity of the transactions are too high and it would complicate the entire system
-ve --> In case process uses control tables it will involve multiple calls to control tables which would maintain the entire end to end processing of the document.
2. To go for a complete java alternative which will be parsing the XML file and converting it into JAVA POJO's and directly inserting into database or writing to a file.
+ve --> Stable since it XML parsers and java option is tried and tested option.
-ve --> Moving away from the common architecture of using middleware components of BPEL ,ESB to more conservative approach.
3. To parse the XML document in smaller chunks in XSL transformation so that we directly address the clogging of heap space due to transformation of XML document. The option was to process smaller chunks of the document within a while in the BPEL process.
+ve--> The process will have better end to end control as the while loop is within and will be able to use all the transaction management capabilities with BPEL to maintain the atomicity of the transaction.
-ve --> In case process uses control tables it will involve multiple calls to control tables which would maintain the entire end to end processing of the document.
The 3rd option is the best work around to handle the huge payload transformation and design issues. The huge payload was processes successfully and it passed the negative and load testing with flying colours.
4 comments:
In case performance/memory usage of large XML is an issue, you might want to check out
vtd-xml which is the latest and greatest XML processing model available... Oracle may be able to offer this solution to help you solve issues, go ask them
can u explain the third option, a bit more. I am facing the same issue.
there are options in xsl where you can pass index variables -lower index and higher index which will be used to control number of elements that gets passed while invoking the new child process/ get transformed at a point a time so as to reduce the memory clogging.
WSO2 ESB also provides light weight solutions to process huge payloads.It is available at http://wso2.com/products/enterprise-service-bus/
there are mediators available to process XML data on the payload very easily.
Post a Comment