This page describes an advanced variant of XML and JSON import that, if used inappropriately, can lead to data inconsistency!
By default, every import is performed as a single database transaction — meaning either everything is saved or nothing is.
This is usually the desired behavior and there is generally no reason to change it. However, there may be situations where transactional behavior is not necessary — in such cases, the atomic attribute can be used to change this behavior.
XML
<?xml version="1.0"?>
<winstrom version="1.0" atomic="false">
<faktura-vydana>
<id>code:123</id> ...
<polozkyFaktury>
<faktura-vydana-polozka>...</faktura-vydana-polozka>
<faktura-vydana-polozka>...</faktura-vydana-polozka>
<faktura-vydana-polozka>...</faktura-vydana-polozka>
</polozkyFaktury>
</faktura-vydana>
<faktura-vydana>
<id>code:456</id> ...
<polozkyFaktury>
<faktura-vydana-polozka>...</faktura-vydana-polozka>
<faktura-vydana-polozka>...</faktura-vydana-polozka>
<faktura-vydana-polozka>...</faktura-vydana-polozka>
</polozkyFaktury>
</faktura-vydana>
</winstrom>
JSON
{
"winstrom": {
"@version": "1.0",
"@atomic": "false",
"faktura-vydana": [
{
"id": "code:123",
"polozkyFaktury": {
"faktura-vydana-polozka": [
"...",
"...",
"..."
]
}
},
{
"id": "code:456",
"polozkyFaktury": {
"faktura-vydana-polozka": [
"...",
"...",
"..."
]
}
}
]
}
}
If you set the atomic attribute to the value false, each record will be imported in a separate transaction. In the example above, two database transactions will take place — one for invoice 123 and one for invoice 456. Line items are part of the invoice and are therefore imported in the same transaction as the invoice itself.
What is the benefit? When importing large XML files with many records, a transaction takes a long time and a lot of information must be kept in memory. Both factors have an adverse effect on performance. However, if each record is independent and it does not matter if saving one of them fails (for example, if you repeat the import regularly and/or are able to intervene manually in case of issues), you can significantly reduce the memory footprint of the import.
In the case of truly large imports, the memory requirements for holding not-yet-saved data become so significant that the garbage collector starts consuming a substantial portion of CPU time (this can be monitored using, for example, the jconsole tool, which is a standard part of the JDK development environment). In atomic="false" mode, this can also drastically reduce the time required for processing.
