Slow deletes with the bulk delete tool

Hi guys! Hope you all are doing well.

I’ve been experimenting with the bulk-delete tool to delete exams and it works fine. The caveat however is that the deletes are quite slow for big studies (namely CT/MR with 500++ instances). I’ve seen times ranging from 10~40 minutes for some studies. I run the deletion as a daily routine to free space in the server, so it runs in parallel with the new arrival of exams.

I was wondering if there is anything I can do to speed up the process, since the big deletes puts heavy stress in the database, causing incoming exams to crash with MySQL errors (lock errors). I use Orthanc with MySQL, storing the exams in the filesystem. I’m also using Docker. Here are some parts of my configuration (the ones I think are the most relevant for the matter):

“StoreDicom”: true,
“DefaultEncoding”: “Utf8”,
“StorageDirectory”: “/root/orthanc-storage”,
“IndexDirectory”: “/root/orthanc-index”,
“TemporaryDirectory”: “/root/orthanc-cache”,
“MallocArenaMax” : 5,

“MySQL”: {
“EnableSsl” : false,
“EnableIndex”: true,
“EnableStorage”: false,
“Host”: “###MACHINE_LOCALHOST###”,
“Port”: 3306,
“UnixSocket”: “/var/lib/mysql/mysql.sock”,
“Database”: “###ORTHANC_DATABASE###”,
“Username”: “###ORTHANC_DATABASE_USER###”,
“Password”: “###ORTHANC_DATABASE_USER_PASSWORD###”,
“Lock”: false,
“MaximumConnectionRetries” : 10,
“ConnectionRetryInterval” : 5,
“IndexConnectionsCount”: 5
},

“DatabaseServerIdentifier” : “OrthancWriter”,
“ConcurrentJobs” : 10,

“SaveJobs”: true,
“JobsHistorySize”: 50,
“LimitJobs”: 20,
“MediaArchiveSize”: 20

As always, thanks for the help!

Hello,

There is a FAQ about this topic in the Orthanc Book:
https://book.orthanc-server.com/faq/scalability.html#slow-deletions

HTH,
Sébastien-

Hi Sebastien, thanks for the info! Indeed, deleting big files from the filesystem while having to update references in the database can be quite IO/CPU intensive. However, I suspect the slowness I am facing is linked MySQL locks, and I think Orthanc could avoid it.

By default, the isolation level of MySQL 5.7 is REPEATABLE READ. I changed it to READ COMMITTED (it is the default in Postgres) and I noticed a decrease in locks. Now, depending on how the bulk-delete creates its deletes queries, it could produce table locks that could block the arrival of new resources.

By running this MySQL query
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND != ‘Sleep’;

I see that both bulk-delete and the old DELETE endpoint create this DELETE (notice the other locked statements):

If the DELETE was splitted and executed in chunks inside a loop, I think the issue could be resolved. SQL Server for example has a set of conditions to lock an entire table because of a delete, with one of the conditions being 5k++ rows affected. I don’t have a number for MySQL, but something like this inside a loop should avoid table locks:

DELETE FROM Resources WHERE internalId IN (SELECT internalId FROM DeletedResources LIMIT 1000)

A domingo, 3 de abril de 2022 à(s) 22:58:49 UTC+1, Diego Victor de Jesus escreveu:

Dear Diego,

Thanks for your investigation, but as indicated in my previous link, the problem is not in the database (deleting rows from any SQL server is quite fast), but in the deletion of the files from the filesystem:
https://book.orthanc-server.com/faq/scalability.html#slow-deletions

It is out of the question to migrate from “SERIALIZABLE” (the semantics used in Orthanc database plugins) to “READ COMMITTED”, because Orthanc stores medical information, and as such, it must provide a consistent view over the database at any time:

https://en.wikipedia.org/wiki/Isolation_(database_systems)

In other words, Orthanc cannot trade consistency for performance. Note that in MySQL and PostgreSQL, the transactions are tagged as “READ ONLY” wherever possible, which allows concurrent accesses to the database according to the “single writer / multiple readers” model in order to improve performance:
https://book.orthanc-server.com/faq/scalability.html#concurrent-accesses-to-the-db-in-orthanc-1-9-2

The actual solution to speeding up deletions is the one indicated in the Orthanc Book: “It is possible to create an storage area plugin that delays the actual deletion from the filesystem. The plugin would maintain a queue of files to be removed. The actual deletion from the filesystem would be done asynchronously in a separate thread.”

I have already implemented a proof-of-concept and it works great. However, I would need hours/days to clean it and make it usable by other people, which is something I cannot afford doing freely. As Osimis doesn’t allow me to receive money for Orthanc, please get in touch with Alain Mazy:
https://book.orthanc-server.com/users/support.html#finding-professional-assistance
https://opencollective.com/orthanc

Regards,

Sébastien-

Hello Sebastien! I see, so Orthanc needs to use serializable isolation. I was digging into the Orthanc code to understand how the deletion works and how it can be improved by running the filesystem/SQL deletions in a dedicated thread. If I make any progress in this matter I’ll post it here. Thanks for the clarifications!

A segunda-feira, 4 de abril de 2022 à(s) 08:47:25 UTC+1, s.jo...@gmail.com escreveu:

To solve this issue, I decided to start the implementation of a multi-writer architecture. By doing that I could simply switch one writer instance by another, then let the switched writer instance delete without receiving new studies.

Since I use Docker Compose, the implementation was not hard at all, and it is amazing how flexible Orthanc can become into a Docker setup. Anyway, to anyone interested, I am sharing the Python scripts for the router instance and the writer instances.

A segunda-feira, 4 de abril de 2022 à(s) 19:42:32 UTC+1, Diego Victor de Jesus escreveu:

router-script.py (10.6 KB)

writer-script.py (9.1 KB)

By the way, I discovered a very important information about our dedicated server having the issue: it uses HDs, not SSDs! Our second server uses only SSDs and has blazing fast delete speeds with Orthanc. In fact, I generated two tables with deletion results from the two servers (HD vs SSD) to emphasize why we should always use SSDs. The difference is brutal!

The attached results have the image_count, size, deletion_begin_date and deletion_finish_date columns, and the data ir ordered by image_count desc.

A quinta-feira, 28 de abril de 2022 à(s) 13:12:41 UTC+1, Diego Victor de Jesus escreveu:

HD DELETION TIMES.txt (151 KB)

SSD DELETION TIMES.txt (324 KB)

Hello,