I’m hoping to load about 18 TB of images (~22k studies) into Orthanc. I was wondering if anyone has advice on fast, or at least faster, ways to do this.
I did some measuring. With a very small test study (just one series in fact):
dcmsend or storescu - 30 seconds
curl (post, with expect header set) - 25 seconds
curl, using GNU parallel, -j8 - 16 seconds
dicomweb from OsiriX MD - 22 seconds
Parallel curl is clearly a winner here (except that it makes checking for and recovering from failures rather harder).
Is there anything I can do to MASSIVELY increase the speed?
Note that all these methods use either HTTP or DICOM, which degrades performance because of frequent network handshakings. You could give a try with HTTP clients that use keep-alive connections (Java, or “requests” in Python).
Such a Python plugin would use the “multithreading” library of Python in order to create a pool of threads that read DICOM files from a folder, then call “orthanc.RestApiPost(‘/instances’, dicom)” for each of those DICOM files: https://book.orthanc-server.com/plugins/python.html#listening-to-changes
For outside dicom directorys like referral cd’s and outside studies downloaded from other sources, I have used Tomovision’s free utility Dicomanager
Have been able to send to DCM4CHEE and Orthanc.
I can’t say I’ve timed it but it seems to transfer films pretty darned fast with their dicom tags intact.
Might be worth a try… it is built for mass transfer with no coding as long as the source is a dicom directory or just a bunch of dicom files in a known location.
I would put an orthanc instance using the postgresql on the same physical machine as the source files for speed and see what happens.
Just did a test with the Tomovision utility from a slow machine, across my internal network to a DCM4CHEE Virtual machine.
less than 3 seconds per film in a 9 film series. Will try if I have time to send to my Orthanc instance (I’m just running the default sqlite)
OK just confirmed … source on a pentium dual core 15 year old computer to Orthanc running on
a raspberry pi 3b+ and usb hard drive… Talk about a torture test… 3 seconds per film using Dicomanage
These are high quality direct digital xrays from a Veterinary DR machine by sedecal.
So if you can arrange your source (or it is already) in dicom directories… You’d probably have to divvy up the batches
Dicomanage will recursively search out a root directory for you, but the whole think would likely throw memory errors before it could load the whole batch. But at least in the small test 3 sec per film on some pretty limited hardware
I just tried this method - I had it ingest a directory full of files (serially, no threading).
starttime = datetime.datetime.now()
for fname in glob(‘/incoming/*dcm’):
with open(fname, ‘rb’) as f:
orthanc.RestApiPost(‘/instances’, f.read())
print(fname)
print(‘finished at’, datetime.datetime.now()-starttime)
Interestingly, it was almost precisely the same speed as dcmsend from another host, so I suspect just doing the simple way below (parallel curl) is just about as fast as it’s gonna reasonably get.
Fortunately, that works out to only about 3-4 weeks ingestion time, which is totally reasonable for a project of this size.
Relatedly, before I start on this project – is it even reasonable to be using Orthanc for this volume of images? When I search this group I see people talking about 1 TB, 5 TB instances – a few as big as 10, but not really any bigger.
I’m planning on starting with about 20 TB and expect that to grow by ~ 3-4 TB a year. Is that a problem for Orthanc?
The backend storage will be a iXsystems TrueNAS mounted via NFS via 40 GbE. The Postgres db will be locally stored on SSD (RAID1). How large should I expect that index to get?