Stone of Orthanc data fetching might be suboptimal

salimkanoun · May 8, 2022, 10:00pm

Hi there,

I wanted to share here a thinking about possible sub-optimal loading of data in stone of orthanc,

Here a video of the issue : https://www.youtube.com/watch?v=urOuR5WtmaQ

My observation is the following :

By Initializing Stone at the patient level Orthanc start to fetch each series metadata sequentially one by one
the series /metadata API is long to process because of the need to access each instance in the storage endpoint of Orthanc. In my scenario the /metadata API at series level require 6 to 10 seconds.
One weird thing : If you ask to display a series before stone has received the metadata, there is an infinite waiting spinner, the images are not display when the response is received. One metadata is received (shown with the slice count appearing in the thumbnail) you have to re-drag and drop to be able to see images.
In the case of querying a patient level with a lot of studies, the preloading mechanism call a lot of time consuming API while the physician will certainly not going to analyze all of them. Also in terms of back-end workload it seems sub-optimal as the /metadata API will require a instance access and each /instance call to load the slices will call to another instance access on the storage access of the same files.

So I may open several question / issues, I don’t expect an immediate solution but maybe some idea to think for a roadmap of Stone enhancements,

→ Is it possible when a series is requested to force the /metadata API call and then display images at received response (to avoid this infinite spinner)

→ Is there a way to reduce the call to /metadata API ?
The preloading mechanism is interesting but maybe should be more customizable to not put to much pressure on the backend when multiple studies are available (maybe starting by choosing in the option if we should preload or not).

→ What is the reason the /metadata of all instances are called before the /instances APIs ? OHIF made the same choice, so I bet there is a reason but can’t figure why. Why this methods in preferred rather then progressively loading the metadata with the /instances calls ?
If this dual scenario /metadata then /instances is needed, is there a way to imagine avoiding dual access to the storage backend ? Maybe caching the /metadata answer somewhere ?

Best regards,

SAlim

alainmazy · May 10, 2022, 8:12am

Hi Salim,

Note that, with the latest Orthanc 1.11.0 and DICOMWEB 1.8 releases, you should be able to optimize the calls to the /metadata routes: https://hg.orthanc-server.com/orthanc-dicomweb/file/tip/NEWS but that might not solve the problem completely.

I’ll let Sébastien answer for the other parts…

Best regards,

Alain.

salimkanoun · May 10, 2022, 1:01pm

Dear Alain,

Thanks for your answer and for you improvement of the DicomWeb and database storage of metadata, which is a real nice improvement of Orthanc.

Just a question, i think the improvement you made could accelerate the query API, for instance to get studies of a patients and series of a study, as all tags are not requested in the answer you can improve performance by storing those informations in the database and return the value from the database and not from the storage endpoint.

But I don’t see how it could benefit to the series/metadata APIs as this API return the full metadata of all instances and so can’t rely on the database (unless all the JSON payload of all instance would be stored in the database), so you still anyway need to access to the storage engine and pay the price of I/O of all these instances.

Am i correct or there another benefit i don’t see in this API ?

For the series/metadata one improvement could be to multithread access of instance in case of a object storage backend but it might be overkill. Probably the best way would be to enable some caching of some dicomweb API inside or outside orthanc to speed up response of some predefined API during a certain time.
It might give you an idea for a new plugin ?

the 6 / 7 seconds to get this API answer is in fact not that dramatic, it is acceptable if it comes at the time you want to query series images, if the metadata happens when you willing to download 500Mb of Image to load it in the viewer, the user impact is not so high, loading images will take time anyway so that’s why i think the challenge is more on a viewer loading workflow than the core API generation (i’m talking about the series/metadata API, the improvement for querying by storing tags is clearly a huge improvement).

Best regards,

Salim

alainmazy · May 10, 2022, 2:01pm

Hi Salim,

My bad, I answered too fast ! What I had in mind to speed up the /metadata route is actually to
replace the calls to /studies/…/series/…/metadata by calls to /instances?StudyInstanceUID=…&SeriesInstanceUID=…&includefield=…&includefield=…
to benefit from ExtraMainDicomTags stored in DB and avoid reading from disk.

The /metadata returns all tags while, with QIDO-RS, you can select only the tags that are relevant for the viewer and therefore, you can decide to store those tags.
I have added a TODO in stone for that: https://hg.orthanc-server.com/orthanc-stone/rev/0fc84835289f

And, the workflow issue is already in the TODO: https://hg.orthanc-server.com/orthanc-stone/rev/efed41b76dc7

Best regards,

Alain.

alainmazy · March 30, 2023, 2:06pm

FYI, the workflow issue has just been fixed here: https://hg.orthanc-server.com/orthanc-stone/rev/b2738d7a388d

salimkanoun · March 30, 2023, 2:28pm

Great ! Thanks !