Dicom web studies/{uid}/metadata and series/{uid}/metadata

salimkanoun · June 14, 2023, 7:20am

Hi there,

I wanted to bring back here an old issue that i thought to be solved but now returning back through the window.
The issue is about viewer/dicom-web communication and especially the /studies/{uid}/metadata and /series/{uid}/metadata.

The root of the issue is a fondamental difference between dicom archive like Dcm4Chee and Orthanc, Dcm4chee stores all metadata in the relational database while Orthanc extract only a part of them in it’s database and then consume other tags by reading from the filesystem.

In the DicomWeb standard we have 2 routes that are affected by this design the /metadata routes at studies and series level that lists all childs instances metadata. As Orthanc needs to read all instances metadata from storage to generate these responses it is significantly slower than Dcm4Che.

However several patches have been attempted to reduce this issue :

Enhancements in the DicomWeb plugin to retrieve data from database (main dicom tag) or extrapolate data (reminder : DicomWeb performance seems slow on the metadata query - #11 by jodogne)
More recently custom storage of metadata to avoid as much as possible to read from the storage backend

These two improvements has really improved several Dicom Web APIs, in particular Qido routes to list childs series of a study that do no require to read for storage anymore.

However the studies/metadata and series/metadata routes that lists all metadata in studies/series cannot rely on the database (unless adding the full dicom dictionnary in the extra tag to store in the database which would be a killer)

3y ago I also asked for improvement in OHIF side, initially OHIF was calling the /study/metadata which was a performance killer as a DICOM study may have 2000 to 8000 instances, it was putting a huge stress on the storage backend.
A that time OHIF responded and have changed his data fetching method to first request child series of a study using Quido request and then calling the /series/{uid}/metadata route to get each series metadata when the users requested to see the images (/series/metadata performance are acceptable as it contains only few hundreds of instances at this level). (reminder : Change approach for loading study metadata from WADO to QIDO+WADO · Issue #836 · OHIF/Viewers · GitHub)

The combination of theses two improvements in OHIF and Orthanc made the OHIF viewer much faster and I thought we were out of this issue (until today).

We get back to the problem because of a change in OHIF design and the implementation of Hanging protocols.

OHIF made a very nice Hanging Protocol management when a user can choose which studies/series he wants to be loaded and displayed.
This rely on some constraints definitions, most of them are based DicomTag values (ex : open the series with modality == ‘PT’) and could be executed by simply loading 1 instance of each series.
But some of them are computed, especially the isReconstructable rule that is meant to know if a series would be able to be volume reconstructed in the viewer and opened in a volume viewport (opposite to a non reconstructable series such as conventional X ray that will be opened only in a stack viewport)

Big problem : The reconstructable status cannot be consumed from a static dicom tag and require to know about each instance position to figure out if the slice spacing is constant and so volume can be reconstructed (and the logic is even more complicated for some scintigraphy multiframe studies).

So to evaluate the requested Hanging Protocols OHIF get back to a situation where it request all metadata of each series, it does not use the /study/{uid}/metadata anymore but asks concurrently for each series the series/{uid}/metadata route.

So we get back to the situation that OHIF needs all metadata of all instances in the opened study to start displaying something (due to hanging protocol evaluation) and then put a huge stress on the backend asking for metadata of thousands of instances.

Now I think we won’t be able to get ride of this problem at the viewer level, the study/{uid}/metadata routes exists in the Dicom-Web implementation and it will be hard to ask viewers to not use it (and now OHIF has a good reason to ask for all metadata as it is needed for a extensive Hanging protocol evaluation).

So we get back to the 2020 issue, how to improve performance of /studies/metadata and /series/metadata dicom web route in Orthanc.

Hypothesis I have could be :

Ask OHIF to request the specific tag slice location using Quido request and store SliceLocation in extra metadata in Orthanc
=> Problem this logic works only for unique frame like CT, PT, for multiframe image we found much more complicated logic relying on position sequence tag.
=> 2nd problem it would fit only for isReconstructable status which is today issue but not future needs that might appear that would need another tag and each viewer editor would need to take account of this performance issue (it doesn’t fix the main issue of the /metadata route performances) …
Maybe implement a sort of caching in Orthanc for these routes ? The nice thing of DICOM is that data are immutable and so favorable for caching.
When we have a series generated in Orthanc I guess a JSON cache of a series/{uid}/metadata could be generated and stored statically somewhere in the file system.
This could rely on “stableSeries” event to generate the cache and invalidate it in case of new instance arriving.
So all dicom web requesting /series/{uid}/metadata will look if the cache is available and simply output the cache rather than generating it from dicom storage (only 1 file read in the storage rather than opening each instance).
The study/{uid}/metadata will concatenate all series caches of the requested study (1 file to read per series).

I’m putting this message in this Orthanc forum because I think this issue is not anymore to be solved in the viewer level but more likely in server side (but will show the message anyway to OHIF).

Of course it is widely opened for discussion, all of this is just my understanding of this complicated problem (but i might be wrong in several places)

Best regards,

Salim

alainmazy · June 14, 2023, 7:50am

Hi Salim,

Thanks for the detailed message !

We are still working on these issues and we had already planned a meeting next week to discuss it with Sebastien. And we had plans to discuss about caching and another option to parallelize the requests to collect the tags (+ read only the beginning of the DICOM file and not the whole file like it is done now).

So I’ll keep you posted about this.

Best regards,

Alain.

salimkanoun · June 14, 2023, 8:06am

Nice ! Thanks !
Parallelization is an option (but only for storage that could be parallelized).
Reading the dicom header would probably not solve so much the problem, you will still pay the file access time.

However would be happy to hear what implementation you will choose and make testing,

Best regards,

Salim

sdscotti · June 16, 2023, 6:12am

Alain,

Just curious because I am not all that familiar with the Dicom Web standard, but a little, and I also is somewhat disappointed with the viewer speed for large studies, particularly with a moderate band-width.

I also want to start using OHIF for development, particularly with the apparent release of a plug-in for it.

Not sure how much of that is just due to the study size and the bandwidth or the way in which the Stone Viewer retrieves data, although other viewers do seem faster (the old Osimis one and MedDream)

When you make a DicomWeb request using QIDO-RS is there a standard set of tags that are retrieved if you do not specify any additional specific ones using includefield, fuzzymatching, etc. in the query ?

I’ve recently had a need to do that, and it seems that it returns some of the ones that I request but not others, e.g.

$queryArray["includefield"]= "00189362,0040030E,ProtocolName,AnatomicalOrientationType,StudyDescription,ImageType,Manufacturer,SOPClassUID,ImagePositionPatient,ImageOrientationPatient,BodyPartExamined,ContrastBolusTotalDose,SliceThickness,ConvolutionKernel,PatientPosition,CTReconstructionSequence,Laterality";

I’ve had to resort to fetching the instances themselves and parsing the entire header in some cases.

Is that functionally equivalent to using the “RequestedTags” option in a C-Find request in Orthanc, or does one use an internal method, “RequestedTags” and the other is Dicom Web ?

/sds

jodogne · June 19, 2023, 10:18am

Hi Salim,

I kindly invite you to give a try to the new OHIF plugin for Orthanc: https://discourse.orthanc-server.org/t/new-plugin-ohif/

This plugin should provide suitable performance.

Kind Regards,
Sébastien-

salimkanoun · June 19, 2023, 10:50am

Wow yes it does !
Very interesting this dicom-json integration.

Yes i think it could be a good way to solve the problem !

sdscotti · June 19, 2023, 11:07am

I think that is also what the MedDream viewer does with their Python Plug-in for their viewer integrated with Orthanc:

# Native MedDream Script

def GetStudyInfo(output, uri, **request):
    start_time = time.time()
    studyId = request['groups'][0]
    info = []
    instances = orthanc.RestApiGet('/studies/%s/instances?expand' % studyId)
    series = orthanc.RestApiGet('/studies/%s/series' % studyId)

    for serie in json.loads(series):
        tags = serie['MainDicomTags']
        tags['OrthancSeriesID'] = serie['ID']
        tags['ParentStudy'] = serie['ParentStudy']
        info1 = []

        for instance in json.loads(instances):
            if serie['ID'] == instance['ParentSeries']:
                tags1 = instance['MainDicomTags']
                tags1['OrthancInstanceID'] = instance['ID']
                metadata = orthanc.RestApiGet('/instances/%s/metadata?expand' % instance['ID'])

                for (key, value) in json.loads(metadata).items():
                    tags1[key] = value

                info1.append(tags1)

        tags['Instances'] = info1
        info.append(tags)
#         end_time = time.time()
#         info.append (str(end_time - start_time))
    output.AnswerBuffer(json.dumps(info), 'application/json')

orthanc.RegisterRestCallback('/studies/(.*)/info', GetStudyInfo)

sdscotti · June 19, 2023, 11:08am

Salim,

Did you already try it ??

/sds

salimkanoun · June 19, 2023, 12:00pm

Yes,

It works well,

The problem i see is that the integration is not relying on dicom web standard and thus would need to modify URL routing for thoose who are serving Orthanc / OHIF through a custom application.

@alainmazy when you will have more input could you tell us what’s the roadmap for dicom web ? Is there caching for dicomweb still on the road or are you relying on Sebsatien’s dicom-json implementation ?

There might be a “intermediate” option if the dicom-json api could have a setting to put the dicom-web location of the instance and note the Orthanc URI, so for third party app, it would have only one dicom-json additional dicom route to manage and then subsequents calls will call conventional dicom-web api.
In this case the dicom-json would be used only as a proxy to replace the /metadata call and then ohif would rely mainly on dicom-web

alainmazy · July 6, 2023, 7:50am

Hi @salimkanoun

We finally implemented multithreading (4 threads) to retrieve the info from the DICOM file and we are now reading only the DICOM headers. On my dev PC, the "Full' mode performance is now very close to the "MainDicomTags" performance (but I’m full SSD so this might not be representative).

Can you try on your side and report on the performance ?

Thx

Alain

salimkanoun · July 6, 2023, 12:58pm

Hi Alain thanks for all of this !

Yes will try but will take time, I have a demo server that I can update very soon but for the production server will do it more.likely in August.
Will keep you updated.

I saw a commit saying dicom web is consuming metadata from the database, you still read from storage backend ?

Anyway yes the mutlithreding and partial reading could be enough

Best regards

Salim

alainmazy · July 6, 2023, 3:34pm

Yes, the “Full” mode is still reading from the storage.

salimkanoun · July 6, 2023, 9:10pm

Dear alain,

In my demo server I didn’t notice significant improvement (dicom storage on a local SSD), /metadata on a series of 500 instances still require around 5 to 6 seconds (same time than previous version).

I will look in august on an object storage, it might benefit more of the multihrading than the local ssd,

Best regards,

Salim

alainmazy · July 7, 2023, 2:56pm

Hi Salim,

Actually, a lot of time is spent parsing the DICOM file and generating the DICOMWeb-json format so, even with an SSD, you should benefit from multithreading since 3 threads can work on the parsing/serialization while the 4th one is reading from the disk.

On my side, here is the difference for a 600 instances series (Docker 23.6.1 on top, 23.7.0 below):

$ time curl http://localhost:8052/dicom-web/studies/1.2.276.0.7230010.3.1.2.1215942821.4756.1664826045.3529/series/1.2.276.0.7230010.3.1.3.1215942821.4756.1664831321.7863/metadata > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3543k  100 3543k    0     0  4353k      0 --:--:-- --:--:-- --:--:-- 4348k

real    0m0.822s
user    0m0.012s
sys     0m0.000s
$ time curl http://localhost:8052/dicom-web/studies/1.2.276.0.7230010.3.1.2.1215942821.4756.1664826045.3529/series/1.2.276.0.7230010.3.1.3.1215942821.4756.1664831321.7863/metadata > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3543k  100 3543k    0     0  17.2M      0 --:--:-- --:--:-- --:--:-- 17.2M

real    0m0.209s
user    0m0.012s
sys     0m0.000s

For some (still unknown) reasons, in the browser, when opening the StoneViewer, the difference is smaller but, for the 6.000 instances study, the total time to load all metadata for all series still goes down from 26s to 16s.

Alain

salimkanoun · July 7, 2023, 3:03pm

Ok nice I will do additional investigations,

Thanks,

Salim

salimkanoun · July 7, 2023, 10:56pm

Dear @alainmazy ,

in my production server I see worst performance than previous version, but i think i have a CPU bottelneck.

I’m running Orthanc on a Azure Container Instance and i already had doubt about CPU performances of ACIs.

I saw to the CPU load reaching 100% when calling a series/metadata api.
The API timing was 17s and now it is 45-60secs.

I tried to disabled the multihreading using the env ORTHANC__DICOM_WEB__METADATA_WORKER_THREADS_COUNT = 1 but without improvement, is this env available in docker ?

Best regards,

Salim

Holakunle69 · July 8, 2023, 6:21pm

Alain,

The viewer takes a little bit of time to load large studies and sometimes forever even when on fast bandwidth, am curious as to why this, because compared to other viewers like (stone, osmisis) its quite slow. I tried working with the previous version of ohif and its reasonably fast. I don’t think the problem is on the ohif side, I guess its more with the integration with orthanc. i might be wrong just saying

alainmazy · July 10, 2023, 8:55am

Yes, this is the correct spelling, I just tried it. If you set it to 1, it goes back to the previous code except that only the beginning of the file is read instead of the whole file. I will try to give it a try with an Azure/S3 plugin too.

Alain

alainmazy · July 10, 2023, 9:46am

Hi @salimkanoun

Just made a test with a “ridiculously slow” setup:

Orthanc running on my dev PC
Using an Azure container to store DICOM files (with a 70 Mbps download speed)

With 23.6.1: it takes 128s to get the metadata for a 600 instances series
With 23.7.0 : 38s

Best regards,

Alain

salimkanoun · July 10, 2023, 10:02am

wow what’s happening,
I’m going to investigate, also in my production compared to my dev server I use 5 db connexion count to postgres, can it impact performance of dicom web retrieval ? (it means we are parralelizing metadata retrieval and db queries all also parallelized)

I’m going to checks my parameters and see if i find something,

Best regards,

Salim