General possibility to avoid storing the JSON files without generally disrupting the Orthanc workflow

Knut · May 11, 2020, 9:44am

Hi,

just wonder if the following scenario is possible in general:
1/ refine incoming C-FIND requests via Lua, allowing only tags that Orthanc stores in its database (see below)
2/ via Storage Area Plugin just throw away the JSON files, not saving them at all

Background:
If you use Orthanc for a large scale of images you literally cannot use any other Tag for C-FIND querying than those listed below.

At some point looking thru every single JSON file in order to perform the query is simply no option anymore.

So if you cannot use the JSON files for querying, you would maybe want to get rid of them at all, since they consume additional storage space (close to 10% of the original size of the Dicom Files in my experience) without - well, that’s my theory - added value.

I don’t know if Orthanc is using the JSON files for any other purposes than Querying and showing them in the Explorer (which would require to store them nevertheless)?

From what I see Orthanc takes the following DICOM tags directly out of the database without accessing the JSON files at all, which would be - in my view - a quite comprehensive and sufficient set to handle routine C-FIND / QR requests:

0008,0012 InstanceCreationDate
0008,0013 InstanceCreationTime
0008,0018 SOPInstanceUID
0008,0020 StudyDate
0008,0030 StudyTime
0008,0050 AccessionNumber
0008,0060 Modality
0008,0080 InstitutionName
0008,0090 ReferringPhysicianName
0008,1030 StudyDescription
0010,0010 PatientName
0010,0020 PatientID
0010,0030 PatientBirthDate
0010,0040 PatientSex
0018,0015 BodyPartExamined
0020,0010 StudyID
0020,0011 SeriesNumber
0020,0012 AcquisitionNumber
0020,0013 InstanceNumber
0020,0032 ImagePositionPatient
0020,0037 ImageOrientationPatient
0008,0021 SeriesDate
0008,0031 SeriesTime
0008,0070 Manufacturer
0008,103E SeriesDescription
0020,000E SeriesInstanceUID
0020,0037 ImageOrientationPatient
0040,0254 PerformedProcedureStepDescription
0020,000D StudyInstanceUID

Alain_Mazy3 · May 15, 2020, 2:15pm

Hi Knut,

Your analysis is very good and makes a lot of sense !
The JSON files are indeed used for C-Find but, you can already disable them thx to that option:

// Performance setting to specify how Orthanc accesses the storage[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l500)
  // area during C-FIND. Three modes are available: (1) "Always"[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l501)
  // allows Orthanc to read the storage area as soon as it needs an[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l502)
  // information that is not present in its database (slowest mode),[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l503)
  // (2) "Never" prevents Orthanc from accessing the storage area, and[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l504)
  // makes it uses exclusively its database (fastest mode), and (3)[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l505)
  // "Answers" allows Orthanc to read the storage area to generate its[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l506)
  // answers, but not to filter the DICOM resources (balance between[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l507)
  // the two modes). By default, the mode is "Always", which[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l508)
  // corresponds to the behavior of Orthanc <= 1.5.0.[](https://hg.orthanc-server.com/orthanc/file/Orthanc-1.6.1/Resources/Configuration.json#l509)
  "StorageAccessOnFind" : "Always",

JSON files are also used in the instances/{…}/tags route that is indeed used in the Orthanc Explorer. Therefore, I can not really predict the behaviour of this route once the JSON file is not there.

So I would say that, if you use Orthanc as a DICOM server only, it looks safe to throw the JSON files away. If you use the Rest API, I would give it a try but expect for the worst

Note that, we’ve already thought of removing the JSON files as part of a larger DB refactoring but that was only an idea up to now. That would be a bit cleaner than just avoiding storing the JSON file on disk…

Best regards,

Alain

Knut · May 16, 2020, 11:27pm

Hi Alain,

thank you for your feedback. Somehow missed out that configuration - it’s literally a piece of gold

I did some tests now with just adding the following to lines to the three callback functions of the StorageArea Sample plugin:
if (type == OrthancPluginContentType_DicomAsJson)
return OrthancPluginErrorCode_Success;

The Explorer works just as usual (just the Tags view stays empty), without error messages in trace mode.

Just if you use the /simplified-tags path you receive a status 500 (Orthanc is seemingly checking the JSON versus a stored MD5 hash).
(Same happens if you include tags other those listed below in a /tools/find search)

I tried very hard but couldn’t run Orthanc into any crash or error - so far better than your expectations

If there was a way to access the db-stored tags and switch off the MD5 check, one could even infuse the db-stored tags in the StorageRead callback.

Cheers
Knut

Alain_Mazy3 · May 19, 2020, 4:11pm

Hi Knut,

Good news that it stays alive !

If there was a way to access the db-stored tags and switch off the MD5 check, one could even infuse the db-stored tags in the StorageRead callback.

That might possible to implement an option “StoreDicomAsJson” that would prevent even trying to read/write those files (this needs to be done in the Core itself and can not be done in a plugin).
This could speed up the whole Orthanc and reduce the size of the storage usage.

We could even have 3-4 options:

“Always”: always save/read DicomAsJson files like current Orthanc
“Never”: never save/never try to read → if you try to access a route like /instances/…/tags, you get a 404
“RebuildOnDemand”: never save the file but, if you try to access data that is supposed to be stored in the DicomAsJson, rebuild the content from the DICOM file.
That would be much slower than the current /instances/…/tags but it’s a compromise …
“TagsFromDb”: never save/never try to read → if you try to access a route like /instances/…/tags, you would only get the tags that are stored in DB.

Best regards,

Alain.

Knut · August 31, 2020, 9:56pm

Hi Alain,

I have played around with the great new set of methods of the 1.7.2 version.
That led me back to our discussion about avoiding storing the JSON files.
Just one thought/question:

1/ is it possible to get the corresponding Dicom file uuid at plugin level when StorageCreate is called with OrthancPluginContentType type = OrthancPluginContentType_DicomAsJson
(kind of GetDicomFileUUIDFromJsonFileUUID() method?)
2/ is there any way to skip the Orthanc Core MD5 check?

I guess you get my idea → if both was possible you could easily create the JSON data using the new OrthancPluginGetInstanceAdvancedJson method, thus maintaining the full original workflow without the need to store the JSON files.

Best Regards

Knut

Knut · August 31, 2020, 10:03pm

… just one correction to my post:
should be 1/ is it possible to get the corresponding Dicom file uuid at plugin level when StorageRead is called with OrthancPluginContentType type = OrthancPluginContentType_DicomAsJson

jodogne · September 2, 2020, 6:14am

Hello,

1/ It is not really safe for the “OrthancPluginStorageRead()” and “OrthancPluginStorageCreate()” callbacks to try and parse the content of the attachments they read/create. Indeed, your callbacks must take into account the fact that these attachments might have been compressed by the Orthanc core (if the “StorageCompression” configuration option is set to “true”).

That being said, you could consider parsing the “DicomAsJson” using any JSON parsing library (Orthanc uses the jsoncpp library to this end), then use the class “DicomInstanceHasher” that is used by Orthanc to hash DICOM identifiers (PatientID, StudyInstanceUID, SeriesInstancesUID and SOPInstanceUID) to Orthanc identifiers:
https://book.orthanc-server.com/faq/orthanc-ids.html

2/ The “StoreMD5ForAttachments” configuration option should be what you are looking for.

HTH,
Sébastien-

Knut · September 3, 2020, 8:09pm

Hi Sébastien,
thank you very much and sorry for 2/
actually 100% an RTFM issue, indeed

Concerning the Orthanc identifiers: as far as I understand the uuid (filename) as handed over in the Storage Area

static OrthancPluginErrorCode StorageRead(void** content, int64_t* size, const char* uuid, OrthancPluginContentType type)

is different to the identifiers described in the Orthanc book (and being used at REST API level). It would be quite handy to have any possibility to know which sop instance is being handled in the StorageRead / StorageDelete methods - I could not find out anything straight-forward yet.

KR
Knut

jodogne · September 5, 2020, 10:44am

Hello,

The UUID of attachments is documented in the following section of the Orthanc Book:
https://book.orthanc-server.com/faq/orthanc-storage.html#storage-area

By definition, these UUID carry no information about which DICOM resources they are attached to.

You must think of the “storage area” of Orthanc as an object storage area, that is only designed to stored binary documents, and that abstracts very different technologies such as a simple filesystem, AWS S3, Ceph, PostgreSQL database, NFS network drive or Azure blob storage:
https://en.wikipedia.org/wiki/Object_storage

The “storage area” of Orthanc has no clue about what it stores. Attachments can be DICOM, JSON, yet any custom type of files. This abstraction enables Orthanc to store its database to a wide variety of systems (such as the ones that are encountered in cloud deployments). Consequently, there will be no modification in future releases of Orthanc related to the “storage area”.

You are actually interested in more high-level constructions, such as the “OrthancPluginRegisterOnChangeCallback()” or “OrthancPluginRegisterOnStoredInstanceCallback()” callbacks that can be found in the Orthanc SDK:
https://sdk.orthanc-server.com/group__Callbacks.html#ga78140887a94f1afb067a15db5ee4099c

https://sdk.orthanc-server.com/group__Callbacks.html#ga1af7c8c9877aaf670208bfc53164b9fb

HTH,
Sébastien-