Anonymize/Modify with a lookup table?

Would it be possible to produce anonymized/modified DICOM with a "new" PatientID while maintaining a lookup table with the original PatientID values? The new PatientID would be unique to the original patient.

   We have a research/teaching use case where we'd like to have the users deal with anonymized images, but have our data administrator be able to track those DICOM back to the originals.

   Reading through the online documentation, I think one approach I could use would be either the REST API or a backend lua script to run "modify" on series and maintaing a separate (ex. sqlite) database of the map between old and new PatientID. I'd then export the modified series.

   If I'm reading the documentation correctly, a big distinction between modify and anonymize is that the modified series will increase disk usage within Orthanc as the modified DICOM files are stored. Anonymize, on the other hand, simply strips (or changes) the relevant DICOM parameters during export. Is that correct?

   Does the anonymize API maintain the sort of map I'm describing between the old and new DICOM? (ex. by way of the new UID it generates)? Or is the association lost once the anonymized images leave Orthanc?

   I've been evaluating several different open source systems for managing our research DICOM (ex. MIRC/CTP, Kitware MIDAS, dcm4che, etc.) and really like the simplicity and elegance of Orthanc.

Thanks,
John.

I went ahead and setup Orthanc to run in a virtual Fedora 22 workstation to test things out.

I anonymized a dataset via the web interface and see that the anonymized study has a “Before anonymization” link (in the web interface) back to the original study.

So, my next question would be whether the pointer embedded in the “Before anonymization” link is available to the REST API?

I don’t mean the href itself, but the map between the newly created PatientID in the anonymized study and the old study and/or old PatientID.

Thanks,
John.

Dear John,

Actually, Orthanc Explorer (the Web interface of Orthanc) only resorts to the REST API for all of its features. This implies that anything Orthanc Explorer can do, can be done with a call to the REST API: This rule includes accessing the “Before anonymization” and “Before modification” information.

Concretely, the corresponding JavaScript code can be found in the function “SetupAnonymizedOrModifiedFrom()” of Orthanc Explorer:
https://bitbucket.org/sjodogne/orthanc/src/Orthanc-0.9.3/OrthancExplorer/explorer.js#explorer.js-326

So, whenever you retrieve an anonymized or modified patient/study/series with the REST API, Orthanc returns an “AnoymizedFrom” or “ModifiedFrom” field in the JSON answer.

I hope this answers all your questions.

Regards,
Sébastien-

Hi Sebastien,

I’ve returned to this issue of accessing the AnonymizedFrom/ModifiedFrom data.

I have server side scripts initiated with OnStableStudy that anonymize the incoming DICOM images. I check the meta data associated with the study to make sure there are no “AnonymizedFrom” or “ModifiedFrom” tags in the meta data. That way, I avoid re-anonymizing already anonymized data.

That setup has worked for several months until recently when I had to switch from calling “/studies/{ID}/anonymize” to calling on an instance level, “instances/{ID}/anonymize”.

When I anonymize at the instance level this way, the “AnonymizedFrom” meta data is not being set. As a result, some time later, OnStableStudy is called with the anonymized study. Because no AnonymizedFrom is found in the meta data, the anonymized study is anonymized again. And again. And again. I finally shutdown the Orthanc when I realized it was stuck in an infinite loop.

My question is, where is the AnonymizedFrom meta stored? Or set for that matter? Should a call to anonymize at the instance level be setting the AnonymizedFrom field or do I need to do that myself? And if I need to set this meta data, how do I do it?

I should add that because I am running anonymize at the instance level, I am creating new Study, Series and Instance UID manually before calling anonymize and then feeding them to anonymize via the “replace” option. I do not let anonymize generate UID itself. Maybe anonymize ONLY sets AnonymizedFrom when it handles the UID generation? And I’ve skipped that step by generating UID myself?

Thanks,
John.

I found the routine AnonymizeOrModifyResource in OrthancRestAnonymizeModify and see the code where the metadata is set up to point from anonymized/modified files to original files.

Checking the metadata table in the PostGres database I can see that my newly anonymized/modified Dicom are missing these metadata connections.

So, when I call generate-uid myself for each new study, series and instance and then use the values returned with a Replace keyword with the instances/{ID}/anonymize, the links between new and old are not recorded.

That would imply that when the call gets to AnonymizeOrModifyResource, the new and old hashes are somehow the same, and don’t trigger the addition of the links to the metadata table.

Perhaps that data is never updated when anonymizing at the instance level? After all, unlike anonymizing at the Patient/Study/Series levels, which generate new Patient/Study/Series within Orthanc itself and return metadata, a call to /instances/{ID}/anonymize will RETURN the anonymized DICOM itself. This must then be uploaded back into Orthanc, if you want it there.

I’m speculating that if Orthanc doesn’t bother to record anonymization links when run at the instance level, the newly uploaded DICOM looks like a new person/study/series.

Is there a way to call anonymize at the instance level and have Orthanc keep a record in the metadata table of the links to the original DICOM? Similarly, is there a way to have it simply retain the new DICOM, rather than sending it in response.

All of this is related to an effort I described in a different thread where I’m attempting to keep the intra-series connections between MR series after anonymization that are normally broken by the default anonymization process.

Now that I think of it, there may be a way to accomplish the anonymization I want by returning to working at the study level and then probing the links in the metadata table after the fact. I should be able to reestablish the links between anonymized series if I keep track of the links before anonymization and use the metadata to lookup the new linking IDs.

Either that, or I will modify the metadata table myself to insert the links I want.

John.

Dear John,

Sorry for the delay.

That setup has worked for several months until recently when I had to switch from calling “/studies/{ID}/anonymize” to calling on an instance level, “instances/{ID}/anonymize”.

Yes, the anonymization and modification primitives work differently at the instance level, than at the patient/study/series levels. As described in the Orthanc Book, anonymizing/modifying one single instance will return the anonymized/modifed instance in the body of the REST answer (whereas when applied at the patient/study/series levels, the result is stored back in Orthanc):
http://book.orthanc-server.com/users/anonymization.html#modification-of-a-single-instance

My question is, where is the AnonymizedFrom meta stored? Or set for that matter?

It is set by the Orthanc core, using the metadata primitive. Metadata consists in an associative key-value array (mapping a integer key in the range [0,65535] to a string value) that is associated with each DICOM resource stored inside Orthanc (may it be a patient, a study, a series or an instance). Metadata records information that is not readily available in the DICOM tags. “AnonymizedFrom” is one such metadata (whose integer key is 6).

Metadata associated associated with one instance “id” can be accessed through the REST API:

curl http://localhost:8042/instances/cb855110-5f4da420-ec9dc9cb-2af6a9bb-dcbd180e/metadata

When you modify/anonymize a single instance, the result is not stored inside the Orthanc database, so “AnonymizedFrom” is unavailable.

Should a call to anonymize at the instance level be setting the AnonymizedFrom field or do I need to do that myself? And if I need to set this meta data, how do I do it?

As “AnonymizedFrom” and “ModifiedFrom” are handled privately by Orthanc, they are read-only metadata. You cannot set these fields by yourself.

However, you can mimic their behavior by using user-defined metadata. Such metadata is associated with an integer key greater or equal to 1024. You can associate a symbolic name to user-defined metadata using the “UserMetadata” configuration option of Orthanc. For instance, here is how you would set/get metadata 1024:

curl http://localhost:8042/instances/cb855110-5f4da420-ec9dc9cb-2af6a9bb-dcbd180e/metadata/1024 -X PUT -d ‘hello’

curl http://localhost:8042/instances/cb855110-5f4da420-ec9dc9cb-2af6a9bb-dcbd180e/metadata/1024

hello

I should add that because I am running anonymize at the instance level, I am creating new Study, Series and Instance UID manually before calling anonymize and then feeding them to anonymize via the “replace” option. I do not let anonymize generate UID itself. Maybe anonymize ONLY sets AnonymizedFrom when it handles the UID generation? And I’ve skipped that step by generating UID myself?

You are free of generating UIDs by yourself. The metadata AnonymizedFrom is unrelated to this.

I found the routine AnonymizeOrModifyResource in OrthancRestAnonymizeModify and see the code where the metadata is set up to point from anonymized/modified files to original files.
Checking the metadata table in the PostGres database I can see that my newly anonymized/modified Dicom are missing these metadata connections.

Metadata are stored in the table called “metadata”. Here is a screenshot of pgadmin3 showing how the AnonymizedFrom metadata (type 6) is associated to the original identifier:

Perhaps that data is never updated when anonymizing at the instance level? After all, unlike anonymizing at the Patient/Study/Series levels, which generate new Patient/Study/Series within Orthanc itself and return metadata, a call to /instances/{ID}/anonymize will RETURN the anonymized DICOM itself. This must then be uploaded back into Orthanc, if you want it there.

Yes, that’s exactly the point: Metadata are only available for resources stored inside Orthanc. Once one instance leaves the Orthanc ecosystem (e.g. through anonymization), metadata is lost.

Is there a way to call anonymize at the instance level and have Orthanc keep a record in the metadata table of the links to the original DICOM?

The “AnonymizedFrom” is read-only, so you cannot set it by yourself. However, you could set an user-defined metadata.

Similarly, is there a way to have it simply retain the new DICOM, rather than sending it in response.

No, this is not possible in the REST API.

That being explained, you have at least 2 solutions to your issue:

  1. As Orthanc is lightweight, you can use 2 instances of Orthanc. The first one is responsible for receiving files, anonymizing individual instances, then sending the anonymized instance to another Orthanc server (possibly located on the same host). You then know that each instance stored by the second Orthanc server only contains anonymized instances, which allows to avoid infinite loops in your scripts.
  2. The “anonymize” URI of Orthanc automatically sets the (0012,0063) “DeidentificationMethod” DICOM tag. On receiving a new instance, your Lua script could check whether this tag is set to know whether the instance results from an anonymization or not.

The “best” solution clearly depends upon your application.

HTH,
Sébastien-

Thanks for your detailed response, Sebastien. I did come up with a solution to my problem, but the ability to store information into the meta data portion seems like a useful tool I’ll be looking into.

To sum up my original problem, I had coded the Lua OnStableStudy routine to skip anonymization when it detected AnonymizedFrom data in the meta data. Then, when I switched from anonymizing at the study level to the instance level, I lost that key for stopping the anonymization of already anonymized data.

In my case, I anonymize the data in a very specific way that, among other things, modifies the patient name to have a common “last name” related to the project the data is associated with.

My solution to the infinite loop problem within OnStableStudy was to look at the incoming PatientName and detect subjects that had already been anonymized. Since the patient name is stored with the meta data at the study level ([‘PatientMainDicomTags’][‘PatientName’]), I only need to check once, rather than look at every DICOM instance in the study.

Thanks for such a robust and flexible system!

John.