Character Encoding in C-Find responses?

Hi,

I have a question about character encoding in C-Find responses by Orthanc.

I was noticing that OsiriX isn't displaying non-ascii characters in patient names correctly in its query window. (http://imgur.com/kITHrYh)
(when I actually load the files, everything works fine; the output is only wrong in the query window)

From what I can tell, it looks like Orthanc C-Find responses are always sent as UTF-8 and never have an explicit encoding tag.
Changing `"DefaultEncoding" : "Latin1"` to "Utf8" in orthanc.json does not seem to change the observed behaviour.

Here is what I tried:

I've run the current Orthanc from Docker [1] and imported some dicom files with valid and invalid encodings:

  # this is on an UTF8 terminal

  UTF8_STRING=$(echo "Müller^Jürgen")
  LATIN1_STRING=$(echo $UTF8_STRING | iconv -t latin1)

  img2dcm -k "(0010,0010)"=$UTF8_STRING -k "(0010,0020)"="utf8" -k "(0008,0005)"="ISO_IR 192" a.jpg a-utf8.dcm
  img2dcm -k "(0010,0010)"=$LATIN1_STRING -k "(0010,0020)"="latin1" -k "(0008,0005)"="ISO_IR 100" a.jpg a-latin1.dcm
  img2dcm -k "(0010,0010)"=$LATIN1_STRING -k "(0010,0020)"="latin1-as-utf8" -k "(0008,0005)"="ISO_IR 192" a.jpg a-latin1-as-utf8.dcm
  img2dcm -k "(0010,0010)"=$UTF8_STRING -k "(0010,0020)"="utf8-as-latin1" -k "(0008,0005)"="ISO_IR 100" a.jpg a-utf8-as-latin1.dcm

  dcmsend -v localhost 4242 a-*.dcm

This creates four patients and they look as expected in the web frontend (http://imgur.com/qoP38Vw)

Now I'm querying these four Patients with `findscu`:

  findscu -k 0008,0052=STUDY \
    -k PatientID -k PatientName \
    -k SpecificCharacterSet="ISO_IR 192" \
    -k 0008,0005 -P localhost 4242 -v -aet findscu -X

  # output at[2], sending "ISO_IR 100" does not make a difference

None of the responses has the SpecificCharacterSet-Tag set. They all appear to be UTF-8. [3]

Osirix seems to treat unknown encodings as latin1. [4]
I can reproduce what Osirix displays by treating the UTF-8 files as Latin1 and utf8-encoding them a second time:

  $ dcmdump rsp*.dcm | iconv -f latin1 -t utf8 | grep -e PatientName -e PatientID
  (0010,0010) PN [Müller^Jürgen] # 14, 1 PatientName
  (0010,0020) LO [latin1-as-utf8] # 14, 1 PatientID
  (0010,0010) PN [Müller^Jürgen] # 16, 1 PatientName
  (0010,0020) LO [latin1] # 6, 1 PatientID
  (0010,0010) PN [Müller^Jürgen] # 20, 1 PatientName
  (0010,0020) LO [utf8-as-latin1] # 14, 1 PatientID
  (0010,0010) PN [Müller^Jürgen] # 16, 1 PatientName
  (0010,0020) LO [utf8] # 4, 1 PatientID

Has anyone got OsiriX to work with Orthanc? Is there something else I can try?

Thank you,
Levin Alexander

[1]

$ docker run -v `pwd`/orthanc.json:/etc/orthanc/orthanc.json -p 4242:4242 -p 8042:8042 --rm jodogne/orthanc /etc/orthanc --verbose

I used orthanc.json from the docker image (`docker run --rm --entrypoint /bin/cat jodogne/orthanc /etc/orthanc/orthanc.json > orthanc.json`) and changed:

  "DicomModalities": {
    "osirix":["osirix", "127.0.0.1", 11112],
    "findscu":["findscu", "127.0.0.1", 104]
  }

and I also changed "DefaultEncoding" to "Utf8" and back a few times to see if it made a difference.
(restarting the server and reuploading the images after each change)

[2]

on the client:

I: Requesting Association
I: Association Accepted (Max Send PDV: 16372)
I: Sending Find Request (MsgID 1)
I: Request Identifiers:
I:
I: # Dicom-Data-Set
I: # Used TransferSyntax: Little Endian Explicit
I: (0008,0005) CS [ISO_IR 192] # 10, 1 SpecificCharacterSet
I: (0008,0052) CS [STUDY] # 6, 1 QueryRetrieveLevel
I: (0010,0010) PN (no value available) # 0, 0 PatientName
I: (0010,0020) LO (no value available) # 0, 0 PatientID
I:
I: Received Find Response 1 (Pending)
I: Writing response message to file: rsp0001.dcm
I: Received Find Response 2 (Pending)
I: Writing response message to file: rsp0002.dcm
I: Received Find Response 3 (Pending)
I: Writing response message to file: rsp0003.dcm
I: Received Find Response 4 (Pending)
I: Writing response message to file: rsp0004.dcm
I: Received Final Find Response (Success)
I: Releasing Association

on the server:

I1103 20:04:04.072624 CommandDispatcher.cpp:491] Association Received from AET findscu on IP 172.17.0.1
I1103 20:04:04.072891 CommandDispatcher.cpp:689] Association Acknowledged (Max Send PDV: 16372)
I1103 20:04:04.175367 main.cpp:119] No limit on the number of C-FIND results at the Patient, Study and Series levels
I1103 20:04:04.175416 main.cpp:129] No limit on the number of C-FIND results at the Instance level
I1103 20:04:04.175563 OrthancFindRequestHandler.cpp:578] DICOM C-Find request at level: Study
I1103 20:04:04.175594 OrthancFindRequestHandler.cpp:584] (0008,0005) SpecificCharacterSet = ISO_IR 192
I1103 20:04:04.175614 OrthancFindRequestHandler.cpp:584] (0008,0020) StudyDate =
I1103 20:04:04.175630 OrthancFindRequestHandler.cpp:584] (0008,0030) StudyTime =
I1103 20:04:04.175644 OrthancFindRequestHandler.cpp:584] (0008,0050) AccessionNumber =
I1103 20:04:04.175659 OrthancFindRequestHandler.cpp:584] (0008,0052) QueryRetrieveLevel = STUDY
I1103 20:04:04.175674 OrthancFindRequestHandler.cpp:584] (0008,0061) ModalitiesInStudy =
I1103 20:04:04.175689 OrthancFindRequestHandler.cpp:584] (0008,0080) InstitutionName =
I1103 20:04:04.175703 OrthancFindRequestHandler.cpp:584] (0008,0090) ReferringPhysicianName =
I1103 20:04:04.175729 OrthancFindRequestHandler.cpp:584] (0008,1030) StudyDescription =
I1103 20:04:04.175744 OrthancFindRequestHandler.cpp:584] (0008,1050) PerformingPhysicianName =
I1103 20:04:04.175759 OrthancFindRequestHandler.cpp:584] (0010,0010) PatientName =
I1103 20:04:04.175773 OrthancFindRequestHandler.cpp:584] (0010,0020) PatientID =
I1103 20:04:04.175788 OrthancFindRequestHandler.cpp:584] (0010,0030) PatientBirthDate =
I1103 20:04:04.175802 OrthancFindRequestHandler.cpp:584] (0010,0040) PatientSex =
I1103 20:04:04.175817 OrthancFindRequestHandler.cpp:584] (0020,000d) StudyInstanceUID =
I1103 20:04:04.175832 OrthancFindRequestHandler.cpp:584] (0020,0010) StudyID =
I1103 20:04:04.175847 OrthancFindRequestHandler.cpp:584] (0020,1208) NumberOfStudyRelatedInstances =
I1103 20:04:04.175862 OrthancFindRequestHandler.cpp:584] (0032,4000) RETIRED_StudyComments =
I1103 20:04:04.175876 OrthancFindRequestHandler.cpp:584] (4008,0212) RETIRED_InterpretationStatusID =
I1103 20:04:04.176014 OrthancFindRequestHandler.cpp:654] Number of candidate resources after fast DB filtering: 4
I1103 20:04:04.176047 FilesystemStorage.cpp:154] Reading attachment "db8fe58b-4b39-4ae6-a4c2-0b340c07a4be" of "JSON summary of DICOM" content type
I1103 20:04:04.176535 FilesystemStorage.cpp:154] Reading attachment "0355891a-4cdb-46e0-a412-f7f599a0993f" of "JSON summary of DICOM" content type
I1103 20:04:04.176932 FilesystemStorage.cpp:154] Reading attachment "3b0db155-872c-42cf-9cb2-a80979f6fdc7" of "JSON summary of DICOM" content type
I1103 20:04:04.177355 FilesystemStorage.cpp:154] Reading attachment "e39fe820-6596-48a7-b44c-16b42df8314c" of "JSON summary of DICOM" content type
I1103 20:04:04.177709 OrthancFindRequestHandler.cpp:680] Number of matching resources: 4
I1103 20:04:04.287153 CommandDispatcher.cpp:860] DUL Peer Requested Release
I1103 20:04:04.287223 CommandDispatcher.cpp:867] Association Release

[3]

(http://pastebin.com/raw/vEFbwz5H; "utf8" and "latin1" appear correct, "*-as-* are broken. This is on an utf8 terminal)

$ dcmdump resp*.dcm

# Dicom-File-Format

# Dicom-Meta-Information-Header
# Used TransferSyntax: Little Endian Explicit
(0002,0000) UL 194 # 4, 1 FileMetaInformationGroupLength
(0002,0001) OB 00\01 # 2, 1 FileMetaInformationVersion
(0002,0002) UI [1.2.276.0.7230010.3.1.0.1] # 26, 1 MediaStorageSOPClassUID
(0002,0003) UI [1.2.276.0.7230010.3.1.4.0.20321.1478203214.527453] # 50, 1 MediaStorageSOPInstanceUID
(0002,0010) UI =LittleEndianExplicit # 20, 1 TransferSyntaxUID
(0002,0012) UI [1.2.276.0.7230010.3.0.3.6.1] # 28, 1 ImplementationClassUID
(0002,0013) SH [OFFIS_DCMTK_361] # 16, 1 ImplementationVersionName

# Dicom-Data-Set
# Used TransferSyntax: Little Endian Explicit
(0008,0052) CS [STUDY] # 6, 1 QueryRetrieveLevel
(0010,0010) PN [M?ller^J?rgen] # 14, 1 PatientName
(0010,0020) LO [latin1-as-utf8] # 14, 1 PatientID

# Dicom-File-Format

# Dicom-Meta-Information-Header
# Used TransferSyntax: Little Endian Explicit
(0002,0000) UL 194 # 4, 1 FileMetaInformationGroupLength
(0002,0001) OB 00\01 # 2, 1 FileMetaInformationVersion
(0002,0002) UI [1.2.276.0.7230010.3.1.0.1] # 26, 1 MediaStorageSOPClassUID
(0002,0003) UI [1.2.276.0.7230010.3.1.4.0.20321.1478203214.527454] # 50, 1 MediaStorageSOPInstanceUID
(0002,0010) UI =LittleEndianExplicit # 20, 1 TransferSyntaxUID
(0002,0012) UI [1.2.276.0.7230010.3.0.3.6.1] # 28, 1 ImplementationClassUID
(0002,0013) SH [OFFIS_DCMTK_361] # 16, 1 ImplementationVersionName

# Dicom-Data-Set
# Used TransferSyntax: Little Endian Explicit
(0008,0052) CS [STUDY] # 6, 1 QueryRetrieveLevel
(0010,0010) PN [Müller^Jürgen] # 16, 1 PatientName
(0010,0020) LO [latin1] # 6, 1 PatientID

# Dicom-File-Format

# Dicom-Meta-Information-Header
# Used TransferSyntax: Little Endian Explicit
(0002,0000) UL 194 # 4, 1 FileMetaInformationGroupLength
(0002,0001) OB 00\01 # 2, 1 FileMetaInformationVersion
(0002,0002) UI [1.2.276.0.7230010.3.1.0.1] # 26, 1 MediaStorageSOPClassUID
(0002,0003) UI [1.2.276.0.7230010.3.1.4.0.20321.1478203214.527455] # 50, 1 MediaStorageSOPInstanceUID
(0002,0010) UI =LittleEndianExplicit # 20, 1 TransferSyntaxUID
(0002,0012) UI [1.2.276.0.7230010.3.0.3.6.1] # 28, 1 ImplementationClassUID
(0002,0013) SH [OFFIS_DCMTK_361] # 16, 1 ImplementationVersionName

# Dicom-Data-Set
# Used TransferSyntax: Little Endian Explicit
(0008,0052) CS [STUDY] # 6, 1 QueryRetrieveLevel
(0010,0010) PN [Müller^Jürgen] # 20, 1 PatientName
(0010,0020) LO [utf8-as-latin1] # 14, 1 PatientID

# Dicom-File-Format

# Dicom-Meta-Information-Header
# Used TransferSyntax: Little Endian Explicit
(0002,0000) UL 194 # 4, 1 FileMetaInformationGroupLength
(0002,0001) OB 00\01 # 2, 1 FileMetaInformationVersion
(0002,0002) UI [1.2.276.0.7230010.3.1.0.1] # 26, 1 MediaStorageSOPClassUID
(0002,0003) UI [1.2.276.0.7230010.3.1.4.0.20321.1478203214.527456] # 50, 1 MediaStorageSOPInstanceUID
(0002,0010) UI =LittleEndianExplicit # 20, 1 TransferSyntaxUID
(0002,0012) UI [1.2.276.0.7230010.3.0.3.6.1] # 28, 1 ImplementationClassUID
(0002,0013) SH [OFFIS_DCMTK_361] # 16, 1 ImplementationVersionName

# Dicom-Data-Set
# Used TransferSyntax: Little Endian Explicit
(0008,0052) CS [STUDY] # 6, 1 QueryRetrieveLevel
(0010,0010) PN [Müller^Jürgen] # 16, 1 PatientName
(0010,0020) LO [utf8] # 4, 1 PatientID

[4]

Osirix has a "Text Encoding" Preference where I can choose different encodings. This only seems to change what OsiriX asks for
(`-k SpecificCharacterSet="ISO_IR 192"`), not how it is treating the responses however.

Dear Levin,

I have a question about character encoding in C-Find responses by Orthanc. […]
From what I can tell, it looks like Orthanc C-Find responses are always sent as UTF-8 and never have an explicit encoding tag.

You are perfectly right: Orthanc <= 1.1.0 does not correctly handle the encoding of C-FIND answers. Thanks for pointing us this issue!

This has just been fixed in the mainline (the Docker images have been updated):
https://bitbucket.org/sjodogne/orthanc/commits/9b373b7d671387a158176e69c4fff0befbf46eb4

You will find enclosed with this message, an adaptation of your original script. Here is the log, that shows that Orthanc now works expected:

$ bash ./Levin.sh
Defining a sample modality on Orthanc
Setting the default encoding of Orthanc to Latin1
(0008,0005) CS [ISO_IR 100] # 10, 1 SpecificCharacterSet
(0010,0010) PN [Müller^Jürgen] # 14, 1 PatientName
(0010,0020) LO [latin1] # 6, 1 PatientID
(0008,0005) CS [ISO_IR 100] # 10, 1 SpecificCharacterSet
(0010,0010) PN [Müller^Jürgen] # 14, 1 PatientName
(0010,0020) LO [utf8] # 4, 1 PatientID

Setting the default encoding of Orthanc to Ascii
(0008,0005) CS [ISO_IR 6] # 8, 1 SpecificCharacterSet
(0010,0010) PN [Mller^Jrgen] # 12, 1 PatientName
(0010,0020) LO [latin1] # 6, 1 PatientID
(0008,0005) CS [ISO_IR 6] # 8, 1 SpecificCharacterSet
(0010,0010) PN [Mller^Jrgen] # 12, 1 PatientName
(0010,0020) LO [utf8] # 4, 1 PatientID

Setting the default encoding of Orthanc to Utf8
(0008,0005) CS [ISO_IR 192] # 10, 1 SpecificCharacterSet
(0010,0010) PN [Müller^Jürgen] # 16, 1 PatientName
(0010,0020) LO [latin1] # 6, 1 PatientID
(0008,0005) CS [ISO_IR 192] # 10, 1 SpecificCharacterSet
(0010,0010) PN [Müller^Jürgen] # 16, 1 PatientName
(0010,0020) LO [utf8] # 4, 1 PatientID

Setting the default encoding of Orthanc to Hebrew
(0008,0005) CS [ISO_IR 138] # 10, 1 SpecificCharacterSet
(0010,0010) PN [Mller^Jrgen] # 12, 1 PatientName
(0010,0020) LO [latin1] # 6, 1 PatientID
(0008,0005) CS [ISO_IR 138] # 10, 1 SpecificCharacterSet
(0010,0010) PN [Mller^Jrgen] # 12, 1 PatientName
(0010,0020) LO [utf8] # 4, 1 PatientID

As can be seen from the above, it is now possible to dynamically change the default encoding used by Orthanc (cf. configuration option “DefaultEncoding”). This default encoding, that was previously ignored when handling C-FIND request, is now used to encode the C-FIND answers.

This fix should likewise solve your problem against OsiriX. We would of course love to hear your feedback about this fix.

Regards,
Sébastien-

Levin.sh (1.35 KB)

Hi Sébastien,

I have a question about character encoding in C-Find responses by Orthanc. […]
From what I can tell, it looks like Orthanc C-Find responses are always sent as UTF-8 and never have an explicit encoding tag.

You are perfectly right: Orthanc <= 1.1.0 does not correctly handle the encoding of C-FIND answers. Thanks for pointing us this issue!

This has just been fixed in the mainline (the Docker images have been updated):
https://bitbucket.org/sjodogne/orthanc/commits/9b373b7d671387a158176e69c4fff0befbf46eb4

You will find enclosed with this message, an adaptation of your original script. Here is the log, that shows that Orthanc now works expected:

As can be seen from the above, it is now possible to dynamically change the default encoding used by Orthanc (cf. configuration option “DefaultEncoding”). This default encoding, that was previously ignored when handling C-FIND request, is now used to encode the C-FIND answers.

This fix should likewise solve your problem against OsiriX. We would of course love to hear your feedback about this fix.

I can confirm that this is fixed with Orthanc 1.2.0.

OsiriX/Horos now display Patient names in C-Find responses from Orthanc correctly.

Thank You!

-Levin