Garbled characters when I uploaded Japanese patient name.

VET_Atoms · December 30, 2022, 1:36pm

Hello,

This issue might happen with only Japanese patient names.
But we are faced with the issue of garbled characters for patient names.
Detail situation is as below ;
Any advice or suggestion should be highly appreciated.

[Environment]
Orthanc version 1.11.2 (installed with installer for windows)
Windows11

[Upload dicom file]
Exported from Unitea (CR console software, manufacturer is KonicaMinolta) .
When I export dicom file, we can choose character set from 3 kinds as below ;

ISO 2022 IR6/IR13/IR87
ISO 2022 IR13/IR87
ISO 2022 IR6/IR87

Then, I exported 3 studies (1 study for each character set).

You can download these files from below link ;
https://drive.google.com/drive/folders/1viMT4VQH3tkbPMyB9IvRYcvubrYTg_nx?usp=share_link

[Default encoding]
I have tried upload with 4 kinds of “DefaultEncoding” as below ;

“DefaultEncoding” : “Utf8”,
“DefaultEncoding” : “Ascii”,
“DefaultEncoding” : “Japanese”,
“DefaultEncoding” : “JapanseKanji”,

[Others]
When we upload dicom files, which contain Japanese patient names and exported from another modality (manufacturer is Fujifilm, Rayence, iRay and so on … ), we usually succeed to upload without any garbled characters with the dicom file.
And character encode is ISO IR13/87 or ISO IR87 or ISO192.

When we import to Horos, we could import without any garbled characters.
But “==” is shown at the top of the patient’s name.

You can refer screen shot of horos from below link ;
https://drive.google.com/file/d/1I8HRpC_s0v1V4BP2Nf0EnPOhgEw_D__z/view?usp=share_link

I think “==” is the cause of garbled characters.
But I can’t reject it, when I export from modality.

Please refer sequence of steps from below ;
https://youtu.be/4lNQljA10DQ

Any advice or suggestion should be highly appreciated.

Best regards,
Yamada

alainmazy · December 30, 2022, 5:21pm

Hi Yamada,

First of all, I’d like to tell you that your detailed report is perfect ! We really appreciate it !

Note that the “DefaultEncoding” configuration is only used if the files are missing “SpecificCharacterSet” which is not the case of your files → it is normal that you observe the same behaviour whatever the value of “DefaultEncoding”

I actually was not aware that SpecificCharacterSet can contain multiple values. It seems that, in this case, the DICOM file contains escape characters in the beginning of the string values that are using the second or third CharacterSet. That might explain the ‘==’. Furthermore, Orthanc currently does not handle the second and third CharacterSet which explain the garbage instead of Japanese characters.

I’m a bit worried by the fact that dmcdump from DCMTK does not handle it correctly either (and, DCMTK is the lib that we use to parse DICOM files). So, this means that it won’t be easy to fix…

As a workaround, I would advise you to try to configure your modality to use a single CharacterSet if possible.

Best regards,

Alain.

alainmazy · December 30, 2022, 5:26pm

Note that, I just checked with Radiant and it shows the ‘==’ signs as well (same as Horos). So, the ‘==’ is maybe really included in the strings.

VET_Atoms · December 31, 2022, 1:48am

Thank you for your quick reply.

I understand the cause is ‘==’ and multiple values of a specific character set.
I will ask the manufacturer of modality.

Best regards,
Yamada

hito · September 8, 2023, 10:24am

I did some research and ended up here.

In a somewhat different situation, I get the following error when performing a C-FIND on the DICOM server: Unsupported value for the SpecificCharacterSet (0008,0005) tag: “ISO 2022 IR 13\ISO 2022 IR 87”

In Japan, it is very common for multiple character sets to be described in (0008,0005), so if C-FIND is interrupted by this error, DICOMQ/R will not be available.

If multiple character sets like this one are specified in (0008,0005), we hope that they will be handled without causing errors.

https://dicom.nema.org/Medical/dicom/current/output/chtml/part05/sect_H.3.2.html

Best regards,
Ito

alainmazy · September 9, 2023, 9:40am

Hi Ito,

This topic is already in our TODO. However, could you provide us with the C-Find query that you are trying to execute (ideally from findscu) or from the Orthanc logs in verbose mode ?

That would help us testing once we work on it.

Thanks,

Alain.

hito · September 11, 2023, 9:10am

Hi Alain,

Thank you for your prompt reply.
I attach the part of the error log.
The patient names are anonymized.

Best regards,
Ito

T0911 18:01:27.766644 DicomControlUserConnection.cpp:77] (dicom) Received Find Response -1:
===================== INCOMING DIMSE MESSAGE ====================
Message Type : C-FIND RSP
Message ID Being Responded To : 1
Affected SOP Class UID : FINDStudyRootQueryRetrieveInformationModel
Data Set : present
DIMSE Status : 0xff00: Pending: Matches are continuing
======================= END DIMSE MESSAGE =======================
T0911 18:01:27.766644 DicomControlUserConnection.cpp:85] (dicom) Response Identifiers -1:

Dicom-Data-Set

Used TransferSyntax: Little Endian Explicit

(0008,0005) CS [ISO 2022 IR 13\ISO 2022 IR 87] # 30, 2 SpecificCharacterSet
(0008,0020) DA [20230911] # 8, 1 StudyDate
(0008,0030) TM [000350] # 6, 1 StudyTime
(0008,0050) SH [RKU2023091108636] # 16, 1 AccessionNumber
(0008,0052) CS [STUDY] # 6, 1 QueryRetrieveLevel
(0008,0054) AE [vue5838FIR] # 10, 1 RetrieveAETitle
(0008,0061) CS [CT] # 2, 1 ModalitiesInStudy
(0008,0090) PN (no value available) # 0, 0 ReferringPhysicianName
(0008,1030) LO [e$B6;J"It#C#T!JC1=c!Ke(J] # 24, 1 StudyDescription
(0010,0010) PN [XXXXX^YYYYY] # 14, 1 PatientName
(0010,0020) LO [10006286] # 8, 1 PatientID
(0010,0021) LO [RKU] # 4, 1 IssuerOfPatientID
(0010,0022) CS [TEXT] # 4, 1 TypeOfPatientID
(0010,0030) DA [19931026] # 8, 1 PatientBirthDate
(0010,0040) CS [M] # 2, 1 PatientSex
(0020,000d) UI [1.2.392.200036.9184.1.1203973.10006286.20230910.1.1] # 52, 1 StudyInstanceUID
(0020,0010) SH [9784] # 4, 1 StudyID
(0020,1206) IS [6] # 2, 1 NumberOfStudyRelatedSeries

E0911 18:01:27.766644 OrthancException.cpp:61] Parameter out of range: Unsupported value for the SpecificCharacterSet (0008,0005) tag: “ISO 2022 IR 13\ISO 2022 IR 87”
T0911 18:01:27.876044 ServerContext.cpp:290] Serializing the content of the jobs engine
T0911 18:01:27.876044 JobsRegistry.cpp:296] Job backup is not supported for job of type: Archive
T0911 18:01:28.641664 Connection.cpp:405] (sqlite) SQLite::Connection::FlushToDisk
T0911 18:01:37.938540 ServerContext.cpp:290] Serializing the content of the jobs engine
T0911 18:01:37.938540 JobsRegistry.cpp:296] Job backup is not supported for job of type: Archive
T0911 18:01:38.891687 Connection.cpp:405] (sqlite) SQLite::Connection::FlushToDisk
T0911 18:01:48.001040 ServerContext.cpp:290] Serializing the content of the jobs engine
T0911 18:01:48.001040 JobsRegistry.cpp:296] Job backup is not supported for job of type: Archive
T0911 18:01:49.141662 Connection.cpp:405] (sqlite) SQLite::Connection::FlushToDisk
T0911 18:01:55.003894 OrthancWebDav.cpp:1244] Cleaning up the empty WebDAV upload folders
T0911 18:01:58.066322 ServerContext.cpp:290] Serializing the content of the jobs engine
T0911 18:01:58.066322 JobsRegistry.cpp:296] Job backup is not supported for job of type: Archive
T0911 18:01:59.347774 Connection.cpp:405] (sqlite) SQLite::Connection::FlushToDisk
T0911 18:02:08.144310 ServerContext.cpp:290] Serializing the content of the jobs engine
T0911 18:02:08.144310 JobsRegistry.cpp:296] Job backup is not supported for job of type: Archive

Garbled characters when I uploaded Japanese patient name.

Best regards, Ito

Dicom-Data-Set

Used TransferSyntax: Little Endian Explicit

Best regards,
Ito