The Chinese in the file cannot be parsed properly

suniw · November 9, 2024, 11:49am

The system is Windows 10, version OrthancInstaller Win64-24.10.2, and the configuration for Default Encoding is Utf8

When modifying the value of SpecificCharacterSet to ISO-IR 192, the Chinese characters present in the (00080080), (00100010), and (00204000) tags cannot be parsed properly, and the console does not report any errors

When modifying the value of SpecificCharacterSet to UTF8, it still cannot parse the Chinese characters present in tags (00080080), (00100010), and (00204000) properly, but the console reports that ‘Value of Specific Character Set (00080005) is not supported: UTF8’, Fallback to ASCII (remove all special characters) error

I downloaded the source code and found that the dDicomEncoding method in Enumeration.cpp only judges ISO-IR 192 and not UTF8. Since there is no development environment for debugging, I guess it will parse properly if it is modified to ‘else if (s==“ISO-IR 192” | | s=“UTF8”)’?, I hope to receive assistance

alainmazy · November 12, 2024, 5:14pm

Hello,

In my understanding, UTF-8 is not a valid value for SpecificCharacterSet.

You should first try to open your file in another tool and, only if Orthanc shows a different result and you believe there is a bug in Orthanc, please share a file with us and a screen capture showing what the Chinese characters should look like.

Best,

Alain.

suniw · December 3, 2024, 4:21am

Hello Alain
The problem has been resolved. I modified the code on line 2023 of OrthancFramework/Sources/Enumerations.cpp to ‘else if’ (s==“GB18030” | | s=“GBK” | | s=“UTF8”). After compilation, it parsed normally, although I am not sure if this modification is correct
The attachment is attached http://suniw.com/DX000000.dcm

jodogne · December 3, 2024, 7:21am

Hello,

Glad to read that this patch solves your issue. However, as indicated by Alain, UTF8 is not a valid value according to the DICOM standard, so we cannot include it inside the official source code of Orthanc.

Also, note that you must write s=="UTF-8" instead of s="UTF-8" in C++. Otherwise, you are not testing anything, but assigning value UTF-8 to variable s. In addition, your file contains the UTF8 value, not the UTF-8 value. So, your patch seems incorrect to me.

Regards,
Sébastien-

suniw · December 3, 2024, 7:47am

Hello,
Sorry, it was my negligence. After the translation was completed, I did not check. The source code is correct. Regarding the issue of non compliant manufacturer files, could you provide a configuration item to simplify this lengthy ‘if else’ section? This is just my personal suggestion. Thank you for your reply