Ajuda para Importar Grande Volume de Arquivos DICOM para o Orthanc de Forma Automática

GuilhermeSantosPB · November 19, 2024, 5:04pm

Olá, pessoal do fórum!

Estou enfrentando um desafio e gostaria de pedir a ajuda de vocês. Tenho um servidor dedicado para o Orthanc na minha empresa, que utilizo para gerenciar exames de imagem (DICOM). Recentemente, precisei importar uma grande quantidade de exames de ultrassom (US) que foram arquivados há alguns anos em backups.

Aqui está o cenário:

Tenho vários terabytes de arquivos DICOM compactados em .7z.
Atualmente, estou descompactando manualmente os arquivos e enviando para o Orthanc usando a interface gráfica, pelo campo de Upload.
Esse processo é muito lento, e como tenho milhares de arquivos, estou procurando uma solução que me permita automatizar o envio.

O que eu já tentei:

Pesquisei na documentação oficial do Orthanc sobre automações e encontrei o plugin Orthanc Folder Synchronizer, mas ainda não implementei porque não sei se é a melhor solução no meu caso.
Sei que o Orthanc possui uma API RESTful para enviar arquivos diretamente, mas como estou lidando com um volume muito grande, talvez existam formas mais otimizadas de fazer isso.

Minha dúvida:

Existe alguma forma de o Orthanc monitorar uma pasta local e importar automaticamente os arquivos DICOM colocados lá?
Seria possível configurar algo para processar múltiplos arquivos em lote (em vez de um por um)?
Caso o uso da API RESTful seja a melhor opção, qual seria a melhor abordagem para lidar com milhares de arquivos, considerando performance e robustez?

Ficarei muito grato se puderem compartilhar dicas, scripts ou até melhores práticas para lidar com esse tipo de cenário.

Obrigado desde já pela ajuda!

Informações adicionais sobre meu ambiente:

Sistema operacional: Windows Server
Orthanc instalado na versão mais recente
Arquivos DICOM estão compactados em .7z e podem ser extraídos para uma pasta local se necessário.

Se precisarem de mais detalhes, estou à disposição!

Tag Sugestões:

Importação em Lote
Orthanc API
Plugin Folder Synchronizer
Automação

Aguardo ansioso pela colaboração de vocês!

jodogne · November 26, 2024, 2:54pm

Hello,

From what I understand, you are a company looking for professional services in Spanish around the Orthanc ecosystem. If so, please check out our list of professional assistance.

Kind Regards,
Sébastien-

benjamin.golinvaux · November 30, 2024, 10:35am

Hello

If you know some Python, you could start with a script such as this one below.

The idea is to :

Walk your directory tree to find all .7z files
For each .7z file, uncompress it to a temporary folder and process all the files that are extracted
For each extracted file, upload it to Orthanc thanks to a POST to the /instances endpoint
when all the files in .7z archive have been uploaded, move the .7z file to a “finished” directory tree, by replicating the same relative path, so that the finished directory keeps the exact same layout as the initial source directory (you might want to disable this specific step unless you have validated that the script works correctly!)

In addition, in order to not overload Orthanc, a semaphore is used that will limit the number of concurrent uploads to 30.

I just defined the overall architecture and asked chatGPT to generate the skeleton : i have NOT attempted to run it. The code looks exactly like what I asked, though.

Packages required here: aiohttp aiofiles py7zr

Hopefully , if you have some programming knowledge, this should lead you in the right direction.

Feel free to ask questions.

HTH

Benjamin

import os
import asyncio
import aiohttp
import aiofiles
import py7zr
import tempfile
import shutil
from pathlib import Path

async def main():
    root_dir = '/path/to/input/directory'
    finished_dir = '/path/to/finished/directory'
    upload_url = 'http://orthanc-server/instances'
    max_concurrent_posts = 30
    semaphore = asyncio.Semaphore(max_concurrent_posts)
    
    # Find all .7z files
    seven_z_files = find_7z_files(root_dir)
    
    for seven_z_file in seven_z_files:
        print(f"Processing {seven_z_file}")
        await process_7z_file(seven_z_file, upload_url, semaphore)
        # After processing, move the .7z file to the finished directory
        move_to_finished(seven_z_file, root_dir, finished_dir)

def find_7z_files(root_dir):
    seven_z_files = []
    for dirpath, dirnames, filenames in os.walk(root_dir):
        for filename in filenames:
            if filename.endswith('.7z'):
                seven_z_files.append(os.path.join(dirpath, filename))
    return seven_z_files

async def process_7z_file(seven_z_file, upload_url, semaphore):
    # Extract to a temporary directory
    with tempfile.TemporaryDirectory() as tmpdirname:
        with py7zr.SevenZipFile(seven_z_file, mode='r') as z:
            z.extractall(path=tmpdirname)
        # Now, get all files in the temp directory
        file_paths = []
        for dirpath, dirnames, filenames in os.walk(tmpdirname):
            for filename in filenames:
                file_paths.append(os.path.join(dirpath, filename))
        
        # Create an aiohttp session
        async with aiohttp.ClientSession() as session:
            # Create tasks for uploading files
            tasks = []
            for file_path in file_paths:
                task = asyncio.create_task(upload_file(file_path, upload_url, semaphore, session))
                tasks.append(task)
            # Wait for all uploads to complete
            await asyncio.gather(*tasks)

async def upload_file(file_path, upload_url, semaphore, session):
    async with semaphore:
        # Load file into memory
        async with aiofiles.open(file_path, 'rb') as f:
            data = await f.read()
        # Perform POST request
        try:
            async with session.post(upload_url, data=data) as resp:
                if resp.status == 200 or resp.status == 201:
                    print(f"Uploaded {file_path}")
                else:
                    print(f"Failed to upload {file_path}, status code: {resp.status}")
        except Exception as e:
            print(f"Error uploading {file_path}: {e}")

def move_to_finished(seven_z_file, root_dir, finished_dir):
    # Get the relative path
    relative_path = os.path.relpath(seven_z_file, root_dir)
    # Destination path
    dest_path = os.path.join(finished_dir, relative_path)
    # Ensure the destination directory exists
    dest_dir = os.path.dirname(dest_path)
    os.makedirs(dest_dir, exist_ok=True)
    # Move the file
    shutil.move(seven_z_file, dest_path)

if __name__ == '__main__':
    asyncio.run(main())

GuilhermeSantosPB · December 2, 2024, 10:18am

Good morning!

First of all, sorry for my absence over the weekend, I spent Saturday and Sunday adapting your code (with the help of ChatGPT) and I would like to say that it worked really well!

I will soon be writing an article on my Linkedin and I would like to know if I can publish it here, as it will definitely help other people.

Thank you for your support!

(I used Google Translate)

benjamin.golinvaux · December 2, 2024, 2:16pm

Hello !

I am happy to read this!.

Such an article can be useful for other forum members and publishing it here means that it will be indexed and searchable, so I think it is a very good idea (provided that its content is textual as opposed to an attachment).

In the meantime, I thought that this script was perhaps not really efficient, because every .7z file is processed separately (while, within an archive, the DICOM files are correctly processed concurrently)

If the archives contain a significant amount of files, it will be OK but, otherwise, the transfer could be slow.

Please let me know if this is a problem (from what you are writing, it seems to be OK, though). If you eventually find that the script speed is not satisfactory with this architecture, please let me know and we will try to improve it.

Benjamin

lucianogavaz · December 3, 2024, 1:49pm

Olá Bom dia…vc já tentou fazer um dicom send ou importar direto para o Dicom Server

sdscotti · December 4, 2024, 5:13pm

I was just working on the same sort of thing. I wonder if it is worth uncompressing the 7z files and then recompressing them to a .zip format before sending them to Orthanc. That would cut down on the number of API calls, but then Orthanc would have to decompress the .zip files on its end, which it can do.

Docker demo Guthub