TransferPlugin cannot work in distributed environment (Swarm / K8s)

Hi Author,
I am doing experiment with Orthanc running in Docker Swarm. This is my setup for orthanc configuration. Each Orthanc1, Orthanc2 is running with 2 replicas.

Orthanc1

version: '3.8'
services:
  orthanc1:
    image: orthancteam/orthanc:24.12.0-full
    deploy:
      replicas: 2
    # container_name: orthanc
    ports: [8042:8042, 4242:4242]
    volumes:
      - ./logs:/logs
      - ./storage/db:/var/lib/orthanc/db
    environment:
      # PHONG: $(hostname | awk -F. '{print $2}')
      PHONG: DB{{.Task.Slot}}
      TASK_SLOT: /mnt{{.Task.Slot}}
      # VERBOSE_STARTUP: "true"
      VERBOSE_ENABLED: "true"
      ORTHANC__POSTGRESQL: |
        {
          "Host": "orthanc-index1",
          "IndexConnectionsCount": 4
        }
      ORTHANC__NAME: "orthanc1"
      ORTHANC__ORTHANC_PEERS: |
        {
            "orthanc2": {
              "Url": "http://192.168.31.146:8043/"
              "RemoteSelf": "orthanc1"
            }
        }
      ORTHANC__DICOM_MODALITIES: |
        {
          "orthanc2" : ["ORTHANC", "192.168.31.146", 4243 ]
        }
      # DICOM_WEB_PLUGIN_ENABLED: "true"
      DELAYED_DELETION_PLUGIN_ENABLED: "true"
      ORTHANC__DATABASE_SERVER_IDENTIFIER: DB{{.Task.Slot}}
      TRANSFERS_PLUGIN_ENABLED: "true"
      ORTHANC__TRANSFERS__MAX_HTTP_RETRIES: 5
      ORTHANC__AUTHENTICATION_ENABLED: "false"
      LOGFILE: /logs/Orthanc.{{.Task.Slot}}.log

  orthanc-index1:
    image: postgres:15
    restart: unless-stopped
    volumes: ["orthanc-index1:/var/lib/postgresql/data"]
    environment:
      POSTGRES_HOST_AUTH_METHOD: "trust"


volumes:
  orthanc-index1:

Orthanc2

version: '3.8'
services:
  orthanc2:
    image: orthancteam/orthanc:24.12.0-full
    deploy:
      replicas: 2
    # container_name: orthanc
    ports: [8043:8042, 4243:4242]
    volumes:
      - ./logs:/logs
      - ./storage/db:/var/lib/orthanc/db
    environment:
      # PHONG: $(hostname | awk -F. '{print $2}')
      PHONG: DB{{.Task.Slot}}
      TASK_SLOT: /mnt{{.Task.Slot}}
      # VERBOSE_STARTUP: "true"
      VERBOSE_ENABLED: "true"
      ORTHANC__POSTGRESQL: |
        {
          "Host": "orthanc-index2",
          "IndexConnectionsCount": 4
        }
      ORTHANC__NAME: "orthanc2"
      ORTHANC__ORTHANC_PEERS: |
        {
          "orthanc1": ["http://192.168.31.146:8042/"]
        }
      # DICOM_WEB_PLUGIN_ENABLED: "true"
      DELAYED_DELETION_PLUGIN_ENABLED: "true"
      ORTHANC__DATABASE_SERVER_IDENTIFIER: DB{{.Task.Slot}}
      TRANSFERS_PLUGIN_ENABLED: "true"
      ORTHANC__TRANSFERS__MAX_HTTP_RETRIES: 5
      ORTHANC__AUTHENTICATION_ENABLED: "false"
      LOGFILE: /logs/Orthanc.{{.Task.Slot}}.log

  orthanc-index2:
    image: postgres:15
    restart: unless-stopped
    ports: ["5432:5432"]
    volumes: ["orthanc-index2:/var/lib/postgresql/data"]
    environment:
      POSTGRES_HOST_AUTH_METHOD: "trust"


volumes:
  orthanc-index2:

When I use transfer plugin to transfer dicom from Orthanc1 to Orthanc2 in PULL mode. The Orthanc2 got the error

I0201 05:25:15.022824          HTTP-24 JobsRegistry.cpp:795] New job submitted with priority 0: 0478f3e5-cafb-414e-9de2-7410425ba1c3
I0201 05:25:15.023571    JOBS-WORKER-1 JobsEngine.cpp:135] (jobs) Executing job with priority 0 in worker thread 1: 0478f3e5-cafb-414e-9de2-7410425ba1c3
I0201 05:25:15.025370    JOBS-WORKER-1 HttpClient.cpp:852] (http) New HTTP request to: http://192.168.31.146:8042/transfers/lookup (timeout: 60s)
I0201 05:25:15.035440    JOBS-WORKER-1 HttpClient.cpp:1074] (http) HTTP status code 200 in 9 ms after POST request on: http://192.168.31.146:8042/transfers/lookup
E0201 05:25:15.036312    JOBS-WORKER-1 PluginsManager.cpp:154] Invalid originator, check out the "RemoteSelf" configuration option of peer: orthanc1
I0201 05:25:15.037166    JOBS-WORKER-1 JobsRegistry.cpp:505] Job has completed with failure: 0478f3e5-cafb-414e-9de2-7410425ba1c3

I thought it’s related to the below snippet code

      if (job_.query_.HasOriginator() &&
          job_.query_.GetOriginator() != answer[KEY_ORIGINATOR_UUID].asString())
      {
        LOG(ERROR) << "Invalid originator, check out the \"" << KEY_REMOTE_SELF
                   << "\" configuration option of peer: " << job_.query_.GetPeer();
        return StateUpdate::Failure();
      }

I think the above code is just for validation purpose. So it cannot work in distributed environment. Should we remove this validation code in order to run Orthanc Transfer Plugin in Swarm or K8s ?

There are notes in the orthanc book specifically on how to use the transfer plugin with a load balancer. Transfers accelerator plugin — Orthanc Book documentation

There is a specific header that you can use to direct the request to the correct instance is my understanding.

Hth

James

Yes, I know the direction. However I am using PULL mode, not PUSH mode. And I want to use the default load balancer in Docker Swarm, NOT the 3rd Party (i.e HA Proxy) software

Hi,
My understanding is docker uses a simple round robin load balancing and doesn’t support sticky sessions so I don’t believe it will work with the default load balancer.
More advanced proxy’s (HAProxy, Treafic, Nginx etc) support sticky sessions for clients so should work.
James