Transfer accelerator triggers "Error encountered in the plugin engine" without more details…

Hello :wave: !

We are using Orthanc a lot in our project ecosystem. We have a network of 29 Orthanc instance to interconnect 270 DICOM storage around the world.

Most of the time it works perfectly but recently we had the cryptic “Error encountered in the plugin engine” that was triggered more often. We don’t have a specific scenario that can trigger that error every-time so I’ll give you a bit of context:

  • The project allow users to move and access DICOM data spread around the world,
  • User can upload new data inside the system from their computer,
  • We are triggering transfers between Orthanc using the Transfer Accelerator plugin, then we process to a final “Move” between the last Orthanc and the final DICOM storage.

Since we are moving a lot of data sometimes on long distances, we tried to optimize the transfers the following way:

  • We request Orthanc to create transfer jobs (accelerator or move) on a subset of the data (we are working with block of 5 series now),
  • We keep track of the requested jobs and register the status and progress on our side, when Orthanc tell us the job is successful, we consider the data moved,
  • Whenever all the series are moved we check using Orthanc API if the data successfully reached the storage (Orthanc or DICOM).

Our issue here seems located in the transfer process. I’m not expecting a quick solution here, maybe some advice about the configuration…

So the final question is, how can we stabilize the transfer process here ?

  • Do we need to move smallest data chunk (so we’ll got a lot more transfers but smaller) ?
  • Do we need to move bigger data chunk (less transfer but longer one) ?
  • Do we need to tweak the local Orthanc configuration regarding packet size, parallel jobs or anything related ?

If something isn’t clear enough, ask me some questions, I’ll be glad to give more inputs ^^

I know that we are talking about network optimization here and there are a lot of things implicated in the process, we are trying to make some step in the right direction :wink:.

Regarding to your question about optimization of the chunk size. I have a project that move a lot of DICOM in short period of time. I encountered many error indicate that jobs have not been finished. I try to fine-tune chunk size a lot. However it does not improve much. One of the culprit is that Orthanc handles a limit number of concurrent requests. It’s due to the underlying library. Hence if you reduce chunk size, there are a lot of requests coming to the Orthanc. Some of them maybe failed which leads to the job fails. If you increase the chunk size, Orthanc may take time to process a request which leads to the congestion.

Yeah that’s hard to refine those configuration, it depends on a lot of different details. We also tried different chunk size and we have something that works most of the time…

Hi Stéphane,

I do not really have recommendations about chunk-size, concurrent jobs, … I think you have more experience than myself with the Transfer Accelerator plugin :wink:

But, we should make the plugin robust enough such that it works with “any” configuration.

So, next step is to identify this cryptic error:

  • if you have any insight from the logs about when it happens in the transfer, that might help targeting our search.
  • would it be possible for you to deploy “mainline” versions of Orthanc on this system ? In this case, I can add debug information both in Orthanc and in the plugin (There are at least 20 places where this exception is thrown without any details → I can add more details quite easily but we will probably need to iterate a few times before we find the real issue). Note that we run the integration tests on the mainline versions too so they are safe to use as well.

Best regards

Alain

Hello Alain and team,

Here are the logs I was able to get:

Logs on Orthanc A:

	Jul 31 11:14:13 orthanc_a Orthanc[1953]: I0731 11:14:13.145740 OrthancPlugins.cpp:2445] (plugins) Delegating HTTP request to plugin for URI: /transfers/lookup
	Jul 31 11:14:13 orthanc_a Orthanc[1953]: I0731 11:14:13.145825 OrthancPlugins.cpp:3146] (plugins) Plugin making REST GET call on URI /series/bc728de4-68222d7b-ce636673-2e62280b-ba6d5cb8/instances (built-in API)
	Jul 31 11:14:13 orthanc_a Orthanc[1953]: I0731 11:14:13.150050 PluginsManager.cpp:161] (plugins) Transfers accelerator reading DICOM instance: d71550dc-bcf8b767-70bb740c-92117830-4c1aef2d
	Jul 31 11:14:13 orthanc_a Orthanc[1953]: I0731 11:14:13.152367 PluginsManager.cpp:161] (plugins) Transfers accelerator reading DICOM instance: b133c81b-ba50a600-5d44a0d0-11fc961c-83f60468
	Jul 31 11:14:13 orthanc_a Orthanc[1953]: I0731 11:14:13.154656 PluginsManager.cpp:161] (plugins) Transfers accelerator reading DICOM instance: d7b2f473-c77c877b-e398f605-020ff6df-bf7839b5
	Jul 31 11:14:13 orthanc_a Orthanc[1953]: I0731 11:14:13.188777 OrthancPlugins.cpp:2445] (plugins) Delegating HTTP request to plugin for URI: /transfers/chunks/b133c81b-ba50a600-5d44a0d0-11fc961c-83f60468.d71550dc-bcf8b767-70bb740c-92117830-4c1aef2d.d7b2f473-c77c877b-e398f605-020ff6df-bf7839b5
	Jul 31 11:14:13 orthanc_a Orthanc[1953]: E0731 11:14:13.766240 PluginsManager.cpp:188] Exception while invoking plugin service 2000: Error in the network protocol
	Jul 31 11:14:13 orthanc_a Orthanc[1953]: E0731 11:14:13.766328 HttpOutput.cpp:73] This HTTP answer has not sent the proper number of bytes in its body
	Jul 31 11:14:13 orthanc_a Orthanc[1953]: I0731 11:14:13.792947 OrthancPlugins.cpp:2445] (plugins) Delegating HTTP request to plugin for URI: /transfers/lookup

Logs on Orthanc B:

	Jul 31 15:14:09 orthanc_b Orthanc[1124]: I0731 15:14:09.171791 OrthancPlugins.cpp:3203] (plugins) Plugin making REST POST call on URI /instances (built-in API)
	Jul 31 15:14:09 orthanc_b Orthanc[1124]: E0731 15:14:09.185517 OrthancException.cpp:61] Error in the network protocol: libCURL error: Failure when receiving data from the peer while accessing http://orthanc-a:8042/transfers/chunks/b133c81b-ba50a600-5d44a0d0-11fc961c-83f60468.d71550dc-bcf8b767-70bb740c-92117830-4c1aef2d.d7b2f473-c77c877b-e398f605-020ff6df-bf7839b5?offset=0&size=1577992&compression=gzip
	Jul 31 15:14:09 orthanc_b Orthanc[1124]: I0731 15:14:09.193459 JobsEngine.cpp:134] (jobs) Executing job with priority 0 in worker thread 1: 4677fb0b-3522-4d8b-bd58-5fd788c7957d
	Jul 31 15:14:09 orthanc_b Orthanc[1124]: I0731 15:14:09.195229 OrthancPlugins.cpp:3203] (plugins) Plugin making REST POST call on URI /instances (built-in API)
	Jul 31 15:14:09 orthanc_b Orthanc[1124]: I0731 15:14:09.244393 OrthancPlugins.cpp:3203] (plugins) Plugin making REST POST call on URI /instances (built-in API)
	Jul 31 15:14:09 orthanc_b Orthanc[1124]: I0731 15:14:09.259677 PluginsManager.cpp:161] (plugins) Importing transfered DICOM files from the temporary download area into Orthanc

Please let me know if you need additional information. The configuration options for both Orthanc instances are the default ones for the transfer accelerator.

Both servers are in the same region (US).

Thank you!

Best,

Sylvain

I’m in the same boat, I have a VM server in a datacenter with multiple VM instance of orthanc to receive DICOM which sends to a centralized “mega” orthanc server for archiving.

However in our office we have old osirix servers (phasing out), which sends to a newly built PC server with orthanc which uses the transfer accelerator plugin to send to that mega orthanc server, however about half the jobs would fail, logs would show 404 http status when requested. I’m doing large chunks, to reduce connections but not quite there. I was thinking of create a lua script to retry the job on failure, but kinda of worry it’s a sloppy solution.

Hi Sylvain,

The error "This HTTP answer has not sent the proper number of bytes in its body" usually indicates that the connection has been closed unexpectedly usually by the client. In this case, A is the server and B is the client and I do not see any reason why B would close the connection so fast. Could there be an HTTP proxy in between that would close it ?

Anyway, even if the connection is closed unexpectedly, this should be handled by the transfers plugin provided that "MaxHttpRetries" is larger than 0. I read that you are using the default config (which is 0). Could you try to increase these retries to let’s say 5 ?

Hope this helps,

Alain.

1 Like

has a solution been found for this? I’m having the same problem.