Using more than 32 cores in 3Ddeconvolve

Rick's point about memory is a good one. If you are running out of memory and not CPUs, then the speed can be limited by disk thrashing by swapping virtual memory to and from the disk. Check during the processing with top and ps or similar tools.

There are several previous threads with more good advice: