SMB is single-threaded. This means SMB uses only one CPU core to transfer a file.
See: https://www.samba.org/samba/docs/old/Samba3-Developers-Guide/architecture.html
Usually this is not a problem as SMB does not fully utilize a CPU core. But on extremely busy systems this could overload your CPU core.
A better CPU, especially a multi-threaded one, will make a huge difference.
The single-thread limitation of SMB can be bypassed through opening multiple connections to your server.
Samba’s default configuration is fairly poor.
Options which help speed up Samba:
[global] # FORCE THE DISK SYSTEM TO ALLOCATE REAL STORAGE BLOCKS WHEN # A FILE IS CREATED OR EXTENDED TO BE A GIVEN SIZE. # THIS IS ONLY A GOOD OPTION FOR FILE SYSTEMS THAT SUPPORT # UNWRITTEN EXTENTS LIKE XFS, EXT4, BTRFS, OCS2. # IF YOU USE A FILE SYSTEM THAT DOES NOT SUPPORT UNWRITTEN # EXTENTS, SET "strict allocate = no". # NOTE: MAY WASTE DRIVE SPACE EVEN ON SUPPORTED FILE SYSTEMS # SEE: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=798532 strict allocate = Yes # THIS IS TO COUNTERACT SPACE WASTAGE THAT CAN BE # CAUSED BY THE PREVIOUS OPTION # SEE: https://lists.samba.org/archive/samba-technical/2014-July/101304.html allocation roundup size = 4096 # ALLOW READS OF 65535 BYTES IN ONE PACKET. # THIS TYPICALLY PROVIDES A MAJOR PERFORMANCE BENEFIT. read raw = Yes # SERVER SIGNING SLOWS THINGS DOWN WHEN ENABLED. # THIS WAS DISABLED BY DEFAULT PRIOR TO SAMBA 4. server signing = No # SUPPORT RAW WRITE SMBs WHEN TRANSFERRING DATA FROM CLIENTS. write raw = Yes # WHEN "strict locking = no", THE SERVER PERFORMS FILE LOCK # CHECKS ONLY WHEN THE CLIENT EXPLICITLY ASKS FOR THEM. # WELL-BEHAVED CLIENTS ALWAYS ASK FOR LOCK CHECKS WHEN IT IS # IMPORTANT, SO IN THE VAST MAJORITY OF CASES, # "strict locking = auto" OR "strict locking = no" IS ACCEPTABLE. strict locking = No # TCP_NODELAY: # SEND AS MANY PACKETS AS NECESSARY TO KEEP DELAY LOW # IPTOS_LOWDELAY: # [Linux IPv4 Tweak] MINIMIZE DELAYS FOR INTERACTIVE TRAFFIC # SO_RCVBUF: # ENLARGE SYSTEM SOCKET RECEIVE BUFFER # SO_SNDBUF: # ENLARGE SYSTEM SOCKET SEND BUFFER socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=131072 SO_SNDBUF=131072 # SMBWriteX CALLS GREATER THAN "min receivefile size" WILL BE # PASSED DIRECTLY TO KERNEL recvfile/splice SYSTEM CALL. # TO ENABLE POSIX LARGE WRITE SUPPORT (SMB/CIFS WRITES UP TO 16MB), # THIS OPTION MUST BE NONZERO. # THIS OPTION WILL HAVE NO EFFECT IF SET ON A SMB SIGNED CONNECTION. # MAX VALUE = 128k min receivefile size = 16384 # USE THE MORE EFFICIENT sendfile() SYSTEM CALL FOR EXCLUSIVELY # OPLOCKED FILES. # NOTE: ONLY FOR CLIENTS HIGHER THAN WINDOWS 98/Me use sendfile = Yes # READ FROM FILE ASYNCHRONOUSLY WHEN SIZE OF REQUEST IS BIGGER # THAN THIS VALUE. # NOTE: SAMBA MUST BE BUILT WITH ASYNCHRONOUS I/O SUPPORT aio read size = 16384 # WRITE TO FILE ASYNCHRONOUSLY WHEN SIZE OF REQUEST IS BIGGER # THAN THIS VALUE # NOTE: SAMBA MUST BE BUILT WITH ASYNCHRONOUS I/O SUPPORT aio write size = 16384
https://wiki.samba.org/index.php/Performance_Tuning
https://wiki.samba.org/index.php/Linux_Performance
https://wiki.samba.org/index.php/Server-Side_Copy
https://www.samba.org/~ab/output/htmldocs/Samba3-HOWTO/speed.html
https://www.samba.org/samba/docs/current/man-html/smb.conf.5.html
https://lists.samba.org/archive/samba-technical/attachments/20140519/642160aa/attachment.pdf