Troubleshooting high sync times between Uyuni Server and Proxy over WAN connections

Depending on what changes are executed in the WebUI or via an API call to distribution or system settings, cobbler sync command may be required to transfer files from Uyuni Server to Uyuni Proxy systems. To accomplish this, Cobbler uses a list of proxies specified in /etc/cobbler/settings.

Due to its design, cobbler sync is not able to sync only the changed or recently added files.

Instead, executing cobbler sync triggers a full sync of the /srv/tftpboot directory to all specified proxies configured in /etc/cobbler/settings. It is also influenced by the latency of the WAN connection between the involved systems.

The process of syncing may take a considerable amount of time to finish according to the logs in /var/log/cobbler/.

For example, it started at:

Thu Jun  3 14:47:35 2021 - DEBUG | running python triggers from /var/lib/cobbler/triggers/task/sync/pre/*
Thu Jun  3 14:47:35 2021 - DEBUG | running shell triggers from /var/lib/cobbler/triggers/task/sync/pre/*

and ended at:

Thu Jun  3 15:18:49 2021 - DEBUG | running shell triggers from /var/lib/cobbler/triggers/task/sync/post/*
Thu Jun  3 15:18:49 2021 - DEBUG | shell triggers finished successfully

The transfer amount was roughly 1.8 GB. The transfer took almost 30 minutes.

By comparison, copying a single big file of the same size as /srv/tftboot completes within several minutes.

Switching to an rsync-based approach to copy files between Uyuni Server and Proxy may help to reduce the transfer and wait times.

The script does not accept command line options. Before running the script, you need to manually edit it and set correctly SUMAHOSTNAME, SUMAIP and SUMAPROXY1 variables for it to work correctly.

There is no support available for individual adjustments of the script. The script and the comments inside aim to provide an overview of the process and steps to be taken into consideration. If further help is required, contact SUSE Consulting.

The proposed approach using the script is beneficial in the following environment:

  • Uyuni Proxy systems are connected via a WAN connection;

  • /srv/tftboot contains a high number of files for distributions and client PXE boot files, in total several thousand files;

  • Any proxy in /etc/cobbler/settings has been disabled, otherwise Uyuni will continue to sync content to the proxies.

    #proxies:
    # - "sumaproxy.sumaproxy.test"
    # - "sumaproxy2.sumaproxy.test"
Procedure: Analyzing New Sync Speed
  1. Take a dump of the TCP traffic between Uyuni and the involved systems.

    • On Uyuni Server:

      tcpdump -i ethX -s 200 host <ip-address-of-susemanagerproxy> and not ssh
    • On Uyuni Proxy:

      tcpdump -i ethX -s 200 host <ip-address-of-susemanager> and not ssh
    • This will only capture a package size of 200 which is sufficient to run an analysis.

    • Adjust ethX to the respective network interface Uyuni uses to communicate with the proxy.

    • At last, ssh communication will not be captured to reduce the number of packages even further.

  2. Start a cobbler sync.

    • To force a sync, delete the Cobbler json cache file first and then issue cobbler sync:

      rm /var/lib/cobbler/pxe_cache.json
      cobbler sync
  3. When {command]cobbler sync is finished, stop the TCPdumps.

  4. Open the TCPdumps using Wireshark, go to Statistics > Conversations and wait for the dump to be analyzed.

  5. Switch to the TCP tab. The number shown on this tab gives the total number of conversations captured between Uyuni Server and Uyuni Proxy.

  6. Look for the column Duration.

    • Start by sorting in ascending order to find out the minimal amount of time it took to transfer a file.

    • Continue by sorting in descending order to find out the maximum values for the big files, for example kernel and initrd transfers.

      Ignore ports 4505 and 4506 as these are used for Salt communication.

Analysis of the TCPdumps showed the transfer of small files with a size of approx. 1800 bytes from Uyuni Server to Proxy took around 0.3 seconds.

While there were not many big files, the high number of smaller files resulted high number of established connections as new TCP connection is created for every single transferred file.

Therefore, knowing the minimal amount of transfer time and a number of connections needed (approx. 5000 in the example), gives an approximate estimated time for the overall transfer time: 5000 * 0.3 / 60 = 25 minutes.