======== Advanced ======== ----------- Usage tips ----------- Since ``yarsync`` allows using a command interface similar to ``git``, one can synchronize several repositories simultaneously using `myrepos `_. If new data was added to several repositories simultaneously, commit the changes on one of them and synchronize that with the another. ``rsync`` should link the working directory with commits properly. This may fail depending on how you actually copied files (they may have changed attributes). In this case, create new commits in both repositories and manually rename them to be the same. Try to synchronize to see that all is linked properly. For example, when we move photographs from an SD card, we want to have at least two copies of them. It would be more reliable to copy data from the original source to two repositories than to push that from one of them to another (possible errors on the intermediate filesystem increase the risk). Make sure that the two repositories were synchronized beforehand. ------------ Development ------------ Community contributions are very important for free software projects. The best thing for the project on the starting phase is to spread information and create packages for new operating systems. ``yarsync`` was tested on ext4, NFSv4 and SimFS on Arch Linux and CentOS. Tests on other systems would be useful. ---------- Hard links ---------- The file system must support hard links if you plan to use *commits*. Multiple hard links are supported by POSIX-compliant and partially POSIX-compliant operating systems, such as Linux, Android, macOS, and also Windows NT4 and later Windows NT operating systems [`Wikipedia `_]. Notable file systems to **support hard links** include [`hard links `_ and `comparison of file systems `_ from Wikipedia]: * EncFS (an Encrypted Filesystem using FUSE). Note that it doesn't support hard links `when External IV Chaining is enabled `_ (this is enabled by default in paranoia mode, and disabled by default in standard mode). * ext2-ext4. Standard on Linux. Ext4 has a limit of `65000 hard links `_ on a file. * HFS+. Standard on Mac OS. * NTFS. The only Windows file system to support hard links. It has a limit of `1024 hard links `_ on a file. * SquashFS, a compressed read-only file system for Linux. Hard links are **not supported** on: * FAT, exFAT. These are used on many flash drives. * Joliet ("CDFS"), ISO 9660. File systems on CDs. The majority of modern file systems support hard links. A full list of `file system capabilities `_ can be found on Wikipedia. One can copy data to file systems without hard links, but this will reduce the functionality of ``yarsync``, and one should take care not to consume too much disk space if accidentally copying files instead of hard linking. ----------------- rsync limitations ----------------- * `Millions of files `_ will be synced very slowly. * ``rsync`` freezes when encountering **too many hard links**. Users report problems for repositories of `200 G `_ or `90 GB `_, with many hard links. For the author's repository with 30 thousand files (160 thousand with commits) and 3 Gb of data ``rsync`` works fine. If you have a large repository and want to copy it with all hard links, it is recommended to create a separate partition (e.g. LVM) and copy the filesystem as a whole. You can also remove some of older backups. * ``rsync`` may create separate files instead of hard linking them. It can be fixed quickly using the `hardlink `_ executable. ------------ Alternatives ------------ `Free software that uses rsync `_ includes: * `Back In Time `_. See previous snapshots using a GUI. * Grsync, graphical interface for rsync. * `LuckyBackup `_. It is written in C++ and is mostly used from a graphical shell. * `rsnapshot `_, a filesystem snapshot utility. ``rsnapshot`` makes it easy to make periodic snapshots of local machines, and remote machines over ssh. Files can be restored by the users who own them, without the root user getting involved. Other syncronization / backup / archiving software: * `casync `_ is a combination of the rsync algorithm and content-addressable storage. It is an efficient way to deliver and update directory trees and large images over the Internet in an HTTP and CDN friendly way. Other systems that use `similar algorithms `_ include `bup `_. * `Duplicity `_ backs directories by producing encrypted tar-format volumes and uploading them to a remote or local file server. ``duplicity`` uses ``librsync`` and is space efficient. It supports many cloud providers. In 2021 ``duplicity`` supports deleted files, full unix permissions, directories, and symbolic links, fifos, and device files, but not hard links. It can be run on Linux, MacOS and Windows (`under Cygwin `_). * `Git-annex `_ manages distributed copies of files using git. This is a very powerful tool written in Haskell. It allows for each file to track the number of backups that contain it and their names, and it allows to plan downloading of a file to the local storage. This is its author's `use case `_: "I have a ton of drives. I have a lot of servers. I live in a cabin on dialup and often have 1 hour on broadband in a week to get everything I need". I tried to learn ``git-annex``, it was `uneasy `_ , and finally I found that it `doesn't preserve timestamps `_ (because ``git`` doesn't) and `permissions `_. If that suits you, there is also a list of specialized `related software `_. ``git-annex`` allows to use many cloud services as `special remotes `_, including all `rclone remotes `_. * `Rclone `_ focuses on cloud and other high latency storage. It supports more than 50 different providers. As of 2021, it doesn't preserve permissions and attributes. Continuous synchronization software: * `gut-sync `_ offers a real-time bi-directional folder synchronization. * `Syncthing `_. A very powerful and developed tool, works on Linux, MacOS, Windows and Android. Mostly uses a GUI (admin panel is managed through a Web interface), but also has a `command line interface `_. * `Unison `_ is a file-synchronization tool for OSX, Unix, and Windows. It allows two replicas of a collection of files and directories to be stored on different hosts (or different disks on the same host), modified separately, and then brought up to date by propagating the changes in each replica to the other (pretty much like other syncronization tools work). * Dropbox, Google Drive, Yandex Disk and many other closed-source tools fall into this cathegory. ArchWiki includes several useful `scripts for rsync `_ and a list of its `graphical front-ends `_. It also has a `list of cloud synchronization clients `_ and a `list of synchronization and backup programs `_. Wikipedia offers a `comparison of file synchronization software `_ and a `comparison of backup software `_. Git-annex has a list of `git-related `_ tools.