csync is a file synchronizer for Linux and allows to keep two copies of files and directories in sync. It uses uses widly adopted protocols like smb or sftp so that there is no need for a server component of csync. It is a user-level program which means there is no need to be a superuser.

1. Introduction

It is often the case that we have multiple copies (called replicas) of a filesystem or part of a filesystem (for example on a notebook and on a desktop computer). Changes to each replica are often made independently and as a result they do not contain the same information. In that case a file synchronizer is used to make them consistent again, without loosing any information.

The goal is to detect conflicting updates (files which has been modified) and propagate non-conflicting updates to each replica. If there are no conflicts left we are done and the replicas are identical.

2. Basics

This section describes some basics you might need to understand how file synchronization works.

2.1. Paths

A path normally refers to a point with a set of files which should be synchronized. It is specified relative to the root of the replica. The path is just a sequence of names separated by /.

Note The path separator is always a forward slash /, even for Windows.

csync is always using the absolute path. This could be /home/gladiac or for sftp sftp://gladiac:secret@myserver/home/gladiac.

2.2. What is an update?

The contents of a path could be a file, a directory or a symbolic link (symbolic links are not supported yet). To be more precise, if the path refers to:

csync keeps a record of each path which has been successfully synchronized. The path gets compared with the record and if it has changed since the last synchronization, we have an update. This is done by comparing the modification or change (modification time of the metadata) time.

2.3. What is a conflict?

A path is conflicting if it fulfills the following conditions:

  1. it has been updated in one replica,

  2. it or any of its descendants has been updated on the other replica too, and

  3. its contents in are not identical.

3. File Synchronization

The main goal of a file synchronizer is correctness. It changes whole or separated pieces of a users file system. So a user is not able to monitor the complete file synchronization process. So the synchronizer is in a position where it can damage the file system. It is important that the implementation behaves correctly under all conditions, even if there is an unexpected error (for example disk full).

On problem concerning correctness is the handling of conflicts. Each file synchronizer tries to propagate conflicting changes to the other replica. At the end both replicas should be identical. There are different strategies to fulfill these goals.

csync is a 3-phase file synchronizer. The desicion for this design was that user interaction should be possible and it should be easy to understand the process. The 3 phases are update detection, reconciliation and propagation. These will be described in the following sections.

3.1. Update detection

There are differnt strategies to do update detection. csync uses a state-based modtime-inode update detector. This means it uses a the modification time to detect updates. It doesn't require much resources. A record of each file is stored in a database (called statedb) and compared with the current modification time during update detection. If the file has changed since the last synchronization a instruction is set to evaluate it during the reconcilation phase. If we don't have a record for a file we invastigate, it is marked as new.

There is a problem to detect names of a file. This is sovled by the record we store in the statedb too. If we don't find the file by the name in the database we search for the inode number. If the inode number is found then the file has been renamed.

3.2. Reconciliation

TODO

3.3. Propagation

TODO

4. Getting started

4.1. Installing csync

See the README and INSTALL files for install prerequisites and procedures. Packagers take a look at Appendix B: Packager Notes.

4.2. Using the commandline client

TODO csync /home/csync sftp://TODO:secret@server:port/profile/TODO

4.3. The PAM module

TODO

5. Appendix A: Packager Notes

Read the README and INSTALL files (in the distribution root directory).