mirror of
https://github.com/nextcloud/desktop.git
synced 2024-11-28 11:48:56 +03:00
Add more test to the userguide.
This commit is contained in:
parent
2a9ac9a91a
commit
0f6a55bb23
2 changed files with 194 additions and 32 deletions
109
doc/csync.txt
109
doc/csync.txt
|
@ -13,13 +13,13 @@ Introduction
|
||||||
|
|
||||||
It is often the case that we have multiple copies (called replicas) of a
|
It is often the case that we have multiple copies (called replicas) of a
|
||||||
filesystem or part of a filesystem (for example on a notebook and on a desktop
|
filesystem or part of a filesystem (for example on a notebook and on a desktop
|
||||||
computer). Changes to each replica are often made independently and as a
|
computer). Changes to each replica are often made independently and as a
|
||||||
result they do not contain the same information. In that case a file
|
result they do not contain the same information. In that case a file
|
||||||
synchronizer is used to make them consistent again, without loosing any
|
synchronizer is used to make them consistent again, without loosing any
|
||||||
information.
|
information.
|
||||||
|
|
||||||
The goal is to detect conflicting <<X13, updates>> (files which has been
|
The goal is to detect conflicting <<X13, updates>> (files which has been
|
||||||
modified) and propagate non-conflicting updates to each replica. If there
|
modified) and propagate non-conflicting updates to each replica. If there
|
||||||
are no conflicts left we are done and the replicas are identical.
|
are no conflicts left we are done and the replicas are identical.
|
||||||
|
|
||||||
Basics
|
Basics
|
||||||
|
@ -115,10 +115,28 @@ filesystem. It decides which file:
|
||||||
A wrong decision of the reconciler leads in most cases to a loss of data. So there
|
A wrong decision of the reconciler leads in most cases to a loss of data. So there
|
||||||
are several conditions a the file synchronizer has to follow.
|
are several conditions a the file synchronizer has to follow.
|
||||||
|
|
||||||
Specification
|
Algorithms
|
||||||
^^^^^^^^^^^^^
|
^^^^^^^^^^
|
||||||
|
|
||||||
TODO
|
For conflict resolution several different algorithms could be implemented. The
|
||||||
|
most common algorithm are the merge and and the conflict algorithm. The first
|
||||||
|
is a batch algortihm and the second is one which needs user interaction.
|
||||||
|
|
||||||
|
Merge algorithm
|
||||||
|
+++++++++++++++
|
||||||
|
|
||||||
|
The merge algorithm is an algorithm which doesn't need any user interaction. It
|
||||||
|
is simple and used for example by Microsoft for Roaming Proflies. If it detects
|
||||||
|
a conflict (the same file changed on both replicas) then it will use the most
|
||||||
|
recent file and overwrite the other. This means you can loose some data, but
|
||||||
|
normally you want the latest file.
|
||||||
|
|
||||||
|
Conflict algorithm
|
||||||
|
++++++++++++++++++
|
||||||
|
|
||||||
|
This is not implemented yet.
|
||||||
|
|
||||||
|
If a file has a conflict the user has to decicde which file should be used.
|
||||||
|
|
||||||
Propagation
|
Propagation
|
||||||
~~~~~~~~~~~
|
~~~~~~~~~~~
|
||||||
|
@ -126,28 +144,73 @@ Propagation
|
||||||
The next instance of the file synchronizer the propagator. It uses the
|
The next instance of the file synchronizer the propagator. It uses the
|
||||||
calculated records to apply them on the current replica.
|
calculated records to apply them on the current replica.
|
||||||
|
|
||||||
* 2-phase-copy
|
|
||||||
* merge trees and write journal
|
The propagator uses a 2-phase-commit mechanism to simulate an atomic filesystem
|
||||||
|
operation.
|
||||||
|
|
||||||
|
In the first phase we copy the file to a temporary file on the opposite
|
||||||
|
replica. This has the advantage that we can check if file which has been copied
|
||||||
|
to the opposite replica has been transfered successfully. If the connection
|
||||||
|
gets interruppted during the transfer we still have the orignal states of the
|
||||||
|
file. This means no data will be lost.
|
||||||
|
In the second phase the the file on the opposite replica will be overwritten by
|
||||||
|
the temporary file.
|
||||||
|
|
||||||
|
After a successfull propagation we have to merge the trees to reflect the
|
||||||
|
current state of the filesystem tree. This updated tree will be written as a
|
||||||
|
journal into a database. The database is called the state database. It will be
|
||||||
|
used during the update detection of the next synchronization. See above.
|
||||||
|
|
||||||
Robustness
|
Robustness
|
||||||
~~~~~~~~~~
|
~~~~~~~~~~
|
||||||
|
|
||||||
TODO
|
This is a really important topic. The file synchronizer should not crash and if
|
||||||
|
it crashed, there should be no loss of data. To achieve this goal there are
|
||||||
|
several mechanism to prevent this. These mechnanism will be discussed in the
|
||||||
|
following sections.
|
||||||
|
|
||||||
Crash resistance
|
Crash resistance
|
||||||
^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
TODO
|
The synchronization process can be interrupted by different events, this can
|
||||||
|
be:
|
||||||
|
|
||||||
|
* the system could be halted due to errors.
|
||||||
|
* the disk could be full or the quota exceeded.
|
||||||
|
* the network or power cable could be pulled out.
|
||||||
|
* the user could force a stop of the synchronization process.
|
||||||
|
* different communication errors could occur.
|
||||||
|
|
||||||
|
That no data will be lost due to the occurance we enforce the following
|
||||||
|
invariant:
|
||||||
|
|
||||||
|
IMPORTANT: At every moment of the synchronization each file has either its
|
||||||
|
original content or its correct final content.
|
||||||
|
|
||||||
|
So each interupted synchronization process is a partial sync and can be
|
||||||
|
continued and completed by simply running csync again. The only problem could
|
||||||
|
be an error of the filesystem. So we reach this invariant only approximatly.
|
||||||
|
|
||||||
Transfer errors
|
Transfer errors
|
||||||
^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
TODO
|
With the Two-Phase-Commit we check the file size after the file has
|
||||||
|
transferred. So we can detect transfer erros. Better would be a transfer
|
||||||
|
protocol with checksums. This could possibly done in the future.
|
||||||
|
|
||||||
|
Future filesystems like btrfs will help to compare checksums instead of the
|
||||||
|
filesize. This will make the synchronization itself safer.
|
||||||
|
|
||||||
Database loss
|
Database loss
|
||||||
^^^^^^^^^^^^^
|
^^^^^^^^^^^^^
|
||||||
|
|
||||||
TODO
|
It could be possible, that the state database get corrupted. If this happens
|
||||||
|
all files get evaluated. In this case the file synchronizer wont delete any
|
||||||
|
file, but it could occur that deleted files will be restored from the other
|
||||||
|
replica.
|
||||||
|
To prevent a corruption or loss of the database if an error occurs or the user
|
||||||
|
forces an abort, the synchronizer is working on a copy of the database and will
|
||||||
|
use a 2-Phase-Commit to save it at the end.
|
||||||
|
|
||||||
Getting started
|
Getting started
|
||||||
---------------
|
---------------
|
||||||
|
@ -160,17 +223,33 @@ procedures. Packagers take a look at <<X90, Appendix B: Packager Notes>>.
|
||||||
|
|
||||||
Using the commandline client
|
Using the commandline client
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
TODO
|
|
||||||
csync /home/csync sftp://TODO:secret@server:port/profile/TODO
|
The synopsis of the commandline client is
|
||||||
|
|
||||||
|
csync [OPTION...] SOURCE DESTINATION
|
||||||
|
|
||||||
|
It synchronizes the content of SOURCE with DESTINATION and vice versa. The
|
||||||
|
DESTINATION can be a local directory or a remote file server.
|
||||||
|
|
||||||
|
csync /home/csync scheme://user:password@server:port/full/path
|
||||||
|
|
||||||
|
The remote destination is supported by plugins. By default csync ships with smb
|
||||||
|
and sftp support. For more information, see the manpage of csync(1).
|
||||||
|
|
||||||
The PAM module
|
The PAM module
|
||||||
~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~
|
||||||
TODO
|
|
||||||
|
pam_csync is a PAM module to provide roaming home directories for a user
|
||||||
|
session. This module is aimed at environments with central file servers a user
|
||||||
|
wishes to store his home directory. The Authentication Module verifies the
|
||||||
|
identity of a user and triggers a synchronization with the server on the first
|
||||||
|
login and the last logout. More information can be found in the manpage of the
|
||||||
|
module pam_csync(8).
|
||||||
|
|
||||||
|
|
||||||
[[X90]]
|
[[X90]]
|
||||||
Appendix A: Packager Notes
|
Appendix A: Packager Notes
|
||||||
--------------------------
|
--------------------------
|
||||||
|
|
||||||
Read the `README` and `INSTALL` files (in the distribution root
|
Read the `README`, `INSTALL` and `FAQ` files (in the distribution root
|
||||||
directory).
|
directory).
|
||||||
|
|
|
@ -437,12 +437,12 @@ is a user-level program which means you don't need to be a superuser.</p></div>
|
||||||
<div class="sectionbody">
|
<div class="sectionbody">
|
||||||
<div class="para"><p>It is often the case that we have multiple copies (called replicas) of a
|
<div class="para"><p>It is often the case that we have multiple copies (called replicas) of a
|
||||||
filesystem or part of a filesystem (for example on a notebook and on a desktop
|
filesystem or part of a filesystem (for example on a notebook and on a desktop
|
||||||
computer). Changes to each replica are often made independently and as a
|
computer). Changes to each replica are often made independently and as a
|
||||||
result they do not contain the same information. In that case a file
|
result they do not contain the same information. In that case a file
|
||||||
synchronizer is used to make them consistent again, without loosing any
|
synchronizer is used to make them consistent again, without loosing any
|
||||||
information.</p></div>
|
information.</p></div>
|
||||||
<div class="para"><p>The goal is to detect conflicting <a href="#X13">updates</a> (files which has been
|
<div class="para"><p>The goal is to detect conflicting <a href="#X13">updates</a> (files which has been
|
||||||
modified) and propagate non-conflicting updates to each replica. If there
|
modified) and propagate non-conflicting updates to each replica. If there
|
||||||
are no conflicts left we are done and the replicas are identical.</p></div>
|
are no conflicts left we are done and the replicas are identical.</p></div>
|
||||||
</div>
|
</div>
|
||||||
<h2 id="_basics">2. Basics</h2>
|
<h2 id="_basics">2. Basics</h2>
|
||||||
|
@ -566,31 +566,98 @@ or gets <strong>deleted</strong>
|
||||||
</ul></div>
|
</ul></div>
|
||||||
<div class="para"><p>A wrong decision of the reconciler leads in most cases to a loss of data. So there
|
<div class="para"><p>A wrong decision of the reconciler leads in most cases to a loss of data. So there
|
||||||
are several conditions a the file synchronizer has to follow.</p></div>
|
are several conditions a the file synchronizer has to follow.</p></div>
|
||||||
<h4 id="_specification">3.2.1. Specification</h4>
|
<h4 id="_algorithms">3.2.1. Algorithms</h4>
|
||||||
<div class="para"><p>TODO</p></div>
|
<div class="para"><p>For conflict resolution several different algorithms could be implemented. The
|
||||||
|
most common algorithm are the merge and and the conflict algorithm. The first
|
||||||
|
is a batch algortihm and the second is one which needs user interaction.</p></div>
|
||||||
|
<h5 id="_merge_algorithm">Merge algorithm</h5>
|
||||||
|
<div class="para"><p>The merge algorithm is an algorithm which doesn't need any user interaction. It
|
||||||
|
is simple and used for example by Microsoft for Roaming Proflies. If it detects
|
||||||
|
a conflict (the same file changed on both replicas) then it will use the most
|
||||||
|
recent file and overwrite the other. This means you can loose some data, but
|
||||||
|
normally you want the latest file.</p></div>
|
||||||
|
<h5 id="_conflict_algorithm">Conflict algorithm</h5>
|
||||||
|
<div class="para"><p>This is not implemented yet.</p></div>
|
||||||
|
<div class="para"><p>If a file has a conflict the user has to decicde which file should be used.</p></div>
|
||||||
<h3 id="_propagation">3.3. Propagation</h3><div style="clear:left"></div>
|
<h3 id="_propagation">3.3. Propagation</h3><div style="clear:left"></div>
|
||||||
<div class="para"><p>The next instance of the file synchronizer the propagator. It uses the
|
<div class="para"><p>The next instance of the file synchronizer the propagator. It uses the
|
||||||
calculated records to apply them on the current replica.</p></div>
|
calculated records to apply them on the current replica.</p></div>
|
||||||
|
<div class="para"><p>The propagator uses a 2-phase-commit mechanism to simulate an atomic filesystem
|
||||||
|
operation.</p></div>
|
||||||
|
<div class="para"><p>In the first phase we copy the file to a temporary file on the opposite
|
||||||
|
replica. This has the advantage that we can check if file which has been copied
|
||||||
|
to the opposite replica has been transfered successfully. If the connection
|
||||||
|
gets interruppted during the transfer we still have the orignal states of the
|
||||||
|
file. This means no data will be lost.
|
||||||
|
In the second phase the the file on the opposite replica will be overwritten by
|
||||||
|
the temporary file.</p></div>
|
||||||
|
<div class="para"><p>After a successfull propagation we have to merge the trees to reflect the
|
||||||
|
current state of the filesystem tree. This updated tree will be written as a
|
||||||
|
journal into a database. The database is called the state database. It will be
|
||||||
|
used during the update detection of the next synchronization. See above.</p></div>
|
||||||
|
<h3 id="_robustness">3.4. Robustness</h3><div style="clear:left"></div>
|
||||||
|
<div class="para"><p>This is a really important topic. The file synchronizer should not crash and if
|
||||||
|
it crashed, there should be no loss of data. To achieve this goal there are
|
||||||
|
several mechanism to prevent this. These mechnanism will be discussed in the
|
||||||
|
following sections.</p></div>
|
||||||
|
<h4 id="_crash_resistance">3.4.1. Crash resistance</h4>
|
||||||
|
<div class="para"><p>The synchronization process can be interrupted by different events, this can
|
||||||
|
be:</p></div>
|
||||||
<div class="ilist"><ul>
|
<div class="ilist"><ul>
|
||||||
<li>
|
<li>
|
||||||
<p>
|
<p>
|
||||||
2-phase-copy
|
the system could be halted due to errors.
|
||||||
</p>
|
</p>
|
||||||
</li>
|
</li>
|
||||||
<li>
|
<li>
|
||||||
<p>
|
<p>
|
||||||
merge trees and write journal
|
the disk could be full or the quota exceeded.
|
||||||
|
</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>
|
||||||
|
the network or power cable could be pulled out.
|
||||||
|
</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>
|
||||||
|
the user could force a stop of the synchronization process.
|
||||||
|
</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>
|
||||||
|
different communication errors could occur.
|
||||||
</p>
|
</p>
|
||||||
</li>
|
</li>
|
||||||
</ul></div>
|
</ul></div>
|
||||||
<h3 id="_robustness">3.4. Robustness</h3><div style="clear:left"></div>
|
<div class="para"><p>That no data will be lost due to the occurance we enforce the following
|
||||||
<div class="para"><p>TODO</p></div>
|
invariant:</p></div>
|
||||||
<h4 id="_crash_resistance">3.4.1. Crash resistance</h4>
|
<div class="admonitionblock">
|
||||||
<div class="para"><p>TODO</p></div>
|
<table><tr>
|
||||||
|
<td class="icon">
|
||||||
|
<img src="./images/icons/important.png" alt="Important" />
|
||||||
|
</td>
|
||||||
|
<td class="content">At every moment of the synchronization each file has either its
|
||||||
|
original content or its correct final content.</td>
|
||||||
|
</tr></table>
|
||||||
|
</div>
|
||||||
|
<div class="para"><p>So each interupted synchronization process is a partial sync and can be
|
||||||
|
continued and completed by simply running csync again. The only problem could
|
||||||
|
be an error of the filesystem. So we reach this invariant only approximatly.</p></div>
|
||||||
<h4 id="_transfer_errors">3.4.2. Transfer errors</h4>
|
<h4 id="_transfer_errors">3.4.2. Transfer errors</h4>
|
||||||
<div class="para"><p>TODO</p></div>
|
<div class="para"><p>With the Two-Phase-Commit we check the file size after the file has
|
||||||
|
transferred. So we can detect transfer erros. Better would be a transfer
|
||||||
|
protocol with checksums. This could possibly done in the future.</p></div>
|
||||||
|
<div class="para"><p>Future filesystems like btrfs will help to compare checksums instead of the
|
||||||
|
filesize. This will make the synchronization itself safer.</p></div>
|
||||||
<h4 id="_database_loss">3.4.3. Database loss</h4>
|
<h4 id="_database_loss">3.4.3. Database loss</h4>
|
||||||
<div class="para"><p>TODO</p></div>
|
<div class="para"><p>It could be possible, that the state database get corrupted. If this happens
|
||||||
|
all files get evaluated. In this case the file synchronizer wont delete any
|
||||||
|
file, but it could occur that deleted files will be restored from the other
|
||||||
|
replica.
|
||||||
|
To prevent a corruption or loss of the database if an error occurs or the user
|
||||||
|
forces an abort, the synchronizer is working on a copy of the database and will
|
||||||
|
use a 2-Phase-Commit to save it at the end.</p></div>
|
||||||
</div>
|
</div>
|
||||||
<h2 id="_getting_started">4. Getting started</h2>
|
<h2 id="_getting_started">4. Getting started</h2>
|
||||||
<div class="sectionbody">
|
<div class="sectionbody">
|
||||||
|
@ -598,19 +665,35 @@ merge trees and write journal
|
||||||
<div class="para"><p>See the <tt>README</tt> and <tt>INSTALL</tt> files for install prerequisites and
|
<div class="para"><p>See the <tt>README</tt> and <tt>INSTALL</tt> files for install prerequisites and
|
||||||
procedures. Packagers take a look at <a href="#X90">Appendix B: Packager Notes</a>.</p></div>
|
procedures. Packagers take a look at <a href="#X90">Appendix B: Packager Notes</a>.</p></div>
|
||||||
<h3 id="_using_the_commandline_client">4.2. Using the commandline client</h3><div style="clear:left"></div>
|
<h3 id="_using_the_commandline_client">4.2. Using the commandline client</h3><div style="clear:left"></div>
|
||||||
<div class="para"><p>TODO
|
<div class="para"><p>The synopsis of the commandline client is</p></div>
|
||||||
csync /home/csync sftp://TODO:secret@server:port/profile/TODO</p></div>
|
<div class="literalblock">
|
||||||
|
<div class="content">
|
||||||
|
<pre><tt>csync [OPTION...] SOURCE DESTINATION</tt></pre>
|
||||||
|
</div></div>
|
||||||
|
<div class="para"><p>It synchronizes the content of SOURCE with DESTINATION and vice versa. The
|
||||||
|
DESTINATION can be a local directory or a remote file server.</p></div>
|
||||||
|
<div class="literalblock">
|
||||||
|
<div class="content">
|
||||||
|
<pre><tt>csync /home/csync scheme://user:password@server:port/full/path</tt></pre>
|
||||||
|
</div></div>
|
||||||
|
<div class="para"><p>The remote destination is supported by plugins. By default csync ships with smb
|
||||||
|
and sftp support. For more information, see the manpage of <tt>csync(1)</tt>.</p></div>
|
||||||
<h3 id="_the_pam_module">4.3. The PAM module</h3><div style="clear:left"></div>
|
<h3 id="_the_pam_module">4.3. The PAM module</h3><div style="clear:left"></div>
|
||||||
<div class="para"><p>TODO</p></div>
|
<div class="para"><p>pam_csync is a PAM module to provide roaming home directories for a user
|
||||||
|
session. This module is aimed at environments with central file servers a user
|
||||||
|
wishes to store his home directory. The Authentication Module verifies the
|
||||||
|
identity of a user and triggers a synchronization with the server on the first
|
||||||
|
login and the last logout. More information can be found in the manpage of the
|
||||||
|
module pam_csync(8).</p></div>
|
||||||
</div>
|
</div>
|
||||||
<h2 id="X90">5. Appendix A: Packager Notes</h2>
|
<h2 id="X90">5. Appendix A: Packager Notes</h2>
|
||||||
<div class="sectionbody">
|
<div class="sectionbody">
|
||||||
<div class="para"><p>Read the <tt>README</tt> and <tt>INSTALL</tt> files (in the distribution root
|
<div class="para"><p>Read the <tt>README</tt>, <tt>INSTALL</tt> and <tt>FAQ</tt> files (in the distribution root
|
||||||
directory).</p></div>
|
directory).</p></div>
|
||||||
</div>
|
</div>
|
||||||
<div id="footer">
|
<div id="footer">
|
||||||
<div id="footer-text">
|
<div id="footer-text">
|
||||||
Last updated 2008-11-20 12:16:02 CEST
|
Last updated 2008-12-17 15:38:27 CEST
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</body>
|
</body>
|
||||||
|
|
Loading…
Reference in a new issue