Add more test to the userguide.

2024-11-28 11:48:56 +03:00 · 2008-12-17 18:23:32 +01:00 · 2008-12-17 18:23:32 +01:00 · 0f6a55bb23
commit 0f6a55bb23
parent 2a9ac9a91a
2 changed files with 194 additions and 32 deletions
--- a/doc/csync.txt
+++ b/doc/csync.txt
@ -13,13 +13,13 @@ Introduction
 It is often the case that we have multiple copies (called replicas) of a
 filesystem or part of a filesystem (for example on a notebook and on a desktop
-    computer). Changes to each replica are often made independently and as a
+computer). Changes to each replica are often made independently and as a
 result they do not contain the same information. In that case a file
 synchronizer is used to make them consistent again, without loosing any
 information.
 The goal is to detect conflicting <<X13, updates>> (files which has been
-    modified) and propagate non-conflicting updates to each replica. If there
+modified) and propagate non-conflicting updates to each replica. If there
 are no conflicts left we are done and the replicas are identical.
 Basics
@ -115,10 +115,28 @@ filesystem. It decides which file:
 A wrong decision of the reconciler leads in most cases to a loss of data. So there
 are several conditions a the file synchronizer has to follow.
-Specification
+Algorithms
-^^^^^^^^^^^^^
+^^^^^^^^^^
-TODO
+For conflict resolution several different algorithms could be implemented. The
 most common algorithm are the merge and and the conflict algorithm. The first
 is a batch algortihm and the second is one which needs user interaction.
 Merge algorithm
 +++++++++++++++
 The merge algorithm is an algorithm which doesn't need any user interaction. It
 is simple and used for example by Microsoft for Roaming Proflies. If it detects
 a conflict (the same file changed on both replicas) then it will use the most
 recent file and overwrite the other. This means you can loose some data, but
 normally you want the latest file.
 Conflict algorithm
 ++++++++++++++++++
 This is not implemented yet.
 If a file has a conflict the user has to decicde which file should be used.
 Propagation
 ~~~~~~~~~~~
@ -126,28 +144,73 @@ Propagation
 The next instance of the file synchronizer the propagator. It uses the
 calculated records to apply them on the current replica.
-* 2-phase-copy
+
-* merge trees and write journal
+The propagator uses a 2-phase-commit mechanism to simulate an atomic filesystem
 operation.
 In the first phase we copy the file to a temporary file on the opposite
 replica. This has the advantage that we can check if file which has been copied
 to the opposite replica has been transfered successfully. If the connection
 gets interruppted during the transfer we still have the orignal states of the
 file. This means no data will be lost.
 In the second phase the the file on the opposite replica will be overwritten by
 the temporary file.
 After a successfull propagation we have to merge the trees to reflect the
 current state of the filesystem tree. This updated tree will be written as a
 journal into a database. The database is called the state database. It will be
 used during the update detection of the next synchronization. See above.
 Robustness
 ~~~~~~~~~~
-TODO
+This is a really important topic. The file synchronizer should not crash and if
 it crashed, there should be no loss of data. To achieve this goal there are
 several mechanism to prevent this. These mechnanism will be discussed in the
 following sections.
 Crash resistance
 ^^^^^^^^^^^^^^^^
-TODO
+The synchronization process can be interrupted by different events, this can
 be:
 * the system could be halted due to errors.
 * the disk could be full or the quota exceeded.
 * the network or power cable could be pulled out.
 * the user could force a stop of the synchronization process.
 * different communication errors could occur.
 That no data will be lost due to the occurance we enforce the following
 invariant:
 IMPORTANT: At every moment of the synchronization each file has either its
 original content or its correct final content.
 So each interupted synchronization process is a partial sync and can be
 continued and completed by simply running csync again. The only problem could
 be an error of the filesystem. So we reach this invariant only approximatly.
 Transfer errors
 ^^^^^^^^^^^^^^^
-TODO
+With the Two-Phase-Commit we check the file size after the file has
 transferred. So we can detect transfer erros. Better would be a transfer
 protocol with checksums. This could possibly done in the future.
 Future filesystems like btrfs will help to compare checksums instead of the
 filesize. This will make the synchronization itself safer.
 Database loss
 ^^^^^^^^^^^^^
-TODO
+It could be possible, that the state database get corrupted. If this happens
 all files get evaluated. In this case the file synchronizer wont delete any
 file, but it could occur that deleted files will be restored from the other
 replica.
 To prevent a corruption or loss of the database if an error occurs or the user
 forces an abort, the synchronizer is working on a copy of the database and will
 use a 2-Phase-Commit to save it at the end.
 Getting started
 ---------------
@ -160,17 +223,33 @@ procedures. Packagers take a look at <<X90, Appendix B: Packager Notes>>.
 Using the commandline client
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-TODO
+
-csync /home/csync sftp://TODO:secret@server:port/profile/TODO
+The synopsis of the commandline client is
  csync [OPTION...] SOURCE DESTINATION
 It synchronizes the content of SOURCE with DESTINATION and vice versa. The
 DESTINATION can be a local directory or a remote file server.
  csync /home/csync scheme://user:password@server:port/full/path
 The remote destination is supported by plugins. By default csync ships with smb
 and sftp support. For more information, see the manpage of csync(1).
 The PAM module
 ~~~~~~~~~~~~~~
-TODO
+
 pam_csync is a PAM module to provide roaming home directories for a user
 session. This module is aimed at environments with central file servers a user
 wishes to store his home directory. The Authentication Module verifies the
 identity of a user and triggers a synchronization with the server on the first
 login and the last logout. More information can be found in the manpage of the
 module pam_csync(8).
 [[X90]]
 Appendix A: Packager Notes
 --------------------------
-Read the `README` and `INSTALL` files (in the distribution root
+Read the `README`, `INSTALL` and `FAQ` files (in the distribution root
 directory).
--- a/doc/userguide/csync.html
+++ b/doc/userguide/csync.html
@ -437,12 +437,12 @@ is a user-level program which means you don't need to be a superuser.</p></div>
 <div class="sectionbody">
 <div class="para"><p>It is often the case that we have multiple copies (called replicas) of a
 filesystem or part of a filesystem (for example on a notebook and on a desktop
-    computer). Changes to each replica are often made independently and as a
+computer). Changes to each replica are often made independently and as a
 result they do not contain the same information. In that case a file
 synchronizer is used to make them consistent again, without loosing any
 information.</p></div>
 <div class="para"><p>The goal is to detect conflicting <a href="#X13">updates</a> (files which has been
-    modified) and propagate non-conflicting updates to each replica. If there
+modified) and propagate non-conflicting updates to each replica. If there
 are no conflicts left we are done and the replicas are identical.</p></div>
 </div>
 <h2 id="_basics">2. Basics</h2>
@ -566,31 +566,98 @@ or gets <strong>deleted</strong>
 </ul></div>
 <div class="para"><p>A wrong decision of the reconciler leads in most cases to a loss of data. So there
 are several conditions a the file synchronizer has to follow.</p></div>
-<h4 id="_specification">3.2.1. Specification</h4>
+<h4 id="_algorithms">3.2.1. Algorithms</h4>
-<div class="para"><p>TODO</p></div>
+<div class="para"><p>For conflict resolution several different algorithms could be implemented. The
 most common algorithm are the merge and and the conflict algorithm. The first
 is a batch algortihm and the second is one which needs user interaction.</p></div>
 <h5 id="_merge_algorithm">Merge algorithm</h5>
 <div class="para"><p>The merge algorithm is an algorithm which doesn't need any user interaction. It
 is simple and used for example by Microsoft for Roaming Proflies. If it detects
 a conflict (the same file changed on both replicas) then it will use the most
 recent file and overwrite the other. This means you can loose some data, but
 normally you want the latest file.</p></div>
 <h5 id="_conflict_algorithm">Conflict algorithm</h5>
 <div class="para"><p>This is not implemented yet.</p></div>
 <div class="para"><p>If a file has a conflict the user has to decicde which file should be used.</p></div>
 <h3 id="_propagation">3.3. Propagation</h3><div style="clear:left"></div>
 <div class="para"><p>The next instance of the file synchronizer the propagator. It uses the
 calculated records to apply them on the current replica.</p></div>
 <div class="para"><p>The propagator uses a 2-phase-commit mechanism to simulate an atomic filesystem
 operation.</p></div>
 <div class="para"><p>In the first phase we copy the file to a temporary file on the opposite
 replica. This has the advantage that we can check if file which has been copied
 to the opposite replica has been transfered successfully. If the connection
 gets interruppted during the transfer we still have the orignal states of the
 file. This means no data will be lost.
 In the second phase the the file on the opposite replica will be overwritten by
 the temporary file.</p></div>
 <div class="para"><p>After a successfull propagation we have to merge the trees to reflect the
 current state of the filesystem tree. This updated tree will be written as a
 journal into a database. The database is called the state database. It will be
 used during the update detection of the next synchronization. See above.</p></div>
 <h3 id="_robustness">3.4. Robustness</h3><div style="clear:left"></div>
 <div class="para"><p>This is a really important topic. The file synchronizer should not crash and if
 it crashed, there should be no loss of data. To achieve this goal there are
 several mechanism to prevent this. These mechnanism will be discussed in the
 following sections.</p></div>
 <h4 id="_crash_resistance">3.4.1. Crash resistance</h4>
 <div class="para"><p>The synchronization process can be interrupted by different events, this can
 be:</p></div>
 <div class="ilist"><ul>
 <li>
 <p>
-2-phase-copy
+the system could be halted due to errors.
 </p>
 </li>
 <li>
 <p>
-merge trees and write journal
+the disk could be full or the quota exceeded.
 </p>
 </li>
 <li>
 <p>
 the network or power cable could be pulled out.
 </p>
 </li>
 <li>
 <p>
 the user could force a stop of the synchronization process.
 </p>
 </li>
 <li>
 <p>
 different communication errors could occur.
 </p>
 </li>
 </ul></div>
-<h3 id="_robustness">3.4. Robustness</h3><div style="clear:left"></div>
+<div class="para"><p>That no data will be lost due to the occurance we enforce the following
-<div class="para"><p>TODO</p></div>
+invariant:</p></div>
-<h4 id="_crash_resistance">3.4.1. Crash resistance</h4>
+<div class="admonitionblock">
-<div class="para"><p>TODO</p></div>
+<table><tr>
 <td class="icon">
 <img src="./images/icons/important.png" alt="Important" />
 </td>
 <td class="content">At every moment of the synchronization each file has either its
 original content or its correct final content.</td>
 </tr></table>
 </div>
 <div class="para"><p>So each interupted synchronization process is a partial sync and can be
 continued and completed by simply running csync again. The only problem could
 be an error of the filesystem. So we reach this invariant only approximatly.</p></div>
 <h4 id="_transfer_errors">3.4.2. Transfer errors</h4>
-<div class="para"><p>TODO</p></div>
+<div class="para"><p>With the Two-Phase-Commit we check the file size after the file has
 transferred. So we can detect transfer erros. Better would be a transfer
 protocol with checksums. This could possibly done in the future.</p></div>
 <div class="para"><p>Future filesystems like btrfs will help to compare checksums instead of the
 filesize. This will make the synchronization itself safer.</p></div>
 <h4 id="_database_loss">3.4.3. Database loss</h4>
-<div class="para"><p>TODO</p></div>
+<div class="para"><p>It could be possible, that the state database get corrupted. If this happens
 all files get evaluated. In this case the file synchronizer wont delete any
 file, but it could occur that deleted files will be restored from the other
 replica.
 To prevent a corruption or loss of the database if an error occurs or the user
 forces an abort, the synchronizer is working on a copy of the database and will
 use a 2-Phase-Commit to save it at the end.</p></div>
 </div>
 <h2 id="_getting_started">4. Getting started</h2>
 <div class="sectionbody">
@ -598,19 +665,35 @@ merge trees and write journal
 <div class="para"><p>See the <tt>README</tt> and <tt>INSTALL</tt> files for install prerequisites and
 procedures. Packagers take a look at <a href="#X90">Appendix B: Packager Notes</a>.</p></div>
 <h3 id="_using_the_commandline_client">4.2. Using the commandline client</h3><div style="clear:left"></div>
-<div class="para"><p>TODO
+<div class="para"><p>The synopsis of the commandline client is</p></div>
-csync /home/csync sftp://TODO:secret@server:port/profile/TODO</p></div>
+<div class="literalblock">
 <div class="content">
 <pre><tt>csync [OPTION...] SOURCE DESTINATION</tt></pre>
 </div></div>
 <div class="para"><p>It synchronizes the content of SOURCE with DESTINATION and vice versa. The
 DESTINATION can be a local directory or a remote file server.</p></div>
 <div class="literalblock">
 <div class="content">
 <pre><tt>csync /home/csync scheme://user:password@server:port/full/path</tt></pre>
 </div></div>
 <div class="para"><p>The remote destination is supported by plugins. By default csync ships with smb
 and sftp support. For more information, see the manpage of <tt>csync(1)</tt>.</p></div>
 <h3 id="_the_pam_module">4.3. The PAM module</h3><div style="clear:left"></div>
-<div class="para"><p>TODO</p></div>
+<div class="para"><p>pam_csync is a PAM module to provide roaming home directories for a user
 session. This module is aimed at environments with central file servers a user
 wishes to store his home directory. The Authentication Module verifies the
 identity of a user and triggers a synchronization with the server on the first
 login and the last logout. More information can be found in the manpage of the
 module pam_csync(8).</p></div>
 </div>
 <h2 id="X90">5. Appendix A: Packager Notes</h2>
 <div class="sectionbody">
-<div class="para"><p>Read the <tt>README</tt> and <tt>INSTALL</tt> files (in the distribution root
+<div class="para"><p>Read the <tt>README</tt>, <tt>INSTALL</tt> and <tt>FAQ</tt> files (in the distribution root
 directory).</p></div>
 </div>
 <div id="footer">
 <div id="footer-text">
-Last updated 2008-11-20 12:16:02 CEST
+Last updated 2008-12-17 15:38:27 CEST
 </div>
 </div>
 </body>