This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new Linux.com!

Linux.com

Feature: Security

Efficient rsyncrypto hides remote sync data

By Ben Martin on February 01, 2008 (9:00:00 AM)

Share    Print    Comments   

The rsync utility is smart enough to send only enough bytes of a changed file to a remote system to enable the remote file to become identical to the local file. When that information is sensitive, using rsync over SSH protects files while in transit.To protect the files when they are on the server you might first encrypt them with GPG. But the manner in which GPG encrypts slightly changed files foils rsync's efficiency.rsyncrypto allows you to encrypt your files while still allowing you to leverage the speed of rsync.

One of the aims of encryption is to try to make any change to the unencrypted file completely modify how the encrypted file appears. This means that somebody who has access to a series of encrypted files gets little information about how the little changes you might make to the unencrypted file are affecting the encrypted file over time. The downside is that such security causes the encryption program to modify most of the encrypted file. If you then use rsync to copy such a file to the remote server, it will have to send almost the complete file to the remote server each time.

The goal of rsyncrypto is to encrypt files in such a manner that only a slight and controlled amount of security is sacrificed in order to make rsync able to send the encrypted files much quicker. It aims to leak no more than 20 bits of aggregated information per 8KB of plaintext file.

For an example of an information leak, suppose you have an XML file and you use rsyncrypto to copy the file to a remote host. Then you change a single XML attribute and use rsyncrypto to copy the updates across. Now suppose an attacker captured the encrypted versions in transit, and thus has copies of both the encrypted file before the change and after the change. The first thing they learn is that only the first 8KB of the file changed, because that is all that was sent the second time. If they can speculate what sort of file the unencrypted file was (for example, an XML file) then they can try to use that guess in an attempt to recover information.

Rsyncrypto encrypts parts of the file independently, thus keeping any changes you make to a single block of the file local to that block in the encrypted version. If you're protecting a collection of personal files from a possible remote system compromise, such a tradeoff in security might be acceptable. On the other hand, if you cannot allow any information leaks, then you'll have to accept that the whole encrypted file will change radically each time you change the unencrypted file. If that's the case, using rsync on GnuPG-encrypted files might suit your needs.

On a Fedora 8 machine, you have to download both rsyncrypto and the dependency argtable2 and install them using the standard ./configure; make; make install combination, starting with argtable2.

rsyncrypto is designed to be used as a presync option to rsync. That is, you first use rsyncrypto on the plain unencrypted files to obtain an encrypted directory tree, and you then run rsync to send that encrypted tree to a remote system. The following command syntax shows a template for directory encryption and decryption:

# to encrypt rsyncrypto -r srcdir /tmp/encrypted srcdir.keys mykey.crt # to decrypt rsyncrypto -d -r /tmp/encrypted srcdir srcdir.keys mykey.crt

The keys and certificates referenced in these commands are generated by OpenSSL, as we'll see in a moment. In the commands, the srcdir is encrypted and sent to /tmp/encrypted with the individual keys used to encrypt each file in srcdir saved into srcdir.keys. The mykey.crt is a certificate that is used to protect all the keys in srcdir.keys. If you still have all the keys, you can use your certificate in the decryption operation to obtain the plaintext files again. If you lose srcdir.keys, all is not lost, but you must use the private key for mykey.crt to regain the encrypted keys that are also stored in /tmp/encrypted.

The following is a full one-way sync to a remote server using both rsyncrypto and rsync to obtain an encrypted backup on a remote machine. The example first generates a master key and certificate using OpenSSL, then makes an encrypted backup of ~/foodir onto the remote machine v8tsrv:

$ mkdir ~/rsyncrypto-keys $ cd ~/rsyncrypto-keys $ openssl req -nodes -newkey rsa:1536 -x509 -keyout rckey.key -out rckey.crt $ cd ~ $ mkdir foodir $ date >foodir/df1.txt $ date >foodir/df2.txt $ rsyncrypto -r foodir /tmp/encrypted foodir.keys ~/rsyncrypto-keys/rckey.crt $ rsync -av /tmp/encrypted ben@v8tsrv:~

In order to test the speed gain of using rsyncrypto as opposed to using other encryption with rsync, I used /dev/urandom to create a file of random bytes, encrypted it with both rsyncrypto and GnuPG, and rsynced both of these to a remote system using rsync. I then modified the plaintext file, encrypted the file again, and synced the encrypted file with the remote system. In this case, I modified 6KB of data at an offset of 17KB into the file using dd and left all the other data intact. The final rsync commands show that the rsyncrypto-encrypted tree only needed to send 58,102 bytes, whereas the GnuPG-encrypted file required the entire file to be sent to the remote system:

$ cd ~/foodir $ rm -f * $ dd if=/dev/urandom of=testfile.random bs=1024 count=500 512000 bytes (512 kB) copied, 0.088045 s, 5.8 MB/s $ cd ~ $ rsyncrypto -r foodir foodir.rcrypto foodir.keys ~/rsyncrypto-keys/rckey.crt $ ls -l foodir.rcrypto/ -rw-r--r-- ... 502K 2008-01-08 19:59 testfile.random $ mkdir foodir.gpg $ gpg --gen-key ... $ mkdir foodir.gpg $ gpg --output foodir.gpg/testfile.random.gpg -e foodir/testfile.random $ ls -l foodir.gpg -rw-r--r-- 1 ... 501K 2008-01-08 20:07 testfile.random.gpg $ rsync -av foodir.rcrypto ben@v8tsrv:~ sent 513356 bytes received 48 bytes 342269.33 bytes/sec $ rsync -av foodir.gpg ben@v8tsrv:~ sent 513026 bytes received 48 bytes 342049.33 bytes/sec # # modify the input file starting at 17KB into the file for 6KB # $ dd if=/dev/urandom of=~/foodir/testfile.random bs=1024 count=6 seek=17 conv=notrunc $ ls -l testfile.random -rw-r--r-- 1 ... 500K 2008-01-08 20:17 testfile.random $ rsyncrypto -r foodir foodir.rcrypto foodir.keys ~/rsyncrypto-keys/rckey.crt $ gpg --output foodir.gpg/testfile.random.gpg -e foodir/testfile.random # # See how much gets sent # $ rsync -av foodir.rcrypto ben@v8tsrv:~ sent 58102 bytes received 4368 bytes 124940.00 bytes/sec $ rsync -av foodir.gpg ben@v8tsrv:~ sent 513024 bytes received 4368 bytes 1034784.00 bytes/sec

Using rsyncrypto with rsync, you can protect the files that you send to a remote system while allowing modified files to be sent in a bandwidth-efficient manner. There is a slight loss of security using rsyncrypto, because changes to the unencrypted file do not propagate throughout the entire encrypted file. When this security trade-off is acceptable, you can get much quicker bandwidth-friendly network syncs and still achieve good encryption on the files stored on the remote server.

If you wish to hide file names as well as their content on the remote server, you can use the --name-encrypt=map option to rsyncrypto, which stores a mapping from the original file name to a garbled random file name in a mapping file, and outputs files using only their garbled random file name in the encrypted directory tree.

Ben Martin has been working on filesystems for more than 10 years. He completed his Ph.D. and now offers consulting services focused on libferris, filesystems, and search solutions.

Share    Print    Comments   

Comments

on Efficient rsyncrypto hides remote sync data

Note: Comments are owned by the poster. We are not responsible for their content.

Efficient rsyncrypto hides remote sync data

Posted by: Anonymous [ip: 212.18.49.194] on February 01, 2008 10:24 AM
Excellent! That's what I'm looking for quite a while and it was in front of my nose (aptitude search rsync)! :-)
Thank you!

Darko

#

Necessary?

Posted by: Anonymous [ip: 169.233.27.26] on February 01, 2008 10:25 AM
Why is this necessary? rsync -avze 'ssh $REMOTE_USER' $DIR $REMOTE: works just fine.

#

Efficient rsyncrypto hides remote sync data

Posted by: Anonymous [ip: 192.168.2.5] on February 01, 2008 10:59 AM
What are these "SSH's changes to files" which allegedly spoil rsync's efficiency? rsync uses SSH as a transport layer to communicate between two hosts; checking for differences in files and deciding which bits to transmit are done completely independently of SSH.

It seems like this is yet another half-baked article from linux.com

#

Re: Efficient rsyncrypto hides remote sync data

Posted by: Anonymous [ip: 124.148.80.193] on February 01, 2008 03:03 PM
This part was meant to refer to using gpg to encrypt files before syncing to the server. So it should read that GPG's changes to files foil rsync's efficiency. ie, a small change to a file that is GPG encrypted will result in large changes to the encrypted file which will spoil rsync's efficiency.

#

Efficient rsyncrypto hides remote sync data

Posted by: Anonymous [ip: 207.112.41.174] on February 01, 2008 01:19 PM
Either this is a half baked article, or the authors of rsyncrypto have no idea what they are doing.

The example in the 4th paragraph makes no sense. As the rsync connection is protected by SSH the attacker would have no idea which parts of the file were modified. All the attacker would know is that some data was sent from one computer to another. The attacker would not be able to determine that an 8kB file was sent or that the first part of the file was later modified.

#

Re: Efficient rsyncrypto hides remote sync data

Posted by: Anonymous [ip: 124.148.80.193] on February 01, 2008 02:58 PM
Perhaps it is unwise to reply to a half baked comment but... Of course if you had read the first paragraph you would see that the point of rsyncrypto is to protect information *after* it has been sent. Yes, the rsync connection is protected by ssh which would make somebody snooping packets have a hard time of it. If the attacker was on the other hand able to compromise the server you sync to, then you are done if you just use rsync and ssh. If you use rsyncrypto then the attacker only gets encrypted content by hacking into the server. Depending on how you use rsyncrypto they can see file names and maybe that you are changing some blocks of files and which blocks. Not all attacks are on the wire. This is especially true when you are rsyncing to a machine where multiple people have root access and so the "attacker" can be one of many people who snoop around more than they aught to.

#

Re: Efficient rsyncrypto hides remote sync data

Posted by: Anonymous [ip: 216.40.38.232] on February 01, 2008 08:48 PM
Maybe YOU have no idea what you're talking about?

One thing is to encrypt your communication channel (by using SSH in this case), and another COMPLETELY different one is to encrypt your file on the remote filesystem/host.

You could use GPG for that, but the generated file wouldn't be rsync-friendly. This tool generates encrypted files that are rsync-friendly. If you don't know what that means, you'd better read rsync documentation on how it works :-P

#

Re: Efficient rsyncrypto hides remote sync data

Posted by: Anonymous [ip: 217.8.201.210] on February 02, 2008 02:22 PM
Not "either/or" this is both a half-baked article, AND the authors have no idea what they are doing security-wise anyway. It is very hard to believe that this system introduces INTENTIONAL collisions in the name of efficiency. They have just broken their implementation of AES. Congratulations, rsyncrypto, your modified CBC mode takes the cake as bad idea gone wrong.

#

Efficient rsyncrypto hides remote sync data

Posted by: Anonymous [ip: 69.225.128.4] on February 01, 2008 05:37 PM
One thing that would be really useful is to incorporate this idea into a version control system. For example, if I and a friend are writing a book and we keep a synchronized on a remote virtual server where we would like all data to be encrypted. It is a Git repository, so it would be nice if there were a way for Git to do what rsyncrypto does.

#

20 leaked bits?!

Posted by: Anonymous [ip: 217.8.201.210] on February 02, 2008 02:13 PM
It leaks 20 "aggregate" bits per 8K plaintext? This is horribly, horribly, horribly bad. If it leaks only plaintext bits, it's not too evil, but the word "aggregate" seems to imply it's leaking from the Key or IV. This is unimaginably bad, especially PER 8K plaintext! If it is, in fact, leaking from anything _other_ than the plaintext, this is so bad that you're probably better off with the efficiency gains of just NOT-ing all the bits (i.e. rot128ing each byte)... in the worst case senario of this, you've just reduced the complexity of breaking your system over 1,000,000 times. The fact that this is per 8K of plaintext, means that the worst case senario seems highly likely to be a gross underestimate. NOT-ing the bits will keep a random peruser away, and this system won't keep an attacker away. note that 50 years divided by 1000000 is about 25 minutes, and note also that 20bits leaked from the plaintext is nowhere near as drastic.

#

Efficient rsyncrypto hides remote sync data

Posted by: Anonymous [ip: 134.76.10.66] on February 03, 2008 08:11 PM
People could just use rsync between two encfs mounts. Of course that is a different class of encryption, but by far the best if you do not want to deal with special programs such as rsyncrypto.

#

advantages over duplicity?

Posted by: Anonymous [ip: 77.182.43.44] on February 05, 2008 09:27 AM
pls look http://duplicity.nongnu.org/

can somebody review and compare ?

#

Re: advantages over duplicity?

Posted by: Anonymous [ip: 195.176.162.18] on February 05, 2008 04:10 PM
duplicity's encryption is, afaik, not broken... but that means that it might be a little slower

#

This story has been archived. Comments can no longer be posted.



 
Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya