﻿<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://purl.org/atom/ns#">
	<link xmlns="http://purl.org/atom/ns#" type="text/html" rel="alternate" href="http://sial.org/blog/" title="Jeremy Mates’s Blog"/>
	<title xmlns="http://purl.org/atom/ns#">Jeremy Mates’s Blog</title>
	<entry xmlns="http://purl.org/atom/ns#" xmlns:default="http://www.w3.org/1999/xhtml">
		<title xmlns="http://purl.org/atom/ns#">Atomic copies using rename()</title>
		<dc:subject>Coding</dc:subject>
		<summary xmlns="http://purl.org/atom/ns#">summary</summary>
		<content xmlns="http://purl.org/atom/ns#" mode="escaped">&lt;p&gt;The Unix &lt;a href="http://www.freebsd.org/cgi/man.cgi?query=rename&amp;apropos=0&amp;sektion=2"&gt;&lt;tt&gt;rename(2)&lt;/tt&gt;&lt;/a&gt; system call—&lt;a href="http://perldoc.perl.org/functions/rename.html"&gt;&lt;tt&gt;rename&lt;/tt&gt;&lt;/a&gt; in Perl, Ruby, and other languages—caMelCased &lt;tt&gt;renameTo&lt;/tt&gt; in Java—offers atomic file renaming under a single filesystem. &lt;a href="http://en.wikipedia.org/wiki/Atomism"&gt;Atomic&lt;/a&gt; here means the file is either renamed, or it is not, even if the system should crash. Workflows that move files can benefit from atomic copies: a truncated &lt;tt&gt;/etc/passwd&lt;/tt&gt; file could render a system unusable; an incomplete &lt;tt&gt;index.html&lt;/tt&gt; may confuse customers or bring ridicule to a company; an incomplete &lt;tt&gt;report.csv&lt;/tt&gt; may silently omit crucial data.&lt;/p&gt;

&lt;!-- technorati tags start --&gt;&lt;p style="text-align:right;font-size:10px;"&gt;Technorati Tags: &lt;a href="http://www.technorati.com/tag/coding" rel="tag"&gt;coding&lt;/a&gt;, &lt;a href="http://www.technorati.com/tag/Unix" rel="tag"&gt;Unix&lt;/a&gt;&lt;/p&gt;&lt;!-- technorati tags end --&gt;
&lt;br /&gt;
&lt;h2&gt;Implementation&lt;/h2&gt;
&lt;p&gt;The sample Ruby code below first writes the source file to a temporary file in the same directory as the permanent filename, then uses the &lt;tt&gt;rename&lt;/tt&gt; call once the copy completes. &lt;a href="http://samba.anu.edu.au/rsync/"&gt;&lt;tt&gt;rsync&lt;/tt&gt;&lt;/a&gt;, &lt;a href="http://www.cfengine.org/"&gt;&lt;tt&gt;cfengine&lt;/tt&gt;&lt;/a&gt;, and other programs use this method to ensure complete all-or-nothing file copies.&lt;/p&gt;
&lt;p class="sial-block-code"&gt;#!/usr/bin/ruby
#
# Usage: $0 source_file destination_file
#
# TODO error, exception handling and cleanup as
# necessary for the implementation.

require "ftools";
require "tempfile";

program  = File.basename($0);

src_file = ARGV[0];
src_stat = File.stat(src_file);
dst_file = ARGV[1];
# If directory the target, copy file into
# that directory
if FileTest::directory?(dst_file)
  dst_dir  = dst_file
  dst_file = File.join(dst_dir,File.basename(src_file))
else
  dst_dir  = File.dirname(dst_file)
end

File.mkpath(dst_dir);
tf = Tempfile.open("#{program}-tmp", tmpdir=dst_dir)
File.copy(src_file, tf.path)
File.chmod(src_stat.mode, tf.path)

# TODO apply checksum verification here if paranoid
# that the filesystem or underlying storage might have
# corrupted the file, or if calculating the MDN as part
# of the AS2 protocol.

File.rename(tf.path, dst_file)&lt;/p&gt;
&lt;h2&gt;Concerns with rename()&lt;/h2&gt;
&lt;p&gt;The &lt;tt&gt;rename&lt;/tt&gt; documentation often includes dire warnings: that &lt;tt&gt;rename&lt;/tt&gt; does not work across filesystem boundaries, that it is not portable, or that it fails should the destination file exist. The first concern is not a problem for files being renamed under the same filesystem, much less the same parent directory. The common &lt;a href="http://www.freebsd.org/cgi/man.cgi?query=mv&amp;sektion=1"&gt;&lt;tt&gt;mv(1)&lt;/tt&gt;&lt;/a&gt; command uses &lt;tt&gt;rename(2)&lt;/tt&gt; should the source and destination reside on the same filesystem:&lt;/p&gt;
&lt;p class="sial-block-shell"&gt;$ &lt;kbd&gt;ktrace mv source/cat dest&lt;/kbd&gt;
$ &lt;kbd&gt;kdump -f ktrace.out| tail -22&lt;/kbd&gt;
  5228 mv       NAMI  "dest"
  5228 mv       RET   stat 0
  5228 mv       CALL  lstat(0xbffff41f,0xbffff220)
  5228 mv       NAMI  "source/cat"
  5228 mv       RET   lstat 0
  5228 mv       CALL  lstat(0xbffff42a,0xbffff1c0)
  5228 mv       NAMI  "dest"
  5228 mv       RET   lstat 0
  5228 mv       CALL  access(0xbfffedc0,0)
  5228 mv       NAMI  "dest/cat"
  5228 mv       RET   access 0
  5228 mv       CALL  lstat(0xbffff41f,0xbfffece0)
  5228 mv       NAMI  "source/cat"
  5228 mv       RET   lstat 0
  5228 mv       CALL  access(0xbfffedc0,0x2)
  5228 mv       NAMI  "dest/cat"
  5228 mv       RET   access 0
  5228 mv       CALL  &lt;b&gt;rename&lt;/b&gt;(0xbffff41f,0xbfffedc0)
  5228 mv       NAMI  "source/cat"
  5228 mv       NAMI  "dest/cat"
  5228 mv       RET   rename 0
  5228 mv       CALL  exit(0)&lt;/p&gt;
&lt;p&gt;Portability can be a concern, as detailed in &lt;a href="http://perldoc.perl.org/perlport.html#System-Interaction"&gt;&lt;tt&gt;perlport&lt;/tt&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;&lt;tt&gt;Some platforms can't delete or rename files held open by
the system, this limitation may also apply to changing
filesystem metainformation like file permissions or
owners. Remember to "close" files when you are done with
them. Don't "unlink" or "rename" an open file. Don't
"tie" or "open" a file already tied or opened; "untie"
or "close" it first.&lt;/tt&gt;&lt;/blockquote&gt;
&lt;p&gt;Portability is best addressed by &lt;a href="http://perldoc.perl.org/Test/More.html"&gt;unit tests&lt;/a&gt;: when building the software, perform various test &lt;tt&gt;rename&lt;/tt&gt; calls, and use these results to determine if the software will work on the new system. Unit tests should also consider various failure or degraded performance conditions, for example when data arrives slowly—copies over a wide area network, or large amounts of data—a condition unlikely in a test environment unless forced somehow.&lt;/p&gt;
&lt;p&gt;Behavior when the target file exists depends on the implementation; &lt;tt&gt;mv(1)&lt;/tt&gt; offers a &lt;tt&gt;-f&lt;/tt&gt; option, or fails. Other software may need to clobber the target, fail with an alert, or first perform metadata comparisons, such as file modification time, file size, or a &lt;a href="http://en.wikipedia.org/wiki/Cryptographic_hash_function"&gt;checksum&lt;/a&gt; on the file contents. Checking whether the target exists prior to the &lt;tt&gt;rename&lt;/tt&gt; call is perhaps a &lt;a href="http://sial.org/blog/2008/05/file_exists_perldoc_f_x.html"&gt;needless race condition&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Network Copies&lt;/h2&gt;
&lt;p&gt;A similar method can also be used with protocols such as &lt;a href="http://en.wikipedia.org/wiki/SSH_file_transfer_protocol"&gt;SFTP&lt;/a&gt; for file transfers: first upload the data to a temporary file, then use the &lt;tt&gt;rename&lt;/tt&gt; command to move the complete file to the final location. Note that uploading and renaming—while a good practice over directly copying the file—does not guarantee that the data was received, uncorrupted, by the target system. In contrast, the &lt;a href="http://en.wikipedia.org/wiki/AS2"&gt;AS2&lt;/a&gt; protocol returns success only if the file contents are cryptographically verified by the target system, making it suitable for workflows where the data must not be truncated, corrupted, or otherwise altered.&lt;/p&gt;
		</content>
		<issued xmlns="http://purl.org/atom/ns#">2008-10-11T16:24:40-0700</issued>
		<link xmlns="http://purl.org/atom/ns#" type="text/html" rel="alternate" href="http://sial.org/blog/2008/10/atomic_copies_using_rename.html" title=""/>
		<id xmlns="http://purl.org/atom/ns#">317</id>
	</entry>
</feed>
