Right-shift a file

Right-shifting a file means moving its contents by a given offset to make space at the beginning of the file. It's a block-level operation that would be performed on raw block devices and involves reading data from the device and writing it back further on in the same device.

! A device can be a physical block device or a file containing a block device.

Writing to the same file that is being read requires care to ensure that enough of the file is read so that the location being written has already been read. This needs a buffered pipeline:

$ reader | buffer | writer

We can use dd to perform the reading and writing. There are several utilities that can provide buffering, such as

The requirement for buffering is to guarantee that a given number of bytes are read before passing anything to the writer so that the writer cannot write to a location that hasn't been read.

Both buffer and mbuffer have an option to require the buffer to be full before they output anything:

  • buffer -b 2048 -p 100 requires 100% of 2048 bytes to be read before writing, as does mbuffer -b 2048 -P 100. WC-Stream and Pipe Viewer lack such an option.

The examples that follow use mbuffer because, following various tests, mbuffer appeared to be the most reliable. In particular, buffer worked with a small buffer size but didn't fare so well with a larger one.

A test file

This is a small test to demonstrate the concept. It writes a file containing 100 bytes, incrementing from 1 to 100:

$ rm -rf /tmp/testfile
$ for n in {1..100}; do echo -ne \x$(printf '%x' $n) >> /tmp/testfile; done

the printf '%x' $n prints the value n as hex and the echo -ne \x writes the 8-bit value represented by tat hex. echo has ways to write values represented as octal and hex but, sadly, not decimal.

Check its length is 100 bytes and dump its output:

$ wc -c /tmp/testfile; xxd /tmp/testfile
100 /tmp/testfile
0000000: 0102 0304 0506 0708 090a 0b0c 0d0e 0f10  ................
0000010: 1112 1314 1516 1718 191a 1b1c 1d1e 1f20  ............... 
0000020: 2122 2324 2526 2728 292a 2b2c 2d2e 2f30  !"#$%&'()*+,-./0
0000030: 3132 3334 3536 3738 393a 3b3c 3d3e 3f40  123456789:;<=>?@
0000040: 4142 4344 4546 4748 494a 4b4c 4d4e 4f50  ABCDEFGHIJKLMNOP
0000050: 5152 5354 5556 5758 595a 5b5c 5d5e 5f60  QRSTUVWXYZ[\]^_`
0000060: 6162 6364                                abcd

This rewrites the file 5 bytes to the right. The resulting file is 5 bytes longer and the first 5 bytes are repeated.

$ dd if=/tmp/testfile bs=1 | pv -B 5 | dd of=/tmp/testfile bs=1 seek=5

The bs block size is given as one byte but is normally 512 bytes. We've used pv for this test because buffer and mbuffer don't like the small 5 byte size of our example.

$ wc -c /tmp/testfile; xxd /tmp/testfile 
105 /tmp/testfile
0000000: 0102 0304 0501 0203 0405 0607 0809 0a0b  ................
0000010: 0c0d 0e0f 1011 1213 1415 1617 1819 1a1b  ................
0000020: 1c1d 1e1f 2021 2223 2425 2627 2829 2a2b  .... !"#$%&'()*+
0000030: 2c2d 2e2f 3031 3233 3435 3637 3839 3a3b  ,-./0123456789:;
0000040: 3c3d 3e3f 4041 4243 4445 4647 4849 4a4b  <=>?@ABCDEFGHIJK
0000050: 4c4d 4e4f 5051 5253 5455 5657 5859 5a5b  LMNOPQRSTUVWXYZ[
0000060: 5c5d 5e5f 6061 6263 64                   \]^_`abcd

A real file

This time, we use a larger block size to make it faster

$ sha1sum /tmp/one.flac
8bb0b79a8adc7926c8aec9f3b15fcdb670467fc5
$ ls -l /tmp/one.flac
-rw-r--r-- 1 john users 30052802 Nov  9 14:55 /tmp/one.flac
$ ls -lh /tmp/one.flac
-rw-r--r-- 1 john users 29M Nov  9 14:55 /tmp/one.flac
$ wc -c /tmp/one.flac
30052802 /tmp/one.flac

Let's move it to start at byte 2048 (4 512KB blocks)

$ dd if=/tmp/one.flac | mbuffer -S 2048 -p 100 | dd of=/tmp/testfile seek=4