Splitting an archive on Mac OS X

Problem:  Your hard disk is running out of space.  You have access to an external or Internet-based storage device but:

  • it is has a file-size limit of 2GB, and/or
  • you know (or strongly suspect) that it uses a single-fork file system and would thus destroy most of the Mac apps that you want to archive.

Many of your files are 2GB in size or larger, and you definitely don’t want your Mac apps ruined due to resource fork stripping.

Ideally, what you want to do us produce a platform-neutral, multi-part archive where each part is less than 2GB.

The solution below assumes that:

  1. You want to move your files from your Mac to the storage solution, then, at some point in the future, bring them back again.  You’re not planning on sharing the files with anyone else.
  2. Since you don’t plan on doing this again (or at least not very often) you don’t want to pay anything for a commercial solution.

With all that in mind, here’s a relatively easy and free method that makes use of the wonderful Unix subsystem under the hood of your Mac.  (If you are familiar with the Terminal and moving around the filesystem using the command line interface, skip steps 1–7 below.)

In everything that follows, remember: Case matters!  ‘Documents’ is not the same as ‘documents’.  Use the correct case at all times.

  1. Open up your Utilities folder
  2. Launch Terminal
    1. the pwd command prints the working directory — it tells you what directory (folder) you are currently in
    2. the ls command lists the contents of the current directory, with ls -l using a long format that lets you see lots of nice detail
    3. the cd command changes the directory (folder) — use it to move around the file system
      1. if there was a folder called “Movies” in the current directory, issuing the cd Movies command would move you into that directory
      2. cd .. will let you back up a level
    4. the man command displays the manual page for any given program on the machine — type man pwd or man ls or man cd to get more info on the commands listed above
  3. By default, you start in your home directory, so pwd should return something like /Users/yourName
  4. Type ls -l to get a directory listing.  You should see the usual suspects — Applications, Desktop, Documents, Movies, Music and so-on
  5. Use cd <directoryName> to move into the directory that contains the folder (or file, but we’ll assume folder) that you want to archive.  If they are in your Documents directory, for example, then you would type cd Documents
  6. Type ls -l to get another listing — hopefully the directory (folder) that you want to archive is here
    1. If you made a mistake and are in the wrong place, just type cd .. and go back to step 3
  7. By using pwd, ls and cd you should eventually be able to navigate to the directory that contains the folder you want to archive (when you type ls -l it should appear in the listing)

Before actually starting the archiving process, you need to know where the split archive is going to go.  Assume that you have some sort of storage device already connected to your system (an external hard disk, a mounted network share, or your iDisk for example) you can use the command line to see where it is, what it’s called, and what’s inside.

  1. ls -l /Volumes/ will show you what is currently mounted to your filesystem — if you haven’t renamed your hard disk you’ll see Macintosh HD in there, if you have a flash device connected you’ll see the device’s name, a mounted iDisk will have your MobileMe username, and so-on
  2. Decide where you want to save the split archive.  By way of example, I will assume that an external hard disk drive is attached, and it is mounted with the name ‘extHDD’
  3. ls -l /Volumes/extHDD/ will list the top-level contents of the chosen device
  4. mkdir /Volumes/extHDD/tempDir will make a temporary directory on the chosen device — your split archive will go in there (you can move or rename it later)
  5. Use ls -l to remind yourself of the name of the directory that you want to archive.  By way of example, I’ll assume you’ve quit playing the World of Warcraft and you have 20+ GB of files in a folder called ‘WoW’
  6. Now for the magic — everything in the following command is important and required.  Pay attention to where spaces are, and whether ‘-‘ signs have spaces on only one side or both
  7. tar -cvf – WoW/ | split -b 1024m – /Volumes/extHDD/tempDir/arc
    1. tar -cvf – WoW/ takes the entire WoW directory structure and turns it into one big file
    2. | then pipes the output from the first command to the second command
    3. split -b 1024m – splits the incoming single file into 1024 MB chunks
    4. /Volumes/extHDD/tempDir/arc is where you will find your split archive.  Each piece of your archive will start will the letters ‘arc’ and then have one or more other letters after it (arca, arcb, arcc, arcd for example)
  8. You’ll know the archiving process has finished when filenames stop scrolling up your screen and you are back to the command line prompt

You are now free to exit Terminal and use the regular Finder to rename tempDir and move it where it needs to be on the extHDD.

If you want to vary the size of the individual chunks of the archive, modify the 1024m to be 512m or 1536m or whatever.  Remember that the ‘m’ stands for MB.  (I like 1GB chunks though, and since 1024MB = 1 GB that’s why I chose 1024m in step 7 above.)

So… fast forward to some day in the future when you want to reverse the process and get your original directory back.

  1. Launch Terminal
  2. use pwd, ls and cd to navigate back to the original folder where your archived folder was located
  3. cat /Volumes/extHDD/tempDir/arc* | tar -xvf –
    1. cat /Volumes/extHDD/tempDir/arc* concatenates (merges) all the split archive chunks starting with the letters ‘arc’ (in the tempDir of the extHDD) back into a single archive file
    2. tar -xvf – extracts the individual documents and applications from the archive file and puts them back where they came from, recreating any sub-directories in the process
  4. There is no step 4.

This approach makes use of Unix command line tools that have been around for decades and will be around for a long time to come, so you don’t need to worry about not being able to get your stuff back.

Commercial offerings often do little more than wrap a nice user interface around command line tools, so the ‘quality’ of the archives produced using this method is likely to be identical to that of shareware graphical apps that call on the same commands.

I hope the above has been useful.

PS:  If you do your research, you’ll often see a tar | gzip | split sequence instead of the shorter tar | split sequence that I used above.  The gzip command simply compresses the files being archived.  Whilst it may seem like a good idea to compress files, I find nowadays that most of the big files that you want to archive are stored in compressed form anyway.  Your .jpeg images, .mp3 audio and .mov video (for example) are already compressed and gzipping them won’t make much of a difference at all (usually only a few percent).  Add to that the fact that adding gzip to the archiving process will easily triple or quadruple the time it takes and, well, it just doesn’t seem worth it to me.

This entry was posted in Stuff and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s