How to Back Up your GMail
In light of the recent GMail account resets, I decided to stop procrastinating and back up all of my e-mail. It turned out it was pretty simple, with just a few caveats.
The following steps are specifically for Arch Linux, though they can be generalized to other Linux distributions too.
First, install getmail. Getmail is a mail retriever, meaning it can log into your e-mail accounts for you and download your e-mail. The nice thing is that, by default, it doesn’t do anything to the messages on the server, so you can still access your e-mail the way you normally do.
As root, do
pacman -Sy getmail
If you try running getmail now, e.g.
getmail ––version
you might notice an odd error:
ImportError: No module named getmailcore
The problem is that the getmail pacman package (as of this writing) assumes the version of Python you are using. In my case, I had installed getmail-4.20.0-2, and the package had assumed python-2.7. As a result, the getmailcore module was placed in /usr/lib/python2.7/site-packages In actuality, I had python-2.6 installed, so getmail ended up looking in /usr/lib/python2.6/site-packages and complaining that it couldn’t find getmailcore. [1]
In order to fix this, first find out the python version that getmail assumed:
pacman -Ql getmail|grep python (that is a Q followed by a lower case L)
A list of files will be shown, in /usr/lib/python#.#/site-packages Take note of the python version.
Next, find out what version of python you have installed
pacman -Q python
Finally, copy the getmail python module to the correct location (substituting in the appropriate python version numbers):
cd /usr/lib/python[installed python version]/site-packages
cp -a /usr/lib/python[getmail assumed python version]/site-packages/getmail* .
As an example, I had python-2.6 installed, while getmail was assuming python-2.7:
cd /usr/lib/python2.6/site-packages
cp -a /usr/lib/python2.7/site-packages/getmail* .
Now, getmail should work:
getmail ––version
Next, you will need to create the configuration files for getmail. By default, getmail will search for configuration settings in ~/.getmail/getmailrc However, if you intend to back up multiple e-mail accounts, you will need a different configuration file for each account. As a result, I ended up with the following scheme (where [GMAIL ADDRESS] should be replaced with your actual GMail e-mail address in the format “user@gmail.com” without the quotes):
Configuration file: ~/getmail/getmail.d/[GMAIL ADDRESS]
Maildir location: ~/getmail/mail/[GMAIL ADDRESS]/
Getmail log file: ~/getmail/log/[GMAIL ADDRESS]
Oldmail directory: ~/getmail/oldmail
Note that the Oldmail directory contains the records of what getmail has already seen. This way, getmail can download new e-mails only, instead of downloading everything each time. By default, these records go in ~/.getmail However, it is helpful to keep them with the rest of our data in order to make backups easier.
For this setup, the configuration file [2] is as follows (substituting in your e-mail and password):
[retriever]
type = SimpleIMAPSSLRetriever
server = imap.gmail.com
mailboxes = ("[Gmail]/All Mail",)
username = [GMAIL ADDRESS]
password = [GMAIL PASSWORD]
[destination]
type = Maildir
path = /home/sam/Data/EMail/getmail/mail/[GMAIL ADDRESS]/
[options]
verbose = 2
message_log = /home/sam/Data/EMail/getmail/log/[GMAIL ADDRESS]
# retrieve only new messages
read_all = false
# do not alter messages
delivered_to = false
received = false
Most of it is pretty self-explanatory. The read_all option tells getmail to download only messages it hasn’t already downloaded, and the delivered_to and received options tell getmail not to add any additional routing information to the downloaded messages.
The mailboxes option requires a bit more explanation for GMail. By specifying “[Gmail]/All Mail”, getmail will download all of your GMail messages, regardless of label, including sent messages. To download messages with specific labels, just enter the label instead of “[Gmail]/All Mail”. Note that the downloaded messages do not retain the label information.
Next, create the rest of the directory structure:
mkdir ~/getmail/log
mkdir -p ~/getmail/mail/[GMAIL ADDRESS]/{cur,new,tmp}
mkdir ~/getmail/oldmail
Finally, run getmail:
getmail -q ––getmaildir ~/getmail/oldmail ––rcfile ~/getmail/getmail.d/[GMAIL ADDRESS]
That’s all there is to it! You can run getmail as often as you like. It will download only the new messages, since it remembers what it has already downloaded.
References:
(1) http://ubuntuforums.org/archive/index.php/t-29365.html
(2) https://wiki.archlinux.org/index.php/Backup_Gmail_with_getmail

Leave a Reply