Profile and optimize IMAP operations

Our IMAP implementation is near to be considered well-behaved, in terms of not breaking MIME rendering of messages or hitting encoding bugs.

But it is still very slow. It's almost exclusively due to i/o bottlenecks, since running the app on a ramdisk shows quite fast response times.

We've been considering (a) changing some pragmas in our sqlcipher backend. (b) a rewrite of the imap backend to use a transitional in-memory store.

h1. Profiling

To profile, we've been using

imaptest. There's a script in https://github.com/leapcode/leap_mail/blob/develop/src/leap/mail/imap/tests/leap_tests_imap.zsh to help running imaptest during a fixed amount of time and aggregating several subsamples of 10 seconds each.

line_profiler. To assess where are the bottlenecks.

direct observation and wtf/min using thunderbird.

h1. SQLITE tweaks

We've added a couple of testing flags:

LEAP_SQLITE_MEMSTORE=1 will set the temp_store pragma to "MEMORY".

LEAP_SQLITE_NOSYNC=1 will set the synchronous pragma to "OFF".

The use of the nosync pragma alone will indeed result in great speedup (APPENDS, which are normally quite costly, will decrease to about 50 msec on average). These results are quite close to what can be observed running the app on a memdisk.

After a initial discussion, we agreed that we would try to avoid the use of the NOSYNC pragma, because it (a) would not allow us to control the writeback period (b) would not offer us guarantees of whether a write operation had succeeded or not (c) according to the sqlite documentation, its use might result in corrupted database files.

Using the script provided above, we should be able to determine with statistical significance if each one of these pragmas contributes to speedup.

Are we still missing any important optimization to sqlite that we could do using PRAGMAS?

h1. IN-MEMORY branch

The architecture for the rewrite is based on having two implementors of IMessageStore: MemoryStore, and SoledadStore. The first one writes periodically to a queue that is consumed by the SoledadStore. We pass along wrappers to the different message parts, and keep both a 'new' and 'dirty' flag for each one of these parts to be able to separate new docs to be created from docs that we can directly put to soledad.

In initial tests, the in-memory branch showed a performance in the append and fetch operations close to the marks using the NOSYNC pragma.

The branch is currently close to be merged https://github.com/leapcode/leap_mail/pull/123

h1. Microbenchmarks vs. real-world behavior

The two situations are really different, and a gain in one of them does not necessarily affects the observed results in the other.

For example, we SHOULD turn the notifications off temporarily during imaptests runs, since we do not unregister the observers for a mailbox, and the notifications add significatively to the overall times.

Also, logging to stdout increments the averages in 100 msecs or so.

Other thing to note is that the memorystore feature is going to allow us to have a fine control about what parts of the messages we keep in memory.

This will result in a great speedup of certain operations that will be very like after the nature of the MUA configuration that we recommend (ie, no on-disk cache). Since the two most-likely operations after a SELECT will be a

FETCH 1:* (FLAGS)

and

FETCH 1:* (RFC822.HEADERS)

, we should prefetch and cache the flags and headers documents for all documents, memory limits allowing.

Other situation that will be favoured from having a fine control over the parts is a COPY command, in which we should be able to iterate and copy over these two parts very fast in memory.

h1. Other considerations

We should consider too if the choice of reactor affects performance significatively or not.

Any comments, requests for clarification, or suggestions about any of the above sections are greatly welcome at the moment.

(from redmine: created on 2014-01-30, closed on 2014-03-13)