Simplify the cli's parallel processing implementation

Currently, mat2 is using the multiprocessing module, to create a Pool and apply an imap_unordered on it, with some itertools's magic. I'm convinced that we should be able to write a more readable implementation, likely by using the fancy asyncio module of Python3 instead.

Edited by jvoisin