Skip to content
Snippets Groups Projects
README.md 6.29 KiB
Newer Older
  • Learn to ignore specific revisions
  • Julien (jvoisin) Voisin's avatar
    Julien (jvoisin) Voisin committed
    ```
    
     _____ _____ _____ ___
    
    georg's avatar
    georg committed
    |     |  _  |_   _|_  |  Keep your data,
    
    Julien (jvoisin) Voisin's avatar
    Julien (jvoisin) Voisin committed
    | | | |     | | | |  _|     trash your meta!
    |_|_|_|__|__| |_| |___|
    
    Julien (jvoisin) Voisin's avatar
    Julien (jvoisin) Voisin committed
    ```
    
    
    This software is currently in **beta**, please don't use it for anything
    critical.
    
    
    # Metadata and privacy
    
    
    Metadata consist of information that characterizes data.
    Metadata are used to provide documentation for data products.
    In essence, metadata answer who, what, when, where, why, and how about
    every facet of the data that are being documented.
    
    Metadata within a file can tell a lot about you.
    Cameras record data about when a picture was taken and what
    camera was used. Office documents like PDF or Office automatically adds
    author and company information to documents and spreadsheets.
    
    Maybe you don't want to disclose those information.
    
    georg's avatar
    georg committed
    This is precisely the job of mat2: getting rid, as much as possible, of
    
    mat2 provides a command line tool, and graphical user interfaces via a service
    menu for Dolphin, the default file manager of KDE, and an extension for
    Nautilus, the default file manager of GNOME.
    
    # Requirements
    
    - `python3-mutagen` for audio support
    - `python3-gi-cairo` and `gir1.2-poppler-0.18` for PDF support
    - `gir1.2-gdkpixbuf-2.0` for images support
    
    - `gir1.2-rsvg-2.0` for svg support
    
    - `FFmpeg`, optionally, for video support
    
    - `libimage-exiftool-perl` for everything else
    
    - `bubblewrap`, optionally, for sandboxing
    
    georg's avatar
    georg committed
    Please note that mat2 requires at least Python3.5.
    
    # Requirements setup on macOS (OS X) using [Homebrew](https://brew.sh/)
    
    ```bash
    brew install exiftool cairo pygobject3 poppler gdk-pixbuf librsvg ffmpeg
    ```
    
    
    # Running the test suite
    
    Julien (jvoisin) Voisin's avatar
    Julien (jvoisin) Voisin committed
    
    ```bash
    $ python3 -m unittest discover -v
    ```
    
    And if you want to see the coverage:
    
    ```bash
    $ python3-coverage run --branch -m unittest discover -s tests/
    $ python3-coverage report --include -m --include /libmat2/*'
    ```
    
    
    georg's avatar
    georg committed
    # How to use mat2
    
    ```
    usage: mat2 [-h] [-V] [--unknown-members policy] [--inplace] [--no-sandbox]
                [-v] [-l] [--check-dependencies] [-L | -s]
    
                [files [files ...]]
    
    
    Metadata anonymisation toolkit 2
    
    positional arguments:
    
      files                 the files to process
    
    
    optional arguments:
    
      -h, --help            show this help message and exit
      -V, --verbose         show more verbose status information
      --unknown-members policy
                            how to handle unknown members of archive-style files
    
                            (policy should be one of: abort, omit, keep) [Default:
                            abort]
    
      --inplace             clean in place, without backup
    
      --no-sandbox          Disable bubblewrap's sandboxing
    
      -v, --version         show program's version number and exit
      -l, --list            list all supported fileformats
      --check-dependencies  check if mat2 has all the dependencies it needs
      -L, --lightweight     remove SOME metadata
    
    georg's avatar
    georg committed
      -s, --show            list harmful metadata detectable by mat2 without
    
                            removing them
    
    georg's avatar
    georg committed
    Note that mat2 **will not** clean files in-place, but will produce, for
    
    example, with a file named "myfile.png" a cleaned version named
    "myfile.cleaned.png".
    
    
    # Supported formats
    
    The following formats are supported: avi, bmp, css, epub/ncx, flac, gif, jpeg,
    m4a/mp2/mp3/…, mp4, odc/odf/odg/odi/odp/ods/odt/…, off/opus/oga/spx/…, pdf,
    png, ppm, pptx/xlsx/docx/…, svg/svgz/…, tar/tar.gz/tar.bz2/tar.xz/…, tiff,
    torrent, wav, wmv, zip, …
      
    
    georg's avatar
    georg committed
    While mat2 is doing its very best to display metadata when the `--show` flag is
    passed, it doesn't mean that a file is clean from any metadata if mat2 doesn't
    
    show any. There is no reliable way to detect every single possible metadata for
    complex file formats.
    
    This is why you shouldn't rely on metadata's presence to decide if your file must
    be cleaned or not.
    
    
    # Notes about the lightweight mode
    
    By default, mat2 might alter a bit the data of your files, in order to remove
    as much metadata as possible. For example, texts in PDF might not be selectable anymore,
    compressed images might get compressed again, …
    Since some users might be willing to trade some metadata's presence in exchange
    of the guarantee that mat2 won't modify the data of their files, there is the
    `-L` flag that precisely does that.
    
    
    # Related software
    
    - [Exiftool](https://sno.phy.queensu.ca/~phil/exiftool/mat)
    - [pdf-redact-tools](https://github.com/firstlookmedia/pdf-redact-tools), that
    
    	tries to deal with *printer dots* too.
    
    - [pdfparanoia](https://github.com/kanzure/pdfparanoia), that removes
    	watermarks from PDF.
    
    - [Scrambled Exif](https://f-droid.org/packages/com.jarsilio.android.scrambledeggsif/),
    	an open-source Android application to remove metadata from pictures.
    
    If possible, use the [issues system](https://0xacab.org/jvoisin/mat2/issues)
    or the [mailing list](https://mailman.boum.org/listinfo/mat-dev)
    Should a more private contact be needed (eg. for reporting security issues),
    
    you can email Julien (jvoisin) Voisin at `julien.voisin+mat2@dustri.org`,
    
    using the gpg key `9FCDEE9E1A381F311EA62A7404D041E8171901CC`.
    
    
    # Donations
    
    If you want to donate some money, please give it to [Tails]( https://tails.boum.org/donate/?r=contribute ).
    
    
    # License
    
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU Lesser General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    
    You should have received a copy of the GNU Lesser General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.
    
    
    Copyright 2018 Julien (jvoisin) Voisin <julien.voisin+mat2@dustri.org>  
    
    georg's avatar
    georg committed
    Copyright 2016 Marie-Rose for mat2's logo
    
    The `tests/data/dirty_with_nsid.docx` file is licensed under GPLv3,
    and was borrowed from the Calibre project: https://calibre-ebook.com/downloads/demos/demo.docx
    
    
    The `narrated_powerpoint_presentation.pptx` file is in the public domain.
    
    
    georg's avatar
    georg committed
    mat2 wouldn't exist without:
    
    - the [Google Summer of Code](https://summerofcode.withgoogle.com/);
    - the fine people from [Tails]( https://tails.boum.org);
    
    - friends
    
    Many thanks to them!