Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
mat2
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Model registry
Operate
Environments
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
tails
mat2
Commits
7992cd0d
Commit
7992cd0d
authored
6 years ago
by
Julien (jvoisin) Voisin
Browse files
Options
Downloads
Patches
Plain Diff
Add some documentation
parent
9e7a4bd2
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
doc/implementation_notes.md
+33
-0
33 additions, 0 deletions
doc/implementation_notes.md
doc/threat_model.md
+85
-0
85 additions, 0 deletions
doc/threat_model.md
with
118 additions
and
0 deletions
doc/implementation_notes.md
0 → 100644
+
33
−
0
View file @
7992cd0d
Implementation notes
====================
Symlink attacks
---------------
MAT2 output predictable filenames (like yourfile.jpg.cleaned).
This may lead to symlink attack. Please check if you OS prevent
against them
Archives handling
-----------------
MAT2 doesn't support archives yet, because we haven't found an usable way to ask the user
what to do when a non-supported files are encountered.
PDF handling
------------
MAT was doing some kind of rendering for PDF files, on a cairo surface, then
printed it to a file. This kept the text selectable, but unfortunately, it
didn't remove any
*deep metadata*
, like the ones in embedded pictures. This was
on of the reason MAT was abandoned: the absence of satisfying solution to
handle PDF. But apparently, people are ok with
[
pdf redact
tools
](
https://github.com/firstlookmedia/pdf-redact-tools
)
, that simply
transform the PDF into images. So this is what's MAT2 is doing too.
Images handling
---------------
When possible, images are handled like PDF: rendered on a surface, then saved
to the filesystem. This ensures that every metadata is removed.
This diff is collapsed.
Click to expand it.
doc/threat_model.md
0 → 100644
+
85
−
0
View file @
7992cd0d
Threat Model
============
The Metadata Anonymisation Toolkit 2 adversary has a number
of goals, capabilities, and counter-attack types that can be
used to guide us towards a set of requirements for the MAT2.
This is an overhaul of MAT's (the first iteration of the software) one.
Warnings
--------
Mat only removes standard metadata from your files, it does _not_:
-
anonymise their content
-
handle watermarking
-
handle steganography
-
handle any non-standard metadata field/system
If you really want to be anonymous format that does not contain any
metadata, or better : use plain-text. And as usual, think before clicking.
Adversary
------------
*
Goals:
- Identifying the source of the document, since a document
always has one. Who/where/when/how was a picture
taken, where was the document leaked from and by
whom, ...
- Identify the author; in some cases documents may be
anonymously authored or created. In these cases,
identifying the author is the goal.
- Identify the equipment/software used. If the attacker fails
to directly identify the author and/or source, his next
goal is to determine the source of the equipment used
to produce, copy, and transmit the document. This can
include the model of camera used to take a photo, or
which software was used to produce an office document.
*
Adversary Capabilities - Positioning
-
The adversary created the document specifically for this
user. This is the strongest position for the adversary to
have. In this case, the adversary is capable of inserting
arbitrary, custom watermarks specifically for tracking
the user. In general, MAT cannot defend against this
adversary, but we list it for completeness.
- The adversary created the document for a group of users.
In this case, the adversary knows that they attempted to
limit distribution to a specific group of users. They may
or may not have watermarked the document for these
users, but they certainly know the format used.
- The adversary did not create the document, the weakest
position for the adversary to have. The file format is (most of the time)
standard, nothing custom is added: MAT
should be able to remove all meta-information from the
file.
Requirements
---------------
*
Processing
-
The MAT2
*should*
avoid interactions with information.
Its goal is to remove metadata, and the user is solely
responsible for the information of the file.
- The MAT2 *must* warn when encountering an unknown
format. For example, in a zipfile, if MAT encounters an
unknown format, it should warn the user, and ask if the
file should be added to the anonymised archive that is
produced.
- The MAT2 *must* not add metadata, since its purpose is to
anonymise files: every added items of metadata decreases
anonymity.
- The MAT2 *should* handle unknown/hidden metadata fields,
like proprietary extensions of open formats.
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment