mat2 issueshttps://0xacab.org/jvoisin/mat2/-/issues2018-04-29T21:00:42Zhttps://0xacab.org/jvoisin/mat2/-/issues/21Provide a meaningful return code when something when wrong with the command line2018-04-29T21:00:42ZjvoisinProvide a meaningful return code when something when wrong with the command lineCurrently, the command line is alway returning zero, even when something went wrong. This should be fixed.Currently, the command line is alway returning zero, even when something went wrong. This should be fixed.jvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/22Write a Thunar extension2023-01-07T16:07:22ZjvoisinWrite a Thunar extensionWhile not as popular as Nautilus (#2), Thunar is the file explorer of XFCE, granting him a non-negligible user base (with the main author of MAT2 amongst it). It thus deserves a MAT2 extension, either via a [custom action]( https://docs....While not as popular as Nautilus (#2), Thunar is the file explorer of XFCE, granting him a non-negligible user base (with the main author of MAT2 amongst it). It thus deserves a MAT2 extension, either via a [custom action]( https://docs.xfce.org/xfce/thunar/custom-actions ), or via the [python bindings]( https://github.com/xfce-mirror/thunarx-python ).1.0 - Ponyjvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/23Clean up the testsuite2018-04-30T21:53:00ZjvoisinClean up the testsuiteCurrently, the CI is leaving some files behind, this is not cool™
This issue is related to #20Currently, the CI is leaving some files behind, this is not cool™
This issue is related to #200.1 - Turtlejvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/24Add some corrupted files to the testsuite2018-07-05T23:02:54ZjvoisinAdd some corrupted files to the testsuiteThis will be useful to check how MAT2 is behaving on weird/malformed files, especially offices/torrents/pdf, since those are the most complex ones.
It's also important to check that MAT2 doesn't crash at runtime (yay, python3!) when pro...This will be useful to check how MAT2 is behaving on weird/malformed files, especially offices/torrents/pdf, since those are the most complex ones.
It's also important to check that MAT2 doesn't crash at runtime (yay, python3!) when processing malformed files.0.1.3 - ostrichjvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/25Vectorialize the logo2018-05-15T19:49:40ZjvoisinVectorialize the logoCurrently, the [logo]( https://0xacab.org/jvoisin/mat2/blob/master/data/mat2.png ) is a simple PNG that I drew in 3 minutes in GIMP. It would be nice to have a decent vectorialized one, but unfortunately, I don't know how to use inkscape :/Currently, the [logo]( https://0xacab.org/jvoisin/mat2/blob/master/data/mat2.png ) is a simple PNG that I drew in 3 minutes in GIMP. It would be nice to have a decent vectorialized one, but unfortunately, I don't know how to use inkscape :/1.0 - Ponyhttps://0xacab.org/jvoisin/mat2/-/issues/26Package MAT2 for the Python ecosystem2018-11-11T19:55:43ZjvoisinPackage MAT2 for the Python ecosystemI know nothing about [python packaging]( https://packaging.python.org/ ), but apparently this is what cool kids are using those day, so we should join the party.I know nothing about [python packaging]( https://packaging.python.org/ ), but apparently this is what cool kids are using those day, so we should join the party.0.7.0 - Roasterjvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/27`pkgutil.walk_packages` not working when loading libmat2 from `PYTHONPATH`2018-06-21T21:36:01ZJonas`pkgutil.walk_packages` not working when loading libmat2 from `PYTHONPATH`Hello,
now that `setup.py` supports installing mat2 into the operating system, the way libmat2 parser modules are included needs to be changed. The relevant part of `libmat2/parser_factory.py`:
```python
# This loads every parser in a ...Hello,
now that `setup.py` supports installing mat2 into the operating system, the way libmat2 parser modules are included needs to be changed. The relevant part of `libmat2/parser_factory.py`:
```python
# This loads every parser in a dynamic way
for module_loader, name, ispkg in pkgutil.walk_packages('.libmat2'):
if not name.startswith('libmat2.'):
continue
elif name == 'libmat2.abstract':
continue
importlib.import_module(name)
```
This doesn't work with libmat2 being in `PYTHONPATH` as [`pkgutil.walk_packages`](https://docs.python.org/3/library/pkgutil.html#pkgutil.walk_packages) doesn't search in `PYTHONPATH` if you give it a path (`.libmat2`).
The best solution I could find is documented here: https://stackoverflow.com/questions/1707709/list-all-the-modules-that-are-part-of-a-python-package/1707786#17077860.1.2 - Duckjvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/28Using a proper logging system2018-07-10T19:31:10ZjvoisinUsing a proper logging systemCurrently, MAT2 is using `print()` everywhere, this isn't cool™.Currently, MAT2 is using `print()` everywhere, this isn't cool™.2.0 - Eaglejvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/29Handle homoglyphs?2018-06-21T21:36:01ZjvoisinHandle homoglyphs?I've been asked by some people to detect and remove [homoglyphs]( https://en.wikipedia.org/wiki/Homoglyph ) from plain-text files. I don't know if this is in MAT2's scope though.I've been asked by some people to detect and remove [homoglyphs]( https://en.wikipedia.org/wiki/Homoglyph ) from plain-text files. I don't know if this is in MAT2's scope though.0.1.2 - Duckhttps://0xacab.org/jvoisin/mat2/-/issues/30Deal with the fact that Python is idiotic with regard to zip files and path t...2018-06-21T21:36:01ZjvoisinDeal with the fact that Python is idiotic with regard to zip files and path traversalBecause Python3's stdlib think that it's ok in 2018 to be vulnerable to path traversal upon zip extraction, we should take care of this.Because Python3's stdlib think that it's ok in 2018 to be vulnerable to path traversal upon zip extraction, we should take care of this.0.1.2 - Duckhttps://0xacab.org/jvoisin/mat2/-/issues/31Prevent argument injection in `exiftool`2018-06-21T21:36:01ZjvoisinPrevent argument injection in `exiftool`We're using [exiftool]( https://www.sno.phy.queensu.ca/~phil/exiftool/ ) with `Popen`, thus making us vulnerable to argument injection. We can't simply blacklist files starting with a dash (`-`) because that's kind of legitimate.We're using [exiftool]( https://www.sno.phy.queensu.ca/~phil/exiftool/ ) with `Popen`, thus making us vulnerable to argument injection. We can't simply blacklist files starting with a dash (`-`) because that's kind of legitimate.0.1.2 - Duckhttps://0xacab.org/jvoisin/mat2/-/issues/32Add some file/folder related tests2018-08-03T20:22:11ZjvoisinAdd some file/folder related testsIt would be nice to have tests for:
- [x] Non-writable files
- [x] Non-existing folders
- [ ] Folders with no files in them
- [x] Non-readable files
- [x] A file which is a character device
- [x] A broken symbolic linkIt would be nice to have tests for:
- [x] Non-writable files
- [x] Non-existing folders
- [ ] Folders with no files in them
- [x] Non-readable files
- [x] A file which is a character device
- [x] A broken symbolic link0.3.0 - Pigjvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/33Increase the (branch) coverage close to 100%2018-07-08T21:26:05ZjvoisinIncrease the (branch) coverage close to 100%Currently, MAT2's coverage is a bit over 90%. It would be nice to be closer to 100%, and to add some pathologically broken file to test its resilience.
```
python3-coverage run -m unittest discover -s tests/
python3-coverage report -m -...Currently, MAT2's coverage is a bit over 90%. It would be nice to be closer to 100%, and to add some pathologically broken file to test its resilience.
```
python3-coverage run -m unittest discover -s tests/
python3-coverage report -m --include 'libmat2/*'
```0.2.0 - Bunnyjvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/34Add more static/dynamic analysers to the CI2018-06-22T21:16:34ZjvoisinAdd more static/dynamic analysers to the CISince Python is highly dynamic, we should add as much as possible (relevant) static and dynamic analysis into the CI:
- [x] [bandit]( https://github.com/openstack/bandit )
- [x] [mypy]( https://github.com/python/mypy )
- [x] [pyflakes](...Since Python is highly dynamic, we should add as much as possible (relevant) static and dynamic analysis into the CI:
- [x] [bandit]( https://github.com/openstack/bandit )
- [x] [mypy]( https://github.com/python/mypy )
- [x] [pyflakes]( https://github.com/PyCQA/pyflakes )
- [ ] [Coverity]( https://www.synopsys.com/software-integrity.html ): apparently, coverity [doesn't support Python 3 yet]( https://community.synopsys.com/s/article/From-Case-RE-Build-Failed-for-Python-code-in-Coverity-tool )
- And apparently, it's not going to happen [anytime soon]( https://www.synopsys.com/software-integrity/security-testing/static-analysis-sast.html )2.0 - Eaglehttps://0xacab.org/jvoisin/mat2/-/issues/35Check for external dependencies at launch time2018-07-10T19:24:43ZjvoisinCheck for external dependencies at launch timeIt would be nice if MAT2 could check for external dependencies (like `exiftool`) at launch time, to avoid runtime crashes.
Shall we do this directly inside the library, or in the CLI? I think that the way to go would be to provide a func...It would be nice if MAT2 could check for external dependencies (like `exiftool`) at launch time, to avoid runtime crashes.
Shall we do this directly inside the library, or in the CLI? I think that the way to go would be to provide a function in the library, and to use it in the cli.0.2.0 - Bunnyjvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/36MAT2 doesn't work on fedora2018-06-21T21:36:02ZjvoisinMAT2 doesn't work on fedoraAs reported by [atenart]( https://github.com/atenart ), MAT2 doesn't work on Fedora, as indicated by the testsuite:
```python
(antoine) ~/hack/mat2 » python3 setup.py test
running test
running egg_info
writing mat2.egg-info/PKG-INFO
wri...As reported by [atenart]( https://github.com/atenart ), MAT2 doesn't work on Fedora, as indicated by the testsuite:
```python
(antoine) ~/hack/mat2 » python3 setup.py test
running test
running egg_info
writing mat2.egg-info/PKG-INFO
writing dependency_links to mat2.egg-info/dependency_links.txt
writing requirements to mat2.egg-info/requires.txt
writing top-level names to mat2.egg-info/top_level.txt
reading manifest file 'mat2.egg-info/SOURCES.txt'
writing manifest file 'mat2.egg-info/SOURCES.txt'
running build_ext
test_jpg (tests.test_climat2.TestCleanMeta) ... ok
test_version (tests.test_climat2.TestExclusiveArgs) ... ok
test_docx (tests.test_climat2.TestGetMeta) ... ok
test_flac (tests.test_climat2.TestGetMeta) ... FAIL
test_jpg (tests.test_climat2.TestGetMeta) ... ok
test_mp3 (tests.test_climat2.TestGetMeta) ... ok
test_odt (tests.test_climat2.TestGetMeta) ... ok
test_ogg (tests.test_climat2.TestGetMeta) ... ok
test_pdf (tests.test_climat2.TestGetMeta) ... ok
test_png (tests.test_climat2.TestGetMeta) ... ok
test_help (tests.test_climat2.TestHelp) ... ok
test_no_arg (tests.test_climat2.TestHelp) ... ok
test_pdf (tests.test_climat2.TestIsSupported) ... ok
test_nonzero (tests.test_climat2.TestReturnValue) ... ok
test_zero (tests.test_climat2.TestReturnValue) ... ok
test_version (tests.test_climat2.TestVersion) ... ok
test_bmp (tests.test_libmat2.TestCleaning) ... ok
test_flac (tests.test_libmat2.TestCleaning) ... ok
test_jpg (tests.test_libmat2.TestCleaning) ... ok
test_libreoffice (tests.test_libmat2.TestCleaning) ... content.xml's format (text/xml) isn't supported
FAIL
test_mp3 (tests.test_libmat2.TestCleaning) ... ok
test_odf (tests.test_libmat2.TestCleaning) ... settings.xml's format (text/xml) isn't supported
FAIL
test_odg (tests.test_libmat2.TestCleaning) ... settings.xml's format (text/xml) isn't supported
FAIL
test_office (tests.test_libmat2.TestCleaning) ... word/settings.xml's format (text/xml) isn't supported
FAIL
test_ogg (tests.test_libmat2.TestCleaning) ... ok
test_pdf (tests.test_libmat2.TestCleaning) ... INFO:root:Rendering page 1/20
INFO:root:Rendering page 2/20
INFO:root:Rendering page 3/20
INFO:root:Rendering page 4/20
INFO:root:Rendering page 5/20
INFO:root:Rendering page 6/20
INFO:root:Rendering page 7/20
INFO:root:Rendering page 8/20
INFO:root:Rendering page 9/20
INFO:root:Rendering page 10/20
INFO:root:Rendering page 11/20
INFO:root:Rendering page 12/20
INFO:root:Rendering page 13/20
INFO:root:Rendering page 14/20
INFO:root:Rendering page 15/20
INFO:root:Rendering page 16/20
INFO:root:Rendering page 17/20
INFO:root:Rendering page 18/20
INFO:root:Rendering page 19/20
INFO:root:Rendering page 20/20
ok
test_png (tests.test_libmat2.TestCleaning) ... ok
test_tiff (tests.test_libmat2.TestCleaning) ... ok
test_torrent (tests.test_libmat2.TestCleaning) ... ok
test_pdf (tests.test_libmat2.TestCorruptedFiles) ... ok
test_png (tests.test_libmat2.TestCorruptedFiles) ... ok
test_png2 (tests.test_libmat2.TestCorruptedFiles) ... ok
test_torrent (tests.test_libmat2.TestCorruptedFiles) ... DEBUG:root:Not a valid bencoded string: 137
DEBUG:root:Not a valid bencoded string: 137
DEBUG:root:Not a valid bencoded string: 137
ok
test_libreoffice (tests.test_libmat2.TestDeepCleaning) ... content.xml's format (text/xml) isn't supported
FAIL
test_office (tests.test_libmat2.TestDeepCleaning) ... word/settings.xml's format (text/xml) isn't supported
FAIL
test_pdf (tests.test_libmat2.TestExplicitelyUnsupportedFiles) ... ok
test_docx (tests.test_libmat2.TestGetMeta) ... ok
test_flac (tests.test_libmat2.TestGetMeta) ... ok
test_jpg (tests.test_libmat2.TestGetMeta) ... ok
test_libreoffice (tests.test_libmat2.TestGetMeta) ... ok
test_mp3 (tests.test_libmat2.TestGetMeta) ... ok
test_ogg (tests.test_libmat2.TestGetMeta) ... ok
test_pdf (tests.test_libmat2.TestGetMeta) ... ok
test_png (tests.test_libmat2.TestGetMeta) ... ok
test_tiff (tests.test_libmat2.TestGetMeta) ... ok
test_torrent (tests.test_libmat2.TestGetMeta) ... ok
test_pdf (tests.test_libmat2.TestLightWeightCleaning) ... INFO:root:Rendering page 1/20
INFO:root:Rendering page 2/20
INFO:root:Rendering page 3/20
INFO:root:Rendering page 4/20
INFO:root:Rendering page 5/20
INFO:root:Rendering page 6/20
INFO:root:Rendering page 7/20
INFO:root:Rendering page 8/20
INFO:root:Rendering page 9/20
INFO:root:Rendering page 10/20
INFO:root:Rendering page 11/20
INFO:root:Rendering page 12/20
INFO:root:Rendering page 13/20
INFO:root:Rendering page 14/20
INFO:root:Rendering page 15/20
INFO:root:Rendering page 16/20
INFO:root:Rendering page 17/20
INFO:root:Rendering page 18/20
INFO:root:Rendering page 19/20
INFO:root:Rendering page 20/20
ok
test_png (tests.test_libmat2.TestLightWeightCleaning) ... ok
test_ver_injection (tests.test_libmat2.TestParameterInjection) ... ok
test_subsubcalss (tests.test_libmat2.TestParserFactory)
Test that our module auto-detection is handling sub-sub-classes ... ok
test_docx_with_svg (tests.test_libmat2.TestUnsupportedEmbeddedFiles) ... [Content_Types].xml's format (text/xml) isn't supported
ok
test_odt_with_svg (tests.test_libmat2.TestUnsupportedEmbeddedFiles) ... Pictures/100006EF00026C3800042270F3FD4B490E928E68.svg's format (image/svg+xml) isn't supported
ok
test_pdf (tests.test_libmat2.TestUnsupportedFiles) ... ok
======================================================================
FAIL: test_flac (tests.test_climat2.TestGetMeta)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/antoine/hack/mat2/tests/test_climat2.py", line 123, in test_flac
self.assertIn(b'comments: Thank you for using MAT !', stdout)
AssertionError: b'comments: Thank you for using MAT !' not found in b"[-] ./tests/data/dirty.flac's format (audio/x-flac) is not supported\n"
======================================================================
FAIL: test_libreoffice (tests.test_libmat2.TestCleaning)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/antoine/hack/mat2/tests/test_libmat2.py", line 378, in test_libreoffice
self.assertTrue(ret)
AssertionError: False is not true
======================================================================
FAIL: test_odf (tests.test_libmat2.TestCleaning)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/antoine/hack/mat2/tests/test_libmat2.py", line 442, in test_odf
self.assertTrue(ret)
AssertionError: False is not true
======================================================================
FAIL: test_odg (tests.test_libmat2.TestCleaning)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/antoine/hack/mat2/tests/test_libmat2.py", line 459, in test_odg
self.assertTrue(ret)
AssertionError: False is not true
======================================================================
FAIL: test_office (tests.test_libmat2.TestCleaning)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/antoine/hack/mat2/tests/test_libmat2.py", line 361, in test_office
self.assertTrue(ret)
AssertionError: False is not true
======================================================================
FAIL: test_libreoffice (tests.test_libmat2.TestDeepCleaning)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/antoine/hack/mat2/tests/test_libmat2.py", line 210, in test_libreoffice
self.assertTrue(ret)
AssertionError: False is not true
======================================================================
FAIL: test_office (tests.test_libmat2.TestDeepCleaning)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/antoine/hack/mat2/tests/test_libmat2.py", line 190, in test_office
self.assertTrue(ret)
AssertionError: False is not true
----------------------------------------------------------------------
Ran 53 tests in 9.180s
FAILED (failures=7)
Test failed: <unittest.runner.TextTestResult run=53 errors=0 failures=7>
error: Test failed: <unittest.runner.TextTestResult run=53 errors=0 failures=7>
Command exited with error 1
```0.1.2 - Duckhttps://0xacab.org/jvoisin/mat2/-/issues/37Translation files2023-01-07T16:07:33ZemmapeelTranslation filesIt would be great to generate some .po files so we can translate MAT2 to other languages.
The sooner the better, as translators will also need some time to translate...It would be great to generate some .po files so we can translate MAT2 to other languages.
The sooner the better, as translators will also need some time to translate...2.0 - Eaglehttps://0xacab.org/jvoisin/mat2/-/issues/38Warn the user of "harmless" filetypes2018-06-21T21:36:02ZZachary SpectorWarn the user of "harmless" filetypesmat2 currently considers plain application/xml files as being free of metadata, when it's really entirely possible that an XML file could have metadata in a schema that we don't know about. We can't support every possible schema, of cour...mat2 currently considers plain application/xml files as being free of metadata, when it's really entirely possible that an XML file could have metadata in a schema that we don't know about. We can't support every possible schema, of course, but currently the tool runs the same way whether it's actually removing metadata or it isn't, and this could result in someone getting a false sense of security.
I'm imagining someone wanting to leak some in-house, totally undocumented schema in a hurry; knowing what metadata is; but not knowing a lot about how file formats really work. That person might run mat2 on their file and think it's clean. We should tell them not to assume this.0.1.2 - Duckjvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/39What do we want to do with files that have a "revision mode"2018-07-05T23:02:54ZjvoisinWhat do we want to do with files that have a "revision mode"@joe pointed out that MAT2 doesn't handle (as in "remove") revisions from office files.
What should we do about this? Shall we keep the revisions and pretend that they are data, or shall we only keep the latest one?@joe pointed out that MAT2 doesn't handle (as in "remove") revisions from office files.
What should we do about this? Shall we keep the revisions and pretend that they are data, or shall we only keep the latest one?0.1.3 - ostrichjvoisinjvoisinhttps://0xacab.org/jvoisin/mat2/-/issues/40__remove_superficial_meta not working with older Poppler versions2018-07-05T23:02:55ZZachary Spector__remove_superficial_meta not working with older Poppler versionsI'm not sure when it happened exactly, but Poppler seems to have removed the Document.set_producer method used by PDFParser.__remove_superficial_meta. I'm using mat2 with Poppler version 0.41.0 and the PDF test is failing because of this.I'm not sure when it happened exactly, but Poppler seems to have removed the Document.set_producer method used by PDFParser.__remove_superficial_meta. I'm using mat2 with Poppler version 0.41.0 and the PDF test is failing because of this.0.1.3 - ostrichjvoisinjvoisin