mat2 issueshttps://0xacab.org/jvoisin/mat2/-/issues2022-03-16T19:35:34Zhttps://0xacab.org/jvoisin/mat2/-/issues/166Keep links in PDF documents2022-03-16T19:35:34ZRomainKeep links in PDF documentsA user of Metadata Cleaner [reported that links are removed after cleaning](https://gitlab.com/rmnvgr/metadata-cleaner/-/issues/22).
Of course it is impossible to have links with the regular cleaning, but I wonder if there's a way with ...A user of Metadata Cleaner [reported that links are removed after cleaning](https://gitlab.com/rmnvgr/metadata-cleaner/-/issues/22).
Of course it is impossible to have links with the regular cleaning, but I wonder if there's a way with lightweight cleaning. Maybe by using [`Poppler.Page.render()`](https://poppler.freedesktop.org/api/glib/poppler-Poppler-Page.html#poppler-page-render) instead of `Poppler.Page.render_for_printing()`? I haven't tried it and don't know the implications of using it.https://0xacab.org/jvoisin/mat2/-/issues/157Evaluate the relevance of mat2 wrt. the USA Library of Congress most used for...2023-05-03T20:42:27ZjvoisinEvaluate the relevance of mat2 wrt. the USA Library of Congress most used formatsThere is a [really nice paper]( https://osf.io/cxh9s/ ) ([local mirror](/uploads/726ca748875f2aaa54a01068c823cc09/39_Mark_Cooper_LP.pdf)) about the most used fileformats at the USA's Library of Congress. We should take a look at it, and ...There is a [really nice paper]( https://osf.io/cxh9s/ ) ([local mirror](/uploads/726ca748875f2aaa54a01068c823cc09/39_Mark_Cooper_LP.pdf)) about the most used fileformats at the USA's Library of Congress. We should take a look at it, and implement formats used there but not supported by mat2.
It boils down to:
- [ ] jp2
- [x] tif
- [x] jpg
- [ ] xml - we can't really support it
- [x] pdf
- [x] txt
- [x] gif
- [x] gz
- [ ] i41
- [ ] mxf
- [ ] mpg
- [ ] wav
- [ ] mov
- [ ] iso
- https://github.com/clalancette/pycdlib
- [ ] dv
- [x] gz
- [x] zip
- [ ] rar - python's library to handle this format, [rarfile](https://rarfile.readthedocs.io/api.html), doesn't provide enough control to remove all the metadata.
- [x] tar