mat2-web issueshttps://0xacab.org/jvoisin/mat2-web/-/issues2022-01-24T19:54:22Zhttps://0xacab.org/jvoisin/mat2-web/-/issues/59"bwrap: capset failed: Operation not permitted" when cleaning large pptx files2022-01-24T19:54:22ZL P"bwrap: capset failed: Operation not permitted" when cleaning large pptx filesI'm experiencing some issues when I try to clean large .pptx files.
I get the following error:
```python
web_1 | bwrap: capset failed: Operation not permitted
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/li...I'm experiencing some issues when I try to clean large .pptx files.
I get the following error:
```python
web_1 | bwrap: capset failed: Operation not permitted
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.7/dist-packages/libmat2/exiftool.py", line 29, in get_meta
web_1 | check=True, stdout=subprocess.PIPE).stdout
web_1 | File "/usr/local/lib/python3.7/dist-packages/libmat2/bubblewrap.py", line 106, in run
web_1 | completed_process = subprocess.run(prefix_args + args, **kwargs)
web_1 | File "/usr/lib/python3.7/subprocess.py", line 487, in run
web_1 | output=stdout, stderr=stderr)
web_1 | subprocess.CalledProcessError: Command '['/usr/bin/bwrap', '--ro-bind', '/usr', '/usr', '--ro-bind', '/lib', '/lib', '--ro-bind', '/lib64', '/lib64', '--ro-bind', '/bin', '/bin', '--ro-bind', '/sbin', '/sbin', '--ro-bind', '/etc/alternatives', '/etc/alternatives', '--ro-bind', '/var/www/mat2-web', '/var/www/mat2-web', '--ro-bind', '/etc/ld.so.cache', '/etc/ld.so.cache', '--dev', '/dev', '--proc', '/proc', '--chdir', '/var/www/mat2-web', '--unshare-user-try', '--unshare-ipc', '--unshare-pid', '--unshare-net', '--unshare-uts', '--unshare-cgroup-try', '--new-session', '--cap-drop', 'all', '--ro-bind', '/tmp/tmpo25dkbhr/docProps/thumbnail.jpeg', '/tmp/tmpo25dkbhr/docProps/thumbnail.jpeg', '/usr/bin/exiftool', '-json', '/tmp/tmpo25dkbhr/docProps/thumbnail.jpeg']' returned non-zero exit status 1.
web_1 |
web_1 | During handling of the above exception, another exception occurred:
web_1 |
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 2464, in __call__
web_1 | return self.wsgi_app(environ, start_response)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 2450, in wsgi_app
web_1 | response = self.handle_exception(e)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_cors/extension.py", line 165, in wrapped_function
web_1 | return cors_after_request(app.make_response(f(*args, **kwargs)))
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_restful/__init__.py", line 272, in error_router
web_1 | return original_handler(e)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1867, in handle_exception
web_1 | reraise(exc_type, exc_value, tb)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/_compat.py", line 38, in reraise
web_1 | raise value.with_traceback(tb)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 2447, in wsgi_app
web_1 | response = self.full_dispatch_request()
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1952, in full_dispatch_request
web_1 | rv = self.handle_user_exception(e)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_cors/extension.py", line 165, in wrapped_function
web_1 | return cors_after_request(app.make_response(f(*args, **kwargs)))
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_restful/__init__.py", line 272, in error_router
web_1 | return original_handler(e)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1821, in handle_user_exception
web_1 | reraise(exc_type, exc_value, tb)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/_compat.py", line 38, in reraise
web_1 | raise value.with_traceback(tb)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1950, in full_dispatch_request
web_1 | rv = self.dispatch_request()
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1936, in dispatch_request
web_1 | return self.view_functions[rule.endpoint](**req.view_args)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_restful/__init__.py", line 468, in wrapper
web_1 | resp = resource(*args, **kwargs)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/views.py", line 89, in view
web_1 | return self.dispatch_request(*args, **kwargs)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_restful/__init__.py", line 583, in dispatch_request
web_1 | resp = meth(*args, **kwargs)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flasgger/utils.py", line 248, in wrapper
web_1 | return function(*args, **kwargs)
web_1 | File "./matweb/rest_api.py", line 120, in post
web_1 | _, _, _, output_filename = utils.cleanup(parser, filepath, current_app.config['UPLOAD_FOLDER'])
web_1 | File "./matweb/utils.py", line 86, in cleanup
web_1 | meta_after = parser.get_meta()
web_1 | File "/usr/local/lib/python3.7/dist-packages/libmat2/archive.py", line 146, in get_meta
web_1 | local_meta = {**local_meta, **member_parser.get_meta()}
web_1 | File "/usr/local/lib/python3.7/dist-packages/libmat2/exiftool.py", line 35, in get_meta
web_1 | raise ValueError
web_1 | ValueError
```
Cleaning the same file with mat2 is working absolutely fine.
I've tried to disable all bubblewrap calls in mat2, but afterwards I'm getting the following error:
```python
web_1 | b'[{\n "SourceFile": "/tmp/tmphv8su1ih/ppt/media/image10.svg",\n "ExifToolVersion": 11.16,\n "FileName": "image10.svg",\n "Directory": "/tmp/tmphv8su1ih/ppt/media",\n "FileSize": "12 kB",\n "FileModifyDate": "2022:01:05 10:59:26+00:00",\n "FileAccessDate": "2022:01:05 10:59:26+00:00",\n "FileInodeChangeDate": "2022:01:05 10:59:26+00:00",\n "FilePermissions": "r--------",\n "Error": "File format error"\n}]\n'
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.7/dist-packages/libmat2/exiftool.py", line 27, in get_meta
web_1 | check=True, stdout=subprocess.PIPE).stdout
web_1 | File "/usr/lib/python3.7/subprocess.py", line 487, in run
web_1 | output=stdout, stderr=stderr)
web_1 | subprocess.CalledProcessError: Command '['/usr/bin/exiftool', '-json', '/tmp/tmphv8su1ih/ppt/media/image10.svg']' returned non-zero exit status 1.
web_1 |
web_1 | During handling of the above exception, another exception occurred:
web_1 |
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 2464, in __call__
web_1 | return self.wsgi_app(environ, start_response)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 2450, in wsgi_app
web_1 | response = self.handle_exception(e)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_cors/extension.py", line 165, in wrapped_function
web_1 | return cors_after_request(app.make_response(f(*args, **kwargs)))
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_restful/__init__.py", line 272, in error_router
web_1 | return original_handler(e)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1867, in handle_exception
web_1 | reraise(exc_type, exc_value, tb)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/_compat.py", line 38, in reraise
web_1 | raise value.with_traceback(tb)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 2447, in wsgi_app
web_1 | response = self.full_dispatch_request()
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1952, in full_dispatch_request
web_1 | rv = self.handle_user_exception(e)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_cors/extension.py", line 165, in wrapped_function
web_1 | return cors_after_request(app.make_response(f(*args, **kwargs)))
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_restful/__init__.py", line 272, in error_router
web_1 | return original_handler(e)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1821, in handle_user_exception
web_1 | reraise(exc_type, exc_value, tb)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/_compat.py", line 38, in reraise
web_1 | raise value.with_traceback(tb)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1950, in full_dispatch_request
web_1 | rv = self.dispatch_request()
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1936, in dispatch_request
web_1 | return self.view_functions[rule.endpoint](**req.view_args)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_restful/__init__.py", line 468, in wrapper
web_1 | resp = resource(*args, **kwargs)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask/views.py", line 89, in view
web_1 | return self.dispatch_request(*args, **kwargs)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flask_restful/__init__.py", line 583, in dispatch_request
web_1 | resp = meth(*args, **kwargs)
web_1 | File "/usr/local/lib/python3.7/dist-packages/flasgger/utils.py", line 248, in wrapper
web_1 | return function(*args, **kwargs)
web_1 | File "./matweb/rest_api.py", line 120, in post
web_1 | _, _, _, output_filename = utils.cleanup(parser, filepath, current_app.config['UPLOAD_FOLDER'])
web_1 | File "./matweb/utils.py", line 86, in cleanup
web_1 | meta_after = parser.get_meta()
web_1 | File "/usr/local/lib/python3.7/dist-packages/libmat2/archive.py", line 146, in get_meta
web_1 | local_meta = {**local_meta, **member_parser.get_meta()}
web_1 | File "/usr/local/lib/python3.7/dist-packages/libmat2/images.py", line 40, in get_meta
web_1 | meta = super().get_meta()
web_1 | File "/usr/local/lib/python3.7/dist-packages/libmat2/exiftool.py", line 30, in get_meta
web_1 | raise ValueError
web_1 | ValueError
```
I'm not quite sure if this error is really produced by mat2-web. But since mat2 is working fine with this file, this seems like the correct place.
BTW: Is it possible that bubblewrap is missing in the production dockerfile? Before adding apt install bubblewrap mat2-web is giving me an error, that bwrap was not found.https://0xacab.org/jvoisin/mat2-web/-/issues/53Run on subpath2020-10-14T14:39:29ZngRun on subpathI'd like to run the app/container on a subpath (e.g. `/mat2web`), though nginx but also flask seems to always assume to run on `/`. So script and assets path are always starting with `/`.
I do not yet fully understand how flask & uwsgi ...I'd like to run the app/container on a subpath (e.g. `/mat2web`), though nginx but also flask seems to always assume to run on `/`. So script and assets path are always starting with `/`.
I do not yet fully understand how flask & uwsgi plays together with regards to have it run on a subpath and whether any code change is actually needed (except for config changes), BUT it likely requires some adaption of the nginx templates on the fly.
My proposal would be to have the entrypoint checking on a ENV variable (e.g. MAT2WEB_SUBPATH) and if present, rewrite the templates (maybe nginx can even do it dynamically) and make sure flask is aware of it (e.g. by setting SCRIPT_NAME).
Motiviation: This would allow to bundle mat2web with other webapps, without having to host in on a separate subdomain.https://0xacab.org/jvoisin/mat2-web/-/issues/52Container fails to start on systems without IPv6 (or IPv4)2020-10-14T14:39:31ZngContainer fails to start on systems without IPv6 (or IPv4)Nginx is hardcoded to listen on IPv6 loopback: https://0xacab.org/jvoisin/mat2-web/-/blob/master/config/nginx-default.conf#L4
If you run this container on a system without IPv6 (disabled via kernel cmdline), you get the following:
```b...Nginx is hardcoded to listen on IPv6 loopback: https://0xacab.org/jvoisin/mat2-web/-/blob/master/config/nginx-default.conf#L4
If you run this container on a system without IPv6 (disabled via kernel cmdline), you get the following:
```bash
$ podman run -ti -p8181:8080 --read-only --tmpfs /tmp --tmpfs /run/uwsgi --tmpfs=/app/upload --security-opt=no-new-privileges registry.0xacab.org/jvoisin/mat2-web:latest
2020/09/22 12:42:46 [emerg] 8#8: socket() [::]:8080 failed (97: Address family not supported by protocol)
nginx: [emerg] socket() [::]:8080 failed (97: Address family not supported by protocol)
[uWSGI] getting INI configuration from /etc/uwsgi/apps-enabled/mat2-web.ini
[...]
```
Though the container is running fine, as uwsgi runs and thus the main process does not exit...
Likely you get the same issue on systems without IPv4, though not sure how Line 3 of the nginx config behaves, when no IPv4 stack is present.
Maybe just listen on Line 3 (without `[::]` on Line 4) would be sufficient?
Mainly recording the error here in case someone else tries to to debug the same issue.https://0xacab.org/jvoisin/mat2-web/-/issues/28Add ability to work on links2020-04-17T18:12:35ZgeorgAdd ability to work on linksI was told, that it would be valuable if mat2-web would be capable of cleaning files provided via a HTTP link, for example a .pdf made available on a website.
mat2-web should then probably download the file, clean it and serve it afterw...I was told, that it would be valuable if mat2-web would be capable of cleaning files provided via a HTTP link, for example a .pdf made available on a website.
mat2-web should then probably download the file, clean it and serve it afterwards.
Without much thought put into this, I guess there are some dragons lurking here, as it might be easy to do some sort of denial of service, if the file is big, for example.
Probably, the `Content-Length` should be checked and redirects should be ignored. Probably more stuff should be taken into consideration as well.https://0xacab.org/jvoisin/mat2-web/-/issues/26Introduce releases and signatures2020-09-12T17:14:20ZgeorgIntroduce releases and signaturesCurrently, the README advises to `git clone` this repository.
Probably, at least, we should recommend to people to verify a GPG signed git tag, like we do for mat2.Currently, the README advises to `git clone` this repository.
Probably, at least, we should recommend to people to verify a GPG signed git tag, like we do for mat2.