This causes issues, at the least, with compressed messages that 'look' like
they decompressed body is form-encoded. 18385 messages in the last month
rejected due to this.
No actually valid form-encoded messages in that time frame.
We suspect no-one is actually using this possible form of upload, and
ideally would remove the code for it entirely. First we need to log any
uploaders that do use this form to be sure.
Fixes the regression identified in #165
However the gzip code path can still erroneously think a decompressed
request body is form encoded when it is not. This happens when any text
in the decompressed body matches the regex:
.+=[^\&;]*
i.e. some text, followed by `=`, and then some more text, possibly
empty, followed by an ampersand `&`, or semi-colon `;`, or the end of the
string.
`&` and `;` are used to separate key=value pairs in form encoding, the
`=` separates a key from its value.
* This code worked if the request was *properly* form-encoded, with a
'data' key whose value was a valid message.
* It failed to detect where the request was form-encoded, with without a
'data' key. It would just assume 'not form-encoded' in that case, then
fail later on JSON parsing.
Thus, re-use the `urlparse.parse_qs()` check for form-encoded format.
This passes:
1. Properly, `data` key, form-encoded with valid value is fully JSON
parsed, schema checked and accepted.
2. *NOT* compressed *or* form-encoded valid message is properly parsed
and accepted.
2. Uncompressed, form-encoded, but no `data` key correctly returns the
same error status and body as the compressed+form-encoded+no data key
path.
For some reason the milliseconds portion of the %S timestamp is using a
comma for decimals separator, despite 'locale' saying we're set to (US)
English. /tableflip
Actually some logging was already there, just the logger had never been
set up properly, but then I decided to make the format of this message
more useful.
This fixes the problem I was having (on two separate machines, Debian
stretch and Debian buster) with the Gateway not actually sending
messages out port 8500 to the Relay and Monitor.
Something about the '@thing' syntax, or using bare 'run()' must be
interfering with zmq.green. The latter ends up thinking there are no
active/matching sockets to send to[0], despite the sockets definitely
being there (complete with TCP 3-way handshake visible on tcpdump
output).
With the problem:
* no network traffic was observed on port 8500.
* A test sender.send(...) just before the bottle run() call
*did* send the message. A similar test at the start of the
@post('/upload/') function did not succeed.
[0] - I ended up putting debug prints in both python-zmq and the zeromq
'libzmq' libraries, building the Debian packages and installing those
versions. I also ended up using 'gdb' on the process. The end result
of this was to find that the _matching variable (a count of matching
sockets I think) was empty deep in libzmq, when it should be counting
sockets to send to.
This was specifically in zeromq3-4.3.1/src/dist.cpp, in the
void zmq::dist_t::distribute (msg_t *msg_) function. The immediate:
if (_matching == 0) {
test was true. I didn't manage to track down which bit of libzmq code
should have been setting _matching before I 'recursed back up' the call
chain to investigate other things.