• Michael Roth's avatar
    json-lexer: make lexer error-recovery more deterministic · b011f619
    Michael Roth authored
    Currently when we reach an error state we effectively flush everything
    fed to the lexer, which can put us in a state where we keep feeding
    tokens into the parser at arbitrary offsets in the stream. This makes it
    difficult for the lexer/tokenizer/parser to get back in sync when bad
    input is made by the client.
    With these changes we emit an error state/token up to the tokenizer as
    soon as we reach an error state, and continue processing any data passed
    in rather than bailing out. The reset token will be used to reset the
    tokenizer and parser, such that they'll recover state as soon as the
    lexer begins generating valid token sequences again.
    We also map chr(192,193,245-255) to an error state here, since they are
    invalid UTF-8 characters. QMP guest proxy/agent will use chr(255) to
    force a flush/reset of previous input for reliable delivery of certain
    events, so also we document that thoroughly here.
    Signed-off-by: default avatarMichael Roth <mdroth@linux.vnet.ibm.com>
    Signed-off-by: default avatarAnthony Liguori <aliguori@us.ibm.com>
json-lexer.h 993 Bytes