Previously, we acknowledged them in PythonIndentingProcessor.adjustBraceLevel, inserting
synthetic STATEMENT_BREAK in front of them to stop recovery in the parser, similarly
to how we handle other kinds of incomplete brackets, but the state inside PyLexerFStringHelper
was not reset, so it kept trying to find matching closing brackets, quotes and interpreting
colons as PyTokenTypes.FSTRING_FRAGMENT_FORMAT_START instead of just PyTokenTypes.COLON.
The state in PyLexerFStringHelper and PythonIndentingProcessor became out of sync, which
led to assertion violations.
It's not an optimal solution, since now these tokens are listed both in
PythonTokenSetContributor.getUnbalancedBracesRecoveryTokens and in Python.flex lexer
specification, and we need to keep them in sync. Also, PythonTokenSetContributor
can provide additional tokens from other languages, such as Cython. But it's simple
and seems "good enough" to patch the problem in the release.
GitOrigin-RevId: 4e156314cc02aba0634d5d9e3008177f49105051
The only lexer not updated is ObjectiveC, because it is using hacky manual patching, see CPP-27237
IJ-CR-103186
GitOrigin-RevId: baf62050f2c4f3f7345c5553cb6b60bca3935ab8
We assumed that if the first token of a Python file is a string literal, then it's a docstring. It's not the case for the Python console, where each input is a separate "file".
Now we pass the `PythonLexerKind` to the `PythonLexer` so that we can parse string literals differently if we are in the console "file".
I've also added Cython as a lexical kind for the Python lexer, since there is at least one place in the lexical rules that is specific to Cython. Also we have separate lexer subclasses for Cython, so having it expressed as a kind for the JFlex rules seems logical, even if we don't use it right now.
GitOrigin-RevId: f5e34fa2dc3b3da84cacf6cee69a4ba0ee674ad5