Commit Graph

10 Commits

Author SHA1 Message Date
Mikhail Golubev
10808546fc PY-63393 Handle keywords terminating lexing of f-strings fragments in the lowermost JFlex lexer
Previously, we acknowledged them in PythonIndentingProcessor.adjustBraceLevel, inserting
synthetic STATEMENT_BREAK in front of them to stop recovery in the parser, similarly
to how we handle other kinds of incomplete brackets, but the state inside PyLexerFStringHelper
was not reset, so it kept trying to find matching closing brackets, quotes and interpreting
colons as PyTokenTypes.FSTRING_FRAGMENT_FORMAT_START instead of just PyTokenTypes.COLON.
The state in PyLexerFStringHelper and PythonIndentingProcessor became out of sync, which
led to assertion violations.

It's not an optimal solution, since now these tokens are listed both in
PythonTokenSetContributor.getUnbalancedBracesRecoveryTokens and in Python.flex lexer
specification, and we need to keep them in sync. Also, PythonTokenSetContributor
can provide additional tokens from other languages, such as Cython. But it's simple
and seems "good enough" to patch the problem in the release.

GitOrigin-RevId: 4e156314cc02aba0634d5d9e3008177f49105051
2023-10-05 22:02:27 +00:00
Alexandr Evstigneev
2271eb1907 IDEA-313615 upgrade JFlex to 1.9.1
GitOrigin-RevId: 72933159ba8a1ae68d39a39a52be46214bb497c5
2023-03-11 11:18:03 +00:00
Alexandr Evstigneev
2dc83a5165 IDEA-313615 Migration to jflex 1.9.0 [regen]
The only lexer not updated is ObjectiveC, because it is using hacky manual patching, see CPP-27237

IJ-CR-103186

GitOrigin-RevId: baf62050f2c4f3f7345c5553cb6b60bca3935ab8
2023-02-24 17:20:31 +00:00
Andrey Vlasovskikh
63f960f624 PY-47974 Extracted isConsole() check
GitOrigin-RevId: f48a39b16babae1a600adc7f095d73cca15879a0
2021-04-07 22:34:57 +00:00
Andrey Vlasovskikh
4bd67d428d PY-47974 Parse single string literal in Python console as string, not as docstring
We assumed that if the first token of a Python file is a string literal, then it's a docstring. It's not the case for the Python console, where each input is a separate "file".

Now we pass the `PythonLexerKind` to the `PythonLexer` so that we can parse string literals differently if we are in the console "file".

I've also added Cython as a lexical kind for the Python lexer, since there is at least one place in the lexical rules that is specific to Cython. Also we have separate lexer subclasses for Cython, so having it expressed as a kind for the JFlex rules seems logical, even if we don't use it right now.

GitOrigin-RevId: f5e34fa2dc3b3da84cacf6cee69a4ba0ee674ad5
2021-04-07 22:34:46 +00:00
Semyon Proshev
558798ed67 Mark \u000b as a bad character since it is not always considered as such (PY-40757)
GitOrigin-RevId: 75b979005b853403a8a8b25b27176e41e7c37fba
2020-02-27 14:35:13 +00:00
Mikhail Khorkov
eae34dc336 PY-14844 Add integer suffix support for Cython
Cython supports C-style integer suffix (u, l, ll). I added them to Python lexer and annotator checker to highlight them in Python language.

More information:

- https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html#differences-between-c-and-cython-expressions

- https://en.cppreference.com/w/cpp/language/integer_literal

GitOrigin-RevId: 97d7bcb19239f931d9ed5e5746aaed84ac09cbc8
2020-02-05 08:01:18 +00:00
Mikhail Golubev
be2d55e603 PY-32123 Ignore escape sequences in raw f-strings by adding special token type for their text
GitOrigin-RevId: 0b15201c60ac56daa45f22bb5ff3c1f8836efee3
2020-01-28 16:04:27 +00:00
Alexey Kudravtsev
843d74524f fix tools/lexer/build.xml: correct paths, add missing lexers, cleanup skeletons, patch path; restore damaged formatting
GitOrigin-RevId: 4bff2a5b5dc1f01d90d470e6d7e65e4f7dbc7e9b
2019-09-16 12:01:16 +00:00
Dmitry Trofimov
a0bc048dcc python-psi-impl extracted
GitOrigin-RevId: e3d808c147ac793701c7b628dbf825a99bb71f2a
2019-09-11 19:15:01 +00:00