Make a reentrant parser with Flex and Bison
Making a reentrant (thread-safe) parser with Flex and Bison involves several stages.
To eliminate global variables from Flex, use the following line:
%option reentrant
This changes yylex ()
to yylex (void *)
. The
argument to yylex
contains initialized memory for the
lexer which is initialized using yylex_init
:
void * something; yylex_init (& something); yylex (something);
and then release the memory after finishing the lexing:
yylex_destroy (something);
In the flex documentation, this is given as yyscan_t
, but it's void *
.
If the lexer is combined with a Bison parser, add
%option bison-bridge
to the Flex input file options. This makes Flex add extra arguments
to yylex
to use instead of using the global
variable yylval
:
int yylex ( YYSTYPE * lvalp, yyscan_t scanner );
You can then use yylval
in the Flex lexer, and it will
refer to whatever is passed in as the first argument
to yylex
above.
The type of yylval
, YYSTYPE
, is also needed,
so run Bison with the -d
option to create the
file y.tab.h
which defines it for you, and include this
file into the lex file using the top section:
%{ #include "y.tab.h" %}
See C Scanners with Bison Parsers - Flex manual for more details.
If your parser's value type is a union, any yyval.
instances used to fill union members will need to change
to yylval->
.
When running flex, use the --header-file=something.h
option to generate a header file to include in the parser file.
In the Bison input file, add
%pure-parser
to make a reentrant parser,
%lex-param {void * scanner}
to tell it to send the lexer an extra argument, and
%parse-param {void * scanner}
to add another argument to yyparse
, which is the thing to
pass in to Flex.
The above is already enough to create a reentrant parser. If you also need to pass in something to Bison, you can add a member
struct pass_to_bison { .... void * scanner; } x;
with
%parse-param {struct pass_to_bison * x}
then use a preprocessor macro like
#define scanner x->scanner
in the Bison preamble to make Bison send the scanner. In this case, use
struct pass_to_bison x; yylex_init (& x->scanner); yyparse (& x); yylex_destroy (& x->scanner);
To use private data in the Flex lexer, set its value
with yylex_set_extra
:
struct user_type { int number; }; struct user_type * user_data; yylex_set_extra (user_data, scanner);
after calling yyinit_lexer
. Here scanner
is
the data passed to yyinit_lexer
. In the preamble of the
Flex file, add
%{ #define YY_EXTRA_TYPE struct user_type * %}
The data in user_data
is then available in the lexer
as yyextra
:
%% .* { yyextra->number++; } %%
John D. Robertson adds:
I am implementing a reentrant expression parser, and I found your article on this topic. Since I am stuck using Bison v2.3, your approach is appropriate.
During implementation I discovered that the macro you suggest:
#define scanner x->scannerdoes not work because the string "scanner" does not appear in the source code generated by Bison, specifically in the yylex() call. What does appear is:
#ifdef YYLEX_PARAM # define YYLEX yylex (&yylval, YYLEX_PARAM) #else # define YYLEX yylex (&yylval, x) #endifThis being the case, simply #define'ing YYLEX_PARAM (in the Bison file preamble) to what is needed will do the trick, e.g.
#define YYLEX_PARAM x->scanner
Acknowledgements
Eric S. Raymond pointed out many errors and omissions in this page.
John D. Robertson sent the addendum.
Web links
-
Writing a Reentrant Parser with Flex and Bison By Edsko de Vries
This is an example of a reentrant parser using C++. It also gives the parser and lexer different prefixes.
-
JSON::Argo, a JSON parser for Perl
JSON::Argo is an example of a reentrant parser for JSON using Bison and Flex. Because this project is defunct (replaced by JSON::Parse), please locate the tar file in the above directory. See the files
src/json_parse_lex.l
for the Flex, andsrc/json_parse_grammar.y
for the Bison, as well asjson_parse.c
for the caller.