Make a reentrant parser with Flex and Bison
Making a reentrant (thread-safe) parser with Flex and Bison involves several stages.
To eliminate global variables from Flex, use the following line:
%option reentrant
This changes yylex () to yylex (void *). The
argument to yylex contains initialized memory for the
lexer which is initialized using yylex_init:
void * something; yylex_init (& something); yylex (something);
and then release the memory after finishing the lexing:
yylex_destroy (something);
In the flex documentation, this is given as yyscan_t, but it's void *.
If the lexer is combined with a Bison parser, add
%option bison-bridge
to the Flex input file options. This makes Flex add extra arguments
to yylex to use instead of using the global
variable yylval:
int yylex ( YYSTYPE * lvalp, yyscan_t scanner );
You can then use yylval in the Flex lexer, and it will
refer to whatever is passed in as the first argument
to yylex above.
The type of yylval, YYSTYPE, is also needed,
so run Bison with the -d option to create the
file y.tab.h which defines it for you, and include this
file into the lex file using the top section:
%{
#include "y.tab.h"
%}
See C Scanners with Bison Parsers - Flex manual for more details.
If your parser's value type is a union, any yyval.
instances used to fill union members will need to change
to yylval->.
When running flex, use the --header-file=something.h
option to generate a header file to include in the parser file.
In the Bison input file, add
%pure-parser
to make a reentrant parser,
%lex-param {void * scanner}
to tell it to send the lexer an extra argument, and
%parse-param {void * scanner}
to add another argument to yyparse, which is the thing to
pass in to Flex.
The above is already enough to create a reentrant parser. If you also need to pass in something to Bison, you can add a member
struct pass_to_bison {
....
void * scanner;
} x;
with
%parse-param {struct pass_to_bison * x}
then use a preprocessor macro like
#define scanner x->scanner
in the Bison preamble to make Bison send the scanner. In this case, use
struct pass_to_bison x; yylex_init (& x->scanner); yyparse (& x); yylex_destroy (& x->scanner);
To use private data in the Flex lexer, set its value
with yylex_set_extra:
struct user_type {
int number;
};
struct user_type * user_data;
yylex_set_extra (user_data, scanner);
after calling yyinit_lexer. Here scanner is
the data passed to yyinit_lexer. In the preamble of the
Flex file, add
%{
#define YY_EXTRA_TYPE struct user_type *
%}
The data in user_data is then available in the lexer
as yyextra:
%%
.* { yyextra->number++; }
%%
John D. Robertson adds:
I am implementing a reentrant expression parser, and I found your article on this topic. Since I am stuck using Bison v2.3, your approach is appropriate.
During implementation I discovered that the macro you suggest:
#define scanner x->scannerdoes not work because the string "scanner" does not appear in the source code generated by Bison, specifically in the yylex() call. What does appear is:
#ifdef YYLEX_PARAM # define YYLEX yylex (&yylval, YYLEX_PARAM) #else # define YYLEX yylex (&yylval, x) #endifThis being the case, simply #define'ing YYLEX_PARAM (in the Bison file preamble) to what is needed will do the trick, e.g.
#define YYLEX_PARAM x->scanner
Acknowledgements
Eric S. Raymond pointed out many errors and omissions in this page.
John D. Robertson sent the addendum.
Web links
-
Writing a Reentrant Parser with Flex and Bison By Edsko de Vries
This is an example of a reentrant parser using C++. It also gives the parser and lexer different prefixes.
-
JSON::Argo, a JSON parser for Perl
JSON::Argo is an example of a reentrant parser for JSON using Bison and Flex. Because this project is defunct (replaced by JSON::Parse), please locate the tar file in the above directory. See the files
src/json_parse_lex.lfor the Flex, andsrc/json_parse_grammar.yfor the Bison, as well asjson_parse.cfor the caller.