Make a reentrant parser with Flex and Bison

Making a reentrant (thread-safe) parser with Flex and Bison involves several stages.

To eliminate global variables from Flex, use the following line:

%option reentrant

This changes yylex () to yylex (void *). The argument to yylex contains initialized memory for the lexer which is initialized using yylex_init:

void * something;
yylex_init (& something);
yylex (something);

and then release the memory after finishing the lexing:

yylex_destroy (something);

In the flex documentation, this is given as yyscan_t, but it's void *.

If the lexer is combined with a Bison parser, add

%option bison-bridge

to the Flex input file options. This makes Flex add extra arguments to yylex to use instead of using the global variable yylval:

int yylex ( YYSTYPE * lvalp, yyscan_t scanner );

You can then use yylval in the Flex lexer, and it will refer to whatever is passed in as the first argument to yylex above.

The type of yylval, YYSTYPE, is also needed, so run Bison with the -d option to create the file y.tab.h which defines it for you, and include this file into the lex file using the top section:

%{
#include "y.tab.h"
%}

See C Scanners with Bison Parsers - Flex manual for more details.

If your parser's value type is a union, any yyval. instances used to fill union members will need to change to yylval->.

When running flex, use the --header-file=something.h option to generate a header file to include in the parser file.

In the Bison input file, add

%pure-parser

to make a reentrant parser,

%lex-param {void * scanner}

to tell it to send the lexer an extra argument, and

%parse-param {void * scanner}

to add another argument to yyparse, which is the thing to pass in to Flex.

The above is already enough to create a reentrant parser. If you also need to pass in something to Bison, you can add a member

struct pass_to_bison {
    ....
    void * scanner;
} x;

with

%parse-param {struct pass_to_bison * x}

then use a preprocessor macro like

#define scanner x->scanner

in the Bison preamble to make Bison send the scanner. In this case, use

struct pass_to_bison x;
yylex_init (& x->scanner);
yyparse (& x);
yylex_destroy (& x->scanner);

To use private data in the Flex lexer, set its value with yylex_set_extra:

struct user_type {
    int number;
};
struct user_type * user_data;
yylex_set_extra (user_data, scanner);

after calling yyinit_lexer. Here scanner is the data passed to yyinit_lexer. In the preamble of the Flex file, add

%{
#define YY_EXTRA_TYPE struct user_type *
%}

The data in user_data is then available in the lexer as yyextra:

%%
.*             { yyextra->number++; }
%%

See Extra Data - Flex manual.


John D. Robertson adds:

I am implementing a reentrant expression parser, and I found your article on this topic. Since I am stuck using Bison v2.3, your approach is appropriate.

During implementation I discovered that the macro you suggest:

#define scanner x->scanner

does not work because the string "scanner" does not appear in the source code generated by Bison, specifically in the yylex() call. What does appear is:

#ifdef YYLEX_PARAM
# define YYLEX yylex (&yylval, YYLEX_PARAM)
#else
# define YYLEX yylex (&yylval, x)
#endif

This being the case, simply #define'ing YYLEX_PARAM (in the Bison file preamble) to what is needed will do the trick, e.g.

#define YYLEX_PARAM x->scanner

Acknowledgements

Eric S. Raymond pointed out many errors and omissions in this page.

John D. Robertson sent the addendum.

Web links


Copyright © Ben Bullock 2009-2024. All rights reserved. For comments, questions, and corrections, please email Ben Bullock (benkasminbullock@gmail.com). / Privacy / Disclaimer