logo

oasis-root

Compiled tree of Oasis Linux based on own branch at <https://hacktivis.me/git/oasis/> git clone https://anongit.hacktivis.me/git/oasis-root.git

yacc.1 (13759B)


  1. .\" $Id: yacc.1,v 1.44 2024/12/31 15:46:49 tom Exp $
  2. .\"
  3. .TH YACC 1 2024-12-31 "Berkeley Yacc" "User Commands"
  4. .
  5. .ds N Yacc
  6. .ds n yacc
  7. .
  8. .ie n .ds CW R
  9. .el \{
  10. .ie \n(.g .ds CW CR
  11. .el .ds CW CW
  12. .\}
  13. .
  14. .de Ex
  15. .RS +7
  16. .PP
  17. .nf
  18. .ft \*(CW
  19. ..
  20. .de Ee
  21. .fi
  22. .ft R
  23. .RE
  24. ..
  25. .\" Escape single quotes in literal strings from groff's Unicode transform.
  26. .ie \n(.g \{\
  27. .ds `` \(lq
  28. .ds '' \(rq
  29. .ds ' \(aq
  30. .\}
  31. .el \{\
  32. .ie t .ds `` ``
  33. .el .ds `` ""
  34. .ie t .ds '' ''
  35. .el .ds '' ""
  36. .ie t .ds ' \(aq
  37. .el .ds ' '
  38. .\}",
  39. .\" Bulleted paragraph
  40. .de bP
  41. .ie n .IP \(bu 4
  42. .el .IP \(bu 2
  43. ..
  44. .SH NAME
  45. \*N \-
  46. an LALR(1) parser generator
  47. .SH SYNOPSIS
  48. .B \*n [ \-BdghilLPrtvVy ] [ \-b
  49. .I file_prefix
  50. .B ] [ \-H
  51. .I defines_file
  52. .B ] [ \-o
  53. .I output_file
  54. .B ] [ \-p
  55. .I symbol_prefix
  56. .B ]
  57. .I filename
  58. .SH DESCRIPTION
  59. .B \*N
  60. reads the grammar specification in the file
  61. .I filename
  62. and generates an LALR(1) parser for it.
  63. The parsers consist of a set of LALR(1) parsing tables and a driver routine
  64. written in the C programming language.
  65. .B \*N
  66. normally writes the parse tables and the driver routine to the file
  67. .IR y.tab.c .
  68. .PP
  69. The following options are available:
  70. .TP 5
  71. \fB\-b \fIfile_prefix\fR
  72. The
  73. .B \-b
  74. option changes the prefix prepended to the output file names to
  75. the string denoted by
  76. .IR file_prefix .
  77. The default prefix is the character
  78. .IR y .
  79. .TP
  80. .B \-B
  81. create a backtracking parser (compile-time configuration for \fBbtyacc\fP).
  82. .TP
  83. .B \-d
  84. causes the header file
  85. .B y.tab.h
  86. to be written.
  87. It contains #define's for the token identifiers.
  88. .TP
  89. .B \-h
  90. print a usage message to the standard error.
  91. .TP
  92. \fB\-H \fIdefines_file\fR
  93. causes #define's for the token identifiers
  94. to be written to the given \fIdefines_file\fP rather
  95. than the \fBy.tab.h\fP file used by the \fB\-d\fP option.
  96. .TP
  97. .B \-g
  98. The
  99. .B \-g
  100. option causes a graphical description of the generated LALR(1) parser to
  101. be written to the file
  102. .B y.dot
  103. in graphviz format, ready to be processed by
  104. .BR dot (1).
  105. .TP
  106. .B \-i
  107. The \fB\-i\fR option causes a supplementary header file
  108. .B y.tab.i
  109. to be written.
  110. It contains extern declarations
  111. and supplementary #define's as needed to map the conventional \fIyacc\fP
  112. \fByy\fP-prefixed names to whatever the \fB\-p\fP option may specify.
  113. The code file, e.g., \fBy.tab.c\fP is modified to #include this file
  114. as well as the \fBy.tab.h\fP file, enforcing consistent usage of the
  115. symbols defined in those files.
  116. .IP
  117. The supplementary header file makes it simpler to separate compilation
  118. of lex- and yacc-files.
  119. .TP
  120. .B \-l
  121. If the
  122. .B \-l
  123. option is not specified,
  124. .B \*n
  125. will insert \fI#line\fP directives in the generated code.
  126. The \fI#line\fP directives let the C compiler relate errors in the
  127. generated code to the user's original code.
  128. If the \fB\-l\fR option is specified,
  129. .B \*n
  130. will not insert the \fI#line\fP directives.
  131. \&\fI#line\fP directives specified by the user will be retained.
  132. .TP
  133. .B \-L
  134. enable position processing,
  135. e.g., \*(``%locations\*('' (compile-time configuration for \fBbtyacc\fP).
  136. .TP
  137. \fB\-o \fIoutput_file\fR
  138. specify the filename for the parser file.
  139. If this option is not given, the output filename is
  140. the file prefix concatenated with the file suffix, e.g., \fBy.tab.c\fP.
  141. This overrides the \fB\-b\fP option.
  142. .TP
  143. \fB\-p \fIsymbol_prefix\fR
  144. The
  145. .B \-p
  146. option changes the prefix prepended to yacc-generated symbols to
  147. the string denoted by
  148. .IR symbol_prefix .
  149. The default prefix is the string
  150. .B "yy."
  151. .TP
  152. .B \-P
  153. create a reentrant parser, e.g., \*(``%pure\-parser\*(''.
  154. .TP
  155. .B \-r
  156. The
  157. .B \-r
  158. option causes
  159. .B \*n
  160. to produce separate files for code and tables.
  161. The code file is named
  162. .IR y.code.c ,
  163. and the tables file is named
  164. .IR y.tab.c .
  165. The prefix \*(``\fIy.\fP\*('' can be overridden using the \fB\-b\fP option.
  166. .TP
  167. .B \-s
  168. suppress \*(``\fB#define\fP\*('' statements generated for string literals in
  169. a \*(``\fB%token\fP\*('' statement,
  170. to more closely match original \fByacc\fP behavior.
  171. .IP
  172. Normally when \fB\*n\fP sees a line such as
  173. .Ex
  174. %token OP_ADD "ADD"
  175. .Ee
  176. .IP
  177. it notices that the quoted \*(``ADD\*('' is a valid C identifier,
  178. and generates a #define not only for OP_ADD,
  179. but for ADD as well,
  180. e.g.,
  181. .Ex
  182. #define OP_ADD 257
  183. .br
  184. #define ADD 258
  185. .Ee
  186. .IP
  187. The original \fByacc\fP does not generate the second \*(``\fB#define\fP\*(''.
  188. The \fB\-s\fP option suppresses this \*(``\fB#define\fP\*(''.
  189. .IP
  190. POSIX (IEEE 1003.1 2004) documents only names and numbers
  191. for \*(``\fB%token\fP\*('',
  192. though original \fByacc\fP and bison also accept string literals.
  193. .TP
  194. .B \-t
  195. The
  196. .B \-t
  197. option changes the preprocessor directives generated by
  198. .B \*n
  199. so that debugging statements will be incorporated in the compiled code.
  200. .IP
  201. \fB\*N\fR sends debugging output to the standard output
  202. (compatible with both the original \fByacc\fP and \fBbtyacc\fP),
  203. while \fBbtyacc\fP writes debugging output to the standard error
  204. (like \fBbison\fP).
  205. .TP
  206. .B \-v
  207. The
  208. .B \-v
  209. option causes a human-readable description of the generated parser to
  210. be written to the file
  211. .IR y.output .
  212. .TP
  213. .B \-V
  214. print the version number to the standard output.
  215. .TP
  216. .B \-y
  217. \fB\*n\fP ignores this option,
  218. which bison supports for ostensible POSIX compatibility.
  219. .PP
  220. The \fIfilename\fP parameter is not optional.
  221. However, \fB\*n\fP accepts a single \*(``\-\*('' to read the grammar
  222. from the standard input.
  223. A double \*(``\-\-\*('' marker denotes the end of options.
  224. A single \fIfilename\fP parameter is expected after a \*(``\-\-\*('' marker.
  225. .
  226. .SH DIAGNOSTICS
  227. If there are rules that are never reduced, the number of such rules is
  228. reported on standard error.
  229. If there are any LALR(1) conflicts, the number of conflicts is reported
  230. on standard error.
  231. .SH EXTENSIONS
  232. .B \*N
  233. provides some extensions for
  234. compatibility with bison and other implementations of yacc.
  235. It accepts several \fIlong options\fP which have equivalents in \*n.
  236. The \fB%destructor\fP and \fB%locations\fP features are available
  237. only if \fB\*n\fP has been configured and compiled to support the
  238. back-tracking (\fBbtyacc\fP) functionality.
  239. The remaining features are always available:
  240. .TP
  241. \fB %code\fP \fIkeyword\fP { \fIcode\fP }
  242. Adds the indicated source \fIcode\fP at a given point in the output file.
  243. The optional \fIkeyword\fP tells \fB\*n\fP where to insert the \fIcode\fP:
  244. .RS 7
  245. .TP 5
  246. \fBtop\fP
  247. just after the version-definition in the generated code-file.
  248. .TP 5
  249. \fBrequires\fP
  250. just after the declaration of public parser variables.
  251. If the \fB\-d\fP option is given, the code is inserted at the
  252. beginning of the defines-file.
  253. .TP 5
  254. \fBprovides\fP
  255. just after the declaration of private parser variables.
  256. If the \fB\-d\fP option is given, the code is inserted at the
  257. end of the defines-file.
  258. .RE
  259. .IP
  260. If no \fIkeyword\fP is given, the code is inserted at the
  261. beginning of the section of code copied verbatim from the source file.
  262. Multiple \fB%code\fP directives may be given;
  263. \fB\*n\fP inserts those into the corresponding code- or defines-file
  264. in the order that they appear in the source file.
  265. .TP
  266. \fB %debug\fP
  267. This has the same effect as the \*(``\-t\*('' command-line option.
  268. .TP
  269. \fB %destructor\fP { \fIcode\fP } \fIsymbol+\fP
  270. defines code that is invoked when a symbol is automatically
  271. discarded during error recovery.
  272. This code can be used to
  273. reclaim dynamically allocated memory associated with the corresponding
  274. semantic value for cases where user actions cannot manage the memory
  275. explicitly.
  276. .IP
  277. On encountering a parse error, the generated parser
  278. discards symbols on the stack and input tokens until it reaches a state
  279. that will allow parsing to continue.
  280. This error recovery approach results in a memory leak
  281. if the \fBYYSTYPE\fP value is, or contains,
  282. pointers to dynamically allocated memory.
  283. .IP
  284. The bracketed \fIcode\fP is invoked whenever the parser discards one of
  285. the symbols.
  286. Within \fIcode\fP, \*(``\fB$$\fP\*('' or
  287. \*(``\fB$<\fItag\fB>$\fR\*('' designates the semantic value associated with the
  288. discarded symbol, and \*(``\fB@$\fP\*('' designates its location (see
  289. \fB%locations\fP directive).
  290. .IP
  291. A per-symbol destructor is defined by listing a grammar symbol
  292. in \fIsymbol+\fP. A per-type destructor is defined by listing
  293. a semantic type tag (e.g., \*(``<some_tag>\*('') in \fIsymbol+\fP; in this
  294. case, the parser will invoke \fIcode\fP whenever it discards any grammar
  295. symbol that has that semantic type tag, unless that symbol has its own
  296. per-symbol destructor.
  297. .IP
  298. Two categories of default destructor are supported that are
  299. invoked when discarding any grammar symbol that has no per-symbol and no
  300. per-type destructor:
  301. .RS
  302. .bP
  303. the code for \*(``\fB<*>\fP\*('' is used
  304. for grammar symbols that have an explicitly declared semantic type tag
  305. (via \*(``\fB%type\fP\*('');
  306. .bP
  307. the code for \*(``\fB<>\fP\*('' is used
  308. for grammar symbols that have no declared semantic type tag.
  309. .RE
  310. .TP
  311. \fB %empty\fP
  312. ignored by \fB\*n\fP.
  313. .TP
  314. \fB %expect\fP \fInumber\fP
  315. tells \fB\*n\fP the expected number of shift/reduce conflicts.
  316. That makes it only report the number if it differs.
  317. .TP
  318. \fB %expect\-rr\fP \fInumber\fP
  319. tell \fB\*n\fP the expected number of reduce/reduce conflicts.
  320. That makes it only report the number if it differs.
  321. This is (unlike bison) allowable in LALR parsers.
  322. .TP
  323. \fB %locations\fP
  324. tells \fB\*n\fP to enable management of position information associated
  325. with each token, provided by the lexer in the global variable \fByylloc\fP,
  326. similar to management of semantic value information provided in \fByylval\fP.
  327. .IP
  328. As for semantic values, locations can be referenced within actions using
  329. \fB@$\fP to refer to the location of the left hand side symbol, and \fB@\fIN\fR
  330. (\fIN\fP an integer) to refer to the location of one of the right hand side
  331. symbols.
  332. Also as for semantic values, when a rule is matched, a default
  333. action is used the compute the location represented by \fB@$\fP as the
  334. beginning of the first symbol and the end of the last symbol in the right
  335. hand side of the rule.
  336. This default computation can be overridden by
  337. explicit assignment to \fB@$\fP in a rule action.
  338. .IP
  339. The type of \fByylloc\fP is \fBYYLTYPE\fP, which is defined by default as:
  340. .Ex
  341. typedef struct YYLTYPE {
  342. int first_line;
  343. int first_column;
  344. int last_line;
  345. int last_column;
  346. } YYLTYPE;
  347. .Ee
  348. .IP
  349. \fBYYLTYPE\fP can be redefined by the user
  350. (\fBYYLTYPE_IS_DEFINED\fP must be defined, to inhibit the default)
  351. in the declarations section of the specification file.
  352. As in bison, the macro \fBYYLLOC_DEFAULT\fP is invoked
  353. each time a rule is matched to calculate a position for the left hand side of
  354. the rule, before the associated action is executed; this macro can be
  355. redefined by the user.
  356. .IP
  357. This directive adds a \fBYYLTYPE\fP parameter to \fByyerror()\fP.
  358. If the \fB%pure\-parser\fP directive is present,
  359. a \fBYYLTYPE\fP parameter is added to \fByylex()\fP calls.
  360. .TP
  361. \fB %lex\-param\fP { \fIargument-declaration\fP }
  362. By default, the lexer accepts no parameters, e.g., \fByylex()\fP.
  363. Use this directive to add parameter declarations for your customized lexer.
  364. .TP
  365. \fB %parse\-param\fP { \fIargument-declaration\fP }
  366. By default, the parser accepts no parameters, e.g., \fByyparse()\fP.
  367. Use this directive to add parameter declarations for your customized parser.
  368. .TP
  369. \fB %pure\-parser\fP
  370. Most variables (other than \fByydebug\fP and \fByynerrs\fP) are
  371. allocated on the stack within \fByyparse\fP, making the parser reasonably
  372. reentrant.
  373. .TP
  374. \fB %token\-table\fP
  375. Make the parser's names for tokens available in the \fByytname\fP array.
  376. However,
  377. .B \*n
  378. does not predefine \*(``$end\*('', \*(``$error\*(''
  379. or \*(``$undefined\*('' in this array.
  380. .
  381. .SH PORTABILITY
  382. According to Robert Corbett,
  383. .Ex
  384. Berkeley Yacc is an LALR(1) parser generator. Berkeley Yacc
  385. has been made as compatible as possible with AT&T Yacc.
  386. Berkeley Yacc can accept any input specification that
  387. conforms to the AT&T Yacc documentation. Specifications
  388. that take advantage of undocumented features of AT&T Yacc
  389. will probably be rejected.
  390. .Ee
  391. .PP
  392. The rationale in
  393. .Ex
  394. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/yacc.html
  395. .Ee
  396. .PP
  397. documents some features of AT&T yacc which are no longer required for POSIX
  398. compliance.
  399. .PP
  400. That said, you may be interested in reusing grammar files with some
  401. other implementation which is not strictly compatible with AT&T yacc.
  402. For instance, there is bison.
  403. Here are a few differences:
  404. .bP
  405. \fBYacc\fP accepts an equals mark preceding the left curly brace
  406. of an action (as in the original grammar file \fBftp.y\fP):
  407. .Ex
  408. | STAT CRLF
  409. = {
  410. statcmd();
  411. }
  412. .Ee
  413. .bP
  414. \fBYacc\fP and bison emit code in different order, and in particular bison
  415. makes forward reference to common functions such as yylex, yyparse and
  416. yyerror without providing prototypes.
  417. .bP
  418. Bison's support for \*(``%expect\*('' is broken in more than one release.
  419. For best results using bison, delete that directive.
  420. .bP
  421. Bison has no equivalent for some of \fB\*n\fP's command-line options,
  422. relying on directives embedded in the grammar file.
  423. .bP
  424. Bison's \*(``\fB\-y\fP\*('' option does not affect bison's lack of support for
  425. features of AT&T yacc which were deemed obsolescent.
  426. .bP
  427. \fBYacc\fP accepts multiple parameters
  428. with \fB%lex\-param\fP and \fB%parse\-param\fP in two forms
  429. .Ex
  430. {type1 name1} {type2 name2} ...
  431. {type1 name1, type2 name2 ...}
  432. .Ee
  433. .IP
  434. Bison accepts the latter (though undocumented), but depending on the
  435. release may generate bad code.
  436. .bP
  437. Like bison, \fB\*n\fP will add parameters specified via \fB%parse\-param\fP
  438. to \fByyparse\fP, \fByyerror\fP and (if configured for back-tracking)
  439. to the destructor declared using \fB%destructor\fP.
  440. Bison puts the additional parameters \fIfirst\fP for
  441. \fByyparse\fP and \fByyerror\fP but \fIlast\fP for destructors.
  442. \fBYacc\fP matches this behavior.
  443. .
  444. .SH SEE ALSO
  445. \fBbison\fP(1),
  446. \fBbtyacc\fP(1),
  447. \fBlex\fP(1),
  448. \fBflex\fP(1),
  449. \fByacc\fP(1)