From: Erick Tryzelaar Date: Sun, 30 Mar 2008 19:14:31 +0000 (+0000) Subject: Fix some documentation for the tutorial. X-Git-Url: http://plrg.eecs.uci.edu/git/?p=oota-llvm.git;a=commitdiff_plain;h=d564686dff04c329285a68308cdf84df7afd5e37 Fix some documentation for the tutorial. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48966 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/docs/tutorial/OCamlLangImpl1.html b/docs/tutorial/OCamlLangImpl1.html index 4b252a411ea..c7b0954021e 100644 --- a/docs/tutorial/OCamlLangImpl1.html +++ b/docs/tutorial/OCamlLangImpl1.html @@ -219,15 +219,15 @@ type token =

Each token returned by our lexer will be one of the token variant values. -An unknown character like '+' will be returned as Kwd '+'. If the -curr token is an identifier, the value will be Ident s. If the -current token is a numeric literal (like 1.0), the value will be -Number 1.0. +An unknown character like '+' will be returned as Token.Kwd '+'. If +the curr token is an identifier, the value will be Token.Ident s. If +the current token is a numeric literal (like 1.0), the value will be +Token.Number 1.0.

The actual implementation of the lexer is a collection of functions driven -by a function named lex. The lex function is called to -return the next token from standard input. We will use +by a function named Lexer.lex. The Lexer.lex function is +called to return the next token from standard input. We will use Camlp4 to simplify the tokenization of the standard input. Its definition starts as:

@@ -245,13 +245,13 @@ let rec lex = parser

-lex works by recursing over a char Stream.t to read +Lexer.lex works by recursing over a char Stream.t to read characters one at a time from the standard input. It eats them as it recognizes -them and stores them in in a token variant. The first thing that it -has to do is ignore whitespace between tokens. This is accomplished with the +them and stores them in in a Token.token variant. The first thing that +it has to do is ignore whitespace between tokens. This is accomplished with the recursive call above.

-

The next thing lex needs to do is recognize identifiers and +

The next thing Lexer.lex needs to do is recognize identifiers and specific keywords like "def". Kaleidoscope does this with this a pattern match and a helper function.

@@ -300,8 +300,8 @@ and lex_number buffer = parser

This is all pretty straight-forward code for processing input. When reading a numeric value from input, we use the ocaml float_of_string function -to convert it to a numeric value that we store in NumVal. Note that -this isn't doing sufficient error checking: it will raise Failure +to convert it to a numeric value that we store in Token.Number. Note +that this isn't doing sufficient error checking: it will raise Failure if the string "1.23.45.67". Feel free to extend it :). Next we handle comments:

diff --git a/docs/tutorial/OCamlLangImpl2.html b/docs/tutorial/OCamlLangImpl2.html index 2aff51a030d..7d60aa6f9fd 100644 --- a/docs/tutorial/OCamlLangImpl2.html +++ b/docs/tutorial/OCamlLangImpl2.html @@ -240,13 +240,13 @@ error", where if the token before the ?? does not match, then Stream.Error "parse error" will be raised.

2) Another interesting aspect of this function is that it uses recursion by -calling parse_primary (we will soon see that parse_primary can -call parse_primary). This is powerful because it allows us to handle -recursive grammars, and keeps each production very simple. Note that -parentheses do not cause construction of AST nodes themselves. While we could -do it this way, the most important role of parentheses are to guide the parser -and provide grouping. Once the parser constructs the AST, parentheses are not -needed.

+calling Parser.parse_primary (we will soon see that +Parser.parse_primary can call Parser.parse_primary). This is +powerful because it allows us to handle recursive grammars, and keeps each +production very simple. Note that parentheses do not cause construction of AST +nodes themselves. While we could do it this way, the most important role of +parentheses are to guide the parser and provide grouping. Once the parser +constructs the AST, parentheses are not needed.

The next simple production is for handling variable references and function calls:

@@ -345,12 +345,12 @@ let main () =

For the basic form of Kaleidoscope, we will only support 4 binary operators (this can obviously be extended by you, our brave and intrepid reader). The -precedence function returns the precedence for the current token, -or -1 if the token is not a binary operator. Having a Hashtbl.t makes -it easy to add new operators and makes it clear that the algorithm doesn't +Parser.precedence function returns the precedence for the current +token, or -1 if the token is not a binary operator. Having a Hashtbl.t +makes it easy to add new operators and makes it clear that the algorithm doesn't depend on the specific operators involved, but it would be easy enough to eliminate the Hashtbl.t and do the comparisons in the -precedence function. (Or just use a fixed-size array).

+Parser.precedence function. (Or just use a fixed-size array).

With the helper above defined, we can now start parsing binary expressions. The basic idea of operator precedence parsing is to break down an expression @@ -376,19 +376,19 @@ and parse_expr = parser -

parse_bin_rhs is the function that parses the sequence of pairs for -us. It takes a precedence and a pointer to an expression for the part that has been -parsed so far. Note that "x" is a perfectly valid expression: As such, "binoprhs" is -allowed to be empty, in which case it returns the expression that is passed into -it. In our example above, the code passes the expression for "a" into -ParseBinOpRHS and the current token is "+".

+

Parser.parse_bin_rhs is the function that parses the sequence of +pairs for us. It takes a precedence and a pointer to an expression for the part +that has been parsed so far. Note that "x" is a perfectly valid expression: As +such, "binoprhs" is allowed to be empty, in which case it returns the expression +that is passed into it. In our example above, the code passes the expression for +"a" into Parser.parse_bin_rhs and the current token is "+".

-

The precedence value passed into parse_bin_rhs indicates the -minimal operator precedence that the function is allowed to eat. For -example, if the current pair stream is [+, x] and parse_bin_rhs is -passed in a precedence of 40, it will not consume any tokens (because the -precedence of '+' is only 20). With this in mind, parse_bin_rhs starts -with:

+

The precedence value passed into Parser.parse_bin_rhs indicates the +minimal operator precedence that the function is allowed to eat. For +example, if the current pair stream is [+, x] and Parser.parse_bin_rhs +is passed in a precedence of 40, it will not consume any tokens (because the +precedence of '+' is only 20). With this in mind, Parser.parse_bin_rhs +starts with:

@@ -497,10 +497,10 @@ context):

has higher precedence than the binop we are currently parsing. As such, we know that any sequence of pairs whose operators are all higher precedence than "+" should be parsed together and returned as "RHS". To do this, we recursively -invoke the parse_bin_rhs function specifying "token_prec+1" as the -minimum precedence required for it to continue. In our example above, this will -cause it to return the AST node for "(c+d)*e*f" as RHS, which is then set as the -RHS of the '+' expression.

+invoke the Parser.parse_bin_rhs function specifying "token_prec+1" as +the minimum precedence required for it to continue. In our example above, this +will cause it to return the AST node for "(c+d)*e*f" as RHS, which is then set +as the RHS of the '+' expression.

Finally, on the next iteration of the while loop, the "+g" piece is parsed and added to the AST. With this little bit of code (14 non-trivial lines), we @@ -705,7 +705,7 @@ course.) To build this, just compile with:

# Compile ocamlbuild toy.byte # Run -./toy +./toy.byte