Term
What are top-down and bottom-up parsing? |
|
Definition
- Top-down:
- Predictive parsing
- Start with start symbol
- Try to derive string
- Bottop-up:
- Start with string
- Try to arrive at start symbol
- Use productions in reverse order
|
|
|
Term
What is right-to-left/left-to-right parsing? |
|
Definition
- Left-to-right
- Examine source string from left-to-right
- Right-to-left
- Start with last symbol and work your way backwards
- Not used for commpilers due to efficiency reasons
|
|
|
Term
|
Definition
- First L means Left-to-right scan
- Second L means it constructs Leftmost defivation
- k is the k-symbol lookahead
|
|
|
Term
What are the limitations using regular languages? |
|
Definition
Describing {"()", ...} using a regular expression is impossible as DFA statse have no memory
That is why Chomsky is used |
|
|
Term
What is the Chomsky hierarchy? |
|
Definition
- Type 3: Regular Languages
- Regular expressions, finite automata
- Type 2: Context-free languages
- Type 1: Context-Sensitive Languages
- Type 0: Recursive Enumerable
|
|
|
Term
What are the derivations and what type of derivations are there? |
|
Definition
Start with string S, select non-terminal n in string and one rule n -> s1,...sk and replace n by s1...sk - and repeat until you reach a string w with only terminal symbols
There can be Leftmost, rightmost derivations |
|
|
Term
|
Definition
- Leaves
- Terminal symbols
- Frontier: Generated string
- Internal nodes
- Non-terminal symbols
- Children: ordered, obtained using production rule
- Root
|
|
|
Term
What are ambiguous grammars? |
|
Definition
If two or more distinct parse trees exist for some string (Right and leftmost derivations might produce this) |
|
|
Term
|
Definition
All productions of form A -> xB or A -> x
Regex/DFA/NFA |
|
|
Term
How can we define parsing? |
|
Definition
- Given a string and a CFG, find a derivation for the string from the grammar or determine that no derivation exists
- Additional restrictions on CFGs are often required to facilitate efficient parsing
- Scan direction (right/left)
- Leftmost derivation -> reduces number of derivations
- Still ambiguous grammars exist
|
|
|
Term
What are the main principles of top-down predictive parsing? |
|
Definition
- Begins with the start symbol
- Parser code can be derived from the production rules of the grammar - one function for each nonterminal
- Left-to-right scanning
- Constructs a Left-most derivation
- Predict which production to apply based on current token and lookahead of next k tokens
- No backtracking -> requires unambiguous grammar
- LL(k) grammars are unambiguous
- Allows parsing in linear time
- Simple algorithm -> easy to understand and implement
|
|
|
Term
Show an example of a top-down parsing |
|
Definition
|
|
Term
How would you write this as a method in java/c?
[image] |
|
Definition
- void S() {
- switch (tok) {
- case IF: eat(IF); E(); eat(THEN); S(); eat(ELSE); S(); break;
- case BEGIN: eat(BEGIN); S(); L(); break;
- case PRINT: eat(PRINT); E(); break;
- default: error()
- }
- }
|
|
|
Term
What is recursive descent parsing? |
|
Definition
A type of top-down algorithm where we begin with the non-terminal, and we always try the rules for the non-terminal in order - needs backtracking in order to continue with the rules if one doesn't work.
Example with grammar:
E -> T | T + E
T -> int | int * T | (E)
Reading input (int)
We start with (:
E would take us to T, and T would take us to int. int does not match '(', so we backtrack to T and we try again with the next one which is int * T, but it does not match either. So we backtrack with the last one and we get (E). '(' does match '(', so we go to the next element, which is int. We have E so we expand it to T, then T is expanded to int - indeed int = int, so we proceed to the final one ')', which matches ')'. |
|
|
Term
|
Definition
- It is the set of terminal symbols that begin strings produced by y
- For a grammar with ε-productions, this is straightforward:
- FIRST(s) = {s} for s in T
- FIRST(A) = U First(B) for all A -> B
- FIRST(Ax) = FIRST(A)
- FIRST(xA) = FIRST(x)
|
|
|
Term
|
Definition
- It is the set of terminal symbols t that can immediately follow X, i.e. there exists a derivation from S -> ... -> aXnB
|
|
|
Term
|
Definition
Is true if non-terminal X can produce ε |
|
|
Term
What is the algorithm to Compute FIRST? |
|
Definition
- For all non-terminals X do First(X) := {};
- For all terminals t do FIRST(t) := {t};
- DO
- For every production X->s1...sk
- FIRST(X) := FIRST(X) U FIRST(S1);
- For i :=1 to k=1 do
- if nulllable(s1) then
- else exit;
- UNTIL NO MORE CHANGES
|
|
|
Term
What is the algorithm to compute FOLLOW? |
|
Definition
- for all non-terminals X do FOLLOW(X) := {};
- DO
- for every non-terminal X
- for each production of the form A->yXB
- FOLLOW(X) :=
- if nullable(B) then
- FOLLOW(X) :=
- UNTIL NO MORE CHANGES
|
|
|
Term
What is the algorithm to compute nullable(X)? |
|
Definition
- for all symbols x
- DO
- change := false
- for every production X -> s1, ... sk
- if nullable(X) = false and s1 ...sk are nullabl
- nullable(X) := true;
- change := true;
- UNTIL CHANGE = FALSE
|
|
|
Term
How to construct the parser? |
|
Definition
- Make table of applicable productions
- Rows: non-terminals X
- Columns: terminal symbols c (= next input token
- Production X-y is applicable iff
- c in FIRST(y) or
- NULLABLE(y) and c in FOLLOW(X)
- If more than one applicable production for some pair(X, c), then grammar is not LL(1).
- If no conflicts, to translate to java just one procedure per terminal as done in previous examples
|
|
|
Term
What are the main principles of predictive parsing? |
|
Definition
- Compute Nullable, FIRST, and FOLLOW sets
- Compute predictive parsing table
- Generate parsing code if no conflicts(i.e. no duplicate entries in the parsing table)
|
|
|
Term
Give a summary of the implemenations of nullable, FIRST and FOLLOW |
|
Definition
|
|
Term
Give a summary of the predictive parsing table |
|
Definition
|
|
Term
What are the problems with recursive descent parsing? |
|
Definition
- Ambiguous grammars
- Always lead to duplicate entres
- Either rewrite grammar without ambiguity or add a "disambiguating" rule:
- Where A -> x and A -> B are both applicable always use A -> x
|
|
|
Term
When is grammar not LL(1)? |
|
Definition
- Let A -> X|B be productions of grammar G
- G is LL(1) iff:
- For no t in T, do both X and B derive strings that begin with t
- At most one of X and B is nullable
- If B is nullable, then X does not derive strings that begin with a in FOLLOW(A)
|
|
|
Term
What can error recovery be implemented? |
|
Definition
- Raise exception
- Insertion
- e.g. print error, pretend everything ok
- Deletion
- Skil until a token in FOLLOW is reached
|
|
|