ambiguity


JavaCC Ambiguities: How do I tell the parser to chose a certain match from the the list of “longer matches”?


For some input, the parser presents a "Possible kinds of longer matches : { <EXPRESSION>, <TEXT> }", but for some odd reason it chooses the wrong one.
This is the source:
SKIP :
{
" "
| "\r"
| "\t"
| "\n"
}
TOKEN :
{
< DOT : "." >
| < LBRACE : "{" >
| < RBRACE : "}" >
| < LBRACKET: "[" >
| < RBRACKET: "]" >
| < #LETTER : [ "a"-"z" ] >
| < #DIGIT : [ "0"-"9" ] >
| < #IDENTIFIER: < LETTER > (< LETTER >)* >
| < EXPRESSION : (< IDENTIFIER> < DOT > < IDENTIFIER> < DOT > < IDENTIFIER> ((< DOT > < IDENTIFIER> )* | < LBRACKET > (< DIGIT>)* < RBRACKET >)*)*>
| < TEXT : (( < DOT >)* ( < LETTER > )+ (< DOT >)*)* >
}
void q0() :
{Token token = null;}
{
(
< LBRACE > expression() < RBRACE >
| ( token = < TEXT >
{
getTextTokens().add( token.image );
}
)
)* < EOF >
}
void expression() :
{Token token = null;}
{
< EXPRESSION >
}
If we try to parse "a.bc.d" using this grammar it would say " FOUND A <EXPRESSION> MATCH (a.bc.d) "
My question is why did it choose to parse the input as an <EXPRESSION> instead of <TEXT>?
Also, how can I force the parser to choose the right path? I have tried countless LOOKAHEAD scenarios with no success.
The right path is for instance <TEXT> when using "a.bc.d" as input, and <EXPRESSION> for "{a.bc.d}".
Thanks in advance.
From the JavaCC FAQ:
If more than one regular expression describes the longest possible
prefix, then the regular expression that comes first in the .jj file
is used.
So a preference can be established by ordering ambiguous definitions accordingly.
If expressions only appear within { braces }, only expressions (and white space) appear in braces, and braces are only used to delimit expressions, then you can do something like the following. See question 3.11 in the faq, if you are not familiar with lexical states.
// The following abbreviations hold in any state.
TOKEN : {
< #LETTER : [ "a"-"z" ] >
| < #DIGIT : [ "0"-"9" ] >
| < #IDENTIFIER: < LETTER > (< LETTER >)* >
}
// Skip white space in either state
<DEFAULT,INBRACES> SKIP : { " " | "\r" | "\t" | "\n" }
// The following are recognized in the default state.
// A left brace forces a switch to the INBRACES state.
<DEFAULT> TOKEN : {
< DOT : "." >
| < LBRACE : "{" > : INBRACES
| < LBRACKET: "[" >
| < RBRACKET: "]" >
| < TEXT : (( < DOT >)* ( < LETTER > )+ (< DOT >)*)* >
}
// A right brace forces a switch to the DEFAULT state.
<DEFAULT, INBRACES > TOKEN {
< RBRACE : "}" > : DEFAULT
}
// Expressions are only recognized in the INBRACES state.
<INBRACES> TOKEN : {
< EXPRESSION : (< IDENTIFIER> < DOT > < IDENTIFIER> < DOT > < IDENTIFIER> ((< DOT > < IDENTIFIER> )* | < LBRACKET > (< DIGIT>)* < RBRACKET >)*)*>
}
It looks a bit dodgy that DOT is defined in one state and used in another. However, I think that it works fine.

Related Links

Ambuigity with reserved keyword (?)
Semantics-directed parser combinators
How to iterate “along” a Marpa parse forest rather than “through” its parse trees?
Checking Ambiguious Grammar
JavaCC Ambiguities: How do I tell the parser to chose a certain match from the the list of “longer matches”?
Ambiguity of function overloading - Integers vs. Doubles

Categories

HOME
flask
weblogic12c
applescript
hid
code-formatting
yql
header
ipv6
capistrano
stored-procedures
sharepoint-online
siesta
angular4
xbap
smartgwt
sonata
heap-dump
exe
environment
fusetools
webviewclient
background-color
list-comprehension
openbravo
annotation-processing
cube
sales
materialize
dnsmasq
bootstrap-sass
polyml
busboy
android-alarms
microsoft-ui-automation
cheat-engine
export-to-pdf
right-click
ejml
oracle-bmcs
realstudio
xcglogger
brightscript
uiautomatorviewer
sonatype
aurelia-fetch-client
gtk#
asp.net-web-api-routing
d3v4
gawk
elastic4s
bigdecimal
decoder
nunit-console
system.net.mail
seccomp
qpixmap
rails-engines
jeasyui
gulp-uglify
vimperator
taco
flush
json-spirit
chicagoboss
extend
jnlp
apple
wso2ml
fuseesb
subfolder
innerhtml
android-audiomanager
fiware-bosun
smacss
tabris
mobile-country-code
siena
screwturn
meteor-collections
xulrunner
sensormanager
mousemotionlistener
acceptance-testing
jquery-lazyload
flv
mongo-jackson-mapper
ui-design
daap
server-name
jzmq
acpi
anonymous-types
reliability
android-input-method
.net-services
zend-test
yahoo-maps
main-method
code-camp

Resources

Encrypt Message