커뮤니티
컴파일러자료
제목:    Lex program tips
  980   김윤중

IISPLParserGenerator V6.0

Summary of Lex language

  • Alphabet : 99 letters  : \n \t \r \x20( )-\x7F(~)            
  • Operators  \ " . ^ $ [ ] - * + | ? { } ( ) / ' ,
             example [0-9a-zA-Z] [0-9]+ [a-z]*([a-z]|[0-9])*
             example  (\+|\-)?[0-9]+(\.[0-9]+)?(E(\+|\-)[0-9]+)?
  • c  the non-operator character
    • a : {a}
  • '  numerical character
    • '\x31'  '\x41' '\xa' 
  • \c,[c],'c' character c literally
    • \a 'a' [a]
  • "s" string s literally
    • "*" : {*},   "+=" : {+=}
  • .   any any single character but new line[\n\r]
    • a.c   : {aac,abc,acc,adc, ... }
      [a.c] : {a,.,c}
       .at   : {hat, cat, bat ... }
  • [s] a character set, any one of the characters in string s
    •  [abc] : {a,b,c}
       [a-z] : {a,b,c,d,e,..., z}
       [abcx-z] : {a,b,c,x,y,z}  = [a-cz-z]
       [".$[*+|?{}()/'\^\-\]] :{",.,$,[,*,+,|,?,{,},(,),/,',^,-,],} 
                but   [\\], [\], [-], [^] are not allowed
       [hc]at : {hat,cat}
    • In this character set, only three characters -,^,] are treated as operator so  they must be writen with escape character such as \^,\^,\]
  • [^s] any one character not in string s. same as the complement of the character set L(s), that is Alphabet-L(s)
    • if Alphabet is [A-Za-z], then
          [^ABC] : {D,E,,...Z,a,b,...z}
    • [^a-z] : {A,B,.., Z}
    • [^hc]at : all strings of .at other than {hat,cat}  : L(.at) - {hat,cat}
  • ^   a starting position of any line or string
    • ^[hc]at  : {hat,cat}, but only at the begining of the string or line
  • $   the ending position of the string or the position just before a string-ending newline. 
    • In line based tools, it matches the ending position of any line.
       [hc]at$  : {hat,cat}, but only at the end of the string or line : {hat\n,cat\n}
  • r*  zero or more strings maching r
    • a* : { ,a,aa,aaa, ... }
  • r+  one or more strings maching r
    • a+ : {a,aa,aaa, ... }
  • r?  zero or one r
    • a? : { , a}
  • r{m,n} : between m and n occurrences of r
    • a[1,5] : {a,aa,aaa,aaaa,aaaaa}
  • rs  an r followed s
    • ab : {ab}
  • r|s  an r or an s
    • a|b : {a,b}
  • (r) same as r
    • (a|b) : {a,b}
  • r/s r when wollowed by s
  • abc/123  : { abc}  flowed by 123
  • // line comment
    • // cny characters to \n
  • %{  anycharacters %} multiple line comment
    • %{
               multiple line comments
               this is copied on to the target program
               any string “%}“ is not allowed
      %}

 

Lexical Analyzer generator

  • %{
              comment lines
             any character strings without “%}“
    %}
    // comment
    def1 re
    def2 re
    %%
    //comment
    re     { yyoval=bexbuf; return("ID"); }
    .|[\n\r]   { return("WS"); }
    %%
    aux procedures
    ...