[ 3 / biz / cgl / ck / diy / fa / g / ic / jp / lit / sci / tg / vr / vt ] [ index / top / reports / report a bug ] [ 4plebs / archived.moe / rbt ]

Due to resource constraints, /g/ and /tg/ will no longer be archived or available. Other archivers continue to archive these boards.Become a Patron!

/sci/ - Science & Math


View post   

[ Toggle deleted replies ]
>> No.7899496 [View]
File: 8 KB, 446x195, abstract syntax tree.png [View same] [iqdb] [saucenao] [google] [report]

Parsing is one of the steps of compilation. Formally a compiler is a function/program that takes input in one language and outputs in another language.

I'm going to skip many steps and details. The first step is typically lexing. A lexer for a language is a function that takes a string as input and
>checks that the string is in the language
>converts the string into tokens representing the atoms of your target language (perhaps capturing some amount of semantics, like having a separate token for a subtraction vs negative sign).
A parser for a language is a function takes a sequence/list of tokens and outputs an abstract syntax tree.
After this the compiler will take the abstract syntax tree and convert it to the new language.

An abstract syntax tree is an abstract representation of your input string (see picture for a simple example) that attempts to capture semantics. It is not to be confused with a parse tree. The abstract syntax tree is typically written in the form
>((node1 (node2 node4 ) node3 (node 5 node6 (node7 node8)))
where the basic structure is
>(root leg1 leg2 ... legk)
where each leg can also be a root (and thus a subtree).

Lisp is an unusual language because it is actually written in it's own abstract syntax tree (hence all the parenthesis). The time complexity of the parser is affected by the properties of the language. Also, being able to prove that the language produces a unique abstract syntax tree (i.e. that given a sentence there exists only a unique way to parse that sentence with regards to semantics) is also affected by said properties. In the past this has caused problems with ambiguity.

My explanation may not be very good and others may have more to add but that's the general idea. Pic related, an abstract syntax tree and corresponding input string.

View posts [+24] [+48] [+96]