CPSC 3520 - DAY 2 JANUARY 16, 2018 ================================================================================ Software development is relatively expensive and time-consuming A possible point of view: software, not hardware, is holding back advances in computing. Programming language syntax defines the allowable arrangement of symbols in programs, especially program fragments. For example, the syntax may constrain fragments to have matching parentheses, ::= main ::= int | void ::= lparens rparens Syntax also catalogs the basic elements of the language. Formal Grammars --------------- Grammers generate languages The syntax of a programming language may be described through specification of a grammar There are many types of grammars Part of a compiler is a grammatical recognizer or parser. A compiler or interpreter for a higher level language attempts to generate lower level machine instructions by determining the desired structure of the input high level language program through parsing the input or source code. The compiler/ interpreter begins with processing of the program as an input string. This processing is based upon checking if the string meets a formal specification of the programming language syntax. An Alphabet and Forming Strings ------------------------------- An alphabet (V) is a finite, nonempty set of symbols, e.g.: V = {a, b, c, ... z} The concatenation of a and b, denoted a o b produces a sequence of two symbols simply denoted hereafter as ab. A string has a natural ordering from left to right. Often it is convenient to denote a string like x = aaa...a, where a symbol (or sequence of symbols) is repeated n times as x = a ^ n aabbbcccc = a^2 b^3 c^4 The Closure Set --------------- Define V+ as V+ = V U V^2 U V^3 U ... V+ is the set of all nonempty sentences producible using V. Adding the empty string to V+ produces V*, i.e., V* = {E} U V+ V* is denoted the closure (set) of V and V+ = V* - {E} is often called the positive closure of V Languages Using Strings ----------------------- Grammars are used with V to give some meaning to a subset of strings L is called a language Furthermore: Languages are generated by grammars. Another viewpoint is that a grammar restricts the production of strings from V. Grammars are used to recognize (parse) elements of a language. Grammar: Formal Definition -------------------------- A grammar G(V_T, V_N, P, S) consists of the following four entities: 1. A set of terminal or primitive symbols (primitives) denoted V_T (or sigma). These are the elemental building blocks of the grammar. 2. A set of non-terminal symbols, or variables, which are used as intermediate quantities in the generation of an outcome consisting solely of terminal symbols. The set is denoted as V_N or alternatively N. A sample G_s S -> AB S -> C A -> C A -> a B -> b B -> c C -> d Stop when you receive a set consisting solely of terminals.