CPSC 3520 - DAY 4 JANUARY 23, 2018 ================================================================================ COCKE-YOUNGER-KASAMI (CYK) PARSING ALGORITHM -------------------------------------------- The CYK algorithm is a parsing approach which will parse string x in a number of steps proportional to |x|^3. The CYK algorithm requires the CFG to be in the Chomsky Normal Form (CNF) With this restriction, the derivation of any string involves a series of binary decisions. In chomsky normal form, each production of G must be in the form of either: A -> BC or A -> a Given a string x = x_1, x_2,...,x_n, where x_i E V_T, |x| = n, and a grammar G, we form a triangular table with entries t_ij indexed by i and j where 1 <= i <= n and 1 <= j <= (n - i + 1). The origin is at i = j = 1, and entry t_11 is the lower left hand entry in the table. t_1n is the uppermost entry in the table. This structure is shown below for the case n = 4 ^ | (Strings of length 4) 4|+-----+ || t_14| |+-----+ (Strings of length 3) 3|+-----++-----+ || t_13|| t_23| |+-----+------+ (Strings of length 2) 2|+-----++-----+-----+ || t_12|| t_22| t_32| |+-----+------+-----+ (Strings of length 1) 1|+-----++-----+-----+-----+ || t_11|| t_21| t_31| t_41| |+-----+------+-----+-----+ +-----------------------------------> 1 2 3 4 The CYK parse table is built, starting from location (1, 1). If a substring of x, beginning with x_i, and of length j can be derived from a nonterminal, this nonterminal is placed into cell (i, j) If cell (1, n) contains S, the table contains a valid derivation of x in L(G) It is convenient to the list the x_i, starting with i = 1, under the bottom row of the table. SAMPLE GRAMMAR PRODUCTIONS -------------------------- S -> AB | BB A -> CC | AB | a B -> BB | CA | b C -> BA | AA | b x = aabb will be the string we are parsing. ^ | (Strings of length 4) 4|+-----+ ||A,S,B| |+-----+ (Strings of length 3) 3|+-----+-----+ ||C,A |S,A,C| |+-----+-----+ (Strings of length 2) 2|+-----+-----+-----+ || C | S, A|S,B,A| |+-----+-----+-----+ (Strings of length 1) 1|+-----+-----+-----+-----+ || A | A | B, C| B, C| |+-----+-----+-----+-----+ +-----------------------------------> a a b b