Mealy J
Table of Contents
These are notes for help when I again forget again how to use J's ;:
dyad.
;: J's sequential machine operator
Hairy. The operation x ;: y
recognizes string \(y\) according to
machine \(x\). This verb encodes Mealy machines in J. Recall, (lol) a
Mealy machine is a 5-tuple \((S,S_0,\Sigma,\Lambda,T)\):
- \(S\) is the state space, (
ijr
in;:
docs) - \(S_0 \in S\) the initial state, the optional fourth component to
x
- \(\Sigma\) is the input alphabet, the classes of atoms of
y
determined bym
. - \(\Lambda\) is the output alphabet, the actions (specefied by code
f
) - \(T\) is a transition function \(T : S \times \Sigma \rightarrow S
\times \Lambda\) corresponding to
sc
table. Other presentations split this function in two functions \(T,G\) (see wiki's Mealy machines).
Parts of a machine \(x\)
The machine specification is given by x=. f ; sc; m; ijrd
, where the
last two components are optional.
Function code \(f\)
\(f\) is the function code, which determines what actions are taken (ie, the output style). For example, recognized words may be boxed using \(0\) or the intermediate states may be traced using \(5\).
\(f\) | definition ew/ev | comments | advice on use |
---|---|---|---|
0 | <y{~j+i.i-j |
box current word | words themselves are of interest |
1 | y{~j+i.i-j |
current word (no box) | some words omitted others retained |
2 | j,i-j |
word start & length | if words need further computation (see huffman lab) |
3 | c+q*r |
state table index | '' … |
4 | j,(i-j),c+q*r |
combo of 2/3 | '' … |
5 | i,j,r,c,s{~<r,c |
full trace | debugging |
It is said (in a jqt lab) that ev
may be used to form vector
constants but that it is not used often.
State transitions \(sc\)
\(sc\) describes the transition table. Items in the table are indexed by
pairs \(r,c\) where \(r\) is the current state and \(c\) the input
character's class. The entry at index \(r,c\) is a pair consisting of a
new state and an output action \(r',a\). The actual machine state is
comprised of an input pointer \(i\) of the current character in \(y\), a
word pointer \(j\) of the word begin (or _1
for no beginning), row
pointer \(r\) to state, and a column pointer \(d\) specifying an
additional action to take at termination.
code | action | comment |
---|---|---|
0 | no out | do nothing |
1 | j =. i |
start recognizing word or vector |
2 | j =. i [ ew (i,j,r,c) |
end recognizing word |
3 | j =. _1 [ ew (i,j,r,c) |
reset j (eg to drop spaces) |
4 | j =. i [ ev (i,j,r,c) |
analogue for emit vector |
5 | j =. _1 [ ev (i,j,r,c) |
analogous, again |
6 | stop |
Basically, we have a graph with vertices in domain \(r\) and edges labelled by character class \(c\), decorated with output actions \(a\) which accumulate machine state.
Letter Classes \(m\) (optional – default y
)
\(m\) is a description of character classes. It is either an array
mapping character to class or a list of boxed characters. In first
example in lab, Roger Hui maps \(m\) as ' '=a.
so that class 0 denotes
non-space chars and class 1 denotes spaces.
Seed \(ijrd\) (optional – default 0 _1 0 _1
)
Seed initial machine state with index pointer \(i\), word pointer \(j\), state \(r\) and \(d\) can specify an additional action \(a\) to take after recognition or not if negative.
Examples
Graphviz
First, some very shitty capabilities to visualize the machines. The color scheme corresponds to the actions \(a\), where
- gray => do nothing
- orange => bump start pointer
- green => emit word/vector and bump index
- blue => emit word/vector and reset index
- red => error/stop
CLR=: ;: 'dimgray darkorange seagreen royalblue seagreen royalblue maroon' dotviz =: 3 : 0 'S C M N'=. y G=. 'digraph {',LF,' overlap=scale;',LF for_uc. , { (<@i."0) 2 {. $ M do. 'u c' =. > uc [ 'v a' =. uc { M G=. G,' ',(>u{S),'->',(>v{S) G=. G,'[headlabel="',(>c{C),'",color=',(>a{CLR) G=. G,',labeldistance=1.8,style=dashed];',LF end. G=. G,'}',LF G 1!:2 < N,'.dot' 2!:0 'dot -Tsvg ',N,'.dot -o ../images/',N,'.svg' echo link './../images/',N,'.svg' )
Challenge 1 from machine lab
Write a word machine that excludes spaces (the first example in lab tokenized words but had leading spaces).
C=: ' '=a. M=: 2 2 2 $ 1 1 0 0 1 0 0 3 echo (0;M;C);:' four score and no more ice '
┌────┬─────┬───┬──┬────┬───┐ │four│score│and│no│more│ice│ └────┴─────┴───┴──┴────┴───┘
Dot viz
dotviz ((;:'out in') ,&< 'abc';' '),M;'word-machine'
S-expressions
First, the machine. The char classes are whitespace (with LF
separate to end comments), brackets ()
, string quotes "
, escapes,
and semicolons. There are states for being in limbo, being in token,
being in quote, escaping in quote, opening and closing paren, and
lastly comment.
LC =: 'ws' ;'()';'\"';'\\';';';'nl' ;'#';'''';'abc' C =: (' ',TAB);'()';'"' ;'\' ;';';(LF,CR);'#';'''' A =: (1 I.~ C e.&>~ ])"0 a. LV =: 'limbo';'symbol';'string';'escape';'paren';'comment';'hash';'quote' M =: 8 9 2 $ , (". ;. _2) 0 : 0 0 1 4 1 2 1 0 6 5 1 0 1 6 1 7 1 1 1 NB. neutral 0 3 4 2 2 2 1 0 5 2 0 3 0 6 7 2 1 0 NB. tok 2 0 2 0 0 3 3 0 2 0 2 0 2 0 2 0 2 0 NB. quot/" 2 0 2 0 2 0 2 0 2 0 2 0 2 0 2 0 2 0 NB. esc/\ 0 3 4 2 2 2 0 6 5 2 0 3 6 2 7 2 1 2 NB. bark/() 5 0 5 0 5 0 5 0 5 0 0 3 5 0 5 0 5 0 NB. comment 0 6 4 2 0 6 1 0 0 6 0 6 0 6 7 2 1 2 NB. # 0 3 4 2 4 2 0 6 5 2 0 3 6 2 7 2 1 2 NB. ' )
The above leads to a concise s-expression parser, which produces a depth vector along with a boxed array of tokens. To produce a sensible tree, the closing parens are dropped and the opening parens are used to signify list begins.
parse =: 3 : 0 m=. 0 <: p=. (;:'()') -/@(=/) t=. (0;M;A) ;: y t ,~&<&(m&#) (+/\ - 1&=) p )
Some examples:
sexps =: '((())())';'(a"(b\")" c) x';'(a ''(#\b)); (no comment)' echo (;:'sexps depths tokens') , (,. parse &>) sexps
┌────────────────────────┬───────────┬────────────────────────────┐ │sexps │depths │tokens │ ├────────────────────────┼───────────┼────────────────────────────┤ │((())()) │0 1 2 1 │┌─┬─┬─┬─┐ │ │ │ ││(│(│(│(│ │ │ │ │└─┴─┴─┴─┘ │ ├────────────────────────┼───────────┼────────────────────────────┤ │(a"(b\")" c) x │0 1 1 1 0 │┌─┬─┬──────┬─┬─┐ │ │ │ ││(│a│"(b\")│c│x│ │ │ │ │└─┴─┴──────┴─┴─┘ │ ├────────────────────────┼───────────┼────────────────────────────┤ │(a '(#\b)); (no comment)│0 1 1 1 2 0│┌─┬─┬─┬─┬───┬──────────────┐│ │ │ ││(│a│'│(│#\b│; (no comment)││ │ │ │└─┴─┴─┴─┴───┴──────────────┘│ └────────────────────────┴───────────┴────────────────────────────┘
Dot viz
dotviz (LV ,&< LC),M;'sexp'
J-expressions
J lexer, from machine lab slide 16/17.
LC =: 'ws';'abc';'012';'.';':';'\''';'+/-' LV =: ;: 'space jascii alpha numeric quote unquote' CS=: ' ';(a.{~,65 97+/i.26);'0123456789_';'.';':';'''' t=. 0 7 2$0 NB. S A 9 D C Q X t=.t,_2]\ 0 0 2 1 3 1 1 1 1 1 4 1 1 1 NB. 0 space t=.t,_2]\ 0 3 2 2 3 2 1 0 1 0 4 2 1 2 NB. 1 other t=.t,_2]\ 0 3 2 0 2 0 1 0 1 0 4 2 1 2 NB. 2 alphanumeric t=.t,_2]\ 0 5 3 0 3 0 3 0 1 0 4 4 1 4 NB. 3 numeric t=.t,_2]\ 4 0 4 0 4 0 4 0 4 0 5 0 4 0 NB. 4 quote t=.t,_2]\ 0 3 2 2 3 2 1 2 1 2 4 0 1 2 NB. 5 even quotes JM=: t dotviz (LV ,&< LC) , JM ; 'J'
CSV Parser
C=: (1 I.~ ('"';',';LF) e.&>~ ])"0 a. NB. states: 0 1 2 3 NB. limbo field quoted field escaped quote M =: 4 4 2 $ , ". ;. _2 ] 0 : 0 2 1 0 2 0 3 1 1 NB. limbo 0 6 0 2 0 3 1 0 NB. field 3 0 2 0 2 0 2 0 NB. quoted field, quote escapes self 2 0 0 2 0 3 2 0 NB. escaped quote or end of quoted field ) LC =: '\"';',';'LF';'abc' LS =: 'limbo';'field';'qfield';'esc' dotviz (LS ,&< LC),M;'nginx'
advent of code 2015 day 8
load '~/code/aoc/aoc.ijs' LC =: '\"';'\\';'x';'hex';'abc' CS=: (1 I.~ ('"';'\';'x';'0123456789abcdef') e.&>~ ])"0 a. LS =: 'string';'letter';'esc';'hexx';'hex' AM =: 5 5 2 $ , ". ;. _2 ] 0 : 0 1 1 0 6 0 6 0 6 0 6 NB. parse string or stop 0 2 2 0 1 2 1 2 1 2 NB. letter 1 2 1 2 3 0 0 6 0 6 NB. emit/do hex/bad 0 6 0 6 0 6 4 0 0 6 NB. read 1 hex/bad 0 6 0 6 0 6 1 2 0 6 NB. finish hex/bad ) dotviz (LS ,&< LC),AM;'aoc-2015-8'