a lot of stuff

author: Preston Pan <preston@nullring.xyz> 2024-05-02 23:25:48 -0700
committer: Preston Pan <preston@nullring.xyz> 2024-05-02 23:25:48 -0700
commit: 52978baab0274bc594c8fd3cc749624a475229e2 (patch)
tree: e33b19050afaef26e66ec78500e07ebf6ce0a05c /blog/cognition.org
parent: d6e2c196f799d0cd5bceb0b5c0260111e739c374 (diff)
1 files changed, 582 insertions, 0 deletions
diff --git a/blog/cognition.org b/blog/cognition.org
new file mode 100644
index 0000000..f331ba0
--- /dev/null
+++ b/blog/cognition.org
@@ -0,0 +1,582 @@
+#+title: Cognition
+#+author: Preston Pan
+#+description: Other languages are inflexible and broken. Let's fix that.
+#+html_head: <link rel="stylesheet" type="text/css" href="../style.css" />
+#+html_head: <link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png">
+#+html_head: <link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png">
+#+html_head: <link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png">
+#+html_head: <link rel="manifest" href="/site.webmanifest">
+#+html_head: <link rel="mask-icon" href="/safari-pinned-tab.svg" color="#5bbad5">
+#+html_head: <meta name="msapplication-TileColor" content="#da532c">
+#+html_head: <meta name="theme-color" content="#ffffff">
+#+html_head: <meta name="viewport" content="width=1000; user-scalable=0;" />
+#+language: en
+#+OPTIONS: broken-links:t
+
+* Introduction
+Cognition is an active research project that Matthew Hinton and I have been working on for the past
+couple of months. Although my commit history for [[https://github.com/metacrank/cognition][this project]] has not been impressive, we came up with
+a lot of the theory together, working alongside each other in order to achieve one of the most generalized
+systems of syntax we know of. Let's take a look at the conceptual reason why cognition needs to exist, as
+well as some /baremetal cognition/ code (you'll see what I mean by this later). There's a paper about this language
+available about the language in the repository, for those interested. Understanding cognition might require a
+lot of background in parsing, tokenization, and syntax, but I've done my best to write this in a very understandable way.
+The repository is available at https://github.com/metacrank/cognition, for your information.
+* The problem
+Lisp programmers claim that their system of s-expression code in addition to its featureful macro system makes it a
+metaprogrammable and generalized system. This is of course true, but there's something very broken with lisp: metaprogramming
+and programming /aren't the same thing/, meaning there will always be rigid syntax within lisp
+(its parentheses or the fact that it needs to have characters that tell lisp to /read ahead/). The left parenthesis tells
+lisp that it needs to keep on reading until the right parenthesis in order to finish some process that allows it to stop
+and evaluate the whole expression. This makes the left and right parenthesis unchangable from within the language (not
+conceptually, but under some implementations it is not possible), and, more importantly, it makes the process of retroactively
+changing the sequence in which these tokens are delimited /impossible/, without a heavy amount of string processing. Other
+langauges have other ways in which they need to read ahead when they see a certain token in order to decide what to do.
+This process of having a program read ahead based on current input is called /syntax/.
+
+And as long as you read ahead, or assume a default way of reading ahead, you fall into the trap of having some form of syntax.
+Cognition is different in that it uses an antisyntax that is fully /postfix/. This has similarities with concatenative
+programming languages, but concatenative programming langauges also suffer from two main problems: first, the introduction
+of the left and right bracket character (which is in fact prefix notation, as it needs to read ahead of the input stream),
+and the quote character for strings. This is unsuitable for such a general language. You can even see the same problem
+in lisp's C syntax implementation: escape characters everywhere, awkward must-have spaces delimit the start and end
+of certain tokens (and if not, it requires post-processing). The racket programming language has its macro system,
+but it is not /runtime dynamic/. It still utilizes preprocessing.
+
+So, what's the percise solution to this connundrum? Well, it's beautiful; but it requires some /cognition/.
+
+* Baremetal Cognition
+Baremetal cognition has a couple of perculiar attributes, and it is remarkably like the /Brainfuck/ programming language.
+But unlike its look-alike, it has the ability to do some /serious metaprogramming/. Let's take a look at what the
+bootstrapping code for a /very minimal/ syntax looks like:
+#+begin_example
+ldfgldftgldfdtgl
+df 
+ 
+dfiff1 crank f
+#+end_example
+And *do* note the whitespace (line 2 has a whitespace after df, and the newlines matter). Erm, okay. What?
+
+So, our goal in this post is to get from a syntax that looks like /that/ to a syntax that looks like [[file:stem.org][Stem]].
+But how on earth does this piece of code even work? Well, we have to introduce two new ideas: delimiters, and ignores.
+
+** Tokenization
+Delimiters allow the tokenizer to figure out when one token ends and another begins. The list of single character tokenizers
+is public, allowing that list to be modified and read from within cognition itself. Ignored characters are characters
+that are completely ignored by the tokenizer in the first stage of every read-eval-print loop; that is, at the start of
+collecting the token, it fist skips a set of ignored characters. By default, every single character is a delimiter, and
+no characters are ignored characters. The delimiter and ignored characters list allows you to toggle a flag to tell it
+to blacklist or whitelist the given characters, adding brevity (and practicality) to the language.
+
+Let's take the first line of code as an example:
+#+begin_example
+ldfgldftgldfdtgl
+#+end_example
+because of the delimiter and ignored rules set by default, every single character is read as a token, and no character
+is skipped. We therefore read the first character, ~l~. By default, Cognition works off a stack-based programming language
+design. If you're not familiar, see the [[file:stem.org][Stem blogpost]] for more detail (in fact if you're not familiar this /won't work/
+as an explanation for you, so you should see it, or read up on the /Forth/ programming language).
+Though, we call them /containers/, as they are more general than stacks. Additionally, in this default environment, /no/
+word is executed except for special /faliases/, as we will cover later.
+
+Therefore, the character ~l~ gets read in and is put on the stack. Then, the character ~d~ is read in and put on the stack.
+But ~f~ is different. In order to execute words in Cognition, we must take a look at the falias system.
+** Faliases
+Faliases are a list of words that get executed when they are put on the stack, or container as we will call it in the future.
+All of them in fact execute the equivalent of ~eval~ in stem but as soon as they are put on their container. Meaning, when
+~f~, the default falias, is run, it doesn't go on the container, but rather executes the top of the container which is ~d~.
+~d~ changes the delimiter list to the string value of a word, meaning that it changes the delimiters to /blacklist/ only
+the character ~l~ as a delimiter. Everything else by default is a delimiter because everything by default is parsed
+into single character words.
+** Delimiter Caveats
+Delimiters have an interesting rule, and that is that the delimiter character is excluded from the tokenized word
+unless we have not ignored a character in the tokenization loop, in which case we collect the character as a part of
+the current token and keep going. This is in contrast to a third kind of tokenization category called the singlet, which
+/includes/ itself into a token before skipping itself and ending the tokenization collection.
+
+In addition, remember what I said about the /blacklist/? Well, you can toggle between /blacklisting/ and /whitelisting/
+your list of delimiters, singlets, and ignored characters. By default, there are no /blacklisted/ delimiters, no
+/whitelisted/ singlets, and no /whitelisted/ ignored characters.
+
+We then also observe that all other characters will simply skip themselves while being collected as a part of the current
+token, without ending this loop, therefore collecting new characters until the loop halts via delimiter or singlet rules.
+** Continuing the Bootstrap Code
+So far, we looked at this part of the code:
+#+begin_example
+ldf
+#+end_example
+which simply creates ~l~ as a non-delimiter. Now, for the rest of the code:
+#+begin_example
+gldftgldfdtgl
+df 
+  
+dfiff1 crank f
+#+end_example
+~gldf~ puts ~gl~ on the stack due to ~d~ being a delimiter, and ~f~ is called on it, meaning that now ~g~ and ~l~ are
+the only non-delimiters. Then, ~tgl~ gets put on the stack and they become non-delimiters with ~df~. ~dtgl~ gets
+put on the stack, and the newline becomes the only non-delimiter with ~\ndf~ (yes, the newline is actually a part of
+the code here, and spaces need to be as well in order for this to work). Then, the space character, due to how delimiter
+rules work (if you don't ignore, the first character is parsed normally even if it is a delimiter)
+and ~\n~ gets put on the stack. Then, another ~\ \n~ word is tokenized (you might not see it, but there's another
+space on line 3). The current stack looks like this (bottom to top):
+#+begin_example
+3. dtgl
+2. [space char]\n
+1. [space char]\n
+#+end_example
+~df~ sets the non-delimiters to ~\ \n~. ~if~ sets the ignores to ~\ \n~, which ignores these characters at the start
+of tokenization. ~f~ executes ~dtgl~, which is a word that toggles the /dflag/, the flag that stores the whitelist/blacklist
+distinction for delimiters. Now, all non-delimiters are delimiters and all delimiters are non-delimiters.
+Finally, we're put in an environment where spaces and newlines are the delimiters for tokens, and they are ignored at the
+start of tokenizing a token. Next, ~1~ is tokenized and put on the stack, and then the ~crank~ word, which is then executed
+by ~f~ (the ~1~ token is treated as a number in this case, but everything textual in cognition is a word).
+We are done our bootstrapping sequence! Now, you might wonder what ~crank~ does. That we will explain in a later section.
+
+* Bootstrapping Takeaways
+From this, we see a couple principles: first, cognition is able to change how it tokenizes on the fly and it can do it
+programmatically, allowing you to program a program in cognition that would theoretically automate the process of changing
+these delimiters, singlets, and ignores. This is something impossible in other languages, being able to
+/program your own tokenizer for some foreign language from within cognition/, and have
+/future code be tokenized exactly like how you want it to be/. This is solely possible because the language is postfix
+and doesn't read ahead, so it doesn't require more than one token to be parsed before an expression is evaluated. Second,
+faliases allow us to execute words without having to have prefix words or any default execution of words.
+
+* Crank
+The /metacrank/ system allows us to set a default way in which tokens are executed on the stack. The ~crank~ word takes
+a number as its argument and by effect executes the top of the stack for every ~n~ words you put on the stack. To make
+this concept concrete, let's look at some code (running from what we call /crank 1/ as we set our environment to
+crank one at the end of the bootstrapping sequence):
+#+begin_example
+5 crank 2crank 2 crank
+1 crank unglue swap quote prepose def
+#+end_example
+the crank 1 environment allows us to stop using ~f~ in order to evaluate tokens. Instead, every /1/ token that is
+tokenized is evaluated. Since we programmed in a newline and space-delimited syntax, we can safely interpret this code
+intuitively.
+
+The code begins by trying to evaluate ~5~, which evaluates to itself as it is not a builtin. ~crank~ evaluates and puts
+us in 5 crank, meaning every /5th/ token evaluates from here on. ~2crank~, ~2~, ~crank~, ~1~ are all put on the stack,
+leaving us with a stack that looks like so (notice that ~crank~ doesn't get executed even though it is a bulitin because
+we set ourselves to using crank 5):
+#+begin_example
+4. 2crank
+3. 2
+2. crank
+1. 1
+#+end_example
+~crank~ is the 5th word, so it executes. Note that this puts us back in crank 1, meaning every word is evaluated.
+~unglue~ is a builtin that gets the value of the word at the top of the stack (as ~1~ is used up by the ~crank~ we
+evaluated), and so it gets the value of ~crank~, which is a builtin. What that in effect does is it gets the function
+pointer associated with the crank builtin. Our new stack looks like this:
+#+begin_example
+3. 2crank
+2. 2
+1. [CLIB]
+#+end_example
+Where CLIB is our function pointer that points to the ~crank~ builtin. We then ~swap~:
+#+begin_example
+3. 2crank
+2. [CLIB]
+1. 2
+#+end_example
+then ~quote~, a builtin that quotes the top thing on the stack:
+#+begin_example
+3. 2crank
+2. [CLIB]
+1. [2]
+#+end_example
+then prepose, a builtin like ~compose~ in stem, except that it preposes and that it puts things in what we call a VMACRO:
+#+begin_example
+2. 2crank
+1. ( [2] [CLIB] )
+#+end_example
+then we call ~def~. This defines a word ~2crank~ that puts ~2~ on the stack and then calls a function pointer pointing
+us to the crank builtin. Now, we still have to define what VMACROs are, and in order to do that we might have to explain
+some differences between the cognition stack and the stem stack.
+** Differeneces
+In the stem stack, putting words on the stack directly is allowed. In cognition, words are put in containers when
+they are put on the stack and not evaluated. This means words like ~compose~ in stem work on words (or more accurately
+containers with a single word in them) as well as other containers, making the API for this language more consistent.
+Additionally, words like ~cd~ as we will make use of this concept.
+
+*** Macros
+Macros are another difference between stem quotes and cognition containers. When macros are evaluated, everything in
+the macro is evaluated, ignoring the crank. If bound to a word, evaluating that word evaluates the macro which will ignore
+the crank completely and will only increment the cranker by one, while evaluating each statement in the macro. They
+are useful for making crank-agnostic code, and expanding macros is very useful for the purpose of optimization, although
+we will actually have to write the word ~expand~ from more primitive words later on (hint: it uses recursive ~unglue~).
+** More Code
+Here is te rest of the code in ~bootstrap.cog~ in ~coglib/~:
+#+begin_example
+getd dup _ concat _ swap d i 
+_quote_swap_quote_compose_swap_dup_d_i eval 
+
+2crank ing 0 crank spc
+2crank ing 1 crank swap quote def
+2crank ing 0 crank endl
+2crank ing 1 crank swap quote def
+2crank ing 1 crank
+2crank ing 3 crank load ../coglib/ quote
+2crank ing 2 crank swap unglue concat unglue fread unglue evalstr unglue
+2crank ing 1 crank compose compose compose compose VMACRO cast def
+2crank ing 1 crank
+2crank ing 1 crank getargs 1 split swap drop 1 split drop
+2crank ing 1 crank
+2crank ing 1 crank epop drop
+2crank ing 1 crank INDEX spc OUT spc OF spc RANGE
+2crank ing 1 crank concat concat concat concat concat concat =
+2crank ing 1 crank
+2crank ing 1 crank missing spc filename concat concat dup endl concat
+2crank ing 1 crank swap quote swap quote compose
+2crank ing 2 crank print compose exit compose
+2crank ing 1 crank
+2crank ing 0 crank fread evalstr
+2crank ing 1 crank compose
+2crank ing 1 crank
+2crank ing 1 crank if
+#+end_example
+Okay, well, the syntax still doesn't look so good, and it's still pretty hard to get what this is doing. But the
+basic idea is that ~2crank~ is a macro and is therefore crank agnostic, and we guarantee its execution with ~ing~, another
+falias (because it's funny). Then, we execute an ~n crank~, which standardizes what crank each line is in (you might
+wonder what ~ing~ and ~f~'s interaction is with the cranker. It actually just guarantees the evaluation of the previous
+thing, so if the previous thing already evaluated ~f~ and ~ing~ both do nothing). In any case, this defines words that
+are useful, such as ~load~, which loads something from the coglib. It does this by ~compose~-ing things into quotes and
+then ~def~-ing those quotes.
+
+The crank, and by extension, the metacrank system is needed in order to discriminate between /evaluating/ some tokens
+and /storing/ others for metaprogramming without having to use ~f~, while also keeping the system postfix. Crank
+is just one word that allows for this type of behavior; the more general word, ~metacrank~, allows for much more
+interesting kinds of syntax manipulation. We have examples of ~metacrank~ down the line, but for now I should explain
+the /metacrank word/.
+** Metacrank
+~n m metacrank~ sets a periodic evaluation ~m~ for an element ~n~ items down the stack. The ~crank~ word is therefore
+equivalent to ~0 m metacrank~. Only one token can be evaluated per tokenized token, although /every/ metacrank is incremented
+per token, where lower metacranks get priority. This means that if you set two different metacranks, only /one/ of them
+can execute per token tokenized, and the lower metacrank gets priority. Note that metacrank and, by extension, crank,
+don't /just/ depend on tokenized words; they also work while evaluating word definitions recursively, meaning if a word
+is evaluated in ~2 crank~, one out of two words will execute in each level of the evaluation tree. You can play around
+with this in the repl to get a sense of how it works: run ~../crank bootstrap.cog repl.cog devel.cog load~, and use stem
+like syntax in order to define a function. Then, run that function in ~2 crank~. You will see how the evaluation tree
+respects cranking in the same way that the program file itself does.
+
+Metacrank allows for not only metaprogramming in the form of code building, but also
+direct syntax manipulation (i.e. /I want to execute this token once I have read n other token(s)/). The advantages to
+this system compared to other programming languages' systems are clear: you can program a prefix word and ~undef~ it
+when you want to rip out that part of syntax. You can write a prefix character that doesn't stop at an ending character
+but /always/ stops when you read a certain number of tokens. You can feed user input into a math program and feed the
+output into a syntax system like metacrank. The possibilities are endless! And with that, we will slowly build up the
+~stem~ programming language, v2, now with macros and from within our own /cognition/.
+* The Stem Dialect, Improved
+In this piece of code, we define the /comment/:
+#+begin_example
+2crank ing 0 crank ff 1
+2crank ing 1 crank cut unaliasf
+2crank ing 0 crank 0
+2crank ing 1 crank cut swap quote def
+2crank ing 0 crank
+2crank ing 0 crank #
+2crank ing 0 crank geti getd gets crankbase f d f i endl s
+2crank ing 1 crank compose compose compose compose compose compose compose compose compose
+2crank ing 0 crank drop halt crank s d i
+2crank ing 1 crank compose compose compose compose compose VMACRO cast quote compose
+2crank ing 0 crank halt 1 quote ing 1 quote ing metacrank
+2crank ing 1 crank compose compose compose compose VMACRO cast
+2crank ing 1 crank def
+2crank ing 2 crank # singlet # delim
+2crank ing 1 crank #comment: geti getd gets crankbase '' d '' i '\n' s ( drop halt crank s d i ) halt 1 1 metacrank
+#+end_example
+and it is our first piece of code that builds something /truly/ prefix. The comment character is a prefix that drops
+all the text before the newline character, which is a type of word that tells the parser to /read ahead/. This is our
+first indication that everything that we thought was possible within cognition truly /is/.
+
+But before that, we can look at the first couple of lines:
+#+begin_example
+2crank ing 0 crank ff 1
+2crank ing 1 crank cut unaliasf
+2crank ing 0 crank 0
+2crank ing 1 crank cut swap quote def
+2crank ing 0 crank
+#+end_example
+which simply unaliases ~f~ from the falias list, with ~ing~ being the only remaining falias. In cognition, even these
+faliases are changeable.
+
+Since we can't put ~f~ directly on the stack (if we try by just using ~f~, it would execute), we instead utilize some
+very minimal string processing to do it, putting ~ff~ on the stack and then cutting the string in half to get two copies
+of ~f~. We then want ~f~ to mean false, which in cognition is just an empty word. Therefore, we make an empty word by
+calling ~0 cut~ on this string, and then ~def~-ing f to the empty string. The following code is where the comment is
+defined:
+
+#+begin_example
+2crank ing 0 crank #
+2crank ing 0 crank geti getd gets crankbase f d f i endl s
+2crank ing 1 crank compose compose compose compose compose compose compose compose compose
+2crank ing 0 crank drop halt crank s d i
+2crank ing 1 crank compose compose compose compose compose VMACRO cast quote compose
+2crank ing 0 crank halt 1 quote ing 1 quote ing metacrank
+2crank ing 1 crank compose compose compose compose VMACRO cast
+2crank ing 1 crank def
+2crank ing 2 crank # singlet # delim
+2crank ing 1 crank #comment: geti getd gets crankbase '' d '' i '\n' s ( drop halt crank s d i ) halt 1 1 metacrank
+#+end_example
+Relevant: ~halt~ just puts you in 0 for all metacranks, and ~VMACRO cast~ just turns the top thing on the stack from a
+container to a macro. ~geti~, ~getd~, ~gets~ gets the ignores, delims, and singlets respectively as a string; ~drop~ is
+~dsc~ in stem. ~singlet~ and ~delim~ sets the singlets and delimiters. ~endl~ is defined withint ~bootstrap.cog~ and just
+puts the newline character as a word on the stack. ~crankbase~ gets the current crank.
+
+we call a lot of ~compose~ words in order to build this definition, and we make the ~#~ character a singlet delimiter in
+order to allow for spaces after the comment. We put ourselves in ~1 1 metacrank~ in the ~#~ definition while altering
+the tokenization rules beforehand in order to tokenize everything until a newline as a token while calling ~#~ on said word
+in order to effectively drop that comment and get ourselves back in the original crank and metacrank. Thus, the brilliant
+~#~ character is written, operating on a token that is tokenized /in the future/, with complete default postfix syntax.
+With the information above, one can work out the specifics of how it works; the point is that it /does/, and one can test
+that it does by going into the ~coglib~ folder and running ~../crank bootstrap.cog repl.cog devel.cog load~, which will load
+the REPL and load ~devel.cog~, which will in turn load ~comment.cog~.
+** The Great Escape
+Here we define a preliminary prefix escape character:
+#+begin_example
+2crank ing 2 crank comment.cog load
+2crank ing 0 crank
+2crank ing 1 crank # preliminary escape character \
+2crank ing 1 crank \
+2crank ing 0 crank halt 1 quote ing crank
+2crank ing 1 crank compose compose
+2crank ing 2 crank VMACRO cast quote eval
+2crank ing 0 crank halt 1 quote ing dup ing metacrank
+2crank ing 1 crank compose compose compose compose
+2crank ing 2 crank VMACRO cast
+2crank ing 1 crank def
+2crank ing 0 crank
+2crank ing 0 crank
+#+end_example
+This allows for escaping so that we can put something on the stack even if it is to be evaluated,
+but we want to redefine this character eventually to be compatible with stem-like quotes. We're
+even using our comment character in order to annotate this code by now! Here is the full quote definition (once we have
+this definition, we can use it to improve itself):
+#+begin_example
+2crank ing 0 crank [
+2crank ing 0 crank
+2crank ing 1 crank # init
+2crank ing 0 crank crankbase 1 quote ing metacrankbase dup 1 quote ing =
+2crank ing 1 crank compose compose compose compose compose
+2crank ing 0 crank
+2crank ing 1 crank # meta-crank-stuff0
+2crank ing 3 crank dup ] quote =
+2crank ing 1 crank compose compose
+2crank ing 16 crank drop swap drop swap 1 quote swap metacrank swap crank quote
+2crank ing 3 crank compose dup quote dip swap
+2crank ing 1 crank compose compose compose compose compose compose compose compose
+2crank ing 1 crank compose compose compose compose compose \ VMACRO cast quote compose
+2crank ing 3 crank compose dup quote dip swap
+2crank ing 1 crank compose compose compose \ VMACRO cast quote compose \ if compose
+2crank ing 1 crank \ VMACRO cast quote quote compose
+2crank ing 0 crank
+2crank ing 1 crank # meta-crank-stuff1
+2crank ing 3 crank dup ] quote =
+2crank ing 1 crank compose compose
+2crank ing 16 crank drop swap drop swap 1 quote swap metacrank swap crank
+2crank ing 1 crank compose compose compose compose compose compose compose compose \ VMACRO cast quote compose
+2crank ing 3 crank compose dup quote dip swap
+2crank ing 1 crank compose compose compose \ VMACRO cast quote compose \ if compose
+2crank ing 1 crank \ VMACRO cast quote quote compose
+2crank ing 0 crank
+2crank ing 1 crank # rest of the definition
+2crank ing 16 crank if dup stack swap 0 quote crank
+2crank ing 2 crank 1 quote 1 quote metacrank
+2crank ing 1 crank compose compose compose compose compose compose compose compose
+2crank ing 1 crank compose \ VMACRO cast
+2crank ing 0 crank
+2crank ing 1 crank def
+#+end_example
+Um, it's quite the spectacle how Matthew Hinton ever came up with this thing, but alas, it exists. Then, we use it in
+order to redefine itself, but better as the old quote definition can't do recursive quotes
+(we can do this because the definition is /used/ before you redefine the word due to postfix ~def~, a
+development pattern seen often in low level cognition):
+#+begin_example
+\ [
+
+[ crankbase ] [ 1 ] quote compose [ metacrankbase dup ] compose [ 1 ] quote compose [ = ] compose
+
+[ dup ] \ ] quote compose [ = ] compose
+[ drop swap drop swap ] [ 1 ] quote compose [ swap metacrank swap crank quote compose ] compose
+[ dup ] quote compose [ dip swap ] compose \ VMACRO cast quote compose
+[ dup dup dup ] \ [ quote compose [ = swap ] compose \ ( quote compose [ = or swap ] compose \ \ quote compose [ = or ] compose
+[ eval ] quote compose
+[ compose ] [ dup ] quote compose [ dip swap ] compose \ VMACRO cast quote compose [ if ] compose \ VMACRO cast
+quote compose [ if ] compose \ VMACRO cast quote quote
+
+[ dup ] \ ] quote compose [ = ] compose
+[ drop swap drop swap ] [ 1 ] quote compose [ swap metacrank swap crank ] compose \ VMACRO cast quote compose
+[ dup dup dup ] \ [ quote compose [ = swap ] compose \ ( quote compose [ = or swap ] compose \ \ quote compose [ = or ] compose
+[ eval ] quote compose
+[ compose ] [ dup ] quote compose [ dip swap ] compose \ VMACRO cast quote compose [ if ] compose \ VMACRO cast
+quote compose [ if ] compose \ VMACRO cast quote quote
+
+compose compose [ if dup stack swap ] compose [ 0 ] quote compose [ crank ] compose
+[ 1 ] quote dup compose compose [ metacrank ] compose \ VMACRO cast
+
+def
+#+end_example
+Okay, so now we can use recursive quoting, just like in stem. But there are still a couple things missing that we probably
+want: a good string quote implementation, and probably escape characters that work in the brackets. Also, since Cognition
+utilizes macros, we probably want a way to notate those as well, and we probably want a way to expand macros. We can do
+all of that! First, we will have to redefine ~\~ once more:
+#+begin_example
+\ \
+[ [ 1 ] metacrankbase [ 1 ] = ]
+[ halt [ 1 ] [ 1 ] metacrank quote compose [ dup ] dip swap ]
+\ VMACRO cast quote quote compose
+[ halt [ 1 ] crank ] VMACRO cast quote quote compose
+[ if halt [ 1 ] [ 1 ] metacrank ] compose \ VMACRO cast
+def
+#+end_example
+This piece of code defines the bracket but for macros (split just splits a list into two):
+#+begin_example
+\ (
+\ [ unglue
+[ 11 ] split swap [ 10 ] split drop [ macro ] compose
+[ 18 ] split quote [ prepose ] compose dip
+[ 17 ] split eval eval
+[ 1 ] del [ \ ) ] [ 1 ] put
+quote quote quote [ prepose ] compose dip
+[ 16 ] split eval eval
+[ 1 ] del [ \ ) ] [ 1 ] put
+quote quote quote [ prepose ] compose dip
+prepose
+def
+#+end_example
+We want these macros to automatically expand because it's more efficient to bind already expanded macros to words,
+and they functionally evaluate identically (~isdef~ just returns a boolean where true is a non-empty string, false
+is an empty string, if a word is defined):
+#+begin_example
+\ (
+( crankbase [ 1 ] metacrankbase dup [ 1 ] =
+  [ ( dup \ ) =
+      ( drop swap drop swap [ 1 ] swap metacrank swap crank quote compose ( dup ) dip swap )
+      ( dup dup dup \ [ = swap \ ( = or swap \ \ = or
+        ( eval )
+        ( dup isdef ( unglue ) [ ] if compose ( dup ) dip swap )
+        if )
+      if ) ]
+  [ ( dup \ ) =
+      ( drop swap drop swap [ 1 ] swap metacrank swap crank )
+      ( dup dup dup \ [ = swap \ ( = or swap \ \ = or
+        ( eval )
+        ( dup isdef ( unglue ) [ ] if compose ( dup ) dip swap )
+        if )
+      if ) ]
+  if dup macro swap
+  [ 0 ] crank [ 1 ] [ 1 ] metacrank ) def
+#+end_example
+and you can see that as we define more things, our language is beginning to look more or less like it has syntax!
+In this ~quote.cog~ file which we have been looking at, there are more things, but the bulk of it is pretty much done.
+From here on, I will just explain the syntax programmed by quote.cog instead of showing the specific code.
+
+As an example, here is ~expand~:
+#+begin_example
+# define basic expand (works on nonempty macros only)
+[ expand ]
+( macro swap
+  ( [ 1 ] split
+    ( isword ( dup isdef ( unglue ) ( ) if ) ( ) if compose ) dip
+    size [ 0 ] > ( ( ( dup ) dip swap ) dip swap eval ) ( ) if )
+  dup ( swap ( swap ) dip ) dip eval drop swap drop ) def
+
+# complete expand (checks for definitions within child first without copying hashtables)
+[ expand ]
+( size [ 0 ] > ( type [ VSTACK ] = ) ( return ) if ?
+  ( macro swap
+    macro
+    ( ( ( size dup [ 0 ] > ) dip swap ) dip swap
+      ( ( ( 1 - dup ( vat ) dip swap ( del ) dip ) dip compose ) dip dup eval )
+      ( drop swap drop )
+      if ) dup eval
+    ( ( [ 1 ] split
+        ( isword
+          ( compose cd dup isdef
+            ( unglue pop )
+              ( pop dup isdef ( unglue ) ( ) if )
+            if ) ( ) if
+          ( swap ) dip compose swap ) dip
+        size [ 0 ] > ) dip swap
+      ( dup eval ) ( drop drop swap compose ) if ) dup eval )
+  ( expand )
+  if ) def
+#+end_example
+Which recursively expands word definitions inside a quote or macro, using the word ~unglue~. We've used the ~expand~
+word in order to redefine itself in a more general case.
+* The Brainfuck Dialect
+And returning to whence we came, we define the /Brainfuck/ dialect with our current advanced stem dialect:
+#+begin_example
+comment.cog load
+quote.cog load
+
+[ ] [ ] [ 0 ]
+
+[ > ] [[ swap [[ compose ]] dip size [ 0 ] = [ [ 0 ] ] [[ [ 1 ] split swap ]] if ]] def
+[ < ] [[ prepose [[ size dup [ 0 ] = [ ] [[ [ 1 ] - split ]] if ]] dip swap ]] def
+[ + ] [[ [ 1 ] + ]] def
+[ - ] [[ [ 1 ] - ]] def
+[ . ] [[ dup char print ]] def
+[ , ] [[ drop read byte ]] def
+
+[ pick ] ( ( ( dup ) dip swap ) dip swap ) def
+[ exec ] ( ( [ 1 ] * dup ) dip swap [ 0 ] = ( drop ) ( dup ( evalstr ) dip \ exec ) if ) def
+
+\ [ (
+  ( dup [ \ ] ] =
+    ( drop swap - [ 1 ] * dup [ 0 ] =
+      ( drop swap drop halt [ 1 ] crank exec )
+      ( swap [ \ ] ] concat pick )
+      if )
+    ( dup [ \ [ ] =
+      ( concat swap + swap pick )
+      ( concat pick )
+      if )
+    if )
+  dup [ 1 ] swap f swap halt [ 1 ] [ 1 ] metacrank
+) def
+
+><+-,.[] dup ( i s itgl f d ) eval
+#+end_example
+test with ~../crank -s 2 bootstrap.cog helloworld.bf brainfuck.cog~. You may of course load your favorite brainfuck
+file with this method. Note that brainfuck.cog isn't a brainfuck parser in the ordinary sense; it actually
+/defines brainfuck words/ and /tokenizes/ brainfuck, running it in the native cognition environment.
+
+It's very profound, as well, how our current syntax allows us to define an /alternate/ syntax with great ease. It might
+make you wonder if it's possible to /specifically craft/ a syntax whose job is to write other syntaxes. Another interesting
+observation you might have is that Cognition defines syntax by defining a prefix character as a /word/ that uses metacrank,
+rather than reading symbols and deciding what to do based on symbols. It's almost as if the syntax becomes /inherent/ to the
+word that's being defined.
+
+These two ideas synthesize to create something truly exciting, but that hasn't yet been implemented in the standard library
+(though we very much know that it is possible). Introducing: the /dialect dialect/ of Cognition...
+** The Dialect Dialect
+Imagine a word ~mkprefix~, that takes two input words (say for example ~[~ and ~]~), and an operation, and
+/automatically defines/ ~[~ to apply said operation until it hits a ~]~ character. This is possible because constructs
+like ~metacrank~ and ~def~ are all just /regular words/, so it's possible to use /them/ as words to metaprogram with.
+In fact, /everything/ is just a word (even ~d~, ~i~, and ~s~), so you can imagine a hyperabstract dialect that includes
+words like ~mkprefix~, using syntax to automate the process of implementing more syntax. Such a construct I have not
+encountered in /any other programming language/. Yet, in your own /Cognition/, you can make nearly anything a reality.
+
+Such creative things Matthew Hinton and I have discussed as possibilities regarding the standard library. Right now, the
+standard library has metawords that generate abstract words automatically and call them. This is possible through string
+concatenation and using ~def~ in the definition of another word also (this is also possible in my prior programming
+language Stem). We have discussed the possibility of a word that searches for word-generators to abstract its current
+wordlist automatically, and we have talked about the possibility of directing this abstraction framework for the purpose
+of solving a problem. These are conceptually possible words to write within cognition, and this might give you an idea
+of how /powerful/ this idea is.
+* Theoretical Musings
+There are a couple of things about Cognition that make it interesting beyond its quirks. For instance,
+string processing in this language is equivalent to tokenizer postprocessing, which makes string operations inherently
+extremely powerful in this language. It also has potential applications in Symbolic AI and in syntax and grammar research,
+where prototypes of languages and metalanguages can be tested with ease. I'd imagine that anyone configuring a program
+that reads a configuration file would really want their configuration language to be something like this, where they can
+have full freedom over the syntax (and metasyntax) in which they program in (think about a Cognition based shell,
+or a Cognition based operating system!). Though, the point of working on this language was never its applications;
+its intrinsic beauty is its own philosophical statement.
+* Conclusion
+You can imagine cognition can program basically any syntax you would want, and in this article, we demonstrate the power
+of the already existing code that makes cognition work. In short, the system allows for true /syntax as code/, as my
+friend Andrei put it; one can /dynamically program/ and even /automate/ the production of syntax. In this article, we
+didn't have the space to cover other important Cognition concepts like the /Metastack/ and words like ~cd~, but this
+can be done in a part 2 of this blog post.
author	Preston Pan <preston@nullring.xyz>	2024-05-02 23:25:48 -0700
committer	Preston Pan <preston@nullring.xyz>	2024-05-02 23:25:48 -0700
commit	52978baab0274bc594c8fd3cc749624a475229e2 (patch)
tree	e33b19050afaef26e66ec78500e07ebf6ce0a05c /blog/cognition.org
parent	d6e2c196f799d0cd5bceb0b5c0260111e739c374 (diff)