Very cool. Neat how you managed to get logical symbols in to the language itself! When might someone use preconditions in Lean theorems?
This article caught my eye because it's focused on imperative programming, and I've been very focused on declarative vs imperative programming over the last few years. I implemented a version of your function in CLIPS, a Rules-based language that takes a declarative approach to code:
The theorem you write in Lean to prove the function kind-of exists in CLIPS Rules; you define the conditions that must occur in order to execute the Right Hand Side of the Rule. Note that the above simply prints `TRUE` or `FALSE`; it is possible to write imperative `deffunction`s that return values in CLIPS, but I wanted to see if I could draw parallels for myself between Lean code and theorems. Here's a gist with the simple version and a slightly more robust version that describes the index at which the matching numbers appear: https://gist.github.com/mrryanjohnston/680deaee87533dfedc74b...
Thank you for writing this and for your work on Lean! This is a concept that's been circling in my head for a minute now, and I feel like this article has unlocked some level of understanding I was missing before.
Location: Pittsburgh, Pa
Remote: Yes
Willing to relocate: No
Technologies: CLIPS, Ruby (on Rails), Go, JavaScript, C, SQL
Résumé: https://drive.google.com/file/d/1M-tC7qpehmsJfd3JBk3KtXEQ5wDJz89t
Email: mrryanjohnston (at) gmail
Website: https://ryjo.codes
12+ years of experience writing code professionally, active in open source, passionate about education and freedom of knowledge. Very interested in Rules Engines. Love working on teams of self-starters, enjoy pushing the boundaries of my understanding. Comfortable mentoring others as well as helping non-technical folks with difficult-to-grok concepts.
> Conventional programming languages, such as FORTRAN and C, are designed and optimized for the
procedural manipulation of data (such as numbers and arrays). Humans, however, often solve complex
problems using very abstract, symbolic approaches which are not well suited for implementation in
conventional languages. Although abstract information can be modeled in these languages, considerable
programming effort is required to transform the information to a format usable with procedural
programming paradigms.
> One of the results of research in the area of artificial intelligence has been the development of techniques
which allow the modeling of information at higher levels of abstraction. These techniques are embodied in
languages or tools which allow programs to be built that closely resemble human logic in their
implementation and are therefore easier to develop and maintain. These programs, which emulate human
expertise in well defined problem domains, are called expert systems. The availability of expert system
tools has greatly reduced the effort and cost involved in developing an expert system.
Really awesome. Thanks for this thorough write-up. I don't totally understand the deeper math concepts mentioned in this article around RNNs, but it's sparked some of my own thoughts. It feels similar to things I've been exploring lately-- that is: building your app interwoven with forward chaining algorithms. In your case, you're using RNNs, and in mine, I'm building into the Rete algorithm.
You also touch on something in this article that I've found quite powerful: putting things in terms of digesting an input string character-by-character. Then, we offload all of the reasoning logic to our algorithm. We write very thin i/o logic, and then the algorithm does the rest.
This is another reason I am learning CLIPS. It implements a form of the Rete algorithm. A Rete network provides the same benefits of a neural network without the bells-and-whistles, and thus would be a great candidate for anyone who thinks they could benefit from the reasoning power provided by LLMs.
Using debuggers makes a lot of sense in this case. If I had to switch context that frequently between relatively stable applications, it would be helpful to have a debugger framework for doing work.
`printf` statements are helpful when the error exists beneath the debugger: in the application framework itself.
> they're not always available
100% agreed. `printf` is "one tool" I can use to follow the control flow of a function call across frameworks, programming languages, and even operating systems. It's also something I can reliably assume my coworkers have in their tool belt, and thus provides a common language that even multiple different organizations can use to cross-collaborate a specific debugging session.
the longer i'm on a project, the more inclined i may be to get a debugger going, but often there's much more low hanging fruit to deal with that doesn't necessarily need a debugger to figure out.
Writing a JSON parser is a good way to teach yourself better programming practices. I attribute my understanding of pointer arithmetic and i/o streams to my own efforts in parsing/generating JSON.
There is an objective benefit to using `constructs-to-c` and that is something you mention in your original comment: you generate a single binary file containing all of your business logic, thus avoiding running a "fully interpreted" app in production.
In order to determine if it gives a performance boost to your particular rules engine, you should test it out and benchmark it with something you'd consider a "real world" example.
> you should test it out and benchmark it with something you'd consider a "real world" example.
I thought that you might have specific experience with "real world" applications, thus my question.
The CLIPS "Advanced Programming Guide" states in secion 11.1, that "...compiles all of the constructs [..] into a single executable and reduces the size of the executable image. A run-time program will not run any faster than a program loaded using the load or bload commands"; so as far as I understand it, the code is still interpreted (i.e. not affected by the translation to C).
Good find in the apg! That thing is a wealth of info. It look like contructs-to-c wouldn't make things faster, but would instead reduce the compiled code size.
If I'm understanding your meaning, you mean "interpreted" in a sense that it translates to C code, and C code is interpreted and translated into machine code. Is that accurate?
It's possible that you and I are referring to different things when we use the word "interpreted." When I think "interpreted" in the context of computer programming, I'm thinking in terms of interpreted-at-runtime programming languages like Ruby, Python, and CLIPS before you use `constructs-to-c`. In my mind, using `constructs-to-c` changes this from an "interpreted" to a "compiled" language in that the expert system you write in CLIPS code is its own language with its own data structures, algorithms, and DSL. When you use `constructs-to-c`, you "compile" your application using the programming language of your expert system, thus it is no longer interpreted at run time.
> you mean "interpreted" in a sense that it translates to C code, and C code is interpreted and translated into machine code
No, I mean that the Lisp variant of CLIPS is a pure interpreter, (assumingly) not even threaded, and (certainly) no JIT. The "constructs-to-c" feature apparently doesn't change anything to that, i.e. even in the "translated to C" version of the CLIPS code it is still interpreted.
> I'm thinking in terms of interpreted-at-runtime programming languages like Ruby, Python
Me too.
> and CLIPS before you use `constructs-to-c` ... changes this from an "interpreted" to a "compiled" language
Didn't find evidence for this so far; as far as I understand it's still the same interpreter.
I'm sorry, but I still think I'm not sure what you mean by interpreter. I hope you don't mind if I press further, but I'm very much enjoying this deep dive into the CLIPS C API with you.
When you compile your code including the c files generated by the constructs-to-c command, you must update the main.c file as well as the makefile that ships with CLIPS. During that update, there are a few things you must do:
1. add the generated c files to the makefile
2. you remove the function calls in the main.c file that cause CLIPS to capture STDIN and STDOUT. That is: you remove one layer of abstraction that sits between your rules and i/o streams.
3. you remove the generic CreateEnvironment function call in main.c and replace it with InitCImage_1, a function that calls CreateRuntimeEnvironment.
CreateRuntimeEnvironment differs from CreateEnvironment because you are able to pass in pre-built:
Which are your rules engine constructs represented in C.
In addition to these four things, CreateRuntimeEnvironment passes in functions the user may have defined as UserDefinedFunctions (UDFs) in C. This is a more direct path than having CLIPS first interpret CLIPS rules like `(defrule foo =>)` and then translate them into these C representations.
I didn't find the time yet to check the details in the source code, so I can only speculate from experience with other VMs. As it seems from your description the constructs-to-c version removes the interactive part (REPL) which is apparently not needed in stand-alone applications. But that's unrelated to the question in which format the code/rule-base is kept and executed. The C files could embed both a text or a byte code version of the CLIPS code/rule-base, and the interpreter is implemented in C anyway. But I think I have to check in the (generated) source code (the architecture manual doesn't even mention the term "interpreter"). For some years I was thinking about using CLIPS for algorithmic music composition, but it might be too slow; seems to be not trivial to get reliable performance figures.
> For some years I was thinking about using CLIPS for algorithmic music composition
This is a great use case! I'd be interested in seeing what you come up with.
> in which format the code/rule-base is kept
When you use constructs-to-c, the generated C files represent your rules engine constructs as pointers to pointers of CLIPSLexeme, CLIPSFloat, CLIPSInteger, and CLIPSBitMap structs. It does not store it as the raw clips code fed into the interpreter. You can later use functions like `save-facts` to generate the rules in CLIPS syntax based on these pointers to pointers of structs.
Ok, then it's aparently the intermediate representation (what the parser feeds to the evaluator), which makes sense. The evaluator doesn't care, whether it comes from the parser or from constructs-to-c generated code.
> Ok, then it's aparently the intermediate representation (what the parser feeds to the evaluator), which makes sense. The evaluator doesn't care, whether it comes from the parser or from constructs-to-c generated code.
> I thought that you might have specific experience with "real world" applications, thus my question.
As to this point: I don't at present have "real world" experience writing CLIPS. My experience thus far has been entirely research-driven based on observations I've made in "real world" scenarios.
If you want to read about some other people who have used CLIPS in real world scenarios, here's the most comprehensive HN thread to date so far on the subject:
Thanks. I was also in the discussion. When CLIPS is used as an expert system shell, then performance comparisons are pretty restricted. But since you have seemed to use it more for "indexing" and "caching", which is a domain shared by other interpreters used for the web, a performance comparison seems more feasible, especially if you already have implemented such features with other interpreters as well.
> you have seemed to use it more for "indexing" and "caching", which is a domain shared by other interpreters used for the web
That's the idea! I think CLIPS is a good framework for a generic run-loop (or `while` loop). If you can conceptualize your app as such, you can reap the benefits inherent in the rete algorithm within your application logic.
This article caught my eye because it's focused on imperative programming, and I've been very focused on declarative vs imperative programming over the last few years. I implemented a version of your function in CLIPS, a Rules-based language that takes a declarative approach to code:
(defrule sum-is-0 (list $? ?first $? ?second $?) (test (= 0 (+ ?first ?second))) => (println TRUE))
(defrule sum-is-not-0 (not (and (list $? ?first $? ?second $?) (test (= 0 (+ ?first ?second))))) => (println FALSE))
(assert (list 1 0 2 -1)) (run) (exit)
The theorem you write in Lean to prove the function kind-of exists in CLIPS Rules; you define the conditions that must occur in order to execute the Right Hand Side of the Rule. Note that the above simply prints `TRUE` or `FALSE`; it is possible to write imperative `deffunction`s that return values in CLIPS, but I wanted to see if I could draw parallels for myself between Lean code and theorems. Here's a gist with the simple version and a slightly more robust version that describes the index at which the matching numbers appear: https://gist.github.com/mrryanjohnston/680deaee87533dfedc74b...
Thank you for writing this and for your work on Lean! This is a concept that's been circling in my head for a minute now, and I feel like this article has unlocked some level of understanding I was missing before.