clng
clng copied to clipboard
Common Lisp next generation (discussion)
- Common lisp next generation
This document's main goal is to collect ideas and propositions about next generation CL (standard/specification/othername). This is NOT a template for a new specification, but merely a place for discussion.
Ideas should be pull requests and discussions should take place in issues. Keep in mind that many things proposed here can be contradictory to each other, resulting in several differently designed languages. However this is a step forward compared to single messages and abrupt threads spread across different random chats, blogs and forums. Surely it is time to start accumulating ideas no matter how fundamental or small they are. While the main inspiration comes from Common Lisp, people contributing to this usually know more than one language.
- Motivation:
The Common Lisp standard was created in 1994. The major goal was to unite many (relatively big as well as small) dialects of Lisp making it an ultimate successor of MacLisp Lisp series. The natural advantage of such an approach was uniting communities around a single project, thus getting a large amount of manpower for further development. However, as a result the developers had to try to maintain a certain level of backwards compatibility with all the previously existing dialects, resulting in a language with different (not always consistent with each other) features and a certain historical baggage.
** Summary: So far it seems that all suggestions can be divided into three categories -- fully backwards compatible, almost backwards compatible (with very minor refactorizations required)... and basically a new lisp language. Apparently, the new language suggestions tend to lean towards low-level features, as the demand for high-level ones is already satisfied by high flexibility of CL.
Fully backwards compatible propositions share goals with the CDR project, read below.
- Alternatives:
-
[[https://common-lisp.net/project/cdr/][CDR Project]] The project is mainly focused on backwards compatible suggestions, implemented incrementally.
-
Honorable mention:
[[https://www.cliki.net/Proposed%20Extensions%20To%20ANSI][List of proposed ANSI revisions]]
And there is also [[https://www.cliki.net/Proposed%20ANSI%20Revisions%20and%20Clarifications][List of clarifications]]
- [[https://github.com/ciel-lang/CIEL][CIEL]] is an attempt at creating a batteries-included Common Lisp distribution. It shares similar goals to the first part of our document about fully backwards-compatible changes.
- Fully backwards compatible changes:
** High-level features:
*** Extensible loop
**** Details The very reason the Iterate library exists. However, having a loop construct (a basic feature) as a library makes it very non obvious, especially for newcomers. The "Iterate vs LOOP" argument, certain design choices (LOOP being "unlispy" vs Iterate using code-walker) -- are issues to be resolved.
**** Reasoning
**** Current status
*** Improved hash-tables.
**** Details There are many possible improvements for hash-tables -- readable hash-tables, initial-values for =make-hash-table=, typed hash-tables, arbitary =:test= function for hash-tables, standard support for weak hash table keys and values.
**** Reasoning
**** Current status
*** Unifying variables declarations.
**** Details =Let= handling multiple value bindings and/or list-pattern matching. Global variables with =defglobal=, see =sb-ext:defglobal=.
**** Reasoning
**** Current status
** General features:
*** Clear policy rules **** Details This includes, but is not limited to, TCO cases, inlining, compiler-macros expansion, general safety of =(safety 0)= etc.
**** Reasoning
**** Current status
*** Standard parser for lambda and macro lambda lists.
**** Details There are a lot of generic half-baked ones. It also may include the discussion of what can be added/changed about lambda-lists structure -- such as pattern matching.
**** Reasoning
**** Current status
*** CFFI **** Details
**** Reasoning
**** Current status
Existing one is decent but surely there are improvememnts. This part requires specific problems described.
** Low-level features:
*** Equivalence functions cleanup.
**** Details Depending on the implementation numbers sometimes are =eq= and sometimes aren't. This can be complicated for some implementations, therefore maybe this issue is rather a consequence of some other issues that should be highlighted. Some =equalp= quirks are also questionable.
**** Reasoning
**** Current status
*** Unicode support **** Details
**** Reasoning
**** Current status
*** Long string literals split across lines with indentation
**** Details
#+BEGIN_SRC
(foo bar "this is just one
\ string literal with only single spaces")
#+END_SRC
As well as special characters in string literals via something analogous to \x3F, \177, \n, \t, \u+1234.
**** Reasoning
**** Current status
*** Expand-full function **** Details Perform all expansion on an expression in a given macro environment. Optionally report all free variables.
**** Reasoning
**** Current status
*** Allow =eval= access to environment **** Details That implies eval being able to eval things that only make sense in certain environment.
**** Reasoning
**** Current status
*** Security (fixing reader eval, ...)
**** Details More security on certain areas.
**** Reasoning
**** Current status
*** Compilation
**** Details Different ways of compilation in more details, for example block-compilation, akin to [[https://mstmetent.blogspot.com/2020/02/block-compilation-fresh-in-sbcl-202.html][what]] is done in SBCL.
**** Reasoning
**** Current status
- Almost backwards compatible changes:
*** Extensible sequences **** Details Extensible data structures of different kind. The protocol for sequences is also a thing to discuss.
**** Reasoning
**** Current status
*** Native lazy list via lazy-cons type which satisfies consp. **** Details While laziness can be theoretically speaking implemented as a library, the efficient (that is, for production use) laziness is nontrivial to make. Therefore, it makes sense for maintainers of the language to implement it (at some point) as a part of (semi-)standard library.
**** Reasoning
**** Current status
*** Standard library redesign **** Details Some things that are in there can be in utility libs such as Alexandria, while some things from Alexandria can be too useful to not include them.
**** Reasoning
**** Current status
*** Standardize the Meta-Object Protocol for CLOS **** Details Instead of closer-mop we should have just mop. This includes both what currently is in MOP as well as some additions -- better definition lookup, all that concerns structures etc.
**** Reasoning
**** Current status
*** First-class macros **** Details Macros that can be bound to variables, passed as arguments and returned from functions. [[http://matt.might.net/articles/metacircular-evaluation-and-first-class-run-time-macros/][A more detailed explanation.]]
**** Reasoning
**** Current status
*** Executables and binary files **** Details A standard way to build them, maybe in different forms, with/without tree shaking.
**** Reasoning
**** Current status
*** Sockets **** Details (At least) BSD sockets interface standardization.
**** Reasoning Every modern language since 1990 includes BSD sockets library in its core.
**** Current status Interfaces are not standardized. There are incompatible implementation-dependent extensions and projects like [[https://github.com/usocket/trivial-sockets][trivial-sockets]] and IOlib.
*** GC finalization support: register callback for finalized object **** Details At least some control over it is in high demand. Better support for dynamic-extent. For more specific examples look [[https://github.com/trivial-garbage/trivial-garbage][here]].
**** Reasoning
**** Current status
*** Environments **** Details Standardized, and a set of basic functions to work with them.
**** Reasoning
**** Current status
*** Standardized code walking primitives: one body of user code which correctly walks all special forms. **** Details
**** Reasoning
**** Current status There is hu.dwim as a library.
*** Name conflicts **** Details
**** Reasoning
**** Current status As a compatibility [[https://github.com/phoe/trivial-package-local-nicknames][library]], [[http://www.sbcl.org/manual/#Package_002dLocal-Nicknames][here]] is how it looks for a specific implementation.
*** Unified naming patterns **** Details
- Have all constants be named like =+constant+= (wrapped in plus signs).
- Have all dynamic variables be named like =dynamic-variable= (wrapped in 'earmuffs').
- Either:
- have all predicates named like =?= or =-p= for full consistency.
- have 2 naming patterns (as it is now) but actually use them consistently:
- have predicates be named like =p= if == is 1 word.
- have predicates be named like =-p= if == is 2+ words separated by hyphens.
Having the standard dictate that incorrect usage of the above be a compilation or even runtime error would make this change (almost?) definitely backwards incompatible (to an unknown (but potentially huge) extent).
**** Reasoning It has been the community consensus for many years that these naming patterns should be used based on what meaning a given name holds. Any exception to this rule is just unexpected, unnecessary and probably should be treated as an error because the way a variable is named conveys the intended way of using it. Some compilers (i.e. SBCL) even give warnings about misuse of dynamic-variable-looking variables based on this particular convention.
**** Current status The names for constants are not consistent across the board. Some examples of 'incorrectly' named constants (from just the standard):
Examples of incorrectly named constants: #+BEGIN_SRC cl:most-negative-fixnum cl:most-positive-fixnum cl:internal-time-units-per-second cl:array-rank-limit cl:array-dimension-limit #+END_SRC
Example of incorrectly used dynamic variable naming pattern (before the proposed change) (SBCL implementation): #+BEGIN_SRC (let ((a 1)) (setf a 2))
; file: /tmp/slimer0wqXc ; in: LET ((A 1)) ; (LET ((COMMON-LISP::A 1)) ; (SETF COMMON-LISP::A 2)) ; ; caught COMMON-LISP:STYLE-WARNING: ; using the lexical binding of the symbol (COMMON-LISP::A), not the ; dynamic binding, even though the name follows ; the usual naming convention (names like FOO) for special variables ; ; compilation unit finished ; caught 1 STYLE-WARNING condition #+END_SRC
Examples of predicates named *p vs *-p (including a popular 3rd-party library):
'Good' examples (before the proposed change): #+BEGIN_SRC cl:evenp cl:oddp cl:stringp cl:base-string-p cl:hash-table-p cl:pathname-match-p #+END_SRC
'Bad' examples (before the proposed change) #+BEGIN_SRC bordeaux-threads:lock-p bordeaux-threads:semaphore-p cl:string-lessp cl:string-not-lessp cl:string-greaterp cl:string-not-greaterp #+END_SRC
- New (presumably low-level) language:
All of the above suggestions apply to this as well however if a new language is being made, it makes sense to care less about any kind of backwards compatibility and more about features. The ones presented here have very general description and are directions rather than something that can be put into actual specification.
** Object system *** Objects **** Details How objects should be made? Constructors, destructors, (multiple) inhertiance vs composition etc
**** Reasoning There are different object systems for different tasks, but some of them are easier implemented in terms of another. Currently classes/structures cannot be parameterized in any way.
**** Current status CLOS has classes that are more first-class citizens, compared to structures that are less used and supported. Both of them interact with the type system in a certain way, sometimes not the best.
*** Type system **** Details Which types should and should not be included, boxed vs unboxed types, type parameterization, linear types, dependent types etc. Interaction with other parts such as (compiler-)macros.
**** Reasoning The current type system is state of the art, but it is state of the art from the 90s. There was a lot of research in the area for the last decade that should be considered. While not all of this (if any) should go into the spec, the goal is to make it easier for the future developers to incorporate their preferred features into the language.
**** Current status
- [[https://blog.30dor.com/2014/03/21/performance-and-types-in-lisp/][Performnance and types in Lisp]]
- There are various attempts to extend the type system with proper parameterized types, recursive type definitions, more strict type checks and inference, the major example being [[https://github.com/stylewarning/coalton][Coalton]].
*** Methods **** Details Depending on the class system, methods or rather "methods" can be organized into traits/typeclasses, or generics, or belong to the class (unlikely). They can also specialize either on classes or types.
**** Reasoning THe way generics work in CLOS allows for a certain flexibility and famous runtime redefinition. However at the same time, the performance of the current approach is quite poor, restriciting its use. Sometimes specializing on classes instead of types can be limiting.
**** Current status There are several atttempts to deal with the inefficiency (in terms of raw performance and safety) of generic functions -- including [[https://github.com/marcoheisig/fast-generic-functions][fast-generic-functions]], [[https://github.com/markcox80/specialization-store][specialization-store]], and [[https://github.com/digikar99/adhoc-polymorphic-functions][others]]. However, they do not fully avoid the limitations mentioned above.
There are also attempts to do something completely different such as [[https://github.com/fare/lisp-interface-library][LIL]] -- they should not be forgotten.
** Syntax
While not a low-level thing on its own, any changes to syntax can imply changes so breaking (for backwards compatibility) that it cannot be put in any of the above categories. The opinions on the subject heavily differ, the opposite approaches suggested are:
No additional syntax at all (even for strings) with the ability to heavily modify the reader extending it for these types of things
vs
A lot of builtin syntactic sugar, for example for literals, but some other things as well (some examples presented below), since there aren't that many common special syntactic constructs, for ease of use.
The default case, case sensitivity etc are also a thing to discuss, although there isn't much talk about that part.
This section can be separated in the future if necessary.
*** Reader macros **** Details The way reader macros should work and interaction between user and reader.
**** Reasoning Currently the system does not allow users to hook into the reader. That could be a huge improvement, allowing for various modifications.
**** Current status There are libraries that try to extend possibilities, such as [[https://github.com/melisgl/named-readtables][named-readtables]].
*** Syntax **** Details If and where can =[]= or ={}= be introduced, slot/structure dot =.= access, etc.
**** Reasoning Compare =(slot-value (slot-value (slot-value x 'foo) 'bar) 'baz)= vs =x.foo.bar.baz= vs =(at x 'foo 'bar 'baz')=.
**** Current status
Available in the [[https://github.com/AccelerationNet/access/][Access]] library: dotted and nested access of data structures (including, but not limited to, slot values). Access is shipped in the [[https://github.com/ciel-lang/CIEL][CIEL]] project.
Available in the [[https://github.com/vseloved/rutils/blob/master/docs/tutorial.md][Rutils]] library: dot notation and index-based access. =(elt (nth 1 (foo-slot2 (bar-slot1 obj)) 0)= can be written [email protected]#1#0=.
*** Unified order of arguments **** Details Operations (functions, macros etc) have predictable (possibly identical) order of arguments. If an operation takes for example 2 arguments (data structure and an index/key) - it is expected that the data structure is always in the same position and the key is also in the same position across the board, regardless of what the actual positions are.
**** Reasoning This issue is an unnecessary burden on one's mind when developing a project and forces the developer to break their focus to think about an out of place implementation detail instead of on the program's business logic. It's just something that is there for no reason other than legacy.
**** Current status The standard contains functions like (gethash key hashtable) and (aref array index). In the case of gethash the data structure (hashtable) is the 2nd argument. In the case of aref the data structure (array) is the 1st argument. These are just 2 examples of such inconsistency. One potential example to follow would be the bulk of list operations, like mapcar, which take key (lambda) as the 1st argument and list(s) (data structure) as the 2nd or 3rd (etc) arguments and those kinds of functions are pretty much consistent across the board.
** System definitions *** Packages **** Details Allow to resolve names at runtime, more convenient export system etc.
**** Reasoning
**** Current status
*** Pathnames **** Details Interaction with pathnames w.r.t. current OS landscape. One standard way to parse a POSIX or Windows path string to a path name, or a URL. Path names should have a "method" for this.
**** Reasoning
**** Current status
*** Separation into libraries **** Details The core language can be separated into libraries with separate condition system, data structures library, algorithms library, math library, concurrency library, iteration library, code-walking library, etc.
**** Reasoning
**** Current status [[https://github.com/robert-strandh/SICL][SICL]] and [[https://github.com/clasp-developers/clasp][Clasp]] compilers are built with this idea in mind.
** Memory *** GC **** Details GC vs RAII in some form. There are several alternatives. Semantics of the language depends heavily on this as well. As this section is probably one of the most fundamental things for Common Lisp (as well as most other lisps), here's a more detailed description:
- Problems/Limitations/Use-cases with GC
- Aspects include real-time systems, working well with foreign code, embedded-systems aka devices with limited memory; maybe something else
- The whole page is an interesting read: [[https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Disadvantages][Garbage_collection_(computer_science) - Wikipedia]]
- https://en.wikipedia.org/wiki/Manual_memory_management#Manual_management_and_correctness
- What are the systems where GC is and isn't an issue (limited-memory systems?)
- Default GC, but if GC does have disadvantages, an option to handle the memory manually should be provided.
**** Reasoning GC vs (semi) manual memory management is about many things, including performance, convenience of use, precision of time estimations, more complicated structures etc.
**** Current status [[https://github.com/trivial-garbage/trivial-garbage][trivial-garbage]] provides a portable API to finalizers, weak hash-tables and weak pointers on all major implementations of the Common Lisp programming language.
*** Continuations **** Details A powerful low-level control construct.
**** Reasoning It is up to the debate for several reasons, one of them being its [[http://www.nhplace.com/kent/PFAQ/unwind-protect-vs-continuations-original.html][interaction]] with unwind-protect.
**** Current status
** Misc
- Useful accessors on macro environment objects.
** Is this idea new?
Of course not. Attempts to build low level C-like lisp exist, lots of them: [[https://github.com/eudoxia0/corvus][1]], [[https://github.com/tomhrr/dale][2]], [[https://github.com/kiselgra/c-mera][3]], [[https://github.com/eudoxia0/interim][4]] and there are more. Attempts to build low-level statically-typed lisp-like language are also well known: [[https://github.com/carp-lang/Carp][1]], [[https://github.com/u2zv1wx/neut][2]] and there are more. Two things they presumably lack are: pre-built well defined specification and community visibility and support.
Same can be said about attempts to just upgrade exiting CL implementation, such as famous [[https://lispcookbook.github.io/cl-cookbook/cl21.html][CL21]].
- Useful links:
[[http://nhplace.com/kent/Papers/cl-untold-story.html][Common Lisp: The Untold Story]] and [[http://nhplace.com/kent/Papers/][friends]] have a lot of useful info in them. [[https://pvk.ca/Blog/2013/11/22/the-weaknesses-of-sbcls-type-propagation/][Paul Khuong blog]] has many notes on potential compiler improvement, althoug specific to SBCL.
- Obstacles
While the discussion is good by itself, the important question is -- how can this come to life? There are 3 major components:
-
Money
-
Time
-
People
- Conclusion May not be written until the bulk of this document is finished.