sexpresso icon indicating copy to clipboard operation
sexpresso copied to clipboard

An s-expression library for C++

#+TITLE: Sexpresso

Sexpresso is c++ centric [[https://en.wikipedia.org/wiki/S-expression][s-expression]] parser library. It uses value types and move semantics and thus manages to completely avoid calling new/delete, so it's probably free from memory leaks!

Sexpresso aims to be very simple, nodes are parsed either as s-expressions or strings, even a number would be parsed a string, so if you expect a node to be a number, please convert the string to a number!

  • How to use

By default Sexpresso uses [[http://doc.cat-v.org/bell_labs/pikestyle][pike style]] which means that it does not include any header files in its headers by default. If you wanna keep using /pike style/ you'll have to locate the line in the header you are about to include that will look something like this:

#+BEGIN_SRC c++ #ifdef SEXPRESSO_OPT_OUT_PIKESTYLE #ifndef SEXPRESSO_HEADER #define SEXPRESSO_HEADER #include #include #include // #include "sexpresso.hpp" #endif #endif #+END_SRC

And copy all of the includes before you include the actual header. If you don't wanna do this and use header guards which can add extra processing time for the pre-processor, and all in all more unclearness of what includes what (since it ends up being a big chain), you can define ~SEXPRESSO_OPT_OUT_PIKESTYLE~ for the pre-processor.

After that you can move onward!

** Parsing

#+BEGIN_SRC c++ auto parsetree = sexpresso::parse(mysexpr); #+END_SRC

This code will parse the ~std::string~ in mysexpr and return a Sexp struct value. There are two main things you can do with this value you can:

  1. Turn it back to a string with ~parsetree.toString()~
  2. Query it using ~parsetree.getChildByPath("path/to/node")~

Number 2 might be slightly confusing, how does it follow the path? Well if you've ever used lisp, you know that the /command/ in an s-expression is the first element. The same thing here determines where it goes looking for values. For example if you have an s-expression such as

#+BEGIN_SRC lisp (my-values (sweet baby jesus) (hi mom) just-a-thing) #+END_SRC

You can query it like:

#+BEGIN_SRC c++ auto sub = parsetree.getChildByPath("my-values/sweet"); #+END_SRC

and

#+BEGIN_SRC c++ auto sub = parsetree.getChildByPath("my-values/hi"); #+END_SRC

Note that you get the sexpr node that contains the value you were looking for as its first value. The sexpr then simply holds an ~std::vector~ of all the sub-values. However, it might not always use the vector, if it's simply a string value like ~just-a-thing~ in the above example, then the vector will be empty, and you need to access the string value instead.

#+BEGIN_SRC c++ if(sub->isSexp()) { std::cout << sub->value.sexpr[1]; } else { std::cout << sub->value.str; }

// or

switch(sub->kind) { case sexpresso::SexpValueKind::SEXP: std::cout << sub->value.sexpr[1]; break; case sexpresso::SexpValueKind::STRING: std::cout << sub->value.str; } #+END_SRC

Sexpresso provides a comfortable way to iterate only over the "arguments of a s-expression. For example if we have an s-expression like ~(hi 1 2 3)~ then the arguments are ~1~, ~2~ and ~3~. If we've parsed and stored that s-expression in a variable called ~hi~, we iterate over its arguments like this:

#+BEGIN_SRC c++ for(auto&& arg : hi.arguments()) { // .. }

// or for(auto&& it = hi.arguments().begin(); it != hi.arguments().end(); ++it) { // .. } #+END_SRC

You can also check if the arguments are empty and how many there are with the ~empty~ and ~size~ methods of the ~SexpArgumentIterator~ class.

WARNING Be REALLY careful that your query result does not exceed the lifetime of the parse tree:

#+BEGIN_SRC c++ Sexp* sub; { auto sexp = sexpresso::parse(mysexpr); sub = sexp.getChildByPath("my-values/just-a-thing") } // sexp gets destroyed here cout << sub.toString(); // BAD! #+END_SRC

** Serializing Sexp structs have an ~addChild~ method that takes a Sexp method. Furthermore, Sexp has a constructor that takes a std::string, so this should make it really easy to build your own Sexp objects from code that you can serialize with ~toString~.

#+BEGIN_SRC c++ auto myvalues = sexpresso::Sexp{"my-values"};

auto sweet = sexpresso::Sexp{"sweet"}; sweet.addChild("baby"); sweet.addChild("jesus");

auto hi = sexpresso::Sexp{"hi"}; hi.addChild("mom");

auto justathing = sexpresso::Sexp{"just-a-thing"};

auto myvaluesholder = sexpresso::Sexp{}; myvaluesholder.addChild(std::move(myvalues)); myvaluesholder.addChild(std::move(sweet)); myvaluesholder.addChild(std::move(hi)); myvaluesholder.addChild(std::move(justathing));

auto sexp = sexpresso::Sexp{}; sexp.addChild(myvaluesholder);

// sexp should now hold the same s-expression we wrote in text earlier std::cout << sexp.toString(); #+END_SRC

*** Important

The outermost s-expression does not get surrounded by paretheses when calling toString, as it treats a string as being implicitly surrounded by parentheses. This is so that you can have multiple s-expressions in the "root" of your code, and serialization goes back to text the same way it came in. That's why we have the ~sexp~ in the above code example. If we simply called ~toString~ on ~myvaluesholder~ we would get

#+BEGIN_SRC lisp my-values (sweet baby jesus) (hi mom) just-a-thing #+END_SRC

instead of

#+BEGIN_SRC lisp (my-values (sweet baby jesus) (hi mom) just-a-thing) #+END_SRC

Cool? Cool.

  • S-expression primer

Confused? I mean what iiiis an s-expression?

s-expressions come from the lisp family of programming languages, it is an incredibly simple notation for lists, however, since these lists can be nested it also means that they are great for representing hierarchies as well, which makes it an excellent replacement for XML or JSON.

The notation is simply to surround the elements, separated by whitespace in parentheses, like this:

#+BEGIN_SRC lisp (here we have an s-expression) #+END_SRC

What you see here is a list of 5 symbols: ~here~, ~we~, ~have~, ~an~ and ~s-expression~. Like I said you can also put s-expressions inside s-expressions to create hierarchies:

#+BEGIN_SRC lisp (my-objects (object-a (name "isak andersson") (countries swe uk)) (object-b (name "joe bain") (countries uk))) #+END_SRC

And as you could see earlier in the [[How to use]] section you can query this hierachy easily with this library. Say that this s-expression is stored in a variable called ~objs~, you can query it like this:

#+BEGIN_SRC lisp auto joe = objs.getChildByPath("my-objects/object-b/name"); #+END_SRC

  • FAQ ** Why should I use s-expressions because they are more elegant and simple than XML or JSON. Much less work required to parse. And they look nice! (subjective)

  • Contributing This library is public domain (CC0). Generally speaking by default, you have copyright on anything you make. So you have to explicitly give up copyrights in order to put something in the public domain. In this repository, please add your signature to [[CONTRIBUTORS.txt]] when contributing.

  • Future direction Make it a header-only library instead perhaps?