positano
positano copied to clipboard
Provenance system for Clojure code
positano
"Positano bites deep. It is a dream place that isn't quite real when you are there and becomes beckoningly real after you have gone." -- John Steinbeck
Provenance system for Clojure code. This is a documentation-first project, so there very little code right now.
Provenance
Provenance (from the French provenir, "to come from"), is the chronology of the ownership, custody or location of a historical object. Within computer science, provenance means the lineage of data, as per data provenance, with research in the last decade extending the conceptual model of causality and relation to include processes that act on data and agents that are responsible for those processes.
Motivation
Immutability and complexity
One of the often-touted advantages of Clojure's immutability is that it makes it easier to reason about the code because, well, state is not mutated. Frequently, in practice, Clojure become very adventurous with chaining multiple data transformations. This, combined with the lack of types often leads to long chains of transformations and/or deep call stacks that transform the data into shapes that are not immediately apparent and require a painstaking exploratory approach to elucidate.
Debuggers can be used to step through the code, and to watch the data transformation as it happens and to construct a mental narrative of the different steps and data shapes that arise, but would it not be preferable to have a tool that constructs the narrative for you and then allows you to explore it visually and by running queries against the execution narrative? This type of exploration would have the added advantage of being able to record multiple runs and then to query them all at once.
Data shape exploration and troubleshooting
Information about the parameters passed to functions can be generalised to show the shape of data generally expected by the function. This can then be used to infer a Prismatic schema or a Typed Clojure type which can be incorporated into the code. This is similar to F# type providers which infer types from example data.
There is a certain class of errors that occurs in dynamically typed languages where a function is passed data of the correct shape in most cases, but under certain circumstances it is passed data of the wrong shape which then causes a bug. In most cases the bug is only revealed several levels deeper into the call stack because the data conflicts with the assumptions of a different function. Gathering data about what the passed parameters look like can reveal such bugs in a statistical way: if parameters conform to the same shape in 99% of the calls to a function, the programmer should look into the remaining 1% in case it reveals a bug.
Coverage exploration
Running a collection of unit tests against a codebase along with a provenance tool can reveal which parts of the codebase are not exercised, and therefore it can be used as a test-coverage metric.
Usage
How to
This section assumes the following:
(require '[positano.trace :as trace])
Find out how many vars are being traced:
(->> (trace/all-fn-vars) (filter trace/traced?) count)
Print out all the vars being traced:
(->> (trace/all-fn-vars)
(filter trace/traced?)
(map str)
clojure.pprint/pprint)
Trace vars that belong to a namespace with a specific prefix:
(->> (trace/all-fn-vars)
(filter #(trace/ns-prefix? % "my-project.my-ns."))
(map trace/trace-var*)
(doall))
Untrace everything:
(trace/untrace-all)
Architecture
- Total code tracing
- Datomic for execution data collection
- Utility code for querying/filtering/exploring/visualising execution narrative.
- Utility code for extracting Prismatic schemas and Typed Clojure types from execution information.
Limitations
positano will refuse to add tracing to certain namespaces when you use
the trace-ns* function (it will add tracing when you use the
trace-var* function on vars belonging to those namespaces, so do
that at your own risk):
- clojure.core
- clojure.core.protocols
- clojure.tools.trace
- datomic.*
- clojure.tools.analyzer.*
- clojure.core.async.*
- refactor-nrepl.*
- clojure.tools.nrepl.*
- clojure.repl.*
- cider.*
- deps.*
- positano.*
Next steps/roadmap
- Explore tracing individual S-expressions
- Utilities for deriving schemas and type annotations
License
Copyright © 2015 Efstathios Sideris
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.