iota icon indicating copy to clipboard operation
iota copied to clipboard

Transactional semantics for data modification

Open pythonesque opened this issue 10 years ago • 4 comments

LAYERS

High level

  • Intuitive "key binding" style commands ("delete next character", "insert tab", etc.)
  • Aliases for queries.

Query level

  • Declarative.
  • Ordered sequence of queries of the form:

LET (result set) := MAP (set expression of positions) (position => sequence expression of characters)

IR level

  • For simplicity, in initial implementation, linearly scan through all text and check conditions on each character.
  • Later: indexes from common statistics (LINE) to position in buffer, basic constant propagation, smart joins, etc.

Low level

Data modification should be of the form:

  • Begin transaction.
  • Acquire stable view of the document (for some value of stable).
  • Acquire start state
  • ?
  • Apply series of VERY SIMPLE operations manipulating text. These operations do not have to worry about anything but buffer data. These actions should be ABSOLUTE. i.e. it would not be "delete line, move to next line, delete line", but "destroy line 0, destroy line 1." They would follow simple rules (e.g. all writes proceed from left to right, top to bottom) and be completely tied to the current buffer implementation.
  • End transaction.

The ? is where we calculate the series of very simple operations from higher level actions. We would calculate cursor position here too, I think.

This is a very rough draft idea and I am highly open to changes.

"High level" user facing language example (semantics, not syntax!):

TRANSACTION1 := MAP (POSITION(character) WHERE LINE(character) = LINE($CURSOR) OR LINE(character) = NEXT(LINE($CURSOR))) (position => [])

TRANSACTION2 := DELETE character WHERE POSITION(character) = POSITION(NEXT($CURSOR))"

Note that this is data modification. Cursor math is entirely separate from data modification--such queries do not move the cursor at all intrinsically, but may be updated in individual transactions (which can see values updated by old transactions):

POSITION($CURSOR) := MAX(POSITION(character)) WHERE DELETED(TRANSACTION1,character)

pythonesque avatar Dec 19 '14 11:12 pythonesque

SQL for text? Hmmm...

Let me see if I'm understanding this correctly:

TRANSACTION1 := MAP (POSITION(character) WHERE LINE(character) = LINE($CURSOR) OR LINE(character) = NEXT(LINE($CURSOR))) (position => [])

POSITION(character) says we're collecting a SET of buffer offsets to individual characters LINE(thing) is the number of newlines preceding thing NEXT(number) is number + 1

This thing:

LINE(character) = LINE($CURSOR) OR LINE(character) = NEXT(LINE($CURSOR))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

is a conditional matching any characters (or buffer positions) on the same line as the cursor or one line below the cursor.

and this thing:

(position => [])
 ^^^^^^^^    ^^

describes a (function? relation? (same difference?)) between buffer positions and some value (nothing, in this case).

Together,

MAP ( <property-of-thing> WHERE <conditional-expression> ) ( <relation-from-property-of-thing-to-value> )

describes an operation in which all the matching <property-of-thing>s are sent to <relation-from-property-of-things-to-value>.

Finally, TRANSACTION1 now describes the equivalent of typing dj in a vi style editor.

Alternatively, one could replace (position => []) with (position => ['z']), and replace every character on both lines with 'z', or perhaps (position => UPPERCASE(CHARACTER(POSITION))) and obtain the vi operation gUj.

At some point, the editor can now take this labeled transformation and (compile?/optimize?) it before applying it to the buffer.

Is that about right?

This is a really interesting idea, and I'm having a lot of fun thinking about it, but I have to wonder what performance is going to be like, especially if we're going to be evaluating complex clauses very often. If pressing the delete key involves evaluating a WHERE clause that searches the entire buffer for one position, that seems sub-optimal. That said, I use SQL in my day job, and while I don't know much about the implementation details, the compile+optimize steps in existing SQL engines clearly shows that it can be done efficiently.

I think building all edit operations around something like this would be really cool, but we'll have to take care deciding which primitives to support and how to handle query optimization.

I posted some thoughts on a slightly different approach over in #31, though this seems much more general.

crespyl avatar Dec 19 '14 15:12 crespyl

It's actually not terribly far off from the selection methods vim exposes. vim demonstrates important optimizations that make it fast, most of which come down to optimizing sensible defaults:

  • explicit range limiting results
  • defaulting to just the current line
  • defaulting to starting from the current cursor
  • looking at only the first match by default

So most queries are only going to be looking at a handful of characters.

pythonesque avatar Dec 19 '14 15:12 pythonesque

I think this is a really interesting idea & I'd like to see it pursued further.

withoutboats avatar Dec 22 '14 07:12 withoutboats

Maybe implement an EDSL inspired by this. MongoDB-like declarative language which the compiler can check for (instead of running on SQL injections and runtime errors)?

vinipsmaker avatar May 22 '15 13:05 vinipsmaker