Quill: The DOM, Parchment, and Delta views of the document

This post was written by eli on December 16, 2021
Posted Under: JavaScript,Rich text editors

Everything in this post relates to Quill v1.3.7. I don’t do web tech for a living, and my knowledge on Quill is merely based upon reading its sources. Besides, I eventually picked another editor for my own use.

The trio

The most remarkable thing with the Quill editor, is that it constantly retains three representations of the edited document, all of which are in sync with each other:

  • The DOM, which is the browser’s view on the visible edit window
  • The Parchment, which is Quill’s parallel representation to the DOM (kept as the quill.scroll object)
  • The Delta, which is a sequence of editing operations for creating the document from scratch (kept as quill.editor.delta). This can be though of as what to type, from top to bottom, to get the document that is currently in the editor window.

This post discusses these three representations and how the relation among them is maintained.

It’s worth mentioning that apparently Quill was conceived with the idea that the Delta representation should be used exclusively, instead of the HTML code. In other words, it was envisioned that the web page for publishing would load the Delta from the server in JSON format (or something of that sort), and then a Quill parser, running on the browser as JavaScript code, would extract it into the DOM representation for view.

This is probably the reason Quill’s formal API doesn’t provide a means to obtain the HTML of the document.

Positions: index and length

Even though not directly related to either of these three, it’s worth mentioning that there’s also an index / length representation, in which the document is viewed as a linear sequence of characters.

Formatting (e.g. bold, italics, links) doesn’t occupy any width, however a new paragraph has a length of one, on behalf of the newline that it results from (even though it isn’t included in the DOM, and hence not in innerText). Embedded objects, such as images, are also considered to have a length of one.

The index and length are used extensively in Quill’s API as well as internal machinery to define positions inside the document.

Quill and DOM

As the document is edited, it’s of course shown in the browser. Hence it necessarily has a DOM representation in the browser, as a subtree of the DOM element that encloses the editing area (kept as the quill.root property). There is nothing special about this area, in fact, except that it’s editable and manipulated by Quill.

It’s however worth to make a quick recap on how the DOM tree is structured. So without going into the gory details, one can say that the DOM’s tree is a reflection of the HTML that would represent it properly: Whenever there’s a tag that needs a terminator (e.g. <p>, <div>, <b>, <a>), the node belonging to the tag becomes the parent for everything between the tag and its terminator. The siblings in this subtree are of course placed in the same order they appear in this theoretic HTML document.

Note that the tree structure doesn’t imply the graphical packing of the elements: A <div> tag creates a subtree containing everything that is within its vertical limits, but <b> does nothing of that sort. On the other hand, <br> and <hr> tags don’t generate any subtree, but do influence vertical packing.

Another thing to be aware of is that superfluous tags create DOM nodes as they appear in the originating HTML. So if the HTML says <b><i><b>Text</b></i></b>, the superfluous, internal <b> tag creates a subtree. There’s no such optimization when building the DOM.

The Parchment

The Parchment is a tree that mimics the DOM’s tree structure quite accurately, but is based upon a completely different set of classes. Accordingly, the objects in the Parchments, the blots, have different properties and hence mostly contain different kinds of information.

There are differences between the structure of these two trees only where the Cursor blot is inserted (see a separate post on this) or in some other cases that go beyond plain document editing (e.g. when something that Quill can’t digest has been pasted into the editing area, in which case that chunk becomes uneditable, and hence the relevant subtree isn’t covered in the Parchment in detail). Other than that, a difference in the tree structure indicates a bug somewhere, most likely in some add-on module.

As mentioned before, the Parchment for a document is given as quill.scroll. The nodes in the Parchment is called Blots, which is a collective name for the nodes in the Parchments (i.e. the JavaScript objects) as well as the classes of these objects. Each node has a .next and .prev property pointing at its next and previous sibling (possibly null when there’s no such). When a node has children, .children.head points at the first child (and likewise .children.tail at the last). Otherwise these are null.

Even more interesting, each blot also has a .domNode property, which points at the DOM object it represents. Conversely, each DOM object inside the editor’s area has __blot property, so that __blot.blot points back to representing blot (except for those DOM nodes that are not covered).

The important difference between a blot and DOM node it represents, is that blot classes represent the intention of the graphical elements they generate. For example, an editor may be customized to support two kinds of links: To sites of type A and sites of type B. They should have different formatting and possibly attributes. By representing each link type as different blot classes, the user creates them with different buttons, possibly with different UI for feeding their details, and the formatting (i.e. selection of the correct class) is done automatically. Nevertheless, both blots end up as a <a> DOM element.

The vast majority of functions and methods in Quill operate on the Parchment tree and the DOM tree simultaneously. The low-level methods that relate to the Parchment tree separately are implemented as the Registry class.

The Delta format

Delta is a serialized, and completely different way to look at the document. Note that the Delta format is also used to represent differences and changes, which is discussed further down. For now, I’ll focus on Delta as a representation of the entire document.

The Delta object, which is kept as quill.editor.delta, but should be obtained by application with the the quill.getContents() method, consists of a single element, an array called ops. This array consists of the sequence of operations required to reproduce the document, starting from an empty one.

So the document isn’t described in terms of its structure, but how it would have been typed into the editing window from beginning to end. It’s worth to try out the live editor example at the bottom of this page which shows the Delta object side by side (and also explains the format in more detail).

The said ops array contains only insertion operations: Each operation is represented with an object, which has at least one property, called “insert”. If it contains a string, that’s the text to add at the specific point. Newlines in this string are newlines as typed on keyboard. These end up as some block blot (paragraph, header etc. depending on the context).

If the “insert” property is an object, that requests the generation of blot. The name of the property of this object is the class of the blot, and the value of the property is that value to be assigned as the “value” of the blot object.

An insert operation may also have another property, “attributes”, which is an object. Its properties modify the insert op in a variety of ways: The font, color and also a “link” property turns the inserted element into a link.

There is hence a fundamental difference between how text is represented in Delta format vs. with HTML and the DOM. In the latter case, it goes “bold starts here, text, bold ends here”. With Delta, it goes “this is a segment of uniformly formatted text, and complete description for the formatting is this and that (among others, bold)”. So the textual parts of the document are chopped into chunks with uniform formatting, and they may span several lines.

To produce a pretty-printed JSON string of the document in Delta format:

JSON.stringify(quill.getContents(), null, '  ');

Formatting in Delta ops

It may come counterintuitive that a link is a formatting of the text segment, but when considering that links are almost always segments with uniform formatting, it turns out to be the natural solution: The link is just an attribute of the text.

Another confusing thing might be that a header (as in <h1>Header</h1>) is inserted as two ops: The first is the text, and the second is just a “\n” insert with the attributes object containing a “header” property giving the rank of the header (i.e. <h3> gets @header 3). This is because attributes that relate to blocks are applied only to the newline character(s) in the the text, controlling which block-level blot is the parent of the text before the newline.

It also goes along with the fact that Quill ingests pasted HTML by traveling through the DOM of the pasted text in post-order, meaning that the children of a parent are scanned from left to right, and then the parent. Hence when scanning e.g. “<h1>This is <i>important</i></h1>” it goes “This is”, “important”, italic tag, header tag. Those used to RPN calculators will find this familiar.

Basic API for manipulating the document

The Quill API (mainly) supplies two methods for inserting things into the document: insertText() and insertEmbed().

insertText() is exactly like typing the text with the keyboard. Newlines (“\n”) are treated like pressing Enter. For example, if the text is inserted inside a bulleted list, a new line and bullet are created, exactly as pressing Enter would.

Inserting text with updateContents(), where plain text is given in the .insert property works completely differently, because the information is treated as Delta operations. For example, a newline in a Delta op may appear to have a quirky behavior unless the Delta format is properly understood.

Another aspect of updateContents() is that it doesn’t respect surrounding formatting. So for example, if insertText() adds text where the context is bold, the added text will be bold too. updateContents() will add non-bold text in this case, unless the Delta has been set up to generate bold text. Every “insert” entry in a Delta op lists all formatting that should be applied explicitly, regardless of the surroundings.

All calls to updateContents() relate to the beginning of the document. In order to reach the place to manipulate, “retain” is used to skip to that position (using index metrics) and possibly “delete” to remove parts, as described in the API page.

In summary: insertText() is like typing, and updateContents() injects blots and text directly, with possibly counterintuitive behavior.

Several manipulation methods are listed in the Partchment’s API, in particular insertAt(), formatAt() and deleteAt(). The first too are used directly by insertText() (see core/editor.js) however these shouldn’t be used directly except for when implementing blots and other internal functionality, since they don’t update the Delta view of the document.

For usage as a replacement for insertText() and friends, remember that these are methods of the Parchment and not Quill, so a typical call would go

quill.scroll.insertAt(index, text);

It’s also possible to call these methods on any blot, however note that the index is then related to the blot’s beginning. This is in fact the case with calling the Scroll object too, since its zero index is the beginning of the document.

As mentioned in a separate post of mine, formats are in principle divided into Inline and Block formats. formatText() works with the Inline formats only (or with the Block formats when targeting the newline character), and formatLine() only with Block formats. format() checks if the format description is in the Block or Inline group, and delegates the call to formatLine() or formatText() accordingly (see core/quill.js).

So this snippet turns the selected part into red font, and gets the line in which the selection (or cursor) is included into a block quote:

quill.format("blockquote", "true");
quill.format("color", "red");

These are the functions that are called by the toolbar, so using them in a script is equivalent to that. Note however that formatLine() or formatText() allow changing any place in the document, not just where the selection is.

Assigning innerHTML directly

There are certain situations, where it’s easier to assign a DOM object’s innerHTML directly, as a quick and somewhat dirty way to update the document’s content. One use for direct innerHTML assignment is integration with Highlight.js’ module in syntax.js. It’s also a possibility for ugly hacks instead of modifying the document with Quill’s API. If you choose to do so, kindly do not refer to this post on where you got the idea from.

If and when an assignment is made to innerHTML anywhere in the editor’s area, the related blots are updated to follow suit, and the internal Delta representation is updated immediately as well. This happens as a result of the browser reporting a change (mutation) to Quill, and consequently the synchronization takes place, as explained in the next section.

This doesn’t affect blots that are away from the DOM hierarchy that was affected by the innerHTML update. In effect, this means that if these blots have object properties that are not reflected in the DOM object, they are retained nevertheless: It’s not like the entire document is refreshed from the DOM.

Hence it’s OK to hold hidden information in the blots’ objects, as long as their relevant DOM elements aren’t updated with an innerHTML assignment.

Synchronizing the Parchment and DOM with update()

In principle, the editor window is managed by the browser. In order to keep the Parchment in sync with the editor window’s content, Quill registers itself to listen to several events involving pressing keyboard keys, pasting etc. On top of that, the ScrollBlot class, which is what the top-level scroll blot is made from, registers itself as follows during its construction (from parchment/src/blot/scroll.ts):

    this.observer = new MutationObserver((mutations: MutationRecord[]) => {
      this.update(mutations);
    });
    this.observer.observe(this.domNode, OBSERVER_CONFIG);

The MutationObserver class is defined by the browser. The result of this registration is that when the browser makes any change in the editor’s window, the update() method is called with the mutation array as provided by the browser. Each entry in this array defines which DOM element has changed, and how.

Note that this doesn’t relate just to direct changes of innerHTML, but to the vast majority of user edits on the document.

The update() method that is defined in the same class (and file) goes

  update(mutations?: MutationRecord[], context: { [key: string]: any } = {}): void {
    mutations = mutations || this.observer.takeRecords();
    // TODO use WeakMap
    mutations
      .map(function(mutation: MutationRecord) {
[ ... ]
      })
      .forEach((blot: Blot | null) => {
[ ... ]
      });
[ ... ]

It loops on the array of mutations, finds the corresponding blot object for each DOM object, and calls its update() method. This allows the blot object to update itself, possibly by changing its attributes and content, or update its subtree structure to match the updated DOM tree (see update() method in src/blot/abstract/container.ts).

Note that if update() is called with no arguments, takeRecords() is called to fetch any pending mutation records from the browser. This ensures that when update() returns, any changes in the DOM have been registered in the Parchment, and hence they are in sync.

It’s important to note that this mechanism covers only changes to the DOM that are initiated by the browser, e.g. when typing text or when text is pasted. Changing the selection, pressing the Enter or Delete key initiate events that are handled otherwise — this is handled by the Keyboard module. Quill calls update() when such events involve changes in the Parchment and/or DOM, typically calling quill.update() defined as follows in core/quill.js:

  update(source = Emitter.sources.USER) {
    let change = this.scroll.update(source);   // Will update selection before selection.update() does if text changes
    this.selection.update(source);
    return change;
  }

As this call is made without any mutations, the purpose of this call is to ensure that the Parchment is in sync with the DOM.

Looking at insertText()

As calling insertText() is equivalent to typing text manually, it’s worth looking at its simple implementation to get an idea how Quill processes input. This is contrary to the complicated handling of interactive input.

This function is defined in core/quill.js, and essentially calls the editor object’s insertText method, which is defined as follows in see quill/core/editor.js:

  insertText(index, text, formats = {}) {
    text = text.replace(/\r\n/g, '\n').replace(/\r/g, '\n');
    this.scroll.insertAt(index, text);
    Object.keys(formats).forEach((format) => {
      this.scroll.formatAt(index, text.length, format, formats[format]);
    });
    return this.update(new Delta().retain(index).insert(text, clone(formats)));
  }

What this demonstrates is that insertAt() is called to insert text into the DOM and Parchment. formatAt() then adds formatting as required, once again affecting both DOM and Parchment.

But then a Delta object that represents this change is generated, and the update() method is called with it. Note that “this” refers to the editor object, so this.update() is the method of the Editor class, and not the Scroll class. This is important, because the Editor’s class’ implementation of update() is completely different: Unlike the Scroll class’ implementation, which updates the Delta according to the Parchment, the Editor’s class implementation updates the Delta by manipulating Delta ops only.

This maintains the parallel view of the document in the quill.editor.delta.ops array. If this isn’t done properly, the Delta structure that is then used to save the document won’t match what’s seen on the editor window. It’s actually quite remarkable that this works.

This update() method is for low-level use, as it updates the Delta view of the document only. To apply Delta operations to a document, the API’s updateContents() should be used.

Internals: How attributes in Delta are applied

The Delta format has also other purposes, and is important within Quill’s API, in particular for requesting certain changes in the documents. This is however of less interest for those not writing or modifying Quill modules. Anyhow, this is a deep dive into the machinery that makes this happen.

The function that is used both by setContents() and updateContents() is applyDelta(), the latter defined in core/editor.js.

Aside from inserting text and objects, the attributes are applied. That’s done with this simple loop:

Object.keys(attributes).forEach((name) => {
 this.scroll.formatAt(index, length, name, attributes[name]);
});

Or simply put, the keys method is applied to the attributes to fetch the attribute keys, and then formatAt() is called for each. formatAt() is hence called in the order as returned by the JavaScript’s built-in keys() method, which is the order they were inserted. However Quill is designed to organize the blot structure (and hence DOM and HTML) in a canonical manner, no matter the order of formatting, so the ordering doesn’t matter effectively.

To complete this issue, I’ll just mention what @attributes equals when the loop above is executed: At this point, the @attributes object contains the updates that are required relative to the format that exists anyway in the current position. Or in more detail, this is done by first fetching them from the Delta op:

let attributes = op.attributes || {};

and if it’s a text op (i.e. the “insert” property is a string) , the current position’s format is calculated by querying the blots above and adjacent to the left for the format they contribute. This is done by recursively calling bubbleFormats() (defined in Quill’s blots/block.js), which calls the blots’ formats() function. @attributes is modified with

attributes = DeltaOp.attributes.diff(formats, attributes) || {};

The said diff() method is defined in quill-delta/lib/op.js, and it loops through all keys in both object arguments (concatenation of the keys() array of the first argument and the second, in this order), and returns the properties in the second argument that have different values from the first argument’s object. Properties that are present only in the first argument are returned with the key set to null.

So all in all, @attributes ends up with the changes needed to update the format to the one required in the Delta op: The value if it wasn’t defined at all or was different, and null if it should be removed.

Add a Comment

required, use real name
required, will not be published
optional, your blog address