Quill internals: formatAt() in detail, and nesting of DOM tags

This post was written by eli on December 8, 2021
Posted Under: JavaScript,Rich text editors

Everything in this post relates to Quill v1.3.7. I don’t do web tech for a living, and my knowledge on Quill is merely based upon reading its sources. Besides, I eventually picked another editor for my own use..

Introduction

This post is a collection of findings I made while trying to figure out the following: Consider this simple HTML code

None enabled, <em><strong>bold and italic, </strong>only italic.</em>

which is visually formatted as follows:

None enabled, bold and italic, only italic.

Note that there’s a part starting with bold and italic, and then the bold is taken off, leaving italic only. Oddly enough, when pasting the formatted line above into Quill, the resulting innerHTML is

<p>None enabled, <strong><em>bold and italic, </em></strong><em>only italic.</em></p>

Because Quill chose to put the <strong> tag first and the <em> afterwards, it was forced to insert an </em> to allow for inserting </strong>, and then re-enable italic with <em>.

This is suboptimal, however harmless with these tags, but recall that <a> tags for links are also implemented as Inline blot extensions, so what happens if the text of a link is partly bold? Could Quill create two separate links, one for the part in bold, and one for the part that isn’t, because of the same issue shown above with nested tags? The short answer is of course no, and I’ll just explain how Quill is designed to ensure this won’t happen.

But this is an opportunity to point out, that if a link spans several Block blots (for example, several paragraphs), Quill will create a different <a>-</a> tag pair inside each paragraph. Just try it out: Select several paragraphs in the editor, and assign that text a link, and then look at the exported HTML or the DOM with the browser’s DOM viewer.

Frankly speaking, I’m not sure if it’s legal to have a link spanning across paragraphs (something like <a href=”/”> <p> This </p> <p> That </p> </a>) but it’s an uncommon use case, so who cares.

What’s important is that changing the formatting inside a small text segment won’t turn it into two links, and that doesn’t happen. The magic behind this is explained next.

DOM tree ordering: The problem

First, let’s understand the problem. Suppose there’s a piece of text in the document saying Hello: Five characters, already italics. HTML-wise, it’s <em>Hello</em>. From a blot perspective, it’s an Italic (Inline) blot with one child, a Textblot, containing the “Hello” string.

Now I mark all five characters and press the bold button on the editor. This results in a call to formatAt() with the index and length of the selected text. Say, something like

quill.scroll.formatAt(105, 5, 'bold', 'true');

I should mention that formatAt() is an internal function, and it doesn’t update the Delta representation of the document. The recommended API call is format() and friends.

A call to formatAt() results in a whole lot of activity, but the punchline is the creation of a new Bold blot and the call to wrap() (defined in the ShadowBlot class), which puts a Bold blot at the position of the blot that it’s formatting, and makes the latter the child of the former. From an HTML point of view, the <strong>-</strong> wrap everything inbetween, hence the method’s name.

I was deliberately ambiguous regarding what exactly is being wrapped, because that’s the big question: The formatAt() call relates to positions in the document, but format blots themselves contribute zero length. Therefore, that range with length 5 could mean the text only, or the text wrapped with italic tags. In other words, the wrap() call could be applied to the Textblot directly or to the Italic blot, and both would satisfy the request.

To ensure consistent behavior, there’s this little snippet in Quill’s blots/inline.js:

// Lower index means deeper in the DOM tree, since not found (-1) is for embeds
Inline.order = [
  'cursor', 'inline',   // Must be lower
  'underline', 'strike', 'italic', 'bold', 'script',
  'link', 'code'        // Must be higher
];

So the @order array of the Inline class defines which formatting blot become the parent of which, which is the same as saying which DOM tag become the parent of which. Or which HTML tags wrap which.

As the comment says, the elements appearing later in this list get a higher place in the DOM hierarchy. As one would expect, “link” appears almost last, so other formatting tags are pushed between the <a>-</a> tag and not vice versa. So as one would expect, formatting inside link text never generates multiple links. Only a “code” blot can do that.

As for blot classes that aren’t listed, they get lowest down in the DOM (as if they were before “cursor”), and if two unknown blot classes compete, it’s as if they were listed in this array in alphabetical order.

The important takeaway is that if you’re writing a blot that extends Inline (even indirectly), you may want to add it to the list somewhere towards the end, if having it divided into segments is an issue (like with the Link blot).

This is done more or less like this — with emphasis on the part marked in red.

var Inline = Quill.import('blots/inline');

class MyBlot extends Inline {
  [ ... ]
}
MyBlot.blotName = 'myblot';
MyBlot.tagName = 'span';
MyBlot.className = 'myclass';

Inline.order.push(MyBlot.blotName);

It would of course have been nicer to have an official API for this rather than hacking an internal variable, but I can’t see a more robust way for this.

How ordering is implemented

If you just wanted the bottom line, there’s no point reading further in this post. What follows is the partial explanation to how I reached the conclusion above. For this section, I’ll stick with Quill’s blots/inline.js unless said otherwise.

First, this is the compare() utility function:

  static compare(self, other) {
    let selfIndex = Inline.order.indexOf(self);
    let otherIndex = Inline.order.indexOf(other);
    if (selfIndex >= 0 || otherIndex >= 0) {
      return selfIndex - otherIndex;
    } else if (self === other) {
      return 0;
    } else if (self < other) {
      return -1;
    } else {
      return 1;
    }
  }

It returns a negative number if its first argument’s name appears before the second in Inline.order and vice versa. That’s the main thing about it. Since indexOf() returns -1 if the string isn’t found, a name not found is considered as if it was before the first element in the array.

And now to Inline’s formatAt():

  formatAt(index, length, name, value) {
    if (Inline.compare(this.statics.blotName, name) < 0 && Parchment.query(name, Parchment.Scope.BLOT)) {
      let blot = this.isolate(index, length);
      if (value) {
        blot.wrap(name, value);
      }
    } else {
      super.formatAt(index, length, name, value);
    }
  }

Before getting into the details, I’ll explain this briefly: The call to compare() checks the position of the desired blot name (the format to add) vs. the name of the current blot (the one of the object that the method is called with). If the current blot appears before in Inline.order, it’s shred into pieces if necessary, and the piece that corresponds to desired segment (index, length) is pushed down as the child of a new blot that is created for the formatting.

An additional condition for this to happen is that the name of the formatting corresponds to a blot class (this is the Parchment.query() part).

Otherwise, the current blot remains intact, and formatAt() call is applied to all its children. This is the super.formatAt() part.

So now the details. It’s a long journey.

The super.formatAt() part is easier to explain. It falls on the method defined in the ContainerBlot class (see parchment/src/blot/abstract/container.ts):

  formatAt(index: number, length: number, name: string, value: any): void {
    this.children.forEachAt(index, length, function(child, offset, length) {
      child.formatAt(offset, length, name, value);
    });

forEachAt() (and similar methods) is implemented in parchment/src/collection/linked-list.ts. As its name implies, it calls a function on all blots within a range (index, length) with proper index and length for each call.

As for the shredding part with isolate() and wrap(): I’m going into these in detail below, but I’ll give the spoiler already:

  • isolate() returns a blot that consists of the segment given by index and length, relative to the blot it’s called on. If that blots needs to be chopped up into siblings (including chopping up the subtree), so be it.
  • wrap() creates a new blot based upon the name and value argument, and pushes the object on which it was called as the new blot’s child. Which is the equivalent of wrapping the HTML segment with the new blot’s tags.

The isolate() method

isolate() is defined in parchment/src/blot/abstract/shadow.ts:

  isolate(index: number, length: number): Blot {
    let target = this.split(index);
    target.split(length);
    return target;
  }

This method quite simply chops up the blot into pieces as necessary to obtain a blot that covers exactly the desired segment, as required by the index and length arguments.

To get an idea how this work, consider the split() method for plain text blots, as defined in parchment/src/blot/text.ts:

  split(index: number, force: boolean = false): Blot {
    if (!force) {
      if (index === 0) return this;
      if (index === this.length()) return this.next;
    }
    let after = Registry.create(this.domNode.splitText(index));
    this.parent.insertBefore(after, this.next);
    this.text = this.statics.value(this.domNode);
    return after;
  }

This method divides the blot into two at the index (relative to the blots beginning, of course) and returns the second (new) blot. And it does nothing in particular if the split point is between blots (index at the beginning or after segment).

This snippet concisely shows how a text node is split with the DOM API’s splitText, and how the new blot node is created and inserted, but that’s a different story.

When the blot for splitting is a Bold, Italic or some other Inline class, ContainerBlot’s split() method does the work instead (see parchment/src/blot/abstract/container.ts):

  split(index: number, force: boolean = false): Blot {
    if (!force) {
      if (index === 0) return this;
      if (index === this.length()) return this.next;
    }
    let after = <ContainerBlot>this.clone();
    this.parent.insertBefore(after, this.next);
    this.children.forEachAt(index, this.length(), function(child, offset, length) {
      child = child.split(offset, force);
      after.appendChild(child);
    });
    return after;
  }

So as one would expect, this method clones the blot in question, and then calls the split() method on all children that are in the range from the split point (i.e. @index) to the end of the what the blot covers. Effectively, this split() call changes only the first child in the list. The second command in the loop moves the child to the cloned blot object. So all in all, this clones the blot for which the method is called, and divides the child blots as appropriate, including splitting a child blot as necessary.

Due to the recursive nature of this method, child blots are split further down as necessary. As one would expect.

The wrap() method

So first a word about class hierarchy: FormatBlot extends ContainerBlot, which extends ShadowBlot.

The wrap() in FormatBlot (see parchment/src/blot/abstract/format.ts) reads:

  wrap(name: string | Parent, value?: any): Parent {
    let wrapper = super.wrap(name, value);
    if (wrapper instanceof FormatBlot && wrapper.statics.scope === this.statics.scope) {
      this.attributes.move(wrapper);
    }
    return wrapper;
  }

so there’s nothing interesting here: Aside from doing things with attributes, it just calls super.wrap(). None is defined in ContainerBlot, so that leaves us with ShadowBlot (parchment/src/blot/abstract/shadow.ts):

  wrap(name: string | Parent, value?: any): Parent {
    let wrapper = typeof name === 'string' ? <Parent>Registry.create(name, value) : name;
    if (this.parent != null) {
      this.parent.insertBefore(wrapper, this.next);
    }
    wrapper.appendChild(this);
    return wrapper;
  }

So pretty much as mentioned above: A new blot object is created with the name and value given, and inserted in place of the object for which the method was called (with insertBefore) and then the latter object becomes the child of the new blot object.

formatAt() for non-ContainerBlot descendants

This isn’t related related, but not all classes are derived from ContainerBlot, which catches formatAt() and applies it to all children. So what happens otherwise?

The generic formatAt() method (as well as several other manipulation methods) is defined in parchment/src/blot/abstract/shadow.ts:

  formatAt(index: number, length: number, name: string, value: any): void {
    let blot = this.isolate(index, length);
    if (Registry.query(name, Registry.Scope.BLOT) != null && value) {
      blot.wrap(name, value);
    } else if (Registry.query(name, Registry.Scope.ATTRIBUTE) != null) {
      let parent = <Parent & Formattable>Registry.create(this.statics.scope);
      blot.wrap(parent);
      parent.format(name, value);
    }
  }

So when the name corresponds to a blot class (as opposed to an attribute), formatAt() calls isolate() on the current object, and then calls wrap() on the object that isolate() returned. Just like the shredding option in Inline’s formatAt(). Actually, it was probably Inline’s formatAt() method that was copied from here, but never mind that.

And if you’re as far as this, it’s probably a good idea to start over to remind yourself what this post is about.

Reader Comments

“Besides, I eventually picked another editor for my own use..”

Which one did you pick?

#1 
Written By Gary Momenee on May 30th, 2022 @ 00:42

It was TinyMCE. Six months later, and lots of writing with that editor, I don’t regret it.

#2 
Written By eli on May 30th, 2022 @ 03:06

Add a Comment

required, use real name
required, will not be published
optional, your blog address