Skip to content

Conversation

@gordonbrander
Copy link
Contributor

@gordonbrander gordonbrander commented Jul 9, 2024

This PR adds a new html function for tagged template literal strings.

TODO

  • html template literal tag function
  • Handle attr="", .prop="", and @event=""
  • Render string via template element, cache, and clone
  • Sanitize rendered templates
  • Bind primitive values
  • Bind reactive values
  • Support Sendable values as event handlers
  • Efficient reactive list diffing
  • Tests
  • DOM tests
  • Integrate with common-ui
    • Create mechanism that automatically registers common-ui tags with an allow list.
  • Integrate with common-frp
  • Integrate with propagators

Overview

import { html, render } from "@commontools/common-html";

const text = "Hello world!"; // Substitute primitive values
const click = () => console.log("Clicked!"); // Bind events
const hidden = state(false); // Bind reactive values

const template = html`
<button class="button" .hidden=${hidden} @click=${click}>
    ${text}
</button>
`

const dom = render(template)

This template tag looks a lot like lit-html, but it comes with a few new tricks:

  • Rendered DOM is reactive.
    • Any value with a .sink<T>((value: T) => void): Cancel method is treated as a reactive value
    • Reactive values are bound to the template and the template will be efficiently re-rendered with a fine-grained reactive update whenever they change
  • Elements, props, attrs, and events are all sanitizable via allow lists or deny lists using setSanitizer
  • Templates can render nested templates, including reactive templates. E.g. if the substitution is another template object, or a template object within a reactive container, that template is recursively rendered into the DOM, and updated when the reactive value is updated.

Template objects are plain JavaScript objects and can be serialized and persisted.

How to use it

Set attributes, .properties, or @events, just like with Lit:

const classes = 'container';
const hidden = false;
const onclick = (event: Event) => console.log(event);
const text = "Hello world";

const template = html`
<div class="${classes}" .hidden=${hidden} @click=${onclick}>
    ${text}
</div>
`;

Set reactive attributes and .properties. Any type with a .sink<T>((value: T) => void): Cancel method will work.

const hidden = state(false);
const text = state("Hello world");

const template = html`
<div .hidden=${hidden}>
    ${text}
</div>
`;

const dom = render(template);

text.send("Hello Reactivity!");

You can easily create template utility functions:

const Article = ({
  title,
  content
}: {
  title: string,
  content: string
}) => html`
<article class="article">
    <h1 class="title">${title}</h1>
    <div class="content">${content}</div>
</article>
`;

Templates can be nested:

const Title = (text: string) => html`<h1 class="title">${text}</h1>`

const Content = (text: string) => html`<div class="content">${text}</div>`

const Article = ({
  title,
  content
}: {
  title: string,
  content: string
}) => html`
<article class="article">
    ${Title(title)}
    ${Content(content)}
</article>
`;

Nested templates can be reactive:

// Put template object inside a reactive state container
// Anything with a  `.sink<T>((value: T) => void): Cancel` method will work
const date = state(html`<h1 class="time">${new Date()}</h1>`);

setInterval(() => {
  date.send(html`<h1 class="time">${new Date()}</h1>`);
}, 1000);

// This works
const container = html`
<div class="container">
    ${date}
</div>
`;

const dom = render(container); // Date is reactive

How it works

Under the hood, the implementation is very similar to lit-html, but with a few alterations to support reactivity and deep sanitization.

const template = html`
<button class="button" .hidden=${hidden} @click=${click}>
    ${text}
</button>
`;

html just gathers template strings array and substitutions array into a template object. The object returned from html is a plain frozen JavaScript object of { template, context }:

export type TemplateContext = Readonly<Array<unknown>>;

/**
 * A template object is an array of strings and an array of substitutions.
 * Typically created via the `html` template literal tagging function.
 */
export type Template = {
  template: Readonly<Array<string>>;
  context: TemplateContext;
};

The template object is a plain JavaScript object, and should be safe to serialize and persist (assuming the values in the context array are themselves serializable).

No actual work happens until you render():

const dom = render(template);

At this point, render takes the template and renders it to DOM using the following approach:

  • The template strings array is "flattened" into a single string
    • Placeholder strings are placed in the template "holes"
      • They look like this #hole12345678-0#
      • They have a random seed to prevent faking a hole
      • The last number after the dash is the index of the hole within the scope of the template
    • The template is joined into a single string
  • A DOM node is efficiently cloned from the string
    • A template element is created, the template string is assigned to its innerHTML
    • That template is cached in a Map for re-use
    • The template's .content is cloned and the firstElementChild returned from the DocumentFragment
    • This gets us fast and error-tolerant HTML parsing via leveraging the browser's built in mechanisms.
      • Note: elements cloned from template elements DO NOT execute scripts or script attributes UNTIL you append the element to a document. This means we CAN sanitize and prune the fragment before it is used.
  • We bind template substitutions to the DOM node, as well as prune and sanitize the node
    • We walk the cloned DOM with TreeWalker, visiting Element and Text nodes, and iterating over attributes.
    • Iterating over attributes:
      • At this point, attributes with template holes look something like attr="#hole12345678-0#", properties are placeholder attributes with a leading dot, like .prop="#hole12345678-0#", events are placeholder attributes with a leading at, like @event="#hole12345678-0#".
      • We remove the placeholder attribute
      • If attributes, properties, or events are disallowed, we prune them.
      • Otherwise, we match against the placeholder, extracting the corresponding index
      • If there is no placeholder (e.g. it's a static prop/attr), we set the value
      • If there is a placeholder, we get the corresponding replacement
        • If it's a primitive value, we set it
        • If it's a reactive value, we bind it to update that part of the DOM
    • Iterating over elements and text nodes:
      • If element is disallowed, we prune it
      • We replace the placeholder text node with a comment that acts as an anchor in the DOM
      • If the replacement is a primitive value we insert it as a text node
      • If the replacement is a nested template, we recursively render() it, and insert the DOM node
      • If the replacement is a reactive value (with a .sink() method), we bind updates so that the child will be replaced any time the reactive value changes.
        • Nested templates can also be reactive values.
  • Done! We return the element

Template grammar and render behavior

Template objects

Template objects are constructed via the html tagged template literal and consist of:

export type TemplateContext = Readonly<Array<unknown>>;

/**
 * A template object is an array of strings and an array of substitutions.
 * Typically created via the `html` template literal tagging function.
 */
export type Template = {
  template: Readonly<Array<string>>;
  context: TemplateContext;
};

Render behavior

  • template is an array of strings, where the breaks between array items represent "holes" in the template that will be replaced with template substitutions
  • context is an array of values to be placed in holes.
    • It is expected to be template.length - 1. However, the renderer is robust to missing or extra template context values.
  • Template context values are type unknown (any), and are disambiguated at runtime.
    • For attributes
      • If the key starts with on, the attribute is rejected and a warning is logged.
      • If the attribute is disallowed by the sanitizer, it is rejected and a debug message is logged.
      • If the value is reactive, the renderer binds the value reactively
        • Value updates are coerced to type string and set
      • If the value is anything else, it is coerced to string and set
    • For .properties
      • If the cleaned key starts with on, the attribute is rejected and a warning is logged.
      • If the property is disallowed by the sanitizer, it is rejected and a debug message is logged.
      • If the value is reactive, the renderer binds the value reactively
        • Value updates are set directly on corresponding dom property, and the DOM is allowed to handle the unknown type
      • If the value is anything else, it is set on the corresponding dom property, and the DOM is allowed to handle the unknown type
    • For @events
      • If the event is disallowed by the sanitizer, it is rejected and a debug message is logged.
      • If the value is a function, it is bound to the corresponding event
      • All other values are ignored, and a warning is logged
    • For content
      • If the value is another template, it is recursively rendered, and inserted at that location within the DOM
      • If the value is reactive, and contains a template, reactive updates are recursively rendered, and automatically updated at the corresponding location within the DOM
      • If the value is reactive, the renderer binds the value to update reactively
        • Value is coerced to string and rendered at that location within the DOM
      • Any other value type is coerced to string and rendered at that location in the DOM.

Template string parsing

We rely on the browser to parse template strings via the <template> tag. Once the template part array is flattened to a string with placeholder values, the formal parsing behavior for template strings can be considered to be the same as the HTML spec for parsing documents, with some minor variations on top:

  • Attributes:
    • Follow normal attribute parsing behavior
      • Exception: disallow any attributes beginning with on (event attributes are not allowed)
  • Properties
    • Example: .hidden=${true}
    • Start with a leading ., but otherwise follow normal attribute name grammar
      • Exception: disallow any attributes beginning with on (event properties are not allowed)
    • "Holes" in the property value position will bind the replacement to the corresponding DOM element property.
      • This allows efficient assignment of primitive values to elements without coercion to string
  • Events
    • Example: @click=${handle}
    • Start with leading @, but otherwise follow normal attribute name grammar
    • "Holes" in the event value position will bind the replacement to the corresponding event.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this appears unused by the rest (which i think is good, since we might use a different implementation in other contexts).

hence this question might be unrelated: sink here calls the callback with the initial state, but i don't think updates did. which is the proper semantics assumed by your code?

Copy link
Contributor Author

@gordonbrander gordonbrander Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q1 This is just a simple reactive value container without scheduling that I added to the package for test mocking purposes. It’s not intended for use outside of tests.

Q2 Regarding behavior, it’s typical for reactive data containers like cells and signals to call the subscriber once immediately with the current value. This ensures that the subscriber doesn’t have to time its subscription correctly to avoid missing the first value.

Our signals implementation did this, and I would recommend our propagators implementation do the same.

Note that event streams (e.g. observables) don’t have this “call immediately” behavior because they don’t have state. They are just streams of events over time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, then this is a difference with updates: That one doesn't send values, it accepts Sendable<void>.

That's actually useful for driving the scheduling, which when ultimately invoking the function will sample the value. So we might want both on cells, for different purposes.

};

/** Generates a random number with 8 digits */
const randomSalt = () => Math.random().toFixed(9).slice(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for security reasons or just to reduce the chance of accidental collisions?

Since the context is created with and associated with the template, is there really something disallowed code pull off by guessing the numbers?

In any case, the templates's hash can't be inserted into the template, so that would be a safe choice.

(Mostly trying to understand)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rationale here basically boils down to trying to reduce the possibility of failure.

Before we were using code to construct markup, and could enforce invariants through that code so that it was impossible to construct broken markup. Now that we're parsing a string (or rather allowing the browser to parse it), we have to be a bit clever. We walk over the resulting markup, and so can guarantee it is correct on the way out. However, the template holes are one place where it might be possible to break the render by faking a hole placeholder.

The random salt prevents collision (this is also what Lit does under the hood). I'm not sure if it is a hard security requirement to do better than this. If it is, I like the idea of using the hash, as you mentioned.

@gordonbrander
Copy link
Contributor Author

Closing in favor of #117.

#117 is already at and past the stage that this PR was in, and has resolved the issues uncovered in tests with the approach tried here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants