Skip to content

Build markdown documents to HTML in the CI workflow#13432

Merged
tabatkins merged 1 commit intomainfrom
build-markdown
Feb 3, 2026
Merged

Build markdown documents to HTML in the CI workflow#13432
tabatkins merged 1 commit intomainfrom
build-markdown

Conversation

@sideshowbarker
Copy link
Member

The legacy draft server converts .md files in spec directories (explainers, transition documents, proposals, etc.) to HTML. This adds that conversion to the GitHub Actions workflow so the files are served from GitHub Pages.

A new bin/build-markdown.py script finds all .md files one level deep in spec directories and converts them to HTML using Python-Markdown, with a minimal stylesheet for readability. The workflow installs the markdown package and runs the script after building specs.

Relates to #12054

@w3cbot

This comment was marked as off-topic.

@frivoal
Copy link
Collaborator

frivoal commented Feb 3, 2026

Thanks, this is the kind of thing we need.

I wonder if we have a dependency on any particular flavor of markdown. I don't know what the current server uses, but it'd be good to check if that's the same as what's proposed here or not. Another consideration is that quite a few of our markdown files start their life as documents served by GitHub, and might be using various features of GitHub flavored markdown (which may or may not be preserved by the current server). If so, and if we want to support that, maybe using a tool like https://github.com/joeyespo/grip could help.

Add a build step that converts .md files in spec directories
(explainers, transition documents, proposals, etc.) to HTML so
they're served from GitHub Pages. The legacy draft server that
previously handled this conversion is struggling with ~60% uptime.

Relates to #12054
@sideshowbarker
Copy link
Member Author

@frivoal Short answer: I think what’s set up here now — using Python-Markdown — is the best choice and does everything y’all need, as far as supporting all markdown features in the sources here. So that’s what I’d recommend.

Longer answer:

I wonder if we have a dependency on any particular flavor of markdown.

I think there are some markdown tables in some of the sources, and we need to fenced-code syntax-highlighting stuff. So for those, we need GitHub-flavored-markdown support. But the good news is: Python-Markdown supports those. See the bit of the change at https://github.com/w3c/csswg-drafts/pull/13432/files#diff-eeeae3642da5cedb2d38612b2e6193054b2171632b614a104842144be1b5f74eR31:

    md = markdown.Markdown(extensions=["fenced_code", "tables"])

I don't know what the current server uses, but it'd be good to check if that's the same as what's proposed here or not. Another consideration is that quite a few of our markdown files start their life as documents served by GitHub, and might be using various features of GitHub flavored markdown (which may or may not be preserved by the current server). If so, and if we want to support that, maybe using a tool like joeyespo/grip could help.

I think the only other tool to possibly consider is cmark-gfm https://github.com/github/cmark-gfm. It’s a C program (though it has Python bindings we could use). I don’t know what the install story for it is — as far as us being able to add a cross-platform install step, to enable contributors to run it locally if they wanted to. I do know that there is a brew package for it. But regardless, if we were to use that, I think we’d need to have the docs explain how to install it.

For Python-Markdown in contrast, we don’t need to do that.

But anyway, as far as I can see, the only additional GFM feature that switching to cmark-gfm would buy us is: You could use the special > [!NOTE] etc GFM admonitions stuff, and get the pretty HTML rendering for that in the HTML output. Python-Markdown doesn’t (yet) support converting that to the same pretty rendering.

But none of the existing markdown sources here are using those admonitions. Yet. So there’d be no reason in choosing cmark-gfm for that at this point — I mean, just to get support for a GFM feature that none of sources are using.

That said, it would not be a big deal at all at some point later to update the build here to switch over to using cmark-gfm. I would be happy to set up, if at some point later y’ll decide you want to use those GFM admonitions.

Otherwise, all that said, as far as I can see, the choice really is just between Python-Markdown and cmark-gfm. There are no other alternatives that need to be considered.

@tabatkins
Copy link
Member

I think we're fine. GH-flavored markdown's only real innovation is the NOTE/etc blocks, and those are done via an existing markup pattern that looks reasonable enough when rendered by a vanilla markdown renderer.

@tabatkins tabatkins merged commit 0a496e6 into main Feb 3, 2026
1 check passed
@sideshowbarker sideshowbarker deleted the build-markdown branch February 4, 2026 00:42
@sideshowbarker
Copy link
Member Author

sideshowbarker commented Feb 4, 2026

Not to beat this into the ground, but for the record here: After looking into cmark-gfm further, I find it also doesn’t handle the GH > [!NOTE] etc admonitions stuff (with apparently are officially called “alerts”).

So there’d be zero point in using cmark-gfm rather than Python-Markdown.

But there is one thing that does handle those GFM “alerts”: The GH API https://api.github.com/markdown endpoint.

So, if at some point later y’all do find a need to use GFM features that Python-Markdown doesn’t support, then you could switch the build to using that GH API markdown endpoint. The main downside would be the obvious one: That it would requiring having network connection / being online. There’s also the fact that the API is rate limited, and the fact that the API requests are limited to the source markdown files needing to be 400KB or less.

The rate limit and size limit would not be problems here in practice. But I reckon the requirement to have to be online could be a PITA. And it seems like there’d be no point in introducing an extra PITA for no real additional benefit.

So yeah, I’d imagine all y’all and your contributors would agree that relying on Python-Markdown seems fine.

@frivoal
Copy link
Collaborator

frivoal commented Feb 4, 2026

I think we're fine. GH-flavored markdown's only real innovation […]

I am more concerned about incompatibilities. Conpare the rendering of this file:
GH: https://github.com/webplatformco/project-image-animation/blob/main/image-animation-property/README.md
python: https://drafts.csswg.org/css-image-animation-1/explainer

I don't think I'm using any unique feature, but the way things get escaped, new lines get processed, etc, is different, and running the GH file through python is just broken. I could probably fix by adjusting the file, but if this explainer is broken, it may not be the only one…

@plinss
Copy link
Member

plinss commented Feb 4, 2026

The legacy server used CommonMark

@sideshowbarker
Copy link
Member Author

I think we're fine. GH-flavored markdown's only real innovation […]

I am more concerned about incompatibilities. Conpare the rendering of this file:
GH: webplatformco/project-image-animation@main/image-animation-property/README.md
python: drafts.csswg.org/css-image-animation-1/explainer

Yipes yeah that’s awful

The legacy server used CommonMark

OK see #13448.

https://github.com/readthedocs/commonmark.py appears to be archived and no longer maintained. So, switching the GitHub-supported CommonMark successor would seem to be the best choice.

@frivoal
Copy link
Collaborator

frivoal commented Feb 6, 2026

#13448 made things looks good now. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants