Improve integration tests (stability + performance) #15125

RobinMalfait · 2024-11-22T20:40:57Z

This PR improves the integration tests in two ways:

Make the integration tests more reliable and thus less flakey
Make the integration tests faster (by introducing concurrency)

Tried a lot of different things to make sure that these tests are fast and stable.

The biggest issue we noticed is that some tests are flakey, these are tests with long running dev-mode processes where watchers are being used and/or dev servers are created.
To solve this, all the tests that spawn a process look at stdout/stderr and wait for a message from the process to know whether we can start making changes.

For example, in case of an Astro project, you get a watching for file changes message. In case of Nuxt project you can wait for an server warmed up in and in case of Next.js there is a Ready in message.

These depend on the tools being used, so this is hardcoded per test instead of a magically automatic solution.

These messages allow us to wait until all the initial necessary work, internal watchers and/or dev servers are setup before we start making changes to the files and/or request CSS stylesheets before the server(s) are ready.

Another improvement is how we setup the dev servers. Before, we used to try and get a free port on the system and use a --port flag or a PORT environment variable. Instead of doing this (which is slow), we rely on the process itself to show a URL with a port. Basically all tools will try to find a free port if the default port is in use. We can then use the stdout/stderr messages to get the URL and the port to use.

To reduce the amount of potential conflicts in ports, we used to run every test and every file sequentially to basically guarantee that ports are free. With this new approach where we rely on the process, I noticed that we don't really run into this issue again (I reran the tests multiple times and they were always stable)

Note: these tests run Linux, Windows and macOS in this branch just for testing purposes. Once this is done, we will only run Linux tests on PRs and run all 3 of them on the `next` branch.

We do make the tests concurrent by default now, which in theory means that there could be conflicts (which in practice means that the process has to do a few more tries to find a free port). To reduce these conflicts, we split up the integration tests such that Vite, PostCSS, CLI, … tests all run in a separate job in the GitHub actions workflow.

Comparing this branch against the next branch, this is what CI looks like right now:

`next`	`feat/improve-integration-tests`

There also was a point in time where I introduced sequential tests such that all spawned processes still run after each other, but so far I didn't run into issues if we keep them concurrent so I dropped that code.

Some small changes I made to make things more reliable:

When relying on stdout/stderr messages, we split lines on \n and we strip all the ANSI escapes which allows us to not worry about special ANSI characters when finding the URL or a specific message to wait for.
Once a test is done, we child.kill() the spawned process. If that doesn't work, for whatever reason, we run a child.kill('SIGKILL') to force kill the process. This could technically lead to some memory or files not being cleaned up properly, but once CI is done, everything is thrown away anyway.
As you can see in the screenshots, I used some nicer names for the workflows.

`next`	`feat/improve-integration-tests`

They also look a bit nicer in the PR overview as well:

The very last commit just filters out Windows and macOS tests again for PRs (but they are executed on the next branch.

Nest steps

I think for now we are in a pretty good state, but there are some things we can do to further improve everything (mainly make things faster) but aren't necessary. I also ran into issue while trying it so there is more work to do.

More splits — instead of having a Vite folder and PostCSS folder, we can go a step further and have folders for Next.js, Astro, Nuxt, Remix, …
Caching — right now we have to run the build step for every OS on every "job". We can re-use the work here by introducing a setup job that the other jobs rely on. @thecrypticace and I tried it already, but were running into some Bun specific Standalone CLI issues when doing that.
Remote caching — we could re-enable remote caching such that the build step can be full turbo (e.g.: after a PR is merged in next and we run everything again)

integrations/vitest.global-setup.ts

This ensures that we run CLI, PostCSS, Vite, ... tests per OS per type of test (sequential or concurrent).

For all shared configuration files instead of using CLI flags

All tests will be concurrent by default, only tests marked using `test.sequential` will be run in sequential order.

Instead of blocks; Also make sure to strip all ANSI escapes. This will be necessary when we are parsing values from the stdout/stderr output.

This is the biggest change. Whenever we start a long running process (via `spawn(…)`), we will ensure to wait for specific message on `stdout` or `stderr` from the process. These typically indicate when a process is ready (e.g.: initial build is done, or watchers are setup and we are waiting for file changes). This fixes race conditions where we already start writing files and expect a rebuild to happen before the process is ready. These messages can differ per command we run, so they are implemented on a per-test basis. Additionally, most long running processes that require a dev server expose the URL (host + port) in the stdout/stderr. We will use that information instead of using the `getFreePort()` that we used to use. The big benefit is that these processes typically already have conflict erroring, and try until a free port is available. Because most tests are concurrent, we also use the `expect` exposed from the test helper.

With the split in CI, properly waiting for dev servers to be ready and properly handling ANSI escapes, it seems like we don't even need sequential tests at all. Going to undo all of that in this commit, so we can re-enable it later if needed.

RobinMalfait · 2024-12-12T11:25:17Z

integrations/cli/config.test.ts

+    let process = await spawn(
+      'pnpm tailwindcss --input src/index.css --output dist/out.css --watch',
+    )
+    await process.onStderr((m) => m.includes('Done in'))


All spawn(…) tests will wait for a specific message to ensure we wait for everything to be ready before making changes. This improves the reliability of the tests.

RobinMalfait · 2024-12-12T11:25:49Z

integrations/cli/index.test.ts

@@ -491,7 +492,7 @@ test(
      'pages/nested/foo.jsx': 'content-["pages/nested/foo.jsx"] content-["BAD"]',
    },
  },
-  async ({ fs, exec }) => {
+  async ({ fs, exec, expect }) => {


Due to the concurrency, we have to use the expect provided by the test itself instead of the globally imported expect.

RobinMalfait · 2024-12-12T11:26:20Z

integrations/package.json

@@ -4,7 +4,6 @@
  "private": true,
  "devDependencies": {
    "dedent": "1.5.3",
-    "fast-glob": "^3.3.2",
-    "kill-port": "^2.0.1"


We removed the getFreePort() which relied on the kill-port package.

RobinMalfait · 2024-12-12T11:27:55Z

integrations/postcss/next.test.ts

+      let url = ''
+      await process.onStdout((m) => {
+        let match = /Local:\s*(http.*)/.exec(m)
+        if (match) url = match[1]
+        return Boolean(url)
+      })


We can potentially abstract this in a function but the regex is different based on the project. It also depends on the project if stdout or stderr is used. Because this is only a handful of lines, I just duplicated them.

Seems fine to me.

RobinMalfait · 2024-12-12T11:29:19Z

integrations/utils.ts

+test.skip = (name: string, config: TestConfig, testCallback: TestCallback) => {
+  return test(name, config, testCallback, { skip: true })
+}


Introduced this during debugging and while this is not used right now, I think this is still useful (similar to how .only and .debug is not used right now).

RobinMalfait · 2024-12-12T11:30:15Z

integrations/vite/index.test.ts

+      let process = await spawn('pnpm vite build --watch', {
+        cwd: path.join(root, 'project-a'),
+      })
+      await process.onStdout((m) => m.includes('built in'))


This is an example where we weren't properly waiting for the initial build to be ready.

RobinMalfait · 2024-12-12T11:31:56Z

integrations/vite/nuxt.test.ts

+test('dev mode', SETUP, async ({ fs, spawn, expect }) => {
+  let process = await spawn('pnpm nuxt dev', {
+    env: {
+      TEST: 'false', // VERY IMPORTANT OTHERWISE YOU WON'T GET OUTPUT


Vitest (I think) sets a process.env.TEST environment variable. If this is set to true, then nuxt dev doesn't show any output. Why? I'm not sure, but it was confusing because it worked while running manually. It also worked with copying the spawn() setup in a custom whatisgoingon.js file...

thecrypticace · 2024-12-12T12:39:45Z

integrations/vite/resolvers.test.ts

@@ -1,143 +1,148 @@
 import { describe, expect } from 'vitest'


This test still uses a global expect — this is probably fine but maybe we should use the one from the test for consistency.

thecrypticace reviewed Nov 22, 2024

View reviewed changes

integrations/vitest.global-setup.ts Outdated Show resolved Hide resolved

philipp-spiess force-pushed the next branch from 5eec09a to 6abd808 Compare November 27, 2024 17:25

RobinMalfait force-pushed the feat/improve-integration-tests branch 12 times, most recently from 1ecbc12 to 763ecb8 Compare December 11, 2024 15:51

RobinMalfait added 9 commits December 11, 2024 18:53

split integration tests in multiple jobs

7249e8f

This ensures that we run CLI, PostCSS, Vite, ... tests per OS per type of test (sequential or concurrent).

remove kill-port

f52a987

add vitest.config.ts

92720c3

For all shared configuration files instead of using CLI flags

expose expect for concurrent test

4780971

remove getFreePort() and rely on a base URL to be used

f1448a6

make tests concurrent and introduce test.skip and test.sequential

a6147bf

All tests will be concurrent by default, only tests marked using `test.sequential` will be run in sequential order.

ensure onStdout and onStderr receive lines

4bd7667

Instead of blocks; Also make sure to strip all ANSI escapes. This will be necessary when we are parsing values from the stdout/stderr output.

force kill process if normal kill didn't work

f7312af

RobinMalfait force-pushed the feat/improve-integration-tests branch from 8bcbff8 to cfd371a Compare December 11, 2024 18:04

run everything concurrently

a266124

With the split in CI, properly waiting for dev servers to be ready and properly handling ANSI escapes, it seems like we don't even need sequential tests at all. Going to undo all of that in this commit, so we can re-enable it later if needed.

RobinMalfait force-pushed the feat/improve-integration-tests branch from 24581c4 to a266124 Compare December 11, 2024 18:37

only run macOS and Windows on the next branch

6edba4e

RobinMalfait force-pushed the feat/improve-integration-tests branch from 182b21f to 6edba4e Compare December 11, 2024 18:45

RobinMalfait changed the title ~~[WIP] Improve integration tests~~ Improve integration tests (stability + performance) Dec 12, 2024

RobinMalfait marked this pull request as ready for review December 12, 2024 11:24

RobinMalfait requested a review from a team as a code owner December 12, 2024 11:24

RobinMalfait commented Dec 12, 2024

View reviewed changes

thecrypticace reviewed Dec 12, 2024

View reviewed changes

use expect exposed by test

8f9f90d

thecrypticace approved these changes Dec 12, 2024

View reviewed changes

RobinMalfait merged commit 2a29c29 into next Dec 12, 2024
5 checks passed

RobinMalfait deleted the feat/improve-integration-tests branch December 12, 2024 12:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve integration tests (stability + performance) #15125

Improve integration tests (stability + performance) #15125

RobinMalfait commented Nov 22, 2024 •

edited

Loading

RobinMalfait Dec 12, 2024

RobinMalfait Dec 12, 2024

RobinMalfait Dec 12, 2024

RobinMalfait Dec 12, 2024

thecrypticace Dec 12, 2024

RobinMalfait Dec 12, 2024

RobinMalfait Dec 12, 2024

RobinMalfait Dec 12, 2024

thecrypticace Dec 12, 2024

		@@ -1,143 +1,148 @@
		import { describe, expect } from 'vitest'

Improve integration tests (stability + performance) #15125

Improve integration tests (stability + performance) #15125

Conversation

RobinMalfait commented Nov 22, 2024 • edited Loading

Nest steps

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RobinMalfait commented Nov 22, 2024 •

edited

Loading