Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve integration tests (stability + performance) #15125

Merged
merged 12 commits into from
Dec 12, 2024

Conversation

RobinMalfait
Copy link
Member

@RobinMalfait RobinMalfait commented Nov 22, 2024

This PR improves the integration tests in two ways:

  1. Make the integration tests more reliable and thus less flakey
  2. Make the integration tests faster (by introducing concurrency)

Tried a lot of different things to make sure that these tests are fast and stable.


The biggest issue we noticed is that some tests are flakey, these are tests with long running dev-mode processes where watchers are being used and/or dev servers are created.
To solve this, all the tests that spawn a process look at stdout/stderr and wait for a message from the process to know whether we can start making changes.

For example, in case of an Astro project, you get a watching for file changes message. In case of Nuxt project you can wait for an server warmed up in and in case of Next.js there is a Ready in message.

These depend on the tools being used, so this is hardcoded per test instead of a magically automatic solution.

These messages allow us to wait until all the initial necessary work, internal watchers and/or dev servers are setup before we start making changes to the files and/or request CSS stylesheets before the server(s) are ready.


Another improvement is how we setup the dev servers. Before, we used to try and get a free port on the system and use a --port flag or a PORT environment variable. Instead of doing this (which is slow), we rely on the process itself to show a URL with a port. Basically all tools will try to find a free port if the default port is in use. We can then use the stdout/stderr messages to get the URL and the port to use.

To reduce the amount of potential conflicts in ports, we used to run every test and every file sequentially to basically guarantee that ports are free. With this new approach where we rely on the process, I noticed that we don't really run into this issue again (I reran the tests multiple times and they were always stable)

image Note: these tests run Linux, Windows and macOS in this branch just for testing purposes. Once this is done, we will only run Linux tests on PRs and run all 3 of them on the `next` branch.

We do make the tests concurrent by default now, which in theory means that there could be conflicts (which in practice means that the process has to do a few more tries to find a free port). To reduce these conflicts, we split up the integration tests such that Vite, PostCSS, CLI, … tests all run in a separate job in the GitHub actions workflow.

image

Comparing this branch against the next branch, this is what CI looks like right now:

next feat/improve-integration-tests
image image

There also was a point in time where I introduced sequential tests such that all spawned processes still run after each other, but so far I didn't run into issues if we keep them concurrent so I dropped that code.

Some small changes I made to make things more reliable:

  1. When relying on stdout/stderr messages, we split lines on \n and we strip all the ANSI escapes which allows us to not worry about special ANSI characters when finding the URL or a specific message to wait for.
  2. Once a test is done, we child.kill() the spawned process. If that doesn't work, for whatever reason, we run a child.kill('SIGKILL') to force kill the process. This could technically lead to some memory or files not being cleaned up properly, but once CI is done, everything is thrown away anyway.
  3. As you can see in the screenshots, I used some nicer names for the workflows.
next feat/improve-integration-tests
image image

They also look a bit nicer in the PR overview as well:
image

The very last commit just filters out Windows and macOS tests again for PRs (but they are executed on the next branch.


Nest steps

I think for now we are in a pretty good state, but there are some things we can do to further improve everything (mainly make things faster) but aren't necessary. I also ran into issue while trying it so there is more work to do.

  1. More splits — instead of having a Vite folder and PostCSS folder, we can go a step further and have folders for Next.js, Astro, Nuxt, Remix, …
  2. Caching — right now we have to run the build step for every OS on every "job". We can re-use the work here by introducing a setup job that the other jobs rely on. @thecrypticace and I tried it already, but were running into some Bun specific Standalone CLI issues when doing that.
  3. Remote caching — we could re-enable remote caching such that the build step can be full turbo (e.g.: after a PR is merged in next and we run everything again)

@RobinMalfait RobinMalfait force-pushed the feat/improve-integration-tests branch 12 times, most recently from 1ecbc12 to 763ecb8 Compare December 11, 2024 15:51
This ensures that we run CLI, PostCSS, Vite, ... tests per OS per type
of test (sequential or concurrent).
For all shared configuration files instead of using CLI flags
All tests will be concurrent by default, only tests marked using
`test.sequential` will be run in sequential order.
Instead of blocks; Also make sure to strip all ANSI escapes. This will
be necessary when we are parsing values from the stdout/stderr output.
This is the biggest change. Whenever we start a long running process
(via `spawn(…)`), we will ensure to wait for specific message on
`stdout` or `stderr` from the process.

These typically indicate when a process is ready (e.g.: initial build is
done, or watchers are setup and we are waiting for file changes).

This fixes race conditions where we already start writing files and
expect a rebuild to happen before the process is ready. These messages
can differ per command we run, so they are implemented on a per-test
basis.

Additionally, most long running processes that require a dev server
expose the URL (host + port) in the stdout/stderr. We will use that
information instead of using the `getFreePort()` that we used to use.

The big benefit is that these processes typically already have conflict
erroring, and try until a free port is available.

Because most tests are concurrent, we also use the `expect` exposed from
the test helper.
@RobinMalfait RobinMalfait force-pushed the feat/improve-integration-tests branch from 8bcbff8 to cfd371a Compare December 11, 2024 18:04
With the split in CI, properly waiting for dev servers to be ready and
properly handling ANSI escapes, it seems like we don't even need
sequential tests at all.

Going to undo all of that in this commit, so we can re-enable it later
if needed.
@RobinMalfait RobinMalfait force-pushed the feat/improve-integration-tests branch from 24581c4 to a266124 Compare December 11, 2024 18:37
@RobinMalfait RobinMalfait force-pushed the feat/improve-integration-tests branch from 182b21f to 6edba4e Compare December 11, 2024 18:45
@RobinMalfait RobinMalfait changed the title [WIP] Improve integration tests Improve integration tests (stability + performance) Dec 12, 2024
@RobinMalfait RobinMalfait marked this pull request as ready for review December 12, 2024 11:24
@RobinMalfait RobinMalfait requested a review from a team as a code owner December 12, 2024 11:24
let process = await spawn(
'pnpm tailwindcss --input src/index.css --output dist/out.css --watch',
)
await process.onStderr((m) => m.includes('Done in'))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All spawn(…) tests will wait for a specific message to ensure we wait for everything to be ready before making changes. This improves the reliability of the tests.

@@ -491,7 +492,7 @@ test(
'pages/nested/foo.jsx': 'content-["pages/nested/foo.jsx"] content-["BAD"]',
},
},
async ({ fs, exec }) => {
async ({ fs, exec, expect }) => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the concurrency, we have to use the expect provided by the test itself instead of the globally imported expect.

@@ -4,7 +4,6 @@
"private": true,
"devDependencies": {
"dedent": "1.5.3",
"fast-glob": "^3.3.2",
"kill-port": "^2.0.1"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We removed the getFreePort() which relied on the kill-port package.

Comment on lines +133 to +138
let url = ''
await process.onStdout((m) => {
let match = /Local:\s*(http.*)/.exec(m)
if (match) url = match[1]
return Boolean(url)
})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can potentially abstract this in a function but the regex is different based on the project. It also depends on the project if stdout or stderr is used. Because this is only a handful of lines, I just duplicated them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me.

Comment on lines +428 to +430
test.skip = (name: string, config: TestConfig, testCallback: TestCallback) => {
return test(name, config, testCallback, { skip: true })
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced this during debugging and while this is not used right now, I think this is still useful (similar to how .only and .debug is not used right now).

let process = await spawn('pnpm vite build --watch', {
cwd: path.join(root, 'project-a'),
})
await process.onStdout((m) => m.includes('built in'))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example where we weren't properly waiting for the initial build to be ready.

test('dev mode', SETUP, async ({ fs, spawn, expect }) => {
let process = await spawn('pnpm nuxt dev', {
env: {
TEST: 'false', // VERY IMPORTANT OTHERWISE YOU WON'T GET OUTPUT
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vitest (I think) sets a process.env.TEST environment variable. If this is set to true, then nuxt dev doesn't show any output. Why? I'm not sure, but it was confusing because it worked while running manually. It also worked with copying the spawn() setup in a custom whatisgoingon.js file...

@@ -1,143 +1,148 @@
import { describe, expect } from 'vitest'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test still uses a global expect — this is probably fine but maybe we should use the one from the test for consistency.

@RobinMalfait RobinMalfait merged commit 2a29c29 into next Dec 12, 2024
5 checks passed
@RobinMalfait RobinMalfait deleted the feat/improve-integration-tests branch December 12, 2024 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants