Browser automation
Windmill makes it easy to perform browser automation tasks, such as testing or web scraping.
Not sure what a worker group is? You should probably read about it first.
By default, a worker group named reports
is available which will handle jobs with the chromium
tag.
Workers assigned to this group will install chromium on start (learn more about init scripts).
You have to set the worker group of at least one worker to reports
.
There is a sample worker container definition called windmill_worker_reports
in the docker-compose.yml
file which you can uncomment to quickly start a worker with the right worker group.
The chromium binary will be available on these workers at /usr/bin/chromium
.
You will need to disable the sandbox to run it inside windmill workers.
You can do this by passing the --no-sandbox
flag.
Running chromium without the sandbox is a security risk. Make sure you trust the website you are visiting.
To run jobs on a chromium-equipped worker, you have to select the chromium
tag in the settings of the script or flow step.
Learn how here.
Examples
Playwright (Bun)
import { chromium } from "playwright"
export async function main() {
const browser = await chromium.launch({
executablePath: "/usr/bin/chromium",
});
const page = await browser.newPage();
await page.goto("https://google.com");
const title = await page.title();
await browser.close()
return title
}
Puppeteer (Bun)
import puppeteer from "puppeteer-core";
export async function main() {
const browser = await puppeteer.launch({
headless: "new",
executablePath: "/usr/bin/chromium",
args: ["--no-sandbox"],
});
const page = await browser.newPage();
await page.goto("https://google.com");
const title = await page.title();
await browser.close();
return title;
}
Puppeteer (Deno)
import puppeteer from "npm:puppeteer-core";
export async function main() {
const browser = await puppeteer.launch({
headless: "new",
executablePath: "/usr/bin/chromium",
args: ["--no-sandbox"],
//userDataDir: "./user-data", // this is only required when Windmill is running with nsjail where you don't have access to the default user data dir
});
const page = await browser.newPage();
await page.goto("https://google.com");
const title = await page.title();
await browser.close();
return title;
}