Headfull browsers beat headless
September 7, 2022
Twenty years ago a simple
curl would open up the world. HTML markup was largely hand designed so
name attributes were easily interpretable and parsable. Now most sites render dynamic content or use template defined class tags to define the styling of the page. To handle this richness of rendering, most production crawlers use headless browsers - at least for a part of the pipeline. Since you're running a chromium or webkit build, these should render sites exactly how users see them. In reality, headless browsers are sometimes quite different:
- The headless chromium build is a different executable - there are some codepaths that are only available in the full build or have different behavior in headless mode. Extensions are one example; Chrome supports them but headless does not.
- There are canvas elements that render differently - both because of missing fonts and underlying pixel driver differences.
- Headless browsers are harder to inspect and debug. By definition they are hidden, so they must must be debugged remotely via control tools. But because of the above points this still doesn't give a full 1:1 comparison to visible browsers.
Instead of working around the shortfalls of headless browsers, I've been building with headfull browsers recently and couldn't be happier.
Headfull browsers are simple to use when running them locally on your desktop. In Puppeteer or Playwright, you can activate one through a
headless:false parameter and use the same control code as your normal headless logic. Chromium will launch a window that behaves almost 1:1 like Chrome.
When running in the cloud within a container orchestration framework like Docker, though, headfull browsers requires more work and some finesse. The standard ubuntu-base image is intended for CLI execution so doesn't come with any of the requisite display code needed for headfull mode.
If you're interested in a pre-packaged docker solution, skip to the bottom.
One thought for building a headfull container is whether we can build on top of an existing base. After all, Playwright ships with an image that makes it easy to get started in docker headless mode. When running in headfull mode it throws the following error:
> Looks like you launched a headed browser without having a XServer running. > [pid=106][err] [106:106:0907/081624.714406:ERROR:ozone_platform_x11.cc(247)] Missing X server or $DISPLAY > [pid=106][err] [106:106:0907/081624.714534:ERROR:env.cc(226)] The platform failed to initialize. Exiting.
Installing an XServer gets past this error but throws others in turn.
In theory we could leverage this base and install the additional libraries iteratively. However one drawback of their base image is that it bundles all browsers together: chromium, webkit, and firefox. If you're only using one browser in your deployment, these additional browsers are just killing bandwidth. A larger container payload means longer bootup times in a serverless function or cluster deployment when containers have to be downloaded. Instead, you'll likely want to fork it and only install the bare minimum.
Font / Library Support
Headfull Chromium needs to access core libraries that the headless version doesn't need. These fulfill some of the system logic that I was referring to above, mostly around cursors, rendering functions, and font support.
apt-get carries all of these as packages so installation is longwinded but straightforward.
$ apt-get install conf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget xfonts-100dpi xfonts-75dpi xfonts-scalable xfonts-cyrillic libgbm-dev
This clears up the Chrome dependency issues that will prevent launch. Now we have to render the browser graphics somewhere.
Chromium needs to be launched in an environment where the graphics code can be successfully painted. After all, that's the key differentiating factor of headfull browsers. Containers don't have the notion of a window server, however, since they're just a raw shell implementation. We therefore virtualize a display through X11, which mirrors the X sever that some OS implementations are built off of.
$ apt-get install xvfb x11-apps x11-xkb-utils libx11-6 libx11-xcb1
xvfb-run script that's bundled into
xvfb makes it easy to launch a virtual display and spawn an executable that will draw into this display.
$ xvfb-run --server-args="-screen 0 1524x768x24" npm run start
While debugging it's often helpful to view the browser's current state, inspecting the viewport and DOM. To connect with the X11 server from your host and therefore bridge from host->container, we'll need a VNC server to be hosted inside the container. This is the most common screen sharing protocol and plays nicely with X11.
apt-get install x11vnc
The server can then be launched on the given screen, in this case display :0. This can then be accessed over
5900 when you port forward to your host.
x11vnc -display :0 -noxrecord -noxfixes -noxdamage -quiet -forever -passwd mypassword
There's a bit of nuance to stringing these dependencies together into an entrypoint.
xvfb-run won't work out of the box because it launches the display, runs the application, and cleans up before yielding. This gives no time for us to launch our VNC server. It also intercepts some SIGINT and SIGTERM signals that we would rather forward to our application code.
For a pre-packaged Docker image that contains all the above, plus execution code and a more detailed getting started guide, see my headfull-chromium image. It's licensed under Apache and free for personal and commercial use.