Using Google Chrome instead of Chromium in Google Cloud Functions
May 5, 2024
May 5, 2024
When using Puppeteer, Playwright and similar, you need to have Chrome installed. When youāre running on AWS Lambda or Google Cloud Functions, it can get tricky.
Google Cloud Functions used to bundle Chromium in their base images,
but itās been a few years itās no longer the case. Thatās where packages
like chrome-aws-lambda
come in handy, by bundling Chromium directly inside a npm package, and
exposing a function that extracts the Chromium binary and returns the
path:
const chromium = require('chrome-aws-lambda')
const path = await chromium.executablePath
Note: unnecessary pedantic detail: the above code doesnāt look like a function, but it is, in fact, a getter function that returns a promise. š
However thatās Chromium, and you may have reasons to want Google Chrome instead (mainly, proprietary codecs).
This article is about Google Cloud Functions, but if youāre on AWS Lambda, the above option is your best bet. Because of the Lambda total size limit of 250 MB (all layers combined), itās really hard to get a binary of Chrome that fits in there.
Thatās why chrome-aws-lambda
uses LambdaFS
under the hood, to aggressively compress the Chrome installation with
Brotli and make it fit in that limited space.
But again with that build, you wonāt have proprietary codecs. I tried to trim down a Chrome Linux build and compress it with the same technique but never managed to make it fit on AWS Lambda. Recent Chrome versions are just too big.
Thereās another option, which is to compile Chromium yourself with proprietary codecs. I never found any prebuilt binaries of Chromium that include proprietary codecs (maybe because of license issues redistributing them š) so youāre on your own here.
Remotion successfully does that for Remotion Lambda. Hereās their instructions to compile Chromium with proprietary codecs for Lambda.
Fair warning: it gets hairy, fast.
Google Cloud Functions is more generous as for bundle size, so we donāt need to resort to those tricks, and we can include a complete, uncompressed, Google Chrome installation.
Google publishes Chrome for Testing, builds specifically made for headless usage.
We can just download the latest build from there as part of the
gcp-build
script in our package.json
.
{
"scripts": {
"gcp-build": "curl -s -O 'https://storage.googleapis.com/chrome-for-testing-public/124.0.6367.91/linux64/chrome-linux64.zip' && unzip chrome-linux64.zip && rm chrome-linux64.zip"
}
}
Note: the gcp-build
script allows you to run a custom build step
in Google Cloud Build, which is what Cloud Functions (both 1st and 2nd
gen, as well as Cloud Run and App Engine) use to build your function
image.
It would work just fine with a postinstall
script as well, but
gcp-build
makes sure you run it only on Google Cloud Build, which is
probably desirable in this particular case.
You will then have the Chrome binary in chrome-linux64/chrome
, that
you can pass to the tool of your choice.
Courtesy of this post, with Puppeteer, you donāt need to download Chrome manually, since it provides a nifty script to do just that.
Actually, Puppeteerās postinstall
script
automatically downloads the latest version of Chrome for Testing for
your platform.
The caveat is that this script by default installs it to
~/.cache/puppeteer
, which in the case of Google Cloud Build, is not
gonna be preserved in the final image. So we need to instruct Puppeteer
to install Chrome in a directory that Cloud Build will keep.
This can be done with the following .puppeteerrc.js
:
module.exports = {
cacheDirectory: `${__dirname}/.cache/puppeteer`
}
But even then, thereās another caveat. Puppeteerās postinstall
script
will only run after it gets installed. However, because of build
caching, you will get in a state where node_modules
is restored, with
Puppeteer already installed (so postinstall
will not run), but the
.cache/puppeteer
directory will also not be restored.
To mitigate that, we need to make sure to install Chrome systematically.
Again we can leverage the gcp-build
for that:
{
"scripts": {
"gcp-build": "npx puppeteer browsers install chrome"
}
}
Note: you could call Puppeteerās postinstall
script directly by
doing node node_modules/puppeteer/install.mjs
instead, but I found the
above command cleaner.
The good thing is that this script knows to not re-download Chrome if
itās already found in the cache directory, so when the postinstall
script does run, the extra gcp-build
command will be a no-op.