Reduce cold start and execution time of Google Cloud Functions
Save seconds with simple tips
A Google Cloud Function can be in two states before its execution : cold or hot.
A cold Cloud Function is a function that hasn’t been called for a while (or was freshly deployed). It needs to be instantiated to run the serverless function.
During this warm up, the global context of the function is evaluated, the execution environment is initialised. This warm up takes time, it’s a cold start, or boot-from-scratch, or cold-boot… let’s say cold start.
A hot Cloud Function, on the contrary, is boiling, ready for action, waiting for a visit like a pretty tiny puppy is waiting for his/her owner to come back.
It was initialised by a previous execution and is still running for a few minutes before it scales to 0 and goes back to the cold state.
In this article I am giving advice about how to:
- reduce cold start time of cold Cloud Functions
- speed up execution time of the function itselfs (cold or hot)
In some companies, like at Doctolib, they have a 70ms request time policy.
I’m afraid this utopia is not compatible with a serverless architecture where a cold start time automatically reach this time limit… Except if you follow these tips!
Basics
Before I introduce the tips, one thing to remember about execution time is randomness.
Everything is random.
Execution time is random. Cold start time is random. Period before stop is random. There is no rule like “every 15 minutes a function is restarted” or “after 100 executions the function has to be stopped”.
For this article, we will be following the same Cloud Function executions : same input, same output, same number of executions per hour ⇒ We will see that the execution time, the period before scaling to 0, the warm up time are all different.
In the photo below you see the time when the Function had to warm up, the durations elapsed between each warm ups are always different (function triggered 190 times per hour).
Let’s see how to reduce cold start and then how to reduce execution time of Google Cloud Functions.
Reduce cold start
Avoid terminating your function
I start easy with an obvious tip that is never mentioned, except slightly in the official documentation.
As explained above, a function has to warm up at the very first execution or if it hasn’t been triggered for a little while.
An other reason is: if the previous execution failed, the function has crashed and terminated. After a crash, the next invocation will need a warm up, there will be a cold start.
Prevent your function from crashing.
A crash can have multiple sources:
- code error
- API call error
- timeout
- out-of-memory error
For code error and API call error, handle them.
For timeout, be sure to have the correct settings in the function parameters, usually 60 seconds is enough but you might need to increase this time depending the complexity of the operations.
Don’t intentionally timeout your Cloud Functions if you need to limit to X minutes of execution, improve your code instead.
For out-of-memory error, and that’s the most important point, be sure to explicitly delete the files you created during the execution, if any. Not deleting them will consume memory available to your function, and sometimes persist between invocations.
Your functions should never crash.
Use minimum instance
A cool feature introduced in mid-2021 is min instances.
I guess this feature is straightforward, it keeps X instances warm even if there are no requests for a while.
However, I will add a few clarifications:
- Minimum instances is keeping X instances warm, X being the number of instances that was filled in the UI or using the
--min-instances
gcloud flag ⇒ Meaning, if more than X functions were called at the same time, the additional n functions will need to warm up, there will still be some cold starts. - There aren’t hard guarantees about the behaviour. In the official documentation they state:
Cloud Functions attempts to keep function instances idle for an unspecified amount of time after handling a request.
- There will still be a cold start right after deployments and after crashes (see previous sub-part)
This feature comes at a cost. It’s almost like paying the price of a Cloud Function running 24/7 (100% of the GB-Second price). See pricing. For low memory Cloud Functions, it’s cheap, but if the function needs a lot of memory, there might exist cheaper solutions.
I would definitely recommend to set up a minimum instance value for user facing functions.
Remember the function of the intro? We had a cold start after a period of 17 minutes to more that an hour. By setting min-instances = 2 we had only one cold start this last day!
Lazy import
This is not a common pattern, frankly, it’s ugly. A good tech lead would refuse a PR with that kind of imports, but for performance reasons, it’s cool!
Lazy import answers the question : Why should we import packages that might not be used?
Packages and dependencies are the #1 contributors to GCF cold-boot performance.
We should avoid them at maximum and more important, avoid them in global scope if it’s not always used.
First, only the dependencies that are used must be imported => if there is a specific function that is used from a dependency, import this very specific function.
Secondly, if dependencies are used is some specific paths, they should be imported inside these paths. This isn’t a standard practice but it can save some precious milliseconds of the Google Cloud Function cold start.
I would say that if a package is used in less than 75% of the cases, lazy import makes sense.
For example, if a bank wants to notify clients after a withdrawal of more than 1000 euros, this is not common, the Cloud Functions might not need it at every invocation, maybe 10% of the case, the following code makes a lot of sense:
Ping function with Cloud Scheduler
This tip works and doesn’t need any coding skills.
The concept is: trigger a Cloud Function every 5 or 10 minutes to keep it warm as long as possible.
This solution is not ideal.
First, because of randomness. A function could be triggered and scale down to 0, a few minutes after the trigger.
Secondly because you will still have a cold start if you suddenly need 2 or 3 Cloud Functions.
Finally, with this method, a Cloud Function will be called tons of time for… nothing.
To set up this method, go to Cloud Scheduler, create a job, give it a name and a frequency of */10 * * * *
if you want it running 24/7. Choose target type HTTP and insert the Cloud Function URL.
With these tips, seconds can be saved (which represent a thousand of seconds after a year, which gives you a lot of time to clap this article and follow me on Medium!).
Once a Cloud Function is warm, let’s see how to accelerate the processing time…
Reduce execution time
There are coding rules to reduce execution time, serverless function are no exception, basics of coding applies too.
Serverless functions have their own best practices to improve execution time.
Time consuming functions in global scope
Cloud Functions often recycles the execution environment of a previous invocation. If you declare a variable in global scope, its value can be reused in subsequent invocations without having to be recomputed.
Using global scope will cache values from heavy computation processes, this is particularly useful for database connection so the function doesn’t have to connect at every invocation.
Packages import into global scope
This part is very close to the previous one, but it worth going into details.
As we saw a few times in this article, “there is no guarantee the previous environment will be used”, but if it does, the dependencies won’t need to be reload. That’s a lot of time saved and developers should take benefits from that.
The careful reader that you are noticed it: in the previous section Lazy import, I’m stating exactly the opposite ⇒ Don’t put all the dependencies into the global scope.
The decision to import dependencies in the global scope or not depends on the trigger frequency and function tree. For packages used 100% of the time ⇒ go for global scope. For those used 50% of the time ⇒ lazy import. Between 50% and 100%… well… It’s up to you ;)
In the first case, you create longer cold starts for cold invocations, in the other, longer execution for some warm invocations.
Wise region selection
The region where a Cloud Function is deployed have an impact in the processing speed.
guillaume blaquiere wrote a very good article about this selection, you can find it here.
There could be a difference of 4 seconds in the execution time between the fastest region of Google Cloud Functions and the slowest. If execution time is important, you should definitely give it a look.
General
In a more general case, language selection is important, prefer Go or Java instead of Javascript. Use library functions, limit printing variables, remove unused code, import only used method from packages and so…
Sources
Official documentation : https://cloud.google.com/functions/docs/bestpractices/tips
Cloud Functions release notes : https://cloud.google.com/functions/docs/release-notes
Min instance doc : https://cloud.google.com/functions/docs/configuring/min-instances#idle_instances_and_cold_starts
Metered article : https://medium.com/@duhroach/improving-cloud-function-cold-start-time-2eb6f5700f6