FaaS systems have storage limits just like they have memory limits. If you have looked at the code I've implemented in the example application, you may have noticed that I download files and store them in /tmp before processing them. This strategy isn't necessary, as it's possible to read data from S3 and store it in memory. In my testing, I found performance gains when downloading the files to disk and then reading them with the standard filesystem open calls. Some of the CSV APIs I was using were also easier to use with a real file handler rather than a String in memory.
When downloading data and storing the files locally, you must keep in mind the storage limits enforced by your FaaS provider. For example, AWS Lambda currently ...