AWS S3 Zip Files - AntStack
profile picture Anaica Grouver
3 min read Oct 15, 2020

Create Zip Using AWS Lambda with Files Streamed from S3

We often find ourselves in a need to download multiple files/folders as a zipped file from an S3 bucket. However, there is no direct functionality in AWS S3 bucket that allows us to do that yet.

I was in need to zip & download multiple files from the bucket. Before using the method I am about to demonstrate, I tried using a method in which the files are first downloaded to either an EFS or tmp folder (limited to 512MB) and then zipping it via an archiver. However, that method was very taxing, complicated, slower & did not work well.

I then discovered this blog which narrated how we have a direct buffer to the stream so no disk memory is involved whatsoever. This intrigued me as most of my complications were resolved if I didn’t have to use the disk memory.

I will now do a walkthrough on how to zip files from S3 bucket & store the zipped-file back into S3 bucket. Further, we will also be seeing how to set up and use lambda layers from serverless in this blog post.

s3_stream_architecture.png

To create a serverless template for nodejs, we use the below command

serverless create --template aws-nodejs

We will be using Lambda functions, S3 buckets, IAM roles, and Lambda layers for this task. Below is how the serverless yalm file will look. We have included the layers for zipping (archiver) & for streaming (stream).

provider:
  name: aws
  runtime: nodejs12.x
  region: us-west-2
  iamRoleStatements:
    - Effect: "Allow"
      Action:
        - "s3:PutObject"
        - "s3:ListBucket"
        - "s3:GetObject"
        - "s3:PutObjectAcl"
      Resource:
        Fn::Join:
          - ""
          - - "arn:aws:s3:::"
            - Ref: TestBucket
            - "/*"

layers:
  archiver:
    #refer as ArchiverLambdaLayer
    path: ./Layers/archiver
  stream:
    #refer as StreamLambdaLayer
    path: ./Layers/stream
Zip:
  handler: Zip/index.handler
  layers:
    - { Ref: ArchiverLambdaLayer }
    - { Ref: StreamLambdaLayer }

The index handler sends out an object of parameters to the streamer. The parameter list would look like:

const param = {
  files: [
    {
      fileName: "excel/1.xlsx", // prefix followed by file name
      key: "excel/1.xlsx", // prefix followed by key name
    },
    {
      fileName: "excel/2.xlsx",
      key: "excel/2.xlsx",
    },
  ],
  zippedFileKey: "zip/zippedFile.zip", // prefix followed by file name
}

Next, we will create Read Stream for all the output files (filenames).

const s3FileDwnldStreams = params.files.map(item => {
  const stream = s3.getObject({ Key: item.key }).createReadStream()
  return {
    stream,
    fileName: item.fileName,
  }
})

We now need to create a zip archive using streamPassThrough and link the request to S3.

const streamPassThrough = new Stream.PassThrough()

const uploadParams = {
  ACL: "public-read", //change to private as per requirement
  Body: streamPassThrough,
  ContentType: "application/zip",
  Key: params.zippedFileKey,
}

const s3Upload = s3.upload(uploadParams, (err, data) => {
  if (err) console.error("upload error", err)
  else console.log("upload done", data)
})
const archive = Archiver("zip", {
  zlib: { level: 0 },
})
archive.on("error", error => {
  throw new Error(
    `${error.name} ${error.code} ${error.message} ${error.path}  ${error.stack}`
  )
})

We can also use the following command to track the progress of the upload.

s3Upload.on("httpUploadProgress", progress => {
  console.log(progress)
})

Our last step would be to connect the archiver and streamPassThrough to pipe all the download streams to the archiver.

await new Promise((resolve, reject) => {
  s3Upload.on("close", resolve())
  s3Upload.on("end", resolve())
  s3Upload.on("error", reject())

  archive.pipe(streamPassThrough)
  s3FileDwnldStreams.forEach(s3FileDwnldStream => {
    archive.append(s3FileDwnldStream.stream, {
      name: s3FileDwnldStream.fileName,
    })
  })
  archive.finalize()
}).catch(error => {
  throw new Error(`${error.code} ${error.message} ${error.data}`)
})

Reference

https://dev.to/lineup-ninja/zip-files-on-s3-with-aws-lambda-and-node-1nm1

service category

Innovate faster, and go farther with serverless-native application development. Explore limitless possibilities with AntStack's serverless solutions. Empowering your business to achieve your most audacious goals.

Build with us

Author(s)

Tags

Share this blog

Your Digital Journey deserves a great story.

Build one with us.

Recommended Blogs

cookie

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information in order to improve and customize your browsing experience and for analytics and metrics about our visitors on this website.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference not to be tracked.

Build With Us