Implementing Multipart upload in Amazon S3 using spring boot

In this tutorial we will see how to handle multi-part upload in AWS S3 with spring boot.

Simply put , multipart upload is uploading a file into chunks and finally merging it back.

Advantage of this approach is simply that we get higher throughput , we can implement pausing and resuming of uploads easily and also very large files can also be uploaded seamlessly without worrying about server load and last but not the least error recovery (i.e in case of failed upload retry that part , i know that’s awesome 😼).

Let’s get started , we will be using spring boot but you can also use any other framework or language as the flow will be same.

Workflow

There are basically 3 steps

Initiating upload

Client initiates the upload by sending the meta data of the file i.e fileName , fileType , fileSize etc.

Server now requests the S3 to initiate a MultiPartUpload and get the uploadId as response from the S3 and the server send that uploadId to the client because that uploadId and key will be required for the future steps.

Uploading Chunks

Once client gets the uploadId , it starts uploading chunks of file and for each chunk the client request the server for the presigned url and sends those uploadId and key for identification of the corresponding ongoing MultiPartUpload, and on the behalf of client with those UploadId and key server requests a presigned url from the S3 for that chunk and on getting that presigned url forward it to the client

Now client uploads that chunk to the S3 directly (No role of server for now) with the help of that presigned url authorised for that upload only.

Note : For multipart upload except the final chunk each chunk is needed to be minimum of 5MB size , and we can also request presigned urls for multiple chunks at once and upload parallely but that will add complexity and i will leave it for you to do.

Finalizing Upload

Now once all the chunks are uploaded client request for finalizing the upload and merge those chunks to our backend , now the point is how the hell S3 will ensure the correct order of chunks and data is valid . AWS S3 done this with the help of Etag map , for each upload to the chunk it sends a Etag in headers which relates to the chunk-number and for finalizing upload a Map of Key value pair ( key = part-number and value = the eTag for that chunk) is sent to the the backend which is then mapped to the PartEtag which contains two fields , which are self understandable.

private int partNumber;
private String eTag;

Now I hope the workflow must be clear to you , now it’s time for code , but one thing which is important is left , i.e creating the bucket in AWS.

Creating Bucket in AWS S3

Click on create bucket

Choose General Purpose and enter a unique name for the bucket

Leave everything as default but Allow all public access for now (as our main concern is uploading today)
Bucket is created , now need to create access credentials for this bucket

Go to IAM and create a user

Attach these policies and finalize.

Now create Access Credentials

Choose Application running Outside AWS

And Get those Credentials Access Key and Access Secret and save them

Now once we have created the bucket and user we are ready for the code part

Dependencies

These are the dependencies which we will need

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-s3</artifactId>
        <version>1.12.707</version>
    </dependency>
    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.postgresql</groupId>
        <artifactId>postgresql</artifactId>
        <scope>runtime</scope>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-jpa</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-amqp</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.amqp</groupId>
        <artifactId>spring-rabbit-test</artifactId>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.springframework.kafka</groupId>
        <artifactId>spring-kafka</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.kafka</groupId>
        <artifactId>spring-kafka-test</artifactId>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
    </dependency>
</dependencies>

Configuring the S3 client

We can configure our S3 client by creating the bean of AmazonS3 and we will configure the credentials inside that bean using AWSStaticCredentialsProvider and AWSCredentials

An AmazonS3 object can be created using AmazonS3ClientBuilder and will set the credentials and other configuration and region of the bucket.

package com.vsnt.asset_onboarding.config;

import com.amazonaws.ClientConfiguration;
import com.amazonaws.Protocol;
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class AWSConfig {

    String accessKeyId = Secrets.AWS_ACCESS_KEY_ID;

    String secretAccessKey = Secrets.AWS_SECRET_KEY;
 @Bean
    public AmazonS3 getS3Client() {
        AWSCredentials credentials = new BasicAWSCredentials(accessKeyId, secretAccessKey);
     ClientConfiguration config = new ClientConfiguration();
     config.setProtocol(Protocol.HTTP);
        AmazonS3 s3 = AmazonS3ClientBuilder.standard()
                .withCredentials(new AWSStaticCredentialsProvider(credentials))
                .withClientConfiguration(config)
                .withRegion(Regions.AP_SOUTH_1).build();
        return s3;
    }
}

S3Service

Create a service class which will contain methods to interact with S3 and have that AmazonS3 client autowired in that class.

package com.vsnt.asset_onboarding.services;

import com.amazonaws.HttpMethod;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.model.*;
import com.vsnt.asset_onboarding.config.Secrets;
import com.vsnt.asset_onboarding.dtos.TranscodingJob;
import lombok.RequiredArgsConstructor;
import org.springframework.stereotype.Service;

import java.net.URL;
import java.sql.Date;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;

@Service

public class S3Service {
    private final AmazonS3 s3;

    public S3Service(AmazonS3 s3) {
        this.s3 = s3;
    }

    public String startMultiPartUpload(String key) {

        InitiateMultipartUploadRequest request = new InitiateMultipartUploadRequest(Secrets.AWS_BUCKET_NAME, key);
        InitiateMultipartUploadResult result = s3.initiateMultipartUpload(request);
        return result.getUploadId();
    }
    public String getPreSignedURLForMultipartUploadChunk(String uploadId,int chunkNumber,String key) {
        GeneratePresignedUrlRequest request = new GeneratePresignedUrlRequest(Secrets.AWS_BUCKET_NAME, key)
                .withMethod(HttpMethod.PUT)
                .withContentType("application/octet-stream");


        request.addRequestParameter("uploadId", uploadId);
        request.addRequestParameter("partNumber", String.valueOf(chunkNumber));
        URL url = s3.generatePresignedUrl(request);
        return url.toString();
    }
    public void completeMultipartUpload(String uploadId, Map<Integer,String> etagMap, String key)
    {
        try{
            CompleteMultipartUploadRequest request  = new CompleteMultipartUploadRequest();
            request.setUploadId(uploadId);
            request.setBucketName(Secrets.AWS_BUCKET_NAME);
            request.setKey(key);
            List<PartETag> partETags = new ArrayList<>();
            for(Map.Entry<Integer,String> etag : etagMap.entrySet())
            {
                partETags.add(new PartETag(etag.getKey(), etag.getValue()));
            }
            request.setPartETags(partETags);
        var e = s3.completeMultipartUpload(request);


        }
        catch (Exception e){
            e.printStackTrace();

        }


    }
}

I will discuss each method one by one

Initiating the upload

We initate the upload with the help of InitiateMultipartUploadRequest class and finally return the upload id

public String startMultiPartUpload(String key) {
    InitiateMultipartUploadRequest request = new InitiateMultipartUploadRequest(Secrets.AWS_BUCKET_NAME, key);
    InitiateMultipartUploadResult result = s3.initiateMultipartUpload(request);
    return result.getUploadId();
}

Getting PreSignedUrl for each chunk

Now we need to generate presigned url which we can generate by GeneratePresignedUrlRequest class and passing the bucket name and key along with method and content type.

For a chunk we also need to add the upload id and part number as request parameters and finally generate the presigned url with the help of S3 client’s method which we autowired generatePresignedUrl(GeneratePresignedUrlRequest)

 public String getPreSignedURLForMultipartUploadChunk(String uploadId,int chunkNumber,String key) {
        GeneratePresignedUrlRequest request = new GeneratePresignedUrlRequest(Secrets.AWS_BUCKET_NAME, key)
                .withMethod(HttpMethod.PUT)
                .withContentType("application/octet-stream");


        request.addRequestParameter("uploadId", uploadId);
        request.addRequestParameter("partNumber", String.valueOf(chunkNumber));
        URL url = s3.generatePresignedUrl(request);
        return url.toString();
    }

Finalizing upload

The completeMultipartUpload method finalizes a multipart upload to AWS S3. It takes the uploadId, a map of partNumber → ETag, and the file key. Using this data, it creates a request to tell S3 to stitch all uploaded parts into the final complete file. Without this step, the uploaded chunks stay incomplete and unused.

Inside the method, it first creates a CompleteMultipartUploadRequest, sets the bucket name, object key, and the upload ID. Then it iterates over the ETag map and creates a list of PartETag objects — these represent each chunk’s metadata that S3 needs to verify and stitch together the parts. Once the list is built, it's passed to the request and sent using the completeMultipartUpload method of the AmazonS3 client.

public void completeMultipartUpload(String uploadId, Map<Integer,String> etagMap, String key)
{
    try{
        CompleteMultipartUploadRequest request  = new CompleteMultipartUploadRequest();
        request.setUploadId(uploadId);
        request.setBucketName(Secrets.AWS_BUCKET_NAME);
        request.setKey(key);
        List<PartETag> partETags = new ArrayList<>();
        for(Map.Entry<Integer,String> etag : etagMap.entrySet())
        {
            partETags.add(new PartETag(etag.getKey(), etag.getValue()));
        }
        request.setPartETags(partETags);
    var e = s3.completeMultipartUpload(request);


    }
    catch (Exception e){
        e.printStackTrace();

    }


}

Now you can use these methods to implement it according to your use case.

That’s it for today guys , If you have any query you can ask below see you in next article soon till then bye.

Implementing Multipart Uploads to Amazon S3 using Spring Boot

Table of contents

Workflow

Initiating upload

Uploading Chunks

Finalizing Upload

Creating Bucket in AWS S3

Click on create bucket

Choose General Purpose and enter a unique name for the bucket

Leave everything as default but Allow all public access for now (as our main concern is uploading today)

Bucket is created , now need to create access credentials for this bucket

Go to IAM and create a user

Attach these policies and finalize.

Now create Access Credentials

Choose Application running Outside AWS

And Get those Credentials Access Key and Access Secret and save them

Dependencies

Configuring the S3 client

S3Service

Initiating the upload

Getting PreSignedUrl for each chunk

Finalizing upload

Subscribe to my newsletter

Mayank Gupta

Mayank Gupta

Implementing Multipart Uploads to Amazon S3 using Spring Boot

Table of contents

Workflow

Initiating upload

Uploading Chunks

Finalizing Upload

Creating Bucket in AWS S3

Login into AWS console and search For S3

Click on create bucket

Choose General Purpose and enter a unique name for the bucket

Leave everything as default but Allow all public access for now (as our main concern is uploading today)

Bucket is created , now need to create access credentials for this bucket

Go to IAM and create a user

Attach these policies and finalize.

Now create Access Credentials

Choose Application running Outside AWS

And Get those Credentials Access Key and Access Secret and save them

Dependencies

Configuring the S3 client

S3Service

Initiating the upload

Getting PreSignedUrl for each chunk

Finalizing upload

Subscribe to my newsletter

Mayank Gupta

Mayank Gupta