AWS MediaConvert with Merge Video Segments

SHUBHAM MEHRASHUBHAM MEHRA
3 min read

A step‑by‑step guide to build a Temporal activity that:

  1. Uploads video segments & audio to S3

  2. Concatenates the video segments (no audio)

  3. Overlays the voice‑over track

  4. Polls for completion and returns the final S3 URL


Prerequisites

  • Node.js >= 20

  • AWS SDK v3 (@aws-sdk/client-s3, @aws-sdk/client-mediaconvert)

  • Temporal Node.js SDK

  • IAM Role MediaConvertExecutionRole with:

      {
        "Effect":"Allow",
        "Action":["s3:GetObject","s3:PutObject","mediaconvert:*"],
        "Resource":["arn:aws:s3:::<BUCKET>/*"]
      }
    
  • .env variables:

      AWS_ACCESS_KEY_ID=...
      AWS_SECRET_ACCESS_KEY=...
      AWS_REGION=us-east-1
      MEDIACONVERT_ENDPOINT=https://<endpoint>.mediaconvert.<region>.amazonaws.com
    

File Structure

src/
├─ config/
│  ├─ aws.ts            # AWS bucket & role config
│  └─ context.ts        # Instantiates S3Client & MediaConvertClient
├─ activities/
│  └─ mergeSegments.ts  # Temporal activity with 2-step MediaConvert pipeline
└─ workflows/
   └─ videoWorkflow.ts  # Calls mergeSegments activity

Step-by-Step Implementation

Step 1: Upload files to S3

Upload each video segment and the audio file with correct ContentType.

// snippet: uploadToS3 helper\async function uploadToS3(
  localPath: string,
  s3Key: string,
  bucket: string,
  contentType: string
) {
  const stream = fs.createReadStream(localPath);
  await s3.send(new PutObjectCommand({ Bucket: bucket, Key: s3Key, Body: stream, ContentType: contentType }));
  return `s3://${bucket}/${s3Key}`;
}

In mergeSegments:

const prefix = `videogeneration/${workflowId}/`;
const sceneInputs = await Promise.all(
  sceneFiles.map((file, i) =>
    uploadToS3(file, `${prefix}scene-${i}.mp4`, bucket, 'video/mp4')
      .then(uri => ({ FileInput: uri }))
  )
);
const audioUri = await uploadToS3(audioFile, `${prefix}audio.wav`, bucket, 'audio/wav');

Step 2: Concatenate video segments (Job A)

Strip audio and merge the scenes.

const jobAParams: CreateJobCommandInput = {
  Role: role,
  Settings: {
    Inputs: sceneInputs.map(inp => ({ ...inp, VideoSelector: {}, AudioSelectors: {} })),
    OutputGroups: [{
      Name: 'File Group',
      OutputGroupSettings: { Type: 'FILE_GROUP_SETTINGS', FileGroupSettings: { Destination: `s3://${bucket}/${prefix}` } },
      Outputs: [{
        NameModifier: '_concat',
        ContainerSettings: { Container: 'MP4' },
        VideoDescription: {
          CodecSettings: { Codec: 'H_264', H264Settings: { RateControlMode: 'QVBR', SceneChangeDetect: 'TRANSITION_DETECTION', MaxBitrate: 5_000_000 } },
          Width: 1280, Height: 720
        }
      }]
    }],
    TimecodeConfig: { Source: 'ZEROBASED' }
  },
  UserMetadata: { workflowId }
};
await mediaConvert.send(new CreateJobCommand(jobAParams));
await pollUntilComplete(jobA.Id!);

// Derive URI of merged file:
const base = path.basename(sceneFiles[0], '.mp4');
const mergedUri = `s3://${bucket}/${prefix}${base}_concat.mp4`;

Step 3: Overlay voice‑over (Job B)

Use the concatenated video as input and attach the audio as an external track.

const jobBParams: CreateJobCommandInput = {
  Role: role,
  Settings: {
    Inputs: [{
      FileInput: mergedUri,
      VideoSelector: {},
      AudioSelectors: {
        'Audio Selector 1': { DefaultSelection: AudioDefaultSelection.DEFAULT, ExternalAudioFileInput: audioUri }
      }
    }],
    OutputGroups: [{
      Name: 'File Group',
      OutputGroupSettings: { Type: 'FILE_GROUP_SETTINGS', FileGroupSettings: { Destination: `s3://${bucket}/${prefix}` } },
      Outputs: [{
        NameModifier: `final_${sanitizeForFilename(workflowId)}`,
        ContainerSettings: { Container: 'MP4' },
        VideoDescription: { CodecSettings: { Codec: 'H_264', H264Settings: { RateControlMode: 'QVBR', SceneChangeDetect: 'TRANSITION_DETECTION', MaxBitrate: 5_000_000 } }, Width: 1280, Height: 720 },
        AudioDescriptions: [{ AudioSourceName: 'Audio Selector 1', CodecSettings: { Codec: 'AAC', AacSettings: { Bitrate: 96_000, CodingMode: 'CODING_MODE_2_0', SampleRate: 48_000 } } }]
      }]
    }],
    TimecodeConfig: { Source: 'ZEROBASED' }
  },
  UserMetadata: { workflowId }
};
await mediaConvert.send(new CreateJobCommand(jobBParams));
await pollUntilComplete(jobB.Id!);

Step 4: Poll for job completion

async function pollUntilComplete(jobId: string) {
  let status = 'SUBMITTED';
  while (['SUBMITTED','PROGRESSING'].includes(status)) {
    await new Promise(r => setTimeout(r, 10_000));
    status = (await mediaConvert.send(new GetJobCommand({ Id: jobId }))).Job?.Status!;
    if (['ERROR','CANCELED'].includes(status)) throw new Error(`Job ${jobId} failed: ${status}`);
  }
}

Step 5: Return final URL

return `s3://${bucket}/${prefix}final_${sanitizeForFilename(workflowId)}.mp4`;

Benefits

  • Clear, numbered steps improve readability

  • Offloads heavy work to AWS MediaConvert

  • Leverages Temporal for retries & orchestration

Feel free to copy this template and adapt it for your pipelines!

0
Subscribe to my newsletter

Read articles from SHUBHAM MEHRA directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

SHUBHAM MEHRA
SHUBHAM MEHRA

Meticulous web developer with over 4 years of experience and passion for metrics and beating former "best-yets."