Why We Picked Fabric.js for Our Video Editor’s Crop Feature

le jackle jack
7 min read

✨ Overview

While working on a short-form video editor project at my company, we needed to overlay various elements—video clips, text, images, background music, templates, fonts, blur boxes, and more—on top of the video preview.

To achieve this, we set up a 9:16 aspect ratio container and placed all elements within it. The elements needed to support drag-and-drop (DnD) and resizing by default, and the text had to be editable directly inside the container.

For subtitles, we chose to use the .ass (Advanced SubStation Alpha) format instead of WebVTT. While WebVTT only supports basic text and styling, ASS offers rich features like font, color, position, and animation. We also considered compatibility with ffmpeg on the server side, which made ASS the better fit.

To figure out the right approach, we looked at several references (like CapCut, Canva, and other web-based tools) and noticed they all rely on canvas as the base. The next question was: how should we leverage the canvas?

The first idea that came to mind was using the native Canvas API. I assumed there would be plenty of resources, especially for implementing cropping, and that turned out to be true. I also recalled using fabric.js back in university when building a drawing-based web editor for secondary media production—it’s vector-based, so it allowed smooth drawing. Though that was long ago and I hadn’t implemented cropping before, I knew Fabric could handle DnD and resizing.

Lastly, I discovered pixi.js through an article by the team behind the game video editor Dor. After reviewing the official documentation and various resources, I concluded it was worth testing.

In the end, no solution covered everything perfectly. After experimenting with all three, we ultimately chose fabric.js. The reason wasn’t just cropping; it was that, compared to Fabric, the other options were less suitable for implementing the broader feature set we needed.

In this post, I’ll compare the three approaches (with a focus on cropping) and share how we reached the final decision.


📦 Crop Data Structure

The crop data sent to the server consists of four fields:

  • start_x: the starting x-coordinate for the crop

  • start_y: the starting y-coordinate for the crop

  • width: the width of the cropped area

  • height: the height of the cropped area

These values are normalized between 0 and 1, relative to the original video resolution—not absolute pixel values.

For example, to select the full area on a 1920×1080 video, you would set width: 1 and height: 1.


🧪 Comparison of Crop Approaches by Technology

We’ll go into the details of each rendering method and background knowledge in a future post. For now, here’s a summary:

ItemCanvas APIPixi.jsFabric.js
Rendering TypeCanvas 2DWebGL + CanvasCanvas 2D
Precise Crop ControlManual coordinate calculationsUsing mask objectsInjecting cropRect into objects

1️⃣ Using the Canvas API

This approach uses an HTML <video> element with the Canvas API and requestAnimationFrame().

Example Code

'use client';

import {
  forwardRef,
  useCallback,
  useEffect,
  useRef,
} from 'react';

type Props = {
  crop: { start_x: number; start_y: number; width: number; height: number };
  url: string;
  videoEl: HTMLVideoElement | null;
  metadata: {
    width: number;
    height: number;
  };
};

export default forwardRef<HTMLVideoElement, Props>(
  function VanillaVideoCanvas(props, ref) {
    const { crop, url, videoEl, metadata } = props;
    const w = metadata.width;
    const h = metadata.height;

    const canvasRef = useRef<HTMLCanvasElement>(null);
    const requestRef = useRef<number>(0);
    const cropRef = useRef(crop);

    useEffect(() => {
      cropRef.current = crop;
    }, [crop]);

    const render = useCallback(() => {
      const canvas = canvasRef.current;
      const ctx = canvas?.getContext('2d');
      if (!canvas || !ctx || !videoEl || videoEl.paused || videoEl.ended)
        return;

      const crop = cropRef.current;

      ctx.drawImage(
        videoEl,
        w * crop.start_x,
        h * crop.start_y,
        w * crop.width,
        h * crop.height,
        0,
        0,
        canvas.width,
        canvas.height,
      );

      requestRef.current = requestAnimationFrame(render);
    }, [w, h, videoEl]);

    useEffect(() => {
      if (!videoEl) return;

      const onPlay = () => {
        render();
      };

      videoEl.addEventListener('play', onPlay);
      return () => {
        videoEl.removeEventListener('play', onPlay);
        cancelAnimationFrame(requestRef.current);
      };
    }, [render, videoEl]);

    return (
      <>
        <div className="absolute left-0 top-0 z-10 h-full w-full">
          <canvas
            ref={canvasRef}
            width={w}
            height={h}
            style={{ backgroundColor: 'white' }}
          />
        </div>
        <video
          ref={ref}
          className="invisible absolute left-0 top-0 h-full"
          width={w}
          height={h}
          src={url}
        />
      </>
    );
  },
);

About drawImage()

The drawImage() method on the HTML5 Canvas 2D context is used to draw images, videos, or other canvases. Its parameters vary depending on how many you pass:

drawImage(image, dx, dy)
drawImage(image, dx, dy, dWidth, dHeight)
drawImage(image, sx, sy, sWidth, sHeight, dx, dy, dWidth, dHeight)
ParameterDescription
imageThe source (HTMLImageElement, HTMLVideoElement, HTMLCanvasElement, ImageBitmap, etc.)
sx, syCoordinates of the top-left corner of the source rectangle
sWidth, sHeightWidth and height of the source rectangle
dx, dyDestination coordinates on the canvas
dWidth, dHeightWidth and height to draw (scaling)

✅ Advantages

  • Cropping is straightforward.

  • Good browser compatibility and performance.

❌ Limitations

  • Rendering styled text (like outlines, shadows, etc.) using the browser’s native text is limited.

  • Managing visibility and placement of multiple timeline elements adds complexity.

  • Frequent reflows from transforms and size adjustments can affect performance.

📌 Verdict

While great for simple cropping, it didn’t meet the full set of editor requirements.


2️⃣ Using Pixi.js

Pixi.js is a WebGL-based 2D rendering engine. It’s higher-performance than Canvas because it uses GPU-accelerated rendering, not the DOM.

Example Code

const app = new PIXI.Application();
await app.init({ /* options */ });

const video: HTMLVideoElement;
video.src = videoUrl;

const container = new PIXI.Container({ label: 'video-container' });
app.stage.addChild(container);

const videoLoadPromise: Promise<HTMLVideoElement> = new Promise((resolve) => {
  video.addEventListener('canplaythrough', () => resolve(video), { once: true });
});

const loadedVideo = await videoLoadPromise;
const videoTexture = PIXI.Texture.from(video);
const videoSprite = new PIXI.Sprite(videoTexture);

const cropX = crop.start_x * videoWidth;
const cropY = crop.start_y * videoHeight;
const cropWidth = crop.width * videoWidth;
const cropHeight = crop.height * videoHeight;

const mask = new PIXI.Graphics().rect(cropX, cropY, cropWidth, cropHeight);
videoSprite.setMask({ mask, inverse: false });

container.addChild(videoSprite, mask);

⚙️ How setMask Works

  • Defines a mask object (like a Graphics shape or Sprite) that controls the visible area.

  • Uses the GPU’s stencil buffer to apply masking at render time.

  • With inverse: true, the mask works in reverse.

✅ Advantages

  • Excellent GPU-accelerated performance.

  • Cropping can be handled elegantly using masks.

❌ Limitations

  • Hard to implement blur effects (especially for layered, backdrop-like blurs).

  • Handling text input (line breaks, cursor positioning, resizing) is tricky.

  • Custom solutions exist but are often unstable.

  • High learning curve with WebGL, and given deadlines, it wasn’t feasible to ramp up quickly.

📌 Verdict

While performant, it lacked the stability and flexibility we needed for the broader editor feature set. That said, it left me wanting to properly learn WebGL later!


3️⃣ Using Fabric.js (Final Choice)

fabric.js offers a robust object model on top of the canvas, making it easy to independently manage video, text, images, and more.

Since Fabric doesn’t have native crop support, we customized it based on this reference and adapted it for modern Fabric (v6.x), since the original was written for Fabric 1.7.1—eight years old!

Example Code

type FabricImageProps = fabric.ImageProps & {
  cropRect: { x: number; y: number; w: number; h: number };
};

class CropVideo extends fabric.FabricImage {
  static type = 'crop-video';
  cropRect?: { x: number; y: number; w: number; h: number };

  constructor(
    element: HTMLVideoElement | HTMLImageElement,
    options?: FabricImageProps,
  ) {
    const defaultOpts = { /* default options */ };
    super(element as any, Object.assign({}, defaultOpts, options));
    this.cropRect = options?.cropRect;
  }

  _draw(ctx: CanvasRenderingContext2D) {
    const element = this.getElement() as HTMLVideoElement;
    const c = this.cropRect;
    const d = {
      x: -this.width / 2,
      y: -this.height / 2,
      w: this.width,
      h: this.height,
    };

    if (c) {
      ctx.drawImage(element, c.x, c.y, c.w, c.h, d.x, d.y, d.w, d.h);
    } else {
      ctx.drawImage(element, d.x, d.y, d.w, d.h);
    }
  }

  _render(ctx: CanvasRenderingContext2D) {
    this._draw(ctx);
  }
}

(fabric as any).CropVideo = CropVideo;

// Usage
const fabricVideo = new fabric.CropVideo(videoElement, { /* options */ });
fabricVideo.setCoords();
canvas.add(fabricVideo);

✅ Advantages

  • Easily manages multiple independent elements.

  • Clear layer ordering and alignment.

  • Highly customizable and maintainable.

❌ Limitations

  • Some learning curve.

  • Certain features (like blur effects) still require creative solutions.

    • For example, we used placeholder boxes during DnD/resizing and applied backdrop-filter only after the interaction finished.

📌 Verdict

The most stable and flexible solution, ultimately meeting all project requirements.


🔚 Final Thoughts

Comparing these three approaches helped us not just understand cropping, but also think deeply about the editor’s overall architecture and future scalability.

Building an editor is more than just creating a nice-looking UI. It requires handling time-based control, multi-source playback, text input, custom styling, and layer management—all demanding lots of experimentation and iteration.

We’re still learning and improving every day.


References

  • https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/drawImage

0
Subscribe to my newsletter

Read articles from le jack directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

le jack
le jack

Web Front end Engineer