Why We Picked Fabric.js for Our Video Editor’s Crop Feature

✨ Overview
While working on a short-form video editor project at my company, we needed to overlay various elements—video clips, text, images, background music, templates, fonts, blur boxes, and more—on top of the video preview.
To achieve this, we set up a 9:16 aspect ratio container and placed all elements within it. The elements needed to support drag-and-drop (DnD) and resizing by default, and the text had to be editable directly inside the container.
For subtitles, we chose to use the .ass (Advanced SubStation Alpha) format instead of WebVTT. While WebVTT only supports basic text and styling, ASS offers rich features like font, color, position, and animation. We also considered compatibility with ffmpeg on the server side, which made ASS the better fit.
To figure out the right approach, we looked at several references (like CapCut, Canva, and other web-based tools) and noticed they all rely on canvas as the base. The next question was: how should we leverage the canvas?
The first idea that came to mind was using the native Canvas API. I assumed there would be plenty of resources, especially for implementing cropping, and that turned out to be true. I also recalled using fabric.js back in university when building a drawing-based web editor for secondary media production—it’s vector-based, so it allowed smooth drawing. Though that was long ago and I hadn’t implemented cropping before, I knew Fabric could handle DnD and resizing.
Lastly, I discovered pixi.js through an article by the team behind the game video editor Dor. After reviewing the official documentation and various resources, I concluded it was worth testing.
In the end, no solution covered everything perfectly. After experimenting with all three, we ultimately chose fabric.js. The reason wasn’t just cropping; it was that, compared to Fabric, the other options were less suitable for implementing the broader feature set we needed.
In this post, I’ll compare the three approaches (with a focus on cropping) and share how we reached the final decision.
📦 Crop Data Structure
The crop data sent to the server consists of four fields:
start_x: the starting x-coordinate for the crop
start_y: the starting y-coordinate for the crop
width: the width of the cropped area
height: the height of the cropped area
These values are normalized between 0 and 1, relative to the original video resolution—not absolute pixel values.
For example, to select the full area on a 1920×1080 video, you would set width: 1 and height: 1.
🧪 Comparison of Crop Approaches by Technology
We’ll go into the details of each rendering method and background knowledge in a future post. For now, here’s a summary:
Item | Canvas API | Pixi.js | Fabric.js |
Rendering Type | Canvas 2D | WebGL + Canvas | Canvas 2D |
Precise Crop Control | Manual coordinate calculations | Using mask objects | Injecting cropRect into objects |
1️⃣ Using the Canvas API
This approach uses an HTML <video> element with the Canvas API and requestAnimationFrame().
Example Code
'use client';
import {
forwardRef,
useCallback,
useEffect,
useRef,
} from 'react';
type Props = {
crop: { start_x: number; start_y: number; width: number; height: number };
url: string;
videoEl: HTMLVideoElement | null;
metadata: {
width: number;
height: number;
};
};
export default forwardRef<HTMLVideoElement, Props>(
function VanillaVideoCanvas(props, ref) {
const { crop, url, videoEl, metadata } = props;
const w = metadata.width;
const h = metadata.height;
const canvasRef = useRef<HTMLCanvasElement>(null);
const requestRef = useRef<number>(0);
const cropRef = useRef(crop);
useEffect(() => {
cropRef.current = crop;
}, [crop]);
const render = useCallback(() => {
const canvas = canvasRef.current;
const ctx = canvas?.getContext('2d');
if (!canvas || !ctx || !videoEl || videoEl.paused || videoEl.ended)
return;
const crop = cropRef.current;
ctx.drawImage(
videoEl,
w * crop.start_x,
h * crop.start_y,
w * crop.width,
h * crop.height,
0,
0,
canvas.width,
canvas.height,
);
requestRef.current = requestAnimationFrame(render);
}, [w, h, videoEl]);
useEffect(() => {
if (!videoEl) return;
const onPlay = () => {
render();
};
videoEl.addEventListener('play', onPlay);
return () => {
videoEl.removeEventListener('play', onPlay);
cancelAnimationFrame(requestRef.current);
};
}, [render, videoEl]);
return (
<>
<div className="absolute left-0 top-0 z-10 h-full w-full">
<canvas
ref={canvasRef}
width={w}
height={h}
style={{ backgroundColor: 'white' }}
/>
</div>
<video
ref={ref}
className="invisible absolute left-0 top-0 h-full"
width={w}
height={h}
src={url}
/>
</>
);
},
);
About drawImage()
The drawImage() method on the HTML5 Canvas 2D context is used to draw images, videos, or other canvases. Its parameters vary depending on how many you pass:
drawImage(image, dx, dy)
drawImage(image, dx, dy, dWidth, dHeight)
drawImage(image, sx, sy, sWidth, sHeight, dx, dy, dWidth, dHeight)
Parameter | Description |
image | The source (HTMLImageElement, HTMLVideoElement, HTMLCanvasElement, ImageBitmap, etc.) |
sx, sy | Coordinates of the top-left corner of the source rectangle |
sWidth, sHeight | Width and height of the source rectangle |
dx, dy | Destination coordinates on the canvas |
dWidth, dHeight | Width and height to draw (scaling) |
✅ Advantages
Cropping is straightforward.
Good browser compatibility and performance.
❌ Limitations
Rendering styled text (like outlines, shadows, etc.) using the browser’s native text is limited.
Managing visibility and placement of multiple timeline elements adds complexity.
Frequent reflows from transforms and size adjustments can affect performance.
📌 Verdict
While great for simple cropping, it didn’t meet the full set of editor requirements.
2️⃣ Using Pixi.js
Pixi.js is a WebGL-based 2D rendering engine. It’s higher-performance than Canvas because it uses GPU-accelerated rendering, not the DOM.
Example Code
const app = new PIXI.Application();
await app.init({ /* options */ });
const video: HTMLVideoElement;
video.src = videoUrl;
const container = new PIXI.Container({ label: 'video-container' });
app.stage.addChild(container);
const videoLoadPromise: Promise<HTMLVideoElement> = new Promise((resolve) => {
video.addEventListener('canplaythrough', () => resolve(video), { once: true });
});
const loadedVideo = await videoLoadPromise;
const videoTexture = PIXI.Texture.from(video);
const videoSprite = new PIXI.Sprite(videoTexture);
const cropX = crop.start_x * videoWidth;
const cropY = crop.start_y * videoHeight;
const cropWidth = crop.width * videoWidth;
const cropHeight = crop.height * videoHeight;
const mask = new PIXI.Graphics().rect(cropX, cropY, cropWidth, cropHeight);
videoSprite.setMask({ mask, inverse: false });
container.addChild(videoSprite, mask);
⚙️ How setMask Works
Defines a mask object (like a Graphics shape or Sprite) that controls the visible area.
Uses the GPU’s stencil buffer to apply masking at render time.
With inverse: true, the mask works in reverse.
✅ Advantages
Excellent GPU-accelerated performance.
Cropping can be handled elegantly using masks.
❌ Limitations
Hard to implement blur effects (especially for layered, backdrop-like blurs).
Handling text input (line breaks, cursor positioning, resizing) is tricky.
Custom solutions exist but are often unstable.
High learning curve with WebGL, and given deadlines, it wasn’t feasible to ramp up quickly.
📌 Verdict
While performant, it lacked the stability and flexibility we needed for the broader editor feature set. That said, it left me wanting to properly learn WebGL later!
3️⃣ Using Fabric.js (Final Choice)
fabric.js offers a robust object model on top of the canvas, making it easy to independently manage video, text, images, and more.
Since Fabric doesn’t have native crop support, we customized it based on this reference and adapted it for modern Fabric (v6.x), since the original was written for Fabric 1.7.1—eight years old!
Example Code
type FabricImageProps = fabric.ImageProps & {
cropRect: { x: number; y: number; w: number; h: number };
};
class CropVideo extends fabric.FabricImage {
static type = 'crop-video';
cropRect?: { x: number; y: number; w: number; h: number };
constructor(
element: HTMLVideoElement | HTMLImageElement,
options?: FabricImageProps,
) {
const defaultOpts = { /* default options */ };
super(element as any, Object.assign({}, defaultOpts, options));
this.cropRect = options?.cropRect;
}
_draw(ctx: CanvasRenderingContext2D) {
const element = this.getElement() as HTMLVideoElement;
const c = this.cropRect;
const d = {
x: -this.width / 2,
y: -this.height / 2,
w: this.width,
h: this.height,
};
if (c) {
ctx.drawImage(element, c.x, c.y, c.w, c.h, d.x, d.y, d.w, d.h);
} else {
ctx.drawImage(element, d.x, d.y, d.w, d.h);
}
}
_render(ctx: CanvasRenderingContext2D) {
this._draw(ctx);
}
}
(fabric as any).CropVideo = CropVideo;
// Usage
const fabricVideo = new fabric.CropVideo(videoElement, { /* options */ });
fabricVideo.setCoords();
canvas.add(fabricVideo);
✅ Advantages
Easily manages multiple independent elements.
Clear layer ordering and alignment.
Highly customizable and maintainable.
❌ Limitations
Some learning curve.
Certain features (like blur effects) still require creative solutions.
- For example, we used placeholder boxes during DnD/resizing and applied backdrop-filter only after the interaction finished.
📌 Verdict
The most stable and flexible solution, ultimately meeting all project requirements.
🔚 Final Thoughts
Comparing these three approaches helped us not just understand cropping, but also think deeply about the editor’s overall architecture and future scalability.
Building an editor is more than just creating a nice-looking UI. It requires handling time-based control, multi-source playback, text input, custom styling, and layer management—all demanding lots of experimentation and iteration.
We’re still learning and improving every day.
References
- https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/drawImage
Subscribe to my newsletter
Read articles from le jack directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
le jack
le jack
Web Front end Engineer