Study: Video Datasets


Checking out some video datasets and evaluating them.
Disclaimer: This is not comprehensive evaluation. Just me manually looking through huggingface.
PE Video
Link: facebook/PE-Video
Tier: SSS
Description: Facebook PE Video contains quality high resolution videos with proper keywords and captioning. The videos feel staged though and contains diverse set of objects.
LLaVA-Video-178K
Link: lmms-lab/LLaVA-Video-178K
Tier: SS
Description: Contains various videos with a nice distribution on various types of videos.
ByteDance Synthetic Videos
Link: kevinzzz8866/ByteDance_Synthetic_Videos
Tier: A
Description: Turn-table style highly detailed 3D renderings of objects. It almost feels like it was designed for creating 3D rendering to later be able to generate 3D models.
Anime Landscape Videos Splited
Link: svjack/Anime_Landscape_Videos_Splited
Tier: A
Description: Contains excepts from anime videos mostly high quality background scenes with delicate and visually pleasing movements.
anime-giph
Link: nev/anime-giph
Tier: A
Description: Short excepts from popular animated films. Contains interesting motions and special effects. High quality.
Anime Segmentation
Link: skytnt/anime-segmentation
Tier: A
Description: Has anime characters segmented out with proper alpha channel. Seems like it is going to be useful for background removals.
VideoMMMU
Link: lmms-lab/VideoMMMU
Tier: A
Description: Peculiar set of video from distinctive domains like art, humanity, science and engineering. Decent number of videos contain powerpoint like transitions.
Anime Video and Anime Screenshots
Link: Fitsd/anime_video_and_anime_screenshots
Tier: B
Description: Video recordings of animated films.
Anime Person Detection (Image Data)
Link: deepghs/anime_person_detection
Tier: B
Description: Contains various anime screenshots and their labeled bounding boxes for humanoids.
VideoFeedback-videos-mp4
Link: hexuan21/VideoFeedback-videos-mp4
Tier: A
Description: Short videos. Relatively well balanced. 37.7k.
WenhaoWang/VidProM
Link: WenhaoWang/VidProM
Tier: B
Description: Contains lots of generated videos from different models.
zhiqiulin/video_captioning
Link: zhiqiulin/video_captioning
Tier: S
Description: Very good captioned video dataset. Good data balance.
VatsaDev/video-danbooru-final
Link: VatsaDev/video-danbooru-final
Tier: B
Description: Anime videos (100hrs it says)
Nahrawy/VIDIT-Depth-ControlNet-E
Link: Nahrawy/VIDIT-Depth-ControlNet-E
Tier: A
Description: Realistically rendered 3d scenes that look real along with Depth map.
daiua/video
Link: daiua/video
Tier: B
Description: Short-form videos (Chinese mostly) Contains lots of live-action and emotional scenes.
VideoHallu
Link: IntelligenceLab/VideoHallu
Tier: B
Description: Generated video from various tools. Good to compare different video gen services.
WenhaoWang/VideoUFO
Link: WenhaoWang/VideoUFO
Tier: A
Description: Detailed captions per keyframe. Video dataset. Seems like he has bunch of papers. Would be good look at other datasets from him.
youtube_videos_11
Link: 2e8konjak/youtube_videos_11
Tier: B
Description: Self explanatory. There are also different compliations like 2e8konjak/youtube_videos_12 and 2e8konjak/youtube_videos_2.
video-reasoning/morse-500-view
Link: video-reasoning/morse-500-view
Tier: A
Description: Not directly related to spriteDX project, but it can be used to learn visual semantics.
Nagi_no_Asukara_Videos_Captioned
Link: svjack/Nagi_no_Asukara_Videos_Captioned
Tier: B
Description: It focuses on one anime but every shot is paired with very detailed caption and scene description.
Sprite Animation
Link: Loacky/sprite-animation
Tier: B
Description: Retro style sprite animation collected frame by frame. The dataset is not too big.
Synth-Vid-Detect
Link: https://huggingface.co/datasets/ductai199x/synth-vid-detect
Tier: A
Description: Has both synthetic and real video. It can be used to train a model that detects fake from real.
video-dataset-disney-organized
Link: sayakpaul/video-dataset-disney-organized
Tier: A
Description: Well detailed captions for different shots of the video.
8k-video-game-dataset
Link: Kim2091/8k-video-game-dataset
Tier: B
Description: Contains video game footage every 1-5 second interval. Has various games.
2D_Video_Game_Cartoon_Character_Sprite-Sheets
Link: mgane/2D_Video_Game_Cartoon_Character_Sprite-Sheets
Tier: C
Description: Contains spritesheets mostly flash style animation for side scroller games. Dataset is small
Pexels Videos
Link: https://huggingface.co/datasets/minh132/pexels-videos
Tier: A
Description: Quality short videos that have natural camera movements.
Tiger Lab VideoFeedback
Link: TIGER-Lab/VideoFeedback
Tier: A
Description: Diverse set of videos
Changli/Ytb_Video
Link: Changli/Ytb_Video
Tier: B
Description: Youtube videos. Not sure about data balance.
svjack/Star_Rail_Tribbie_MMD_Videos_Splited_Captioned_512x384x1_conn
Link: svjack
Tier: C
Description: MMD video dataset that seems relatively safe. It also contains Alpha mask versions of the frames which would be useful for training background removal models.
QimingLi/videos
Link: QimingLi/videos
Tier: C
Description: Small video dataset.
laion2B-en-aesthetic (Image Data)
Link: laion/laion2B-en-aesthetic
Tier: A
Description: Subset of Laion dataset that are aesthetically pleasing. The caption quality is much better than what we have seen in other places.
novelai-anime-v3-artist-comparison (Image Data)
Link: novelai-anime-v3-artist-comparison
Tier: C
Description: Mostly generated data based on artist name as a trigger words.
Steer Clear List
One other thought: Seedance 1 Pro generates high quality 5s videos relatively fast cheaply. It may be cheaper to generate synthetic dataset instead and use for training.
ā Sprited Dev š± Stay safe, Stay cool.
Subscribe to my newsletter
Read articles from Sprited Dev directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
