Software/Datasets

> Software

> Datasets

Software

vidchapters

VidChapters-7M: Video Chapters at Scale

NeurIPS 2023 D&B

vid2seq

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning

CVPR 2023

frozenbilm

Zero-Shot Video Question Answering via Frozen Bidirectional Language Models

NeurIPS 2022

collision

Collision Detection Accelerated: An Optimization Perspective

RSS 2022

tubedetr

TubeDETR: Spatio-Temporal Video Grounding with Transformers

CVPR 2022

vln_duet

Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation

CVPR 2022

lookforthechange

Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos

CVPR 2022

lod

Large-Scale Unsupervised Object Discovery

NeurIPS 2021

vln_hamt

History Aware Multimodal Transformer for Vision-and-Language Navigation

NeurIPS 2021

ccvs

CCVS: Context-aware Controllable Video Synthesis

NeurIPS 2021

justask

Just Ask: Learning to Answer Questions from Millions of Narrated Videos

ICCV 2021

ris

Goal-Conditioned Reinforcement Learning with Imagined Subgoals

ICML 2021

obman

Learning joint reconstruction of hands and manipulated objects

CVPR 2019

crosstask

Cross-task weakly supervised learning from instructional videos

CVPR 2019

d2net

D2-Net: A Trainable CNN for Joint Detection and Description of Local Features

CVPR 2019

sfnet

SFNet: Learning Object-aware Semantic Flow

CVPR 2019

weakactionloc

A flexible model for training action localization with varying levels of supervision

NeurIPS 2018

ncnet

Neighbourhood Consensus Networks

NeurIPS 2018

bodynet

BodyNet: Volumetric Inference of 3D Human Body Shapes

ECCV 2018

weakalign

End-to-end weakly-supervised semantic alignment

CVPR 2018

LTC

Long-term Temporal Convolutions for Action Recognition

PAMI 2018

learningvideotext

Learning from Video and Text via Large-Scale Discriminative Clustering

ICCV 2017

objectstates

Joint Discovery of Object States and Manipulation Actions

ICCV 2017

UNREL

Weakly-supervised learning of visual relations

ICCV 2017

biogans

GANs for Biological Image Synthesis

ICCV 2017

SCNet

SCNet: Learning Semantic Correspondence

ICCV 2017

CNNgeometric

Convolutional neural network architecture for geometric matching

CVPR 2017

SURREAL

Learning from Synthetic Humans

CVPR 2017

ContextLocNet

ContextLocNet: Context-aware Deep Network Models for Weakly Supervised Localization

ECCV 2016

Thin-slice

Thin-Slicing for Pose: Learning to Understand Pose without Explicit Pose Estimation

CVPR 2016

NetVLAD

NetVLAD: CNN Architecture for Weakly Supervised Place Recognition

CVPR 2016

Instruction Videos

Unsupervised Learning from Narrated Instruction Videos

CVPR 2016

ProposalFLow

Proposal Flow

CVPR 2016

Discriminativev Part Detector

Learning Dictionary of Discriminative Part Detectors for Image Categorization and Cosegmentation

IJCV 2016

Head Detection

Context-aware CNNs for person head detection

ICCV 2015

P-CNN

P-CNN: Pose-based CNN Features for Action Recognition

ICCV 2015

Is object localization for free?

Is object localization for free? Weakly Supervised Object Recognition with Convolutional Neural Networks

CVPR 2015

Unsupervised Object Discovery

Unsupervised Object Discovery and Localization in the Wild

CVPR 2015

24/7 place recognition

24/7 Place Recognition by View Synthesis

CVPR 2015

SD Filter

The SD Filter: Robust Image Filtering Using Joint Static and Dynamic Guidance

PAMI 2017, CVPR 2015

Painting alignment

Automatic Alignment of Paintings to a 3D Model

3DRRW-ICCV 2011

Deblurring

Non-uniform deblurring for shaken images

IJCV 2012, CVPR 2010

PMVS

Patch-based Multi-view Stereo Software (PMVS)

PAMI 2010, CVPR 2007

Resampling Penalization for histogram selection in regression software

Resampling Penalization for histogram selection in regression software

EJS 2009

SPAMS

Sparse Modeling Software (SPAMS)

STIP

Space-Time Interest Points (STIP)

IJCV 2005



Datasets

justask

HowToVQA69M dataset

large-scale video question answering training dataset

ICCV 2021

justask

iVQA dataset

instructional video question answering benchmark

ICCV 2021

ObMan

ObMan Dataset

synthetically rendered hand-object images

CVPR 2019

SURREAL

CrossTask Dataset

CVPR 2019

UNREL

UNREL Dataset

unusual relation dataset

ICCV 2017

SURREAL

SURREAL Dataset

synthetically rendered person videos

CVPR 2017

THUMOS

THUMOS Dataset

challenges for large-scale action recognition

CVIU 2017

Charades

Charades Dataset

videos of human activities at home environment

ECCV 2016

Instance-level video segmentation

Inria 3DMovie Dataset v2

CVPR 2016

Proposal Flow

PF-WILLOW & PF-PASCAL Datasets

image annotations for semantic correspondence

CVPR 2016

Thin-slice

Thin-slicing for Pose

CVPR 2016

NetVLAD

Tokyo Time Machine Dataset

CVPR 2016

Instruction Videos

Instruction Videos Dataset

CVPR 2016

Head Detection

HollywoodHeads Dataset

ICCV 2015

24/7 place recognition

24/7 Place Recognition

CVPR 2015

Time-lapse videos

Time-lapse videos for long-term observation of people

common human-object interactions

ECCV 2012

Time-lapse sequences

Time-lapse sequences of indoor scenes

single-view 3D scene understanding

ECCV 2012

Video segmentation

Annotated video clips for spatio-temporal video segmentation

CVPR 2011

Willow Actions

Willow Actions

human action classification in still images

BMVC 2010

Video segmentation

Annotated movie data of face tracks

face tracks from six movies

ECCV 2010

Other datasets for Computer Vision

Include 15 scene categories, 3D object recognition stereo dataset, 3D photography dataset, visual hull datasets, birds, butterflies, object recognition dataset, texture dataset, and video sequences.