WILLOW Software

Software

	VidChapters-7M: Video Chapters at Scale NeurIPS 2023 D&B		Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning CVPR 2023
	Zero-Shot Video Question Answering via Frozen Bidirectional Language Models NeurIPS 2022		Collision Detection Accelerated: An Optimization Perspective RSS 2022
	TubeDETR: Spatio-Temporal Video Grounding with Transformers CVPR 2022		Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation CVPR 2022
	Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos CVPR 2022		Large-Scale Unsupervised Object Discovery NeurIPS 2021
	History Aware Multimodal Transformer for Vision-and-Language Navigation NeurIPS 2021		CCVS: Context-aware Controllable Video Synthesis NeurIPS 2021
	Just Ask: Learning to Answer Questions from Millions of Narrated Videos ICCV 2021		Goal-Conditioned Reinforcement Learning with Imagined Subgoals ICML 2021
	Learning joint reconstruction of hands and manipulated objects CVPR 2019		Cross-task weakly supervised learning from instructional videos CVPR 2019
	D2-Net: A Trainable CNN for Joint Detection and Description of Local Features CVPR 2019		SFNet: Learning Object-aware Semantic Flow CVPR 2019
	A flexible model for training action localization with varying levels of supervision NeurIPS 2018		Neighbourhood Consensus Networks NeurIPS 2018
	BodyNet: Volumetric Inference of 3D Human Body Shapes ECCV 2018		End-to-end weakly-supervised semantic alignment CVPR 2018
	Long-term Temporal Convolutions for Action Recognition PAMI 2018		Learning from Video and Text via Large-Scale Discriminative Clustering ICCV 2017
	Joint Discovery of Object States and Manipulation Actions ICCV 2017		Weakly-supervised learning of visual relations ICCV 2017
	GANs for Biological Image Synthesis ICCV 2017		SCNet: Learning Semantic Correspondence ICCV 2017
	Convolutional neural network architecture for geometric matching CVPR 2017		Learning from Synthetic Humans CVPR 2017
	ContextLocNet: Context-aware Deep Network Models for Weakly Supervised Localization ECCV 2016		Thin-Slicing for Pose: Learning to Understand Pose without Explicit Pose Estimation CVPR 2016
	NetVLAD: CNN Architecture for Weakly Supervised Place Recognition CVPR 2016		Unsupervised Learning from Narrated Instruction Videos CVPR 2016
	Proposal Flow CVPR 2016		Learning Dictionary of Discriminative Part Detectors for Image Categorization and Cosegmentation IJCV 2016
	Context-aware CNNs for person head detection ICCV 2015		P-CNN: Pose-based CNN Features for Action Recognition ICCV 2015
	Is object localization for free? Weakly Supervised Object Recognition with Convolutional Neural Networks CVPR 2015		Unsupervised Object Discovery and Localization in the Wild CVPR 2015
	24/7 Place Recognition by View Synthesis CVPR 2015		The SD Filter: Robust Image Filtering Using Joint Static and Dynamic Guidance PAMI 2017, CVPR 2015
	Automatic Alignment of Paintings to a 3D Model 3DRRW-ICCV 2011		Non-uniform deblurring for shaken images IJCV 2012, CVPR 2010
	Patch-based Multi-view Stereo Software (PMVS) PAMI 2010, CVPR 2007		Resampling Penalization for histogram selection in regression software EJS 2009
	Sparse Modeling Software (SPAMS)		Space-Time Interest Points (STIP) IJCV 2005

Datasets

	HowToVQA69M dataset large-scale video question answering training dataset ICCV 2021		iVQA dataset instructional video question answering benchmark ICCV 2021
	ObMan Dataset synthetically rendered hand-object images CVPR 2019		CrossTask Dataset CVPR 2019
	UNREL Dataset unusual relation dataset ICCV 2017		SURREAL Dataset synthetically rendered person videos CVPR 2017
	THUMOS Dataset challenges for large-scale action recognition CVIU 2017		Charades Dataset videos of human activities at home environment ECCV 2016
	Inria 3DMovie Dataset v2 CVPR 2016		PF-WILLOW & PF-PASCAL Datasets image annotations for semantic correspondence CVPR 2016
	Thin-slicing for Pose CVPR 2016		Tokyo Time Machine Dataset CVPR 2016
	Instruction Videos Dataset CVPR 2016		HollywoodHeads Dataset ICCV 2015
	24/7 Place Recognition CVPR 2015		Time-lapse videos for long-term observation of people common human-object interactions ECCV 2012
	Time-lapse sequences of indoor scenes single-view 3D scene understanding ECCV 2012		Annotated video clips for spatio-temporal video segmentation CVPR 2011
	Willow Actions human action classification in still images BMVC 2010		Annotated movie data of face tracks face tracks from six movies ECCV 2010

Other datasets for Computer Vision

Include 15 scene categories, 3D object recognition stereo dataset, 3D photography dataset, visual hull datasets, birds, butterflies, object recognition dataset, texture dataset, and video sequences.