Facebook has been in the spotlight this year for undergoing a major brand change and renaming itself as Meta. As one of the leading innovation companies, globally, we saw Meta come out with really interesting algorithms and models in areas ranging from computer vision, robotics, 3D simulation, NLP, and more.
Also Read:
- Facebook AI Releases XLS-R, Self-Supervised Model For Speech Tasks
- Facebook Releases Massive Annotated Dataset Of First-Person Perception
- Facebook Crowdsources This Computer Vision Dataset, Partners With An Indian University
- Hands-on Guide to AI Habitat: A Platform For Embodied AI Research
- Popular Datasets Released By Tech Firms In 2021
Let us look at a few of them.
Habitat 2.0
Meta came out with Habitat 2.0 (H2.0), a simulation platform for training virtual robots in interactive 3D environments and complex physics-enabled scenarios. It worked on all stack levels, including data, simulation, and benchmark tasks. One of such areas is ReplicaCAD, an artist-authored, annotated, reconfigurable 3D dataset of apartments with articulated objects, said Meta. H2.0 comes with a physics-enabled 3D simulator with speeds exceeding 25,000 simulation steps per second (850× real-time) on an 8-GPU node. The Home Assistant Benchmark (HAB) is a suite of common tasks for assistive robots that works with mobile manipulation capabilities.
For more details, click here.
Animating children’s hand-drawn figures of humanlike characters
Just days back, Meta came out with something really unique – calling it “a first-of-its-kind method for automatically animating children’s hand-drawn figures of people and humanlike characters” in minutes by using AI. The prototype system it has built allows them to do so by uploading their drawings with the option of even downloading their animated drawings. Meta said that it wanted to build an AI system that can identify and automatically animate the humanlike figures in children’s drawings with a high success rate and without any human guidance.
For more details, click here.
Ego4D
Meta introduced Ego4D, a massive-scale egocentric video dataset and benchmark suite. It said that Ego4D comes with 3,025 hours of daily life activity video spread over hundreds of scenarios, such as outdoor, workplace, leisure, home, etc. It added that parts of the videos come with audio, 3D meshes of the environment, eye gaze, stereo, and synchronized videos from multiple egocentric cameras at the same event.
For more details, click here.
Few-Shot Learner to take on harmful content
Meta came out with new AI technology called Few-Shot Learner (FSL) that can adapt to take action on new or evolving types of harmful content within weeks instead of months. FSL works on more than 100 languages and has the capability to learn from different kinds of data (both image and text). In addition, it can work on AI models that are already being used to detect harmful content.
“Few-shot learning” starts with a large, general understanding of many different topics, then uses much fewer, and in some cases zero, labelled examples to learn new tasks, said Meta.
For more details, click here.
XLS-R: Self-supervised speech processing for 128 languages
XLS-R is a self-supervised model for speech tasks. It improves upon previous multilingual models by training on nearly ten times more public data in more than twice as many languages. Meta said that it fine-tuned XLS-R to perform speech recognition, speech translation, and language identification, setting a new state of the art on a diverse set of benchmarks. This includes BABEL, CommonVoice, and VoxPopuli for speech recognition, CoVoST-2 on foreign-to-English translation, and VoxLingua107 for language identification.
It is trained on more than 436,000 hours of publicly available speech recordings based on wav2vec 2.0. In addition, meta has expanded this model to 128 different languages, increasing it nearly two and a half times from its predecessor.
For more details, click here.