Modern Australian
Times Advertising

Facebook wants AI to find your keys and understand your conversations

  • Written by Jumana Abu-Khalaf, Research Fellow in Computing and Security, Edith Cowan University
Facebook wants AI to find your keys and understand your conversations

Facebook has announced a research project that aims to push the “frontier of first-person perception”, and in the process help you remember where your left your keys.

The Ego4D project provides a huge collection of first-person video and related data, plus a set of challenges for researchers to teach computers to understand the data and gather useful information from it.

In September, the social media giant launched a line of “smart glasses” called Ray-Ban Stories, which carry a digital camera and other features. Much like the Google Glass project, which met mixed reviews in 2013, this one has prompted complaints of privacy invasion.

The Ego4D project aims to develop software that will make smart glasses far more useful, but may in the process enable far greater breaches of privacy.

Read more: Ray-Ban Stories let you wear Facebook on your face. But why would you want to?

What is Ego4D?

Facebook describes the heart of the project as

a massive-scale, egocentric dataset and benchmark suite collected across 74 worldwide locations and nine countries, with over 3,025 hours of daily-life activity video.

Ego4D: Teaching AI to perceive the world through your eyes.

The “Ego” in Ego4D means egocentric (or “first-person” video), while “4D” stands for the three dimensions of space plus one more: time. In essence, Ego4D seeks to combine photos, video, geographical information and other data to build a model of the user’s world.

There are two components: a large dataset of first-person photos and videos, and a “benchmark suite” consisting of five challenging tasks that can be used to compare different AI models or algorithms with each other. These benchmarks involve analysing first-person video to remember past events, create diary entries, understand interactions with objects and people, and forecast future events.

The dataset includes more than 3,000 hours of first-person video from 855 participants going about everyday tasks, captured with a variety of devices including GoPro cameras and augmented reality (AR) glasses. The videos cover activities at home, in the workplace, and hundreds of social settings.

What is in the data set?

Although this is not the first such video dataset to be introduced to the research community, it is 20 times larger than publicly available datasets. It includes video, audio, 3D mesh scans of the environment, eye gaze, stereo, and synchronized multi-camera views of the same event.

Most of the recorded footage is unscripted or “in the wild”. The data is also quite diverse as it was collected from 74 locations across nine countries, and those capturing the data have various backgrounds, ages and genders.

What can we do with it?

Commonly, computer vision models are trained and tested on annotated images and videos for a specific task. Facebook argues that current AI datasets and models represent a third-person or a “spectator” view, resulting in limited visual perception. Understanding first-person video will help design robots that better engage with their surroundings.

Future robotic agents will benefit from a better understanding of their environment
Wikimedia: Future robotic agents will benefit from a better understanding of their environment Furthermore, Facebook argues egocentric vision can potentially transform how we use virtual and augmented reality devices such as glasses and headsets. If we can develop AI models that understand the world from a first-person viewpoint, just like humans do, VR and AR devices may become as valuable as our smartphones. Can AI make our lives better? Facebook has also developed five benchmark challenges as part of the Ego4D project. The challenges aim to build better understanding of video materials to develop useful AI assistants. The benchmarks focus on understanding first person perception. The benchmarks are described as follows: Episodic memory (what happened when?): for example, figuring out from first-person video where you left your keys Hand-object manipulation (what am I doing and how?): this aims to better understand and teach human actions, such as giving instructions on how to play the drums Audio-visual conversation (who said what and when?): this includes keeping track of and summarising conversations, meetings or classes Social interactions (who is interacting with whom?): this is about identifying people and their actions, with a goal of doing things like helping you hear a person better if they’re talking to you Forecasting activities (what am I likely to do next?): this aims to anticipate your intentions and offer advice, like pointing out you’ve already added salt to a recipe if you look like you’re about to add some more. What about privacy? Obviously there are significant concerns regarding privacy. If this technology is paired with smart glasses constantly recording and analysing the environment, the result could be constant tracking and logging (via facial recognition) of people moving around in public. Read more: Face masks and facial recognition will both be common in the future. How will they co-exist? While the above may sound dramatic, similar technology has already been trialled in China, and the potential dangers have been explored by journalists. Facebook says it will maintain high ethical and privacy standards for the data gathered for the project, including consent of participants, independent reviews, and de-identifying data where possible. As such, Facebook says the data was captured in a “controlled environment with informed consent”, and in public spaces “faces and other PII [personally identifing information] are blurred”. But despite these reassurances (and noting this is only a trial), there are concerns over the future of smart-glasses technology coupled with the power of a social media giant whose intentions have not always been aligned to their users. Read more: Artificial intelligence in Australia needs to get ethical, so we have a plan The future? The ImageNet dataset, a huge collection of tagged images, has helped computers learn to analyse and describe images over the past decade or more. Will Ego4D do the same for first-person video? We may get an idea next year. Facebook has invited the research community to participate in the Ego4D competition in June 2022, and pit their algorithms against the benchmark challenges to see if we can find those keys at last.

Authors: Jumana Abu-Khalaf, Research Fellow in Computing and Security, Edith Cowan University

Read more https://theconversation.com/facebook-wants-ai-to-find-your-keys-and-understand-your-conversations-170092

Interstate Car Transporter Urges Buyers to Book Early

As the conflict in the Middle East continues to put increasing pressure on local fuel supply, Australian transport companies are experiencing increasi...

Digital Minimalism for Business Owners: Fewer Tools, Better Systems

Be honest. How many apps are open right now? One for scheduling, another for invoices, a third for customer notes, plus a spreadsheet someone email...

The Importance Of Proactive NDIS Renewal Preparation For Sustaining Your Provider Business

Your NDIS renewal notice is not a signal to start preparing. By the time it arrives, preparation should already be well underway. For new providers, s...

Why Fire Extinguisher Testing in Sydney Is Becoming a Records Game, Not Only a Maintenance Job

A fire extinguisher used to feel like one of the simpler parts of building safety. It hung on the wall, wore a service tag, and sat there quietly unle...

The Switchboard Upgrade Question Every Melbourne Renovator Should Ask Before the Walls Close Up

Renovations have a funny way of making people think on surfaces first. Splashback, stone, joinery, tapware, paint. Fair enough too. That is the exciti...

Winter Sanitation Gaps in Parramatta Kitchens: A Hidden Pest Risk

Winter brings a host of changes to our homes, from the chill in the air to the cozy warmth indoors. However, this season also introduces sanitation ch...

When to Seek Advice from Employment Lawyers in Melbourne

Australian employment law is detailed and, at times, complex, with rights and obligations that aren't always obvious to employees or employers witho...

7 Benefits of Professional Gutter Cleaning for Australian Homeowners

Gutters aren't exactly glamorous. They sit up there on the edge of your roof, doing their job quietly - until they stop working. Clogged, overflowing ...

Pipe Floats Strengthening Pipeline Performance In Demanding Environments

Pipelines often travel through environments that are anything but predictable, water currents shift, terrain changes, and materials keep moving unde...

Why Ceiling Fans Are Essential For Comfort, Efficiency, And Modern Living

Creating a comfortable indoor environment is not just about temperature; it is about how air moves, how a room feels, and how efficiently energy is ...

Why Duct Cleaning In Melbourne Is A Smart Investment For Healthier Living Spaces

Behind your walls, ceilings, and vents lies a network quietly working every day to keep your home comfortable. Yet over time, this system can become...

Disability Service Providers Supporting Inclusive And Independent Living

Finding the right support system can feel like assembling a puzzle where every piece must fit just right. For individuals and families navigating di...

A Beginner's Guide to Owning a Caravan in Australia

Owning a caravan opens up a style of travel that's hard to match for freedom and flexibility. However, for those just starting out, the process of c...

Preparing Your Air Conditioner for Summer: What Most Homeowners Overlook

As temperatures rise, many homeowners switch on their air conditioning for the first time in months — only to find it’s not performing the way i...

What Actually Adds Value to Properties in Newcastle

Newcastle has seen steady growth over the past few years, with more buyers looking beyond Sydney for lifestyle, space, and long-term value. As dema...

What is Design and Build in Construction?

Imagine you’re about to start a new construction project, maybe it’s a custom home or a commercial building. You’ve got the idea, the land, an...

Commercial roof leak detection: why early action protects your building

Water ingress is one of the most disruptive and costly issues facing commercial properties. For property managers and facilities teams, even a minor...

Custom Photo Frames: Turning Everyday Moments into Lasting Displays

Photos capture moments, but how you display them determines how they’re experienced every day. A meaningful photograph deserves more than a generi...