At least 20 computer vision and imaging scientists from Xerox will join their peers from Google, Facebook, Microsoft Research, Amazon and many of the world’s top academic institutions later this month to share their research on making computers more “human-like”, mimicking how the brain sees and thinks.
Steadily closing the gap between reality and Hollywood depictions of artificial intelligence, the annual IEEE Computer Vision/Pattern Recognition Conference set for June 23-28 in Columbus, Ohio, draws top scientists worldwide working on ways to advance computer vision, a field that empowers machines to “see” and make sense of the world, augmenting and often exceeding human capabilities.
“Xerox has first-hand knowledge of business processes across many industries, and is a pioneer in teaching computers to extract meaningful and actionable analytics from images and video,” says Raja Bala, a Xerox principal scientist.
“Although there’s been significant progress in recent years, a number of scientific challenges remain to be resolved.”
Xerox research presented at this year’s conference includes:
* Detecting cell phone use by highway drivers
* Motivated by its impact on public safety and property, several state and federal government organisations prohibit cell phone use while driving. Xerox scientists are working on a camera system for highways that uses pattern recognition technology to detect if a driver is using a cell phone.
Researchers in Webster are also working on a computer vision project that would turn smartphones into driving assistants. Using facial feature detection technology the phone would estimate a driver’s gaze direction, and detect if a driver is distracted and not paying attention to the road.
Researchers from both Xerox in Europe and at Harvard University are studying what attracts people’s attention first when they look at a picture. Understanding that eye-catching element enables visuals to be composed for greater effect and can predict where people will look when facing a scene, a photo or game.
Images and video make up 90% of today’s Internet traffic. To explore and use this massive amount of data requires technology that can automatically analyse an image and create a unique ‘visual signature’ that distinguishes it from other images.
The Xerox Research Centre Europe (XRCE) has invented a patented state-of-the-art methodology that creates such signatures in an extremely compact and robust fashion, beating current deep learning methods for such challenging image classification problems as recognising the brand and model of a car.
Crowdsourcing is frequently used as a forum to find individuals to label images but presents challenges when specific subject matter experts are required (such as ornithologists to identify species of birds), since such experts are rarely available on crowdsourcing platforms.
To assist non-expert annotators, the Xerox research centres in Europe and India have designed a system that can do a first filter to propose a very limited number of categories to choose from. The system automatically includes ‘gold’ questions to identify untrustworthy annotators and ensure the quality of labelling.

