New apps help visually impaired users see the world differently

Jane Hall

3 years ago

Visually impaired iPhone users can now download two new free apps developed to give them more autonomy in their everyday lives.

Created by a team based at the University of Michigan, VizLens is essentially a screen reader that employs a person’s smartphone camera to allow them to understand and operate a variety of interfaces in everyday environments, including home appliances and public kiosks, just by touching buttons on their mobile.

Meanwhile, Image Explorer identifies features in a picture allowing the user to examine it through touch and audio feedback.

Anhong Guo, assistant professor of computer science and engineering at the University of Michigan, led the development of both software applications, which are available to download from the Apple app store. He said: “A blind user can take a picture of an interface, and we use optical character recognition to automatically detect the text labels.

“A user can first familiarise themself with the layout on their smartphone touchscreen. Then, they can move their finger on the physical appliance control panel, and the app will speak out the button under the user’s finger.”

Loss of vision can affect people of all ages. But the majority of those with vision impairment and blindness are over the age of 50. The leading cause of eyesight problems is cataracts and uncorrected refractive errors.

VizLens uses a smartphone’s camera to view control interfaces, such as the one on this microwave, and read each label. Image: Human-AI Lab, University of Michigan

According to the UK-based RNIB (Royal National Institute of Blind People) nearly 80% of those living with sight loss are 65 or older, and around 60% are 75-plus. Of these, well over half of people with sight loss are women.

But the advent of smartphones and specially tailored apps has revolutionised how the visually impaired now interact with the world around them.

For the Image Explorer, Professor Guo and his team integrated a suite of object detection and segmentation models – including Meta’s Detectron2 visual recognition library and Google OCR (optical character recognition) and image analysis models – to enable visually impaired users to explore what’s in the depiction and how the different objects relate to one another.

Professor Guo’s aim has been to offer visually impaired people an accurate way of forming a mental image when alt text is missing or incomplete, as AI-generated captions are often not sufficient.

He explained: “There are a number of automated caption programmes out there that blind people use to understand images, but they often have errors, and it’s impossible for users to debug them because they can’t see the images. Our goal, then, was to stitch together a bunch of AI tools to give users the ability to explore images in more detail with a greater degree of agency.”

When a picture’s uploaded, Image Explorer provides a thorough analysis of the content. It gives a general overview of the image, including the objects detected, relevant tags, and a caption.

The app also features a touch-based interface that allows users to explore the spatial layout and content of the image by pointing to different areas.

Image Explorer correctly auto captions the image as “a couple of women walk down a sidewalk.” Image: Human-AI Lab, University of Michigan

Image Explorer developers say it is unique in the level of detail it provides.

It gives users a comprehensive description of the objects in an image, even down to the level of what type of clothing a person is wearing and what activities they are engaged in, as well as their position within the picture.

Professor Guo said: “Image Explorer helps users understand the content of an image even though they cannot see it.”

Hundreds of visually impaired, user-testing participants have experimented with VizLens and Image Explorer, offering feedback to Professor Guo’s team, which is continuing to develop these tools.

First discussed in 2022, Image Explorer is a much newer concept than VizLens, which made its academic debut in 2016.

Some of its details need further refinement- for instance, most tops are simplified to ‘shirts,’ and different tools within Image Explorer sometimes give conflicting information.

“The accuracy relies on the models we use, and as they improve, Image Explorer will improve,” Professor Guo said. “In spite of these errors, the results we presented in 2022 show that Image Explorer enables users to make more informed judgements of the accuracy of the AI-generated captions.”

Professor Guo is looking forward to the feedback that will come with public deployment. “We will be able to observe how people use these tools and adapt them to their lives,” he said.