Computer Vision

INESC TEC Science Bits – Episode 11

PODCAST INESC TEC Science Bits

 

Guest speakers:

Paula Viana, Centre for Telecommunications and Multimedia (CTM)

Pedro Carvalho, Centre for Telecommunications and Multimedia (CTM)

 

Keywords: computer vision | artificial intelligence | computer | vision | (deep) machine learning

 

Paula Viana and Pedro Carvalho

When we must teach how to see

Teaching a machine how to see, the same way humans do, is a very difficult task. Mainly because computers must learn to replicate the brain’s (seemingly trivial) processing of human vision, by means of a mathematical calculation, a complex process that is still relatively unknown nowadays.

In a broad sense, computer vision is the technology that teaches computers to see and interpret the world through images. People learn it from birth, but one must teach computers to perform the same type of identification process. How can we teach a computer to differentiate between a dog and a chair, since both have four legs? We cannot expect computers to identify the characteristics of a dog (e.g., tail, ears or snout), like people do. In this sense, we must load them with millions of images of dogs or chairs, so they can collect the connections between them and learn for themselves the different elements of a dog or a chair.

Many abilities yet to explore

Computer vision, which also incorporates artificial intelligence, is increasingly present in our daily lives. Without even realising it, we use solutions based on these technologies, without understanding the complexity of the processes hidden in actions as immediate as editing a photo on the computer, choosing a filter for an Instagram post or tagging a friend in a photo on Facebook.

However, the applicability of these technologies is not limited to these actions. They could be useful in many different fields, from medicine to sports, security to energy, from mobility to healthcare, from multimedia to agriculture; everything can benefit from the performance of new and intelligent machines with operational and data processing capabilities, in some cases already superior to that of humans.

Some high-potential innovations

INESC TEC has been promoting several initiatives in the field of multimedia systems and communication, by taking part/leading innovative projects based on computer vision technologies, developing solutions to improve the relationship between people and technology.

Transforming a simple photograph into a video could be very interesting and even valuable in some areas. This is the main goal of the FotoinMotion project – which focuses on the development of a tool that introduces an innovative way to transform a single photo into a high-quality video, with dynamic storytelling and branding effects. Using computer vision and artificial intelligence techniques, the tool supports audio-visual professionals and the creative public in general, by taking static content (photography) and transforming it into a video, capable of telling a story that suits the image’s content and context.

Moreover, FotoIinMotion addresses three creative industries: photojournalism, fashion and event organization, although it can be adapted to any other area resorting to photographic content.

Regarding the CHIC project (Cooperative Holistic View on Internet and Content), the idea is to develop an ecosystem based on technologies that help stakeholders in the promotion and creation of content.
INESC TEC’s contribution to this consortium translates into the development of solutions for the management and operation of audio-visual files, enabling the identification of specific people in video pieces, while allowing direct access to the moment (timecode) they appear on screen – thus contributing to improve the search of certain elements on archive. This technology ensures an efficient reuse of stored contents on file, while reducing the operators’ costs. In addition, this solution makes the search performed by ordinary users on the online platforms of a given television station more efficient.

Finally, the application of computer vision has also been increasing in sports, particularly with the development of tools that support the analysis of movements and provide indicators on performance improvement. That is the main goal of the Ténis Video Sports AI project, which made use of these technologies in a marketable application (Tennis Tracking – AI Training), already available on the App Store, and which has been presenting itself as a powerful solution in obtaining results.

The future is here

Digital transformation has occurred in almost every area, despite some differences in terms of maturity. The current challenge is addressing regulation and ethics, but there is still a long way to go. Each step forward paves the way to more possibilities. Will there be any limits to these technologies?

 

Next Post
PHP Code Snippets Powered By : XYZScripts.com
EnglishPortugal