Alvin Wong, vice president of marketing & business development in the Programmable System Solutions Group at Spansion, explains to CIE why the user interface (UI) is the next battlefield for consumer electronics manufacturers.
“With only relatively minor differences in the features they offer, OEMs have realised that consumers no longer decide between devices based solely on what they do. Instead, they turn to the industrial design and UI of a product.
The UI defines the personality of a device and its look-and-feel can help to create an emotional link between people and their devices and helps to binds their loyalty. The truth of this can be seen in everything from the iPhone to kitchen appliances: the decision to buy a particular toaster is less about the toast, but more about how the product looks on the counter.
UI features that don’t work well aren’t used, so devices must work out of the box and adapt to the specific preferences of individuals. To achieve this, devices need to become more intelligent and will require more processing power and memory. Predictive intelligence is the ability of a system to predict a user’s desired outcome by understanding the different choices the person can make and which is the most likely to be selected. It can be active or passive. For example, auto-correction in SMS texting can actively replace misspelled words. Passive prediction is often used to suggest selections to users; for example, Apple Genius and Pandora use a listener’s selection history to create a list of suggested songs.
Predictive intelligence can introduce complexity to a system, however. For a UI with fixed options, such as controlling the infotainment console, a command-and-control speech recognition interface supports a limited vocabulary to provide low latency and high reliability. Natural language processing, where a user can talk as if to another person, requires not only more advanced storage and computing capabilities but also needs to be able to adapt to individual users to improve accuracy.
The more data the system can collect about a user, the more confidently it can predict what the user wants. Today, predictive technology assumes that people are consistent in their interactions. In reality, a person’s emotions and distractions affect their choices. Consider a car’s speech-recognition system. A person who is angry when getting into the car is likely to speak more quickly or yell at the car, resulting in lower recognition accuracy and further user frustration. A system that can adjust its behaviour to compensate for agitated responses is needed. For example, the system could apologise for not understanding the user and ask for help by suggesting that the user roll up the window. By proactively adapting to the situation, the system can begin to help calm the user, thereby improving accuracy.
Predicting the user’s state of mind is extremely complex and to maximise accuracy multiple sources are needed to determine the user’s intent and emotion. The camera on a mobile phone could recognise the face of a user but also his or her current emotional state from facial expressions and body language. The touchscreen could sense an urgent or agitated state by how hard and how quickly a user is pressing keys. Similarly, a speech recognition system could monitor changes in the user’s voice and mood even as it adjusts for them. In addition, each of these systems could coordinate their results to improve accuracy.
Determining a user’s emotional state has further applications beyond just improving the accuracy of the UI. For example, a user could allow friends and family to assess his or her emotional state before they complete a call. This could enable a spouse to decide to use texting instead of a voice call. And advertisers would also be likely to pay a premium to place their ads when a person is more engaged, such as when talking to friends over social networks.
However, this requires significant complexity within the device and one challenge for OEMs is determining whether to implement intelligence processing on-board a device or in the cloud.
In the cloud, processing is centralised over the network so even systems with limited processing capabilities can support more advanced technologies. However, because data has to be sent over the network connection, there is noticeable latency. Also, IP networks can drop packets, resulting in unreliable responsiveness, and there are interoperability issues that must be addressed.
A wholly onboard implementation, in contrast, offers high reliability with fast responsiveness but at a higher equipment cost. For this reason, many automotive applications will employ a hybrid approach with on-board resources used for functions such as operating air-conditioning or radio, with the cloud used for features that require substantial memory and processing capabilities like analysing user behaviour. These profiles can then be downloaded to the on-board system.
Advanced system intelligence requires sophisticated analysis capabilities involving a mix of hardware accelerators, flexible software algorithms and look-up tables. Systems will also need to be highly customisable and offer elegant error handling as well as being secure. The typical embedded system has an applications processor which is already heavily burdened so, to achieve the level of performance, accuracy, and power efficiency required for advanced UI processing, systems will need specialised hardware-based accelerators.
While the architecture of these devices will be important, it will be the efficiency of the IP that each vendor has to offer and the level of integration which will determine market leaders.
Of course, numerous technologies – graphics, encryption, digital signal processing, high-speed communications – have leveraged specialised hardware to offload processing from the host processor.
In the same way, OEMs will need to rely on hardware-based ICs to keep pace with the leading-edge of UI technology. Single-feature ICs will quickly evolve over the next five to 10 years into dedicated UI processors which will offload processing of multiple forms of recognition – speech, voice, image, facial, and emotion – from the applications processor. These will eventually be integrated into the general-purpose processor architecture. And with UI processors OEMs will be able to introduce advanced features to provide greater ease of use.