Voice search (also referred to as conversational search), is dramatically on the increase and according to Comscore it’s anticipated by 2020 50% of search queries will be voice driven (Comscore, 2017). It is important to note however that this statistic is thought to be heavily misinterpreted and taken out of the context which it was originally meant to cover. This statistic on the dramatic increase in voice search usage is probably one of the most criticised figure of the past year or so. In addition to the Comscore voice search prediction there’s a whole host of other support for this huge wave of change toward a move from ten blue links and a search box, so whilst the Comscore statistic might be on shaky ground, it is pretty clear that the rise of voice and conversational search is a real ‘thing’. The search engines have made no secret of their investment and research into this area and it is only a matter of time before conversational search follows mobile and the ‘year of voice’ finally rolls around. If voice is to emulate mobile and become the dominant query input form, in the way that mobile overtook desktop search then voice search would certainly have to overtake the use of text input across the barometer, which is a measurement that Google uses to monitor overall digital technology behaviour across a number of countries.
Difference between conversational search (voice search) and conversational actions (voice actions)
Something which is important to note however, is the continued confusion between voice search and voice actions and predictions. Many predictions do not appear to separate conversational search (using voice activation to query a search engine results index) and conversational actions (using voice activation to use Google Actions, Alexa Skills or Google Assistant for example). These are not the same.
Conversational Search – What is it?
Conversational search is the non-keyboard, verbal input version of a traditional search engine. Instead of using a keyboard or mobile phone screen to enter a query, users utilise the spoken word using their voice and speak the query either into their mobile phone, desktop or via their smart-speaker (e.g. Google Home). By nature search engine queries (or keywords / keyword phrases) using voice (i.e. the spoken word) versus a typed query utilising a keyboard, are longer and tend to be more conversational in nature. i.e. they are nearer to how a human would speak. It certainly takes far less time to speak a query than to type one in with a keyboard or keypad. People speak very quickly when compared with the speed at which they enter keywords with their fingers.
Spoken word versus keyboard search
Search engines implementing voice search still aim to return search results as they would if the search engine user were entering them via a keyboard, but part of the challenge here is the complexity and length of the queries. To this end, understanding natural language is now a strong focus for search engines and training parts of their algorithms to understand just what the user‘s informational needs are, and in what context so they can return the most appropriately relevant result. Disambiguation is a problem because the English language, for example, has multiple meanings for virtually every other word. How does the search engine know that a horse in a query is one of the four legged variety versus a clothes horse or a saw horse, or that the user‘s voice is ‘hoarse’?
Precision over recall
With voice search the search engine needs to be more precise in its interpretation of the users information needs (and context). So, in this case precision beats recall. Precision in the result for a conversational search question or query is of paramount importance because currently there can be only one result / answer returned, whereas with keyboard search there can be several results listed (and some even included for diversity too), in the traditional 10 blue links. Keyboard search allows the user to interact with the results to select the most appropriate for themselves and to indicate their context further. For example, if a keyboard searcher enters “dresses”, a range of results are returned which could meet either transactional intent (e.g. shopping), images (pictures of dresses), dress shops (perhaps with some geo-location detection there) on maps, videos of dress shows, and so forth. The query is very top level without much context (a cold start query). However, the user can interact and click around providing a form of feedback as to their intent. Voice search does not provide for this feedback loop in the same way.
Answering questions and meeting queries
Conversational search is focused very much around searching in the same way as a user would do with a traditional search engine, entering queries, asking questions, and so forth, but with voice.
Conversational search is not without its challenges however, some of which were covered in depth in a lecture by Google’s Enrique Alfonseca, who is on the conversational search and natural language processing research team at Google Switzerland, at the 2017 European Summer School for Information Retrieval.
Conversational Actions – What are they?
Conversational actions are different to conversational search. Conversational Actions tend to be commands of an assistive nature and are very much task-based assistive system. These type of actions might be “Play Calvin Harris“, “Play my morning playlist”, or “What’s on my calendar today?”. Conversational Actions are mapped to, in Google‘s case, Google Assistant. Conversational Actions can also be mapped and integrated with other platforms and products such as Spotify (for playlists, music and radio), your Google Calendar (for diary management), your Nest or Hive installation (for heating and lighting control), and of course your phone.
There is even now functionality for developers outside of search engines to build their own conversational actions and some of these have already been implemented by hotel and travel sites, takeaway sites, mobile phone providers and even train ticket providers, with a view to helping their users to accomplish tasks associated with their product or service offering.
2019 Voice search statistics
There’s no shortage of statistics out there on voice (conversational) search. Let’s take a look at some of the statistics currently doing the rounds:
- Gartner predicts by 2020 some 30% of searches will be undertaken without a screen.
- According to Alpine.AI In January 2018 alone there were over a billion searches using voice search.
- Gartner predicts by 2021 early adopter brands will increase revenue by 30% by redesigning websites to cater for visual and voice search.
- Voice Shopping Set to Jump to $40 Billion By 2022, Rising From $2 Billion Today according to OC & C Strategy Consultants.
- Three giants of tech are leading the viirtual assistant space State Side – Google has 4% of the market share with its Google Home device. Cortana has 2% from Microsoft. The Echo from Amazon has a 10% penetration. These figures are for the US homes market.
- There are currently only 39 apps which have the capacity (developed skills) to integrate retail offerings into voice offerings currently in the market (according to OC & C Strategy Consultants).
- Amazon-Dominated Household Speaker Penetration Expected to Soar to 55% Over Next Four Years, from Current 13% (OC & C Strategy Consultants).
- 69% of customers using voice search know the exact product they wish to buy. (Source: OC & C Strategy Consultants).
- 58% of consumers have used voice search to find local business information within the last year (Bright Local)
- 27% visit the website of a local business after conducting a voice search (Bright Local).
- Only 39% of consumers trust in the “personalized” product selection of smart speakers.
- 39 million Americans now own a Smart Speaker (Source: Techcrunch)
- 30% of Smart Speaker owner say the device is replacing time spent with TV (Source: NPR)
- Technavio’s analysts forecast “the global voice recognition biometrics market to grow at a CAGR of 22.15% during 2014-2019″.
- 52% of Smart Speakers according to Google keep their voice device activated in their living room.
- 41% of people who own a voice-activated speaker say they feel that they are talking to a friend when talking to their device, according to 2017 research undertaken by Google and Peerless Insights.
- According to 2018 research by Brightlocal, 46% of voice search users look for a local business with voice search every day (Source: Brightlocal)
- In January 2018 alone, there were over 1 billion voice searches carried out (Source: Alpine AI)
- According to Google research of those people who owned a voice activated device, 72% said they use it as part of their daily routine (Source: Google / Peerless Insights)
- 18 million smart speakers were sold in Q4 of 2017 alone according to Voicebot AI.
- Search Engine Watch reports a Google Trends 35% increase in usage voice activated device since 2008, with “call Mom”, “call Dad” and “navigate home” the top voice requests. (NB: It should be noted these are not voice searches but conversational actions using Google Assistant)
- According to Global Web Index, 1 in 5 adults use voice search at least monthly (20%). (source: Global Web Index)
- The largest user group at 25% is the 16 – 24 year old demographic who use voice search on mobile devices say Global Web Index.
- 5% of individuals aged 16–24 use voice search on their mobile devices.
- Mary Meeker’s Kleiner Perkins Annual Internet Trends Report 2018, cites Andrew NG, former Chief Scientist at Baidu is quoted as saying in 2014 “In two years time at least 50% of all searches are going to be either through images or speech” (Andrew NG, Baidu, 2014). The citation comes from an interview Andrew Ng held with Fast Company.
- Research by Sistrix using a Google Survey sample of 26,720 answers collated from Germany, Spain, United Kingdom and United States found 74% of respondents were not using voice search at all (Source: Sistrix, 2018). (NB: Of course, conversely this does equate to 26% using voice search).
- 65% of Amazon Echo or Google Homeowners don’t want to go back to the days of keyboard input.
- According to Google, 20% of all searches are voice.
- 46% of smart speaker owners use voice search to find a business situated in their vicinity every day.
- 22% of smart speaker owners have bought something using voice search assistance.
So what do these statistics mean for the future of voice search?
Whilst it’s easy to be confused at how these statistics might turn into commercial opportunities right now, it’s evident from the above and the many more statistics available on voice search that the tide is turning. Even if voice search optimisation is not on your SEO strategy radar right now you should certainly be keeping a very keen eye on this changing landscape as we move further into the age of assistive search.
Some of the other main challenges with this quickly evolving and emerging area of search were covered at the European Search Conference and the slides are below, as well as on Slideshare: