Last week I stepped on a piece of news with yet another story about the voice-driven assistance services which have gained so much traction recently: “Mozilla is crowdsourcing voice recognition to make AI work for the people“. I have been an enthusiastic Alexa user myself right from the start. I believe voice-based assistance services, voice-based human-computer interaction, in general, have a bright future. At least bright in terms of a sharp surge in usage and a myriad of use cases that bring benefits to most consumers and businesses.

The depth of reach into our everyday life will be breathtaking

When thinking about a future with omnipresent voice assistants it is paramount to consider the question of ownership. Who will rule the digital assistants which will organize our everyday lives? Today, voice-based services like Google Home, Amazon Alexa, Siri from Apple or Microsoft’s Cortana are in their infancy and you find insufficiencies wherever you really try to do everything you can imagine. If you want to try that for yourself, ask Alexa to play some of the early Bon Jovi songs written by Desmond Child. It doesn’t lead to anything meaningful. But that will change.

The gigantic platforms of artificial intelligence services which are being developed right now are making big steps forward by continuous learning. Every day. With every word spoken into one of these speakers, with every source of data being connected to it they get better and better. They will soon fill up such a significant space in our everyday life which is hard to grasp from today’s perspective. They will make our appointments, they will carry out e-commerce orders, they will buy tickets when our favorite artists are in town, they will manage and enhance our communication and they will remind us to grab a coat because it is a little more chilly that day than usual. They will know tremendously much about us. They will answer our questions in distinct fashion and last not least direct our money in one or the other direction.

Will AI & Voice-based Assistance Services be dominated only by the Tech Giants?

At the moment, the tech giants are the only players possessing of the means and funds to develop and run such all-rounder services. And here we are back at the Mozilla story from the beginning of this post. Mozilla wants to step in and change some of the procedures. The first step is an open-source based system for speech recognition and natural language processing. That would be the basis for a full-blown digital voice-based assistant. One that is neither owned by Google, nor Apple, nor Amazon. But to all of us.

For me, it is important to narrow the considerations down to the question if you want a service with such intimate and encompassing access into your life to be a product of one of the tech giants. Or not.

By donating your voice to the Mozilla Project Common Voice everyone can help push an independent service by providing learning material for the project’s speech recognition and natural language processing algorithms. This could be your contribution to a future where the robots and software we will talk to in the future will be owned by all of us.

This all reminds me of my own ideation work on assistance services from 2013

Thinking about this idea of liberation of digital assistance services reminds me of some work I did in 2013. I was one of the driving forces in a future ideation project aiming at identifying next-generation services and product ideas in the communications sector. This was a heavily design thinking based initiative for developing future scenarios, transition those into a single normative scenario and hence generate service and product ideas derived from day-in-a-life ideation exercises. The time horizon was 2022, so basically a 10-year future ideation scope. Looking back I have to admit that I pretty much captured the whole exercise at the point of the day-in-a-life ideation, being totally triggered by the outlook of how much I had imagined digital assistance services would dominate our lives in 2022 already.


The following graphic shows how many out of the almost 200 ideas were somehow centered around and tagged with “assistance”. It’s the crowded lower “Next Generation VAS” section with a total heavyweight in “assistance” service ideas. At some point during the creative process, I felt totally carried away. I could grasp the lack of understanding at my fellow participants about some of the really low-level detailed digital assistance management service and feature ideas. Well, it was much fun though and ideally, I should have started work on building a company doing services like these… digital, artificially driven voice-based assistance services. Today I often bite my lip when realizing exactly this kind of stuff is getting really relevant. And 5 years earlier than anticipated back then. Well, at least I did anticipate it… remembering the others scratching their head like “what the heck is this guy talking about?” 🙂


Here are some highlights of the ideas I had in 2013

  • Biometric activation of voice and visual augmentation based assistance services (so that very personal information is not accessible to others)
  • Automated settings and preferences proposals when signing up for services (to make an easy start and not have to skim through hundreds of preferences… rather shape and if necessary adjust them as services is being used)
  • Payment account & settings management for assistances services (from where are funds taken? auto-pay settings, automated payment threshold levels, etc.)
  • Universal and unified access to multiple content sources by digital assistance services (connect to cloud services, local services, dedicated apps and services to learn preferences and control and carry out service tasks)
  • Access-type agnostic assistances services (same baseline services, but access via voice, messaging, apps, etc. in various environments and contexts)
  • Avatar-based personification of assistance services
  • Appointment management assistance (organize all my appointments and requests for such; up to a degree where private and personal appointment requests can be re-arranged and or rejected based on elaborated rules and preferences… I had the idea of a “contacts appreciation level manager”…)
  • The “user-driven feedback assistance service optimizer” (take 2-3 minutes at the end of the day to review and rate recommendations, auto-carried out tasks, etc. to improve quality of service; could also work via instant feedback right the moment such service results surface)
  • Enabler program / service (for digital assistance services to make maximum impact, they should stretch into every aspect of life… also areas which are “digital laggards”; enable kindergarten, school, doctors, and all sorts of business to interact with consumers’ digital assistance services in bi-directional fashion)
  • “Ambient life observation & guidance” (going beyond the “do this and this, dear assistance service” principle and let assistance services auto-recommend and make decision correction proposal types of interaction like “you really want to go into this area? … statistics say it’s not safe at this time of day”)
  • Life situation templates (e.g. load the “young family assistance services” template to enable my assistant to perfectly help me in my new life situation as a young dad, etc.)
  • Inter-service coordination (how do we ensure that my provider 1 assistance service is able to interact with provider 2’s service that my business partner uses?)

…to name a few.

Detaching services and data ownership is crucial

Ultimately it all mounted in the key idea, that whatever kinds of really cool, helpful and profound assistance services there are, there should be a detaching function or layer, allowing users to encapsulate all the data used as input into and gathered by digital assistance services. I hated the idea that I would be “trapped” with a single provider with years worth of my individual input and preference data and not being able to just take my data and use another service without having to start from scratch there. In a world where service logic and underlying data are decoupled, I would feel much more comfortable. Mozilla’s step now is one into the right direction. Put more value creation into open source hands and democratize those crucial future assistance services.

That’s my 5 cents on where AI and voice assistance services should be heading.

Read this post on Linkedin

Read this post on Medium