“Alexa, tell Smoothie Bomb I want to make a smoothie”
Build a voice skill that allows the user to select quick 5-minute recipes from several options and then provide step-by-step instructions.
A voice-forward Alexa Skill for finding and making the perfect smoothie recipe
User Research | Personas | Script & Flow | User Testing |Multimodal Prototyping
Myself & VUI mentor
The original challenge was to create a 5 minute recipe skill that allowed users to select from several options and then provide step-by-step instructions.
I felt that smoothie recipes would be an effective focus due to the wide variety of quick options that can be made without complicated techniques or equipment.
Target audience → 18-35 year old current smoothie-makers who are looking for variety and a new smoothie-makers who want clear instructions to introduce them to making smoothies.
Able to physically make a smoothie, and have made one before.
Value quick snack/meals made of whole food ingredients.
Own an Alexa device and a blender.
“Alexa, ask Smoothie Bomb to find me a recipe with banana"
“I’ve got a bunch of recipes with banana. The 3 most popular are..."
Conducting interviews that explored motivations for making smoothies lead to the creation of Proto-Persona: Erin Trujillo.
As health efforts and ‘trying new things’ can be frustrating or even shame-inducing, I took care to build a system persona that would be witty and fun but never at the expense of being supportive and encouraging.
It felt important to cater to the set of users who may be more sensitive to failure. The encouraging nature of the system persona could increase their level of engagement without negatively affecting the set of users who did not 'need' this positive reinforcement.
FIRST PHASE: Learn, Search, Select
For the purposes of this challenge, I focused on building out the script for Intro and Find Smoothie. For users to successfully find a recipe, they have to know what's in it and so I also worked on the Hear Ingredients intent. Similarly, for users who didn't know what they wanted, I created Discover Categories. On-boarding, searching and evaluating recipe ingredients felt inseparable for testing purposes and so I chose to build them together for this challenge.
SECOND PHASE: Get in the kitchen
After building a pathway to access and evaluate recipes, developing the Make Smoothie intent would be the next logical step. Make Smoothie seemed the most compatible as a voice-only experience so I decided to instead focus on solving for the other more difficult intents. Additionally, testing Make Smoothie would require recipe development which would be beyond the scope of my efforts.
THIRD PHASE: Helpful non-essentials
If a user has selected a recipe but is not ready to make it yet, they may also need to Make a Shopping List or Save Recipe. I would assign these intents to a third tier of development. Depending on the resources allocated for developing this skill, this may also be the time to work on Create Profile.
Welcome to Smoothie Bomb
I initially wanted the introduction to feel like the user had just walked into their favorite juice bar. If this were actually the case, the user could browse the signboard menu or ask for a recommendation. In a voice-only world, I wanted to empower the user with an orientation to Smoothie Bomb's search options.
*Blender Sound Effect* [Time-sensitive greeting]! SmoothieBomb is all about helping you find and make the perfect smoothie recipe. I can search for recipes by ingredient or by category. Some popular categories are [fruity], [green smoothies] and [high protein].
What sounds good today?
Actually... we are the ones being welcomed
The logic I had started with was "does the user know how to search?" But as I read through the dialogue, I realized this wasn't a very conversational approach nor did it focus on the central user need.
A more fundamental question was "does the user know what kind of smoothie they want to make?" This perspective makes much more sense as the user isn't walking into a juice bar. Instead, the whole interaction is more appropriately viewed as a professional chef being invited into their kitchen.
[Time-sensitive greeting]! SmoothieBomb is all about helping you find and make the perfect smoothie.
Do you know what kind of smoothie you want to make today?
With this new framing of the user/Smoothie Bomb relationship, the pathways branch from a more solid grounding. If the user doesn't have a preconceived desire, it's a great opportunity to introduce the various smoothie categories and search options available to them. If they do know what they want, the next step is to accommodate the different ways they may try and make their query.
User didn't know what they want
Would you like to hear more categories?
You can search for recipes by ingredient or by smoothie category. Our top categories are [fruity], [green] and [high-protein].
User requesting what they want
I want to make an [ingredient] smoothie
Tell me about your [category] smoothies
GUI vs VUI: Playing to strengths
Redirecting users to an easier pathway
For users with dietary restrictions or allergies, the ability to substitute or filter ingredients is an essential task. Additionally, users might want to substitute due to taste preference or ingredient availability. Creating a user profile would help users from having to constantly make the same substitutions. This requires a fair amount of data gathering which is often better suited for a GUI.
Create Profile prompt that offers a GUI alternative
Creating a profile will help make finding the perfect recipe faster and easier. Which reminds me—you can also edit your profile on smoothiebomb.com.
Who are we making a profile for?
Voice-forward not voice-only
Building out a voice-only pathway (as referenced in the above Alexa prompt) would maximize accessibility and preserve a unified voice-only experience. However, if this project were limited in the terms of development resources, a shortcut could potentially be made by directing skill users to complete/edit their profiles on the web.
When actually making the smoothie, voice-forward interactions shine.
The need to be moving around the kitchen gathering ingredients and having occupied or messy hands makes interacting with a touchscreen problematic.
Interacting with the instructions and ingredient list with voice allows users to best focus on the literal task at hand. Granted, during this time, users with a screen-equipped device would certainly benefit from a reinforcing visual of the ingredient/instruction. Stacking communication with a visual helps to reduce cognitive load. This would increase accessibility for populations that struggle with decoding.
Example of reinforcing visual for Echo Show
Looking good vs sounding good
Users who don’t know exactly what they want may find searching via voice-only cumbersome to navigate or even deficient in information. As most restaurant menus can attest to, pictures can be very helpful to the decision maker. Additionally, the inability to scan large amounts of information via voice may frustrate the indecisive user.
Usability testing the discovery process
As previously mentioned, I focused on building out the sections of the script that focused on users finding recipes: Intro, Find Smoothie and Hear Ingredients.
I conducted 5 usability tests with subjects from the target demographic. The tests consisted of 4 scenario tasks in Wizard of Oz format in which I played the part of Alexa. The tests sought to assess the following:
Intuitiveness of navigation
Recipe discovery pathways
Fit of system personality
Analysis of the 5 usability tests revealed three prominent issues. I categorized them by severity (High, Medium & Low) and fix-ability (easy, moderate, difficult).
ISSUE ONE: Unclear prompt after being told the most popular smoothie:
HIGH PRIORITY - Easy fix
Would you like to hear more?
People really love [topSmoothie].
4 testers that encountered this issue all made a decision but reported not being confident in what would happen as a result.
SHORT and CLEAR is best, but LONGER and CLEAR is better than SHORT and UNCLEAR.
SUGGESTED CHANGE: “Want to hear more about [topSmoothie] or would you like another recommendation?”
ISSUE TWO: Scope of FilterRec_Previous not expansive enough.
MED PRIORITY - Easy fix
Would you like to make it again?
Last time we made [smoothieName].
Several testers asked what if the last smoothie made was not the one they were looking for. Expanding the prompt to include the previous three increases utility without adding much more complexity.
SUGGESTED CHANGE: “The last [amount] smoothies you made were [smoothieName1], [smoothieName2] and [smoothieName3]. You can select one of these or I can tell you more of what we made before"
ISSUE THREE: How/when to handle substitution requests.
MED PRIORITY - Moderate fix
The main ingredients of [SmoothieName] are [main1], [mainI2] and [main3]. Would you like to hear the full ingredient list?
When hearing summaries or lists of ingredients, several testers wondered if they could make substitutions.
This requires accommodating responses such as "I can't have _______" or "I don't have any _______."
SUGGESTED CHANGE: Adding a response stating that any ingredient can be substituted once a recipe has been selected. The resulting substitution decision tree would occur during the Make Smoothie intent instead of during Find Smoothie.
One flaw of the testing I conducted was the lack of investment from participants about what it is they were trying to find. The scenarios requested that they find smoothies with various ingredients or ask for recommendations but the success metric did not carry much weight. Future testing should be conducted in actual kitchens where environmental factors and constraints such as available ingredients and actual taste preferences are factored in.
This next phase of testing would require developing or licensing actual recipes. This process would provide another opportunity to focus or expand the target group due to the myriad of goals and values associated with food.
BEYOND THE SKILL
This challenge led me to thinking about the recipe process in general. As previously noted, voice shines during the actual making of the recipe but has more limitations when it comes to the discovery process. I began to wonder if there was a way to connect the strength of GUI recipe searching with VUI implementation.
Another way to put this is...
When searching for a recipe online, why can I send it practically anywhere in the world—except my kitchen??
Proposal: A bridge between browser & voice assistant.
Capture recipes and save to Alexa
A natural extension of this smoothie skill would be a browser extension that adapts web-based recipes to the voice skill. When this technology becomes available, Smoothie Bomb users could capture recipes from their favorite sites and make them without having to fumble with messy printouts or touchscreens.
Due to the vast amount of recipe resources online (and the diverse needs and goals they cater to) I think it would be an advantageous position for this skill to be more of a medium for delivering the users desired content rather than a self-contained source of recipes.
Capturing recipes with browser extension
Ensuring the delivery of value
Optimizing the transition from written word to spoken directions would certainly be a challenge. The issue wouldn't so much be inserting the captured steps into the proper place within the script but in making sure that what is inserted translates into a positive voice-only experience. It seems like there is significant value to offer here but testing would be required to determine if that value isn't undermined by incompatible or indigestible content.
User requesting previously captured info