At least, mine are. I'm not entirely sure how to feel about the fact that "HEY GOO-GOH, PAY TWA TWUH ON NEH-FLIH" (translation: Hey Google, play Trash Truck on Netflix) is among the first few phrases in my child's vocabulary.
Our Google smart speaker is my children's most tangible exposure to AI. It's the piece of technology in my home that I have to be most mindful of. How I interact with it and talk about it matters because it can seem like there's a person behind itāa human speaking through the speaker when it's really a computer.
A Google smart speaker gearing up to play āBaby Sharkā for the millionth time. Image: WIRED
Moving forward, our children will likely interact with computers by talking to them, not necessarily typing or tapping. I wanted to understand how today's smart speakers process voice and language, and what we should be mindful of as parents.
I couldn't think of a better person to ask than my friend Arwa Mookhtiar. With her background in linguistics and computer science, plus a decade on the Google Assistant team, she offers amazing insight into talking to AI.
Today, youāll discover
How Arwaās multilingual childhood and passion for languages led to a decade of shaping Google Assistant
The inside scoop (expected and unexpected) on working at Google
How smart speakers process voice commands and why they struggle with children's voices
Practical features and safety guidance for home assistants that most parents don't know about
šæ Letās grow!
Todayās ammi.ai is brought to you by:
Stop wrestling with spreadsheets. Our comprehensive Excel mastery pack includes everything you need to become a spreadsheet wizard in record time.
Master 50 essential Excel tricks with step-by-step GIF tutorials
Create stunning data visualizations using done-for-you graph templates
Learn time-saving shortcuts the pros use daily
Access game-changing formulas and functions explained simply
Ready to revolutionize your Excel workflow?
Insights from fellow technologists and parents on AI and their areas of expertise
I actually feel somewhat relieved hearing that toddlers use home speakers, because these devices provide small bursts of technology as needed rather than the brain-rot way of being glued to a screen like with TikTok.
This conversation has been edited for length and clarity.
Arwa at her UCLA graduation
Ruqaiya Akbari: How would you fill in this blank to describe your relationship with AI? __________.ai
Arwa Mookhtiar: I think I would say Cautiously Optimistic.
Iāve been āinsideā AI through my work, but not on teams building models. My team utilizes those models to improve the product. I want to also clarify that everything Iāll say in this interview are my opinions and not those of the company.
When I started in voice assistants, what they called "AI" was mostly heuristics like āif X, do Y.ā But over the years, real AI started to come in, incorporating neural network models for speech-to-text and intent-based understanding. Now, generative AI can improve the flow 100x, and our job is setting guardrails to prevent hallucinations. Neural networks have been around since the 1940s, but only in the last few years did they start making models quite so large -- ChatGPT/Gemini are so powerful because of their sheer size.
Going back to being cautiously optimistic, I think of Alfred Nobel, who invented dynamite for mining but saw it used in war. AI has amazing uses, but also risks like deepfakes, which could harm kids even if theyāve done nothing wrong.
Could you share your background and what you're passionate about?
Language is a big thing I'm passionate about. We grew up speaking Gujarati or Lisan-al-Dawat at home, but after starting preschool, my brother and I switched entirely to English. Our parents, having faced discrimination for their accents, let it happen so weād fit in.
In middle school, I realized Iād lost something valuable. I could understand the language but wasnāt comfortable speaking it. I started relearning it, though I still have an accent, and people sometimes laugh. Watching Bollywood movies helped me pick up Hindi, and later, I studied French for eight years. But when I arrived in France, I struggled to speak. I found out the hard way that learning a language in school is not the same as learning it in real life.
Now, living in Miami, I speak Spanish, though not fluently. The lady who cleans my house doesnāt speak English, nor does the A/C repairman. I manage, though sometimes I have to look things up or just point. I always cringe at the stereotype of Americans saying, āThis is America, speak English.ā But itās that person's broken English which allows you to communicate with them at all -- thatās a skill, not a flaw.
Was the intersection of language and computer science something you were always interested in, or did you find your way there by happenstance?
I found my way there. I started college unsure. Maybe English, then psychology. In my second quarter, I took a general education linguistics class and it was like, full stop, this is it. I had no idea there was a whole science behind language.
Around the same time, I met a student who planned to major in Linguistics & Computer Science to avoid the tougher engineering requirements. She convinced me to take an intro CS class with her that spring. She hated it and dropped it, but I stuck with it.
Later, I interned at Google to see if I could actually do this as a job since Iām not one of those people who codes for fun. I ended up really liking it. They offered me a full-time role, and now, 10 years later, Iām still here.
Arwaās mug shot!
For those outside the tech world, could you share one thing that might be unexpected about working at Google and one thing that's exactly what people might expect?
Something totally expected: Google feeds you really well. Free food, snacks, coffee, everything! I barely had to grocery shop in my twenties because work provided all my meals. Google believes people connect over food: if you eat lunch with coworkers, youāll end up discussing ideas and making the company better. They even gift you a one-hour massage on your "Googleversary" each year.
Something unexpected: the sheer amount of scrutiny and regulation. This type of company is "the man," and regulators are always finding reasons to say it's doing something wrong. You can't just come up with a cool idea, build it, and launch it. There are layers of process to ensure it follows privacy principles, fits guidelines, and meets standards. This is for good reason and there have definitely been privacy issues in the past, but it's not like the movies where it's all fun demos.
You've been on the Google Assistant team for about 10 years, even before it officially launched. Could you talk about your role?
I worked on reminders for over five years. One big project was merging Google Reminders with Google Tasks and making sure Assistant worked seamlessly with it. Now, if you say, "Hey Google, remind me to reply to that email," it appears in your inbox as a Google Task when you sit down at your computer.
Now, I work on overall Assistant quality. People have noticed a decline in reliability, so weāre focused on fixing issues, simplifying complex features, and using large language models to enhance understanding. Weāre also improving internal tools, like using AI to categorize bugs and evaluate interaction quality.
Has the decline in perception correlated with the launch of applications like ChatGPT and people expecting more from an assistant product?
We think the decline started before ChatGPT launched. People have been complaining on the Google Home Reddit for many years, and we wanted to take it seriously. But you're totally right that expectations change over time. If a competitor launches an amazing feature, people naturally expect more from the industry; so if we did nothing for a year while competitors kept launching updates, user perception of our product would still drop in comparison. So, instead, our job is to define what āqualityā means for us, build metrics that tell that story, and focus on what truly improves the user experience.
My children now interact with our Google Home device a lot. How does Google process and respond to users' voices, especially children's?
When you speak to a device, it records the waveform and converts it to text. Then, it interprets intent: if you ask, āHow tall is Barack Obama?ā the system understands that you want to retrieve a fact about a personās height.
The system also manages dialogue to check if there's enough information to execute the action, a need to search for the answer, or if follow-up questions are needed. If you say, āRemind me on Saturday morning,ā we assume you mean 9 AM. But if you ask to set an alarm for the morning, weāll ask what time since you likely have something specific in mind.
Finally, the request is executed. Whether itās setting a reminder, fetching an answer, or controlling a smart home device. If you have a screen-enabled device, we generate a visual response, too.
People often say they were talking about something and then saw an ad for it. Can you explain how devices handle listening?
That totally happens to me too, but that ad targeting is based on a ton of other things, for example on social media the things your friends click on are often shown to you too. Most smart speakers use a hotword system. Theyāre always listening locally for the wake word (āHey Googleā) but donāt send anything to the server until they hear it. The device processes sound locally, and unless it detects the wake word, that data is deleted immediately.
There are some false activations, like how saying āHey Boo-booā used to trigger Google Assistant. But as technology improves, newer models are shifting toward on-device processing, meaning they can handle more requests without sending data to servers at all.
š” Ammi note: When a device processes audio locally, it means the sound is analyzed directly on your device without sending any data over the internet. In contrast, server processing sends your audio to remote computers for analysis.
Regarding kids interacting with the technologyācan it distinguish between adults and children? How does it handle those interactions differently?
Speech recognition works best on data it has been trained on, and when the training data is biased, the model performance can be biased too. Our speech models are trained on adult voices, and kids tend to have higher-pitched voices and different pronunciation, which makes those adult models worse at understanding kids.
Regulations like the Childrenās Online Privacy Protection Act (COPPA) and Age-Appropriate Design Code Act (AADC) are trying to protect kids' data, but also make it difficult to build kid-specific voice models. We canāt store kidsā voice data to improve accuracy because that would make it a child-targeted feature, subject to strict regulations. So when voice assistants work well for kids, itās somewhat by luck.
Are there guardrails and safety mechanisms parents can use with these devices?
There are a lot: YouTube autoplay is off by default for kids, screen time limits can be set, and SafeSearch is enabled automatically. Parents can also manage content through Family Link.
I actually feel somewhat relieved hearing that toddlers use home speakers, because these devices provide small bursts of technology as needed rather than the brain-rot way of being glued to a screen like with TikTok with Gen Z/Alpha kids. Home devices have a higher barrier to entry - you have to keep asking for each interaction rather than just zoning out in front of it. I'm curious if this generation of "Google babies" or "Alexa babies" might grow up with a healthier relationship to technology than the āiPad babyā generation.
š” Resources from Arwa:
How many Google Home devices do you have in your home?
One per room. I have a smart display in the kitchen for schedules and reminders, a speaker in the bathroom for music, and a smart alarm clock by my bed. My husband also sees reminders pop up on the screen, which is useful for shared tasks.
Do you have any special features or hacks people might not know about?
Google just launched improved Assistant answers using the Gemini model. You have to opt into experimental AI features in the Google Home app, but it lets you ask for explanations in different styles, like āexplain it like Iām fiveā or in Shakespearean English.
Another great feature is āFamily Bell,ā which was developed during COVID. You can set up daily chimes for things like reading time, snack breaks, or bedtime. It now exists as customizable routines, where you can dim the lights, play white noise, and set a sleep schedule automatically.
What should moms and people in general be mindful of in this AI age?
These are tools available to help you - just like anything else, itās what you make of it. AI can be incredibly helpful. One mom told me she used it to generate an age-appropriate summer learning plan for her kids - things like frying an egg or balancing a checkbook. You can ease some of the mental load by asking AI to come up with ideas for you. And donāt feel bad about not using them if they arenāt helpful for you.
In terms of privacy, thereās a saying: if youāre not paying for a service, you are the product. This is true for social media, where data is collected for ad targeting. I can definitely understand the worry with your voice data, but thatās pretty protected and only saved if you donate it to help improve models. Google doesnāt use Assistant transcripts for ads, but some data is stored to improve accuracy. You can always manage or delete your activity at myactivity.google.com.
AI also raises questions for education. Whatās an acceptable way to use it and whatās not? Itās an incredible technology, and it would be unfair to tell students they canāt use it all; I think it would also fail to prepare them for the real world somewhat. This creates a need to show them how they can use it safely, and use it without compromising their education ā and that kind of thing starts at home. My friend works at a university writing center, and they put together a resource for students on ethical AI use.
āŗļø Apple or Android?
ā iPhone for personal use, though I have a Pixel for work. Once you're in the Apple ecosystem with family members having iPhones and iPads, it's hard to leave.
āŗļø Mac or PC?
ā Mac. At Google, most engineers use Macs because the file system is closer to Linux which our servers use.
āŗļø Favorite technology
ā Google Tasks - I use it constantly because I forget things if I donāt write them down. I'm always creating reminders for one-offs and things I need to do every few weeks or months.
āŗļø Tech Recommendation for Ammis and friends?
ā Something I recently heard about but havenāt tried myself is the Skylight, a kitchen device that shows family calendars, meal plans, and schedules. We really wanted to build something similar for smart displays a few years ago, and this one seems really cool.
Many thanks to Arwa for sharing her invaluable insights and helping us grow.
š Connect with Arwa on LinkedIn!
Your insights nourish our garden!
Which AI Voice Assistant do you use most often?Feel free to elaborate on your favorite way to use them in the comments! |
Thanks for spending a few of your precious, precious minutes with us.
How did you like today's newsletter? |
See yāall soon,
Ruqaiya
Ammi by day, Ammi by night
Reply