This year, TurboTax and H&R Block added artificial intelligence to the tax-prep software used by millions of us. Now while you’re doing your taxes online, there are AI chatbots on the right side of the screen to answer your burning questions.
But watch out: Rely on either AI for even lightly challenging tax questions, and you could end up confused. Or maybe even audited.
Should you trust that AI?
Here’s one example: Where should your child file taxes if she goes to college out of state? When I asked, TurboTax’s “Intuit Assist” bot offered irrelevant advice about tax credits and extensions. H&R Block’s “AI Tax Assist” bot gave me the wrong impression she has to file in both places. (The correct answer: She only files in the other state if she has earned income there.)
Question after question, I got many of the same random, misleading or inaccurate AI answers.
What’s going on? We’ve all heard about the incredible possibilities of generative AI. But now we have to wade through a parade of terrible AI products, as companies stuff still-experimental AI into everyday things. For consumers, it’s on us to figure out how to size up each new AI we encounter. (Come across an AI that needs some investigation? Send me an email.)
Geoffrey A. Fowler
The good news is that you can completely ignore the chatbots and still do your taxes. My experience shows we should be especially wary of generative AI when there are real-life consequences to it being wrong. And we can’t necessarily trust companies experimenting with AI to make the right decisions to protect our interests.
I tested the AI from TurboTax and H&R Block with the assistance of two real, live tax pros I recruited from EP Wealth Advisors, an independent wealth management firm. TurboTax’s self-help AI, which you access by clicking on the question mark in a circle in the top right corner of the screen, flubbed more than half of the 16 test questions I asked. Most often, it gave wildly irrelevant responses. After I shared my results with TurboTax maker Intuit, the company changed some of how the bot picks its answers. But its new version of Intuit Assist was still unhelpful on a quarter of the questions.
H&R Block’s AI gave unhelpful answers to more than 30 percent of the questions. It did well on 529 plans and mortgage deductions, but confidently recommended an incorrect filing status and erroneously described IRS guidance on cryptocurrency.
“I feel that my job as a tax professional is very secure,” said Beverly Goodman, a tax manager at EP Wealth who helped me analyze the AI advice.
Both companies include text underneath their chatbots describing them as works in progress. “Intuit Assist is still developing and will improve with your help,” says TurboTax. “Al Tax Assist is a digital helper that’s still learning, so please review all responses,” says H&R Block.
This much is clear: When a product’s fine print says “don’t trust us,” you shouldn’t.
Both companies say risks to tax preparers are limited because they’re not using the AI to actually calculate taxes — just to answer questions and help people understand their returns. They both took issue with my tests, saying their AI answers can’t be seen outside of their broader tax-prep offering, which includes more traditional software to check returns. Both companies also offer a path to ask your questions to human agents, though in the case of TurboTax, it usually comes with an additional fee.
But just imagine someone relied on the bad advice from the chatbot to make critical decisions about what to report, or even what filing status to use. Being audited is just terrifying; so, to a lesser extent, is wasting even more time filing your taxes. Wrong and unhelpful answers are wrong and unhelpful. Period.
My results are anecdotal. Both companies declined to disclose their own data about their chatbot accuracy and relevancy. But if a journalist and a few tax experts can so easily spot holes in their AI, it makes me wonder why these companies had not. Or worse: Maybe they just didn’t care.
How tax bots went off the rails
Intuit and H&R Block both told me the goal of integrating AI into their products was to help users with questions that otherwise lead them to turn to Google for potentially unreliable answers. That could save you time, or the expense of speaking with a human expert (which, I suspect, these companies are motivated to try to reduce).
But in the performance of these chatbots, we can see limitations in the current state of the art for AI that should have been red flags for products where accuracy is paramount.
How the AI chatbots answered our tax questions
1/7
The companies are using generative AI in different ways. Intuit integrated it into TurboTax for several purposes: to provide self-help for user questions, to provide extra feedback during the filing process, to provide customized explanations of finished returns and to translate its service into Spanish. My tests focused on that first part — the Q&A — because it sounded the most useful.
The responses you get back are a mix of written-out answers and links to pages written by TurboTax staff and its self-help community of experts. Intuit said the bot is a hybrid of the self-help system it’s had for years and newer generative AI technology designed to help people with less-common questions.
The first time I tested, the AI frequently directed me to results written by the TurboTax community. But it did a very bad job finding relevant answers, sometimes linking to pages that left the wrong impression about what the answer should be. It reminded me of the unhelpful old Clippy from Microsoft Office.
Why was the AI so off-point? “We had our legacy technology positioned to be the first place Intuit Assist pulled answers from,” said Intuit spokeswoman Karen Nolan. “We have now updated that so Intuit Assist’s self help answers are pulled first from a more advanced, multi faceted search capability.”
After Intuit updated its software, my second round of tests pulled up many more correct, straightforward answers, like to questions about claiming tax credits for education and a new air conditioner. But the chatbot still responded with answers and links to irrelevant information a quarter of the time.
H&R Block’s AI is a more prominent part of its website, and more of a typical chatbot in the vein of ChatGPT. H&R Block told me it uses underlying tech from Microsoft and specifically trained it to get answers from its in-house Tax Institute.
Its responses to my questions were more on-topic — but it was also more likely to make up answers that were just wrong. For example, at the recommendation of a tax pro, I asked a specific question about cryptocurrency: Do you have to report so-called wash transactions, which net out to zero? H&R Block’s bot said the IRS hadn’t weighed in on the question. But that’s not true — it has, and wash sale rules don’t apply to crypto.
The company says it curated AI Tax Assist’s answers using the most common questions it received from clients in the prior year — so “niche” questions like one about crypto sale rules “may not be fully addressed.”
H&R Block also suggested my questions lacked the “specificity and clarity” that AI needs to be effective.
Just imagine if you got those kind of excuses from a human tax preparer.
Users are not guinea pigs
Clearly, both bots needed more work. Then why put them into a product that’s being used by real people stressed out about doing real taxes?
The companies appear to have a different bar than I do for what’s responsible. They think it’s okay to give some people bad answers so long as some people get good ones and the system continues to improve. In other words, we’re their guinea pigs.
The Q&A feature I tested is still in “very early stages” and “a small sliver of the broader utilization of AI in TurboTax,” said Intuit’s Nolan. “In only our first year, we continue to innovate on Intuit Assist and how to better serve our customers,” she said.
My view: If AI is in your product, it needs to work.
H&R Block spokeswoman Teri Daley wrote in an email: “Is it perfect, no. Will it ever be, probably not. Mainly because we have a very complicated taxation system with ambiguous wording.”
She continued: “We believe we are being very prudent by putting the language directly above the prompt area to remind individuals to confirm answers. We feel strongly that the other capabilities built into the DIY software, such as the guided interview, and allowing individuals to connect with a Tax Pro free are what ultimately ensure a correct return.”
Including a disclaimer isn’t being prudent — it’s shifting responsibility onto users who probably aren’t in a position to evaluate the situation. Being responsible would involve testing and improving the product behind the scenes until it can give the right answers to questions and knows how to say “I don’t know” when it doesn’t.
H&R Block also touted the fact it offers access to human agents for no additional cost. That’s a good thing, but on complex tax topics, you’d often need to already be a tax expert to know one of the AI answers was suspect.
Shouldn’t a business marketing AI as a source of information be liable for harms that come from it? Unfortunately, without stronger laws, the companies’ fine-print disclaimers might be enough to protect both Intuit and H&R Block from legal responsibility for giving you bad advice.
At least they’re doing the right thing on one front: Both companies tell me they guarantee the accuracy of your final tax return, and will provide assistance if you get audited because you listened to their AI’s bad advice. Let’s hope that comes through a human, and not a chatbot.