Michael R. Bock's Avatar

Michael R. Bock

@michaelrbock.com.bsky.social

co-founder of Column Tax // michaelrbock.com

35 Followers  |  81 Following  |  123 Posts  |  Joined: 22.09.2024  |  2.3595

Latest posts by michaelrbock.com on Bluesky

Post image

imagine falling for the most obvious spy of all time on bumble ???

(a friend sent me this screenshot, I'm married πŸ˜…)

03.11.2025 04:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Column Tax’s β€œsecret” master plan to automate tax filing Column Tax is in a unique position at a unique moment in technology history.

The full plan:

www.columntax.com/blog/our-se...

29.10.2025 13:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We’re so confident that we’re publishing an internal roadmap document: our β€œsecret” master plan to automate tax filing (just between you & me).

29.10.2025 13:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

And now the combination of the latest AI progress and our expert team & large proprietary eval datasets means we’re the group that can finally fully automate tax filing and save people time & money.

29.10.2025 13:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

How will we know that AI has really β€œmade it”?

The task that most exemplifies our ability to automate knowledge work is β€œdoing your taxes”.

At Column Tax we’re now within line of sight to fully automating taxes. We started the company at the perfect moment, with LLMs just on the horizon.

29.10.2025 13:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

No matter how many times we do it...

I always get nervous before a big announcement (coming tomorrow!)

28.10.2025 16:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Hypothesis Sheets: how to navigate and exit the idea maze with a (good) startup idea In 2020 when we were at the beginning of our startup journey I had a conversation with Erik Goldman where he shared this process, which we used to start Column Tax.

The blog post in question: michaelrbock.com/hypothesis

23.10.2025 15:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Positive review of my most popular blog post: "Hypothesis Sheets - how to navigate and exit the idea maze with a (good) startup idea".

Glad to hear the founder whisper networks are still sharing this knowledge around.

23.10.2025 15:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
column-tax/tax-calc-bench Code & data for TaxCalcBench. Contribute to column-tax/tax-calc-bench development by creating an account on GitHub.

4/ next up?

adding tool use (code execution & web search) to see how that helps models calculate tax returns

also testing Claude Opus 4.1 and GPT-5 mini & nano

follow here: github.com/column-tax/...

18.09.2025 17:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

3/ GPT-5 is impressive in many ways

especially because it's knowledge cutoff is still September 2024

but it's not the leader in tax calculation today

(even with maximal test time compute)

18.09.2025 17:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

2/ back in July, we published the first-ever eval for US personal income tax calculations

x.com/michaelrboc...

18.09.2025 17:38 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

1/ GPT-5 is worse than Gemini 2.5 Pro at filing your taxes (but it's really close and they both can't do it yet)

we proved it via our tax calculation benchmark:

18.09.2025 17:38 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

I got married last month.πŸ€΅β€β™‚οΈπŸ‘°β€β™€οΈ

Here's what it taught me about B2B2C tax software:

Just kidding :) but I do really recommend getting married to the love of your life with all your friends & family around!

17.09.2025 14:00 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

no one had even heard of git worktress before claude code

13.08.2025 20:49 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

amazing ChatGPT Agent Mode use case: find & validate coupon codes without having to test them yourself

03.08.2025 19:08 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
TaxCalcBench: Can AI file your taxes? AI can’t do your taxes on its own (yet).

10/ Read more about the work, research, and results here:

www.columntax.com/blog/taxcal...

23.07.2025 15:18 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - column-tax/tax-calc-bench: Code & data for TaxCalcBench Code & data for TaxCalcBench. Contribute to column-tax/tax-calc-bench development by creating an account on GitHub.

9/ This work wouldn’t have been possible without the hard work of our Tax Analyst team over the past 4 years & the success of our commercial product: you can’t buy this dataset on Scale or Surge.

View the dataset and testing harness here:

github.com/column-tax/...

23.07.2025 15:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

8/ Models are also inconsistent:

using pass^k (a measure of reliability of a model across multiple runs on the same task), performance degrades with additional runs meaning models mess up in new & surprising ways when calculating tax returns.

23.07.2025 15:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

7/ For some models, performance improves with increased inference-time compute (thinking budget tokens)

but not for the best model (Gemini 2.5 Pro), suggesting alternative techniques/scaffolding/orchestration is required to get AI to do this tax calculation task.

23.07.2025 15:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

6/ Models consistently:

1. Misuse tax tables
2. Make calculation errors

For example, models will hallucinate line numbers on Forms or use incorrect eligibility limits.

23.07.2025 15:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

5/ Takeaway: models can’t calculate tax returns reliably today.

Even on this simplified data set and allowing the models to output to a simplified format, the best model only calculates 32.35% of returns correctly.

23.07.2025 15:17 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

4/ TaxCalcBench is a dataset of 51 pairs of user inputs and the expected tax return output + a testing harness.

We made the task easy for the models. We provide:
- all of the data (e.g. W-2s) needed to file a return
- the expected output in IRS XML format

23.07.2025 15:17 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

3/ Tax calculation means taking a user’s "inputs" (W-2s, 1099s) and outputting the Form 1040 in the IRS XML format.

75k pages of English text define the transformations required to do this.

Companies like @ColumnTax use deterministic tax engines to do these calculations.

23.07.2025 15:17 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
TaxCalcBench: Evaluating Frontier Models on the Tax Calculation Task Can AI file your taxes? Not yet. Calculating US personal income taxes is a task that requires building an understanding of vast amounts of English text and using that knowledge to carefully...

2/ Today, we’re releasing TaxCalcBench: a first-ever benchmark dataset & eval framework for testing AI’s ability to calculate US personal income tax returns.

Tax is a secretive industry, so we’re proud to release a research paper sharing our findings:

arxiv.org/abs/2507.16126

23.07.2025 15:17 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

1/ Can AI file your taxes? Not yet.

We tested the latest frontier models and the results were full of catastrophic errors.

Letting AI do your taxes would mean IRS rejections, audits, and penalties:

23.07.2025 15:17 β€” πŸ‘ 4    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Post image

this is the wildest cold twitter dm opener i've ever received

21.07.2025 23:17 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

this is what founder <> founder private text messages look like (and what makes the job so fun)

21.07.2025 14:05 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image 19.07.2025 18:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

why is everyone complaining about a GPU shortage if it turns out you can just buy them on amazon ;)

21.06.2025 15:48 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

11/ Thanks to the folks who worked on Direct File. We have a lot of gratitude.

05.06.2025 16:02 β€” πŸ‘ 0    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

@michaelrbock.com is following 20 prominent accounts