Interesting to see the discourse around the GPT 5.2 Excel demo last night and this morning. It started with promotional hype, and ended with people having a laugh at the egregious calculation mistakes in the (now deleted) Uber DCF example that was posted. It’s a funny set of errors for sure, but the reality is that basic DCF mechanics, terminal value calculations, etc. have already been solved by other AI Excel agents - Shortcut for example already does these DCF calculations with correct formulas - and I expect this will be corrected by OpenAI as well.
Based on my reviews of both Shortcut and Ramp Sheets over the past week, I think the more nuanced and difficult hurdles to institutional adoption that Excel AI providers face relate to 1.) Historical data retrieval accuracy, 2.) Incorporating appropriate levels of granularity, and 3.) The ability to take existing models/templates and accurately replicate the formatting and general model structure/logic in a new model.
After some great discussions with several new subscribers, I wanted to share some constructive thoughts on the path forward on each of these issues and what I will be watching for in early 2026.
1.) The historical data retrieval problem and potential near-term solutions:
I’ve already pointed out data pulling accuracy issues in the reviews over the past week, so I won’t harp on them here. Let’s discuss potential solutions. Pulling bulk amounts of historical financials and reported KPIs with a high degree of accuracy is not a trivial issue to solve - so it is to be expected that newly launched Excel AI providers experience some difficulties there. Daloopa, for example, has spent 5+ years building algorithms/systems focused narrowly on this exact issue and, at least for public companies where reporting is more standardized, seem to have figured this out with 99% accuracy since 2021 (link). Daloopa has announced MCP integrations with Claude (link) and OpenAI (link) which provide enterprise customers with data connectivity to Daloopa if they are subscribers. So, assuming data pull accuracy using the MCP continues to improve (latest report on accuracy of Daloopa + Opus via MCP in Figure 1 below shows 94% accuracy), there may already be a near-term solution for the data retrieval accuracy problem for public equities at least. I hope we see providers like Shortcut offer these types of integrations to enterprise and retail customers as well. For the analysis of private companies, who have less standardized reporting and are not covered by providers like Daloopa, solving this issue is a much heavier lift and will have to wait for improvements in the tech stack.

Figure 1: Daloopa MCP Number Retrieval Accuracy (Latest Available)
2.) The “granularity requirement” in institutional use cases and potential near-term solutions:
If doing a deep dive on a name, buyside quality models require thoughtful granularity that reflects key business drivers. When asking an AI agent for a granular revenue build, my experience with using Shortcut was: 1.) ask for granular model -> 2.) get a “halfway” granular revenue build with some segment/KPI drivers -> 3.) Dig around myself for other available KPIs and drivers, and guide the model to incorporate these. (See link to this process here).
If asked for a granular revenue build, to the extent Excel AI agents can default to first pulling all available KPIs and segment financials from uploaded materials (presentations, press releases, financials) into the model, and second confirming/suggesting which to use in the revenue build, I think this would greatly enhance the utility of these tools for institutional users looking to do deep dives.
3.) Using existing models and templates: current excel AI prompting pain points and where I (and others) hope excel AI prompting is going.
Can we get to a point where the main prompt for building a new financial model = uploading a similar existing financial model that already has the desired formatting and general structure?
In the experiment I wrote about in my first review of Ramp Sheets this past week (here), I asked Ramp to solve an analyst/associate modeling test and, in another tab, provided Ramp with a full model solution for a slightly different modeling test. As of today, to get to the right output Ramp would require a lot of additional guidance via follow up prompting (Figure 2). Having to explain nuances of formatting and model structure using natural language is ironically unnatural and a productivity barrier. Getting closer to a point where the best prompt consists of uploading an similar existing model would be a huge development.

Figure 2: Ramp experiment highlighted challenges of applying existing model formatting/structure to similar tasks.
If steps forward on all of the above are taken in 2026, analysts may be able to shift more of their focus on creative research to sharpen their model driver assumptions (which will remain the hardest part to get right!).
Thanks for reading. If you have any suggestions for what I should look at next, please reach out!