moonshots ep218 ai benchmarks recursive learning transcript

Most of the public focus today has been on the large public benchmarks for things like coding. I think the problem is though Matt Fitzpatrick, CEO of Invisible Technologies. We’re an infrastructure company. What that infrastructure allows us to do [music] is what I would call hyper-personalized software at scale. It sounds like your position is we need thousands of of new >> [music] >> narrow benchmarks to capture maybe every labor category, every industry vertical. [music] >> That is an interesting second part of this, which is We’re going to see the largest disruption ever in 2026 from companies that don’t make this change. >> There are many sectors where the structure of what the industry does is going to change. If you think about knowledge work, the production of large amounts of documentation, where these these technologies are very disruptive. I think the question is which parts of your business can really change with AI. What are you seeing most companies get wrong on their mission to implement AI? You’ve got two kind of different

[00:01:01] challenges. [music] One is Now that’s a moonshot, ladies and gentlemen. Hey buddy, welcome to Moonshots. In today’s episode, we’re going to be discussing why all companies need to become AI companies in 2026. How they do that, what happens if they don’t. We’ll discuss whether or not big legacy companies can even make such a dramatic change and how they can best do it. We’re going to be going over some fun and meaningful AI use cases. So listen up. I think they’re ones that are going to sort of get you excited about what you can do and we’ll dive into some predictions from our guest for 2026. Today joining the Moonshot mate is a friend of the pod, Matt Fitzpatrick, who for more than a decade was at McKinsey, rising to the position of global head of Quantum Black Labs. I love that name, Quantum Black. It’s so cool. Leading the firm’s AI software development R&D and global AI products. A year ago Matt joined as the CEO of

[00:02:01] Invisible Technologies, a company started by a brilliant friend of mine, Francis Pedraza. For those of you don’t know Invisible, the company is a modular AI software platform that uses AI training and provides AI training for most of the large language model providers out there, building custom workflows and agents for enterprise. They anchor their work in creating clean data and human in the loop delivery to ensure measurable business results. Matt, welcome. Good to have you here. Hey Matt. >> Thank you for having me. We got DB2, AWG, Salim. Almost happy holidays, guys. It feels like It feels like we’re on this pod, you know, every other day. I think we should just move into a large, you know, sort of podcast house and We will have fully documented [clears throat] the singularity. >> [laughter] >> Oh, well, you know, I’m really looking forward to hearing from Matt cuz, you know, on Thursday we have to do our predictions for next year. Yeah. And Matt is going to give us a ton of insight today. One of my predictions out of the gate is that enterprises are

[00:03:01] going to move super stupidly slowly compared to AI capabilities. And Matt is the world-leading expert on the intersection between AI and enterprise. So I cannot wait for this. >> You cannot You can’t cheat this way, Dave, and sort of use Matt’s predictions as yours. I can’t? No. I’ll be there. I Well, but everybody listening to this pod will know that. All right. Well, take good notes, if nothing else. Um you guys ready to jump in? Matt, I’m going to kick it off with a question, uh sort of a broad question for you. Um and uh here it is. So within the past year, you know, we’ve heard from like every company out there and every CEO that we’re going to be pivoting to become an AI company. Uh Salim, in the last pod, you said something like we’re going to see the largest disruption ever in 2026 from companies that don’t make this change. And I think Alex, uh the term you used is they’re going to be cooked if they don’t. I said knowledge

[00:04:00] knowledge work is cooked. Not knowledge workers, not companies. Knowledge work as we currently know it. Aha, okay. So you don’t think that companies are going to be cooked that don’t make the transition to to AI? I think we’re going to see many more companies over time and many more smaller companies as well. Okay. Well, uh we’re going to dive into that. >> Look, in an earlier episode, we we pointed out that you you know, when you thought you were on product market fit and you’re scaling a SaaS company, you’re like toast because everything needs to be rethought now given AI. So this is now applying to big companies also. All right. So Matt, the question to kick this off is can every company truly become an AI company and how? And then which companies and industries do you think need to disrupt themselves now before they become basically irrelevant? So it’s a you know, a softball question to kick us all off here. [laughter] Peter save the hardballs for me, Matt. I >> [laughter] >> Always. I I think your second question relates

[00:05:00] to your first question in some ways, which is I don’t think and I think all the data that has come out on this so far that all industries are going to be impacted equally by this. I think there are some sectors and areas where you’re going to see materially different impacts. I think areas like media, legal services, um business process outsourcing. There are many sectors where the structure of what the industry does is going to change. If you think about knowledge work, the production of large amounts of documentation, where these these technologies are very disruptive. I think where there’s I think the hype has been a bit overblown is if you take a lot of sectors like oil and gas or real estate. The function of what they do is going to stay pretty consistent and I think actually most of the good analytics of how, you know, job dynamics will change over the next couple years will get at this. Like the decision on which apartment building or which office building to buy is going to function pretty similar to what it did 5, 6 years ago. And so I think the question is which parts of your business can really change with AI. It’s not all of them and and some sectors will be more or less.

[00:06:00] And then the second part of your question, which is also an interesting one, which is can everyone actually become an AI company? And that is an interesting second part of this, which is not There are not that many people that know how to build these sorts of models or deploy these sorts of models well. And so one of the big challenges is do you have the expertise in-house to do this? How do you think about adjusting the operating functions of your company to do it? Is it the same team you have in your IT function doing it now? And particularly, Peter, I know the the the set of folks that you and I have spoken with in the past, like small businesses. If you’re a 50-person company, it’s hard to deploy a lot of this stuff at scale if you don’t even have a CTO in-house. And so I think there’s a mix of is your industry going to fundamentally change and then what are the actual core competency your company has to implement it. And so I think what you’re going to end up finding, yeah. Yeah, so I mean, do you end up bringing a chief AI officer into your company or are you going to bring that capability or are you basically renting it? Uh I mean, part of the other

[00:07:01] thing that’s going on right now, we’ve talked about this on the pod a lot, is your competition isn’t really the large, you know, multinational. It’s the AI-native startup that came out of no place that’s reinvented themselves from the ground up as an AI-first company, right? Yeah. >> like down down and dirty, Matt. Like which happens first? I can get a mortgage by talking to an AI and get it done in under an hour or we’re walking on Mars with our own two feet? Like Like which which of those two things is going to happen in the real world first? Yeah, so the way I’ve I’ve heard the the question asked is do the do the startups get distribution before the big companies build the technology? And I do think that will be the tension in a lot of ways. And look, I think there’s a lot of big established companies that are going to figure out how to do this really well. Like I I think if you take a sector like legal services, I do think the big law firms will figure out how to use a lot of this over time. Um you know, I think there are sectors where I think banking is a really

[00:08:00] interesting one to look at right now. Uh you know, if you look at the the the age of the application footprints in banking. Like most of the tech that exists in banking is north of 20 years old. Um and so you do have a bunch of very fast-moving newer fintechs that are approaching it different ways, companies like Revolut. I I don’t know how that plays out, but I do think I do think that becomes the the question in a lot of ways is which moves faster, the emerging entrants or the modernizations of the existing. Um I think Peter, to to hit on what you were asking also as a second part of that, do you buy or rent? I I think that’s something you’ve got to be really honest with yourself about as a company, right? And I think the idea that everyone can buy, everyone can um hire people to do this is challenging. The challenge of trying to adapt an existing IT function to do this is many of the skill sets that people hire for, even like do they know Python, things like that, um there are gaps in that. And so I think the answer that most companies I’ve seen

[00:09:01] who don’t have the resources in-house through or being just directive about a push that is they they are finding ways to rent or buy this externally and to partner with folks that can allow them to do it. Every week my team and I study the top 10 technology meta trends that will transform industries over the decade ahead. I cover trends ranging from humanoid robotics, AGI, and quantum computing to transport, energy, longevity, and more. There’s no fluff, only the most important stuff that matters that impacts our lives, our companies, and our careers. If you want me to share these meta trends with you, I write a newsletter twice a week, sending it out as a short 2-minute read via email. And if you want to discover the most important meta trends 10 years before anyone else, this report’s for you. Readers include founders and CEOs from the world’s most disruptive companies and entrepreneurs building the world’s most disruptive tech. It’s not for you if you don’t want to be informed about what’s coming, why it matters, and how you can benefit from it. To subscribe for free, go to

[00:10:00] demasters.com/metatrends to gain access to the trends 10 years before anyone else. All right, now back to this episode. >> [snorts] >> I think I think legal and accounting are really really cool case studies and I know you know more about this, you know, your McKinsey time and Quantum Black. You’re like the guy understanding and parsing all of this. But they’re really cool because they can be uh replaced by a startup, you know, like Harvey. Dave, two things I’d say about that. I think one challenge of the implementation of GenAI in the enterprise setting is a statistically validatable baseline to compare against, right? And so, as an example, if you take something like mortgage underwriting has made huge progress like in a very positive way actually, the percentage of mortgage underwriting that’s now done by a very guard railed and very effective set of algorithms developed by the banks is pretty high because they can back test and say this is a correct credit decision that has no redlining anything else. But if you think about a a document like the the reason contact centers has been one of the cases we’ve seen a lot of

[00:11:01] adoption of this is you do have a clear baseline, right? Like time per call, CSAT, cost per call, you have you have a set of metrics you can compare against. Something like let me generate an investment memo, which is different in format at every firm. It could be 10 pages versus 40 pages. The content is different. It’s been harder for folks to build baselines. Um I do think that’s why legal services is an interesting one is there’s certain areas of legal where um where those baselines are clear like you can look at at what what documents are really good for like an ISA agreement. But where I would I think you’re going to see this in a lot of different segments. I think the high end of that market still persists in a really differentiated way, which is if you’re doing a large M&A transaction, you’re still going to want a really good lawyer’s advice. And [snorts] where it changes I think is in the more very basic like produce an NDA type of of work. And and the kind of basic And I think that’s going to be one of the shifts again is is that really really good human guidance is going to persist

[00:12:00] forever. It’s the the basic commodity information that right now a lot of people are paid, you know, probably excessive amounts of money to do. Yeah, well, you know, the NDA is pretty extreme. That’s a that’s a But I’ll tell you the the the venture fundings that we do, you know, we do tons of these every year. And the term sheets always say you the company that we’re investing in will bear the cost of the legal uh capped at $50,000. And then the documents are freaking identical every single time. There’s like eight knobs. >> [laughter] >> And you could you could store all combinations on like the smallest thumb drive in the world. And I’m like how is this like a $50,000 and and it always runs up to $4,999.99. It’s like wow, what a miracle. >> [laughter] >> So, I don’t know. That to me feels like you know, that would be on sort of the mid to hard end of the scale, yet it’s still so doable. You know, an NDA is a no-brainer. Mortgages are no-brainers.

[00:13:01] I I completely agree. I think what’s been interesting though is how slow the actual adoption curve has been in say contact centers. Because contact centers should have had, I mean, generally CSAT scores or people don’t really like most contact center interactions. Um the the kind of general customer feedback you get is pretty unhappy. And that’s been true for a decade. So, you would have expected Yeah, I guess technology would have Let’s talk about the whole Klarna thing actually. I know you’re an expert on this. Like the the Klarna thing has been really interesting to watch. Wait, tell tell us the story. Uh what what is the Klarna thing? Well, yeah, but I was not involved in Klarna anyway, but I can say at least what I what I know from from from reading about it and what my hypothesis would be. I mean, basically Klarna announced that they were going to move entirely towards um a fully end-to-end agentic contact center. And then a couple months later the And by the way, the interesting was at that time they were the most frequently cited example of agentic success in deployments. And then about 8 to 12 months later they

[00:14:01] basically announced they were rolling the whole thing back and moving back entirely to human contact center agents. And and I found the entire evolution kind of interesting because it if you think of how these systems should be defined like you know, deployed. Like a multi-agent system the way it should work is you’d have uh kind of an orchestration of what the types of calls are. You’d have a set of validations on which calls could go back well or badly and you’d have some sense of where you need escalations to human agents versus where you’re going to So, you actually would never want to move and I think this is a theme in this whole aspect forever. You’re never going to want to move to doing everything agentic. You’re going to want humans in the loop in every single almost every industry and almost any topic because I think actually that’s where a lot of the if if these models are trained off of precedent data and then you you can train them really well to then kind of continue that logic you’re going to want humans for some of the things where you don’t really have precedent data or you need them to work through complex things where you don’t have enough historical

[00:15:00] information. And so I found the entire structure of how the change happened quite confusing because you would always want to keep a contact center be a mix of humans and and agents and then evolve the the mix between those and on which topics. And so the whole movement from all all humans all agents back to all humans was confusing I think from an end Saleem, you are presented with a question. >> [laughter] >> No, no, I just wanted to give out some details here. So, the Klarna situation, they rolled out an AI to do customer service calls and the claim was in the first month it did the work of 700 full-time agents, handled 2.3 million calls a month and they projected that it would save them 40 million a year. Um and they were like really proudly saying this was like month one and it’s only ever going to get better from here. The when I saw that I was like okay, I if I was doing that this sounds like a PR exercise more than anything real because you’d never put that out in the first month. You’d wait a couple of months to see what exactly happened.

[00:16:00] Um and uh Matt, you may be able to give a little more color on why did they roll it back in the end? Did they find the hard cases were too many? Uh the exception handling was too much? Or was it a cultural backlash? What was it exactly that had them undo the whole thing? I I I don’t know in the sense I haven’t worked with Klarna, but um you know, I think you hear a variety of different pieces of feedback on why folks have struggled in contact centers. I think one one reason is um there there are cases in which humans just want to talk to another human. And so I think some of the PR saying we’re moving to only agents has its challenges. I think two a lot of the challenges and where contact centers are most sensitive is non-first-line call resolution topics. So, it’s not something like check your balance. It might be something like process a refund, right? Something that’s pretty complex. You have to write back to the source systems. I would it was surprising to me how quickly they rolled that out and I wonder how hard how well it was able to

[00:17:01] roll to kind of deal with some of the more complex functionality in that example. Right. You go from level one to level two and three very quickly on those support calls and then and then you’re you do not want an AI dealing with you. Can we get back to the main question here, which is you’ve got, you know, 2026 is coming up. Um if you’re listening to this in 2026, it’s here now. Uh so, here’s the question. You’re a medium-sized company or a large-sized company and and your board of directors has just said uh to Mr. CEO or CTO, uh guys, what’s your AI plan? What are you doing? I mean, we’re seeing that over and over and over again. Uh their first reaction is what typically and what should their what should they do? I mean, I want to just get some of the fundamentals here because I want to serve our our listener base in that fashion. Yeah, so I I think if you’re that CEO, you’ve got two kind of different challenges. One is what are the things I should focus on and then two is who should do them. Uh and do I have those skills in house? And so the first thing

[00:18:01] I would start with is making sure you know the first question. I do think this is a question of following the value. So, I’d go down a list. I would not start with letting a thousand thousand flowers bloom. I would start with what are, you know, two three things that if you do them well materially move the needle for your business. Maybe it’s, you know, we’re just talking about customer service. Maybe that’s one example. Maybe it’s uh forecasting um in your uh FP&A function. Uh maybe it’s um uh inventory management. But there’s there’s definitely two to three things that almost any business on earth even as a small company has that are digital marketing probably is another one that you see pretty frequently. And you focus on one or two of those and you make sure you get to a pilot stage in that one or two. Meaning not a strategy document. I do think the one thing that anyone who’s spent real time in this space will tell you is um like if you take the paradigm of how machine learning is deployed where you spend months and months building something and then it works and you can underwrite statistically that it works.

[00:19:01] This is kind of the exact opposite paradigm in that you can get a prototype up and running in a month. But you have to do a lot of testing and validation to to make sure you can trust it. And so it is really a a function of making sure you can get something up and running and testing and validating. You know, Peter, the question I always ask is who would you under would you bet your annual bonus that whatever use case you deploy works? And and that’s a complicated thing. If it’s like let’s say generate a claims um a claims processing review. And you have to do 10,000 of them. Most companies don’t know how to say that works or it doesn’t. And so what I would do is just to summarize, make sure you have a list of the two or three things that move the needle. Make sure you get to do a proof of concept in one of them. And I would probably do that first use case as an RFP to a third-party vendor that gets compensated based on results. Mhm. Yeah. And I say very specifically because I I think if you do it in-house, the odds are the in-house team has not had a lot of experience with this. And so, you also

[00:20:00] can’t hold them accountable in the same way of you get paid if it works. And so, I do think tying it to outcomes limits your risk. I mean, that that is still the business model for Invisible, right? You’re paid by money saved. Correct. We we outcomes. Yeah, outcomes in various ways. Yeah. Yeah. Alex, want to bring you into the game here. Much appreciated. So, maybe just as a preliminary matter in in for us to full disclosure, I have no financial interest in Matt’s company, Invisible. I do have a number of questions, though. First question, maybe pulling the the thread on testing. One of the things that we talk about here on the pod all the time is benchmarks, the importance of benchmarking. I’m I’m curious given that >> talk about that constantly, Alex. That is all we talk about. We talk about nothing else. That is all we talk about. Oh, wait. Maybe that’s you. Okay. >> [laughter] >> Given that’s all we talk about as as Dave just mentioned, and given that Invisible is is also in the business of training so many models, what benchmarks do you think most need

[00:21:01] to be brought into existence in the world? What what’s most missing? Top three benchmarks you’d like to see summoned into existence. Yeah, look, I I think and you’ve seen a bunch of of these start to get publicized in the past couple of months, but most of the public focus to date has been on the large public benchmarks for things like coding. And and I think those are very useful as metrics for are the models improving broadly. Uh and I think, you know, that that is way you’ve been able to see by any standard if you look at our last 3 years, the models have 50% to 100% improvement on most dimensions um that you can look at. I think the problem is, though, if you think about like enterprises or small businesses, your benchmark for most cases is not a broad-based um accurate kind of cognitive benchmark. It’s accuracy or human equivalent on a specific task. And so, what I think you’re going to see more and more need for is kind of custom evals on highly specific topics. So, if you go back to

[00:22:00] the contact center example, the benchmark you’d want to build if you’re going to roll this out for contact center is a series of expert agents that are in your contact center and how they perform, and then how the AI agents perform similarly. Same with claims processing, but basically most businesses are going to have to get comfortable with doing what’s called an eval or a custom benchmark for the tasks they’re trying to modernize. Because an 80% accurate very smart deployment is not, you know, there’s still too much risk in that rollout framework. And so, I think a lot of this is actually the way that we think about benchmarking will evolve from broad-based benchmarks to hyper-specific benchmarks. I I freaking love that because I can immediately see 10,000 listeners right now just found a calling in life based on what you just said cuz because all this, you know, benchmarking within any of these domains is really, really hard to figure out unless you know the like, you know, title insurance. You know, what’s the benchmark for successful AI in title into? Well, somebody in that industry listening to this pod right now is going to be like, “You know what? I was an early adopter of AI, and I know this space inside and out. That’s my

[00:23:02] benchmark to own. And if you declare yourself the owner of it and then broadcast the benchmark, the evidence so far is you become an instant star. Like nobody’s grabbing topic ownership in all these topics. And if if you just get there first, you become an instant star. I I completely agree with that. And I I think that’s the >> geek like Alex type or Matt type or David you’re in >> with that. In in this era of post-training as a commodity, if you own the benchmark, often it’s the case I think that the benchmark is the hard part, and you can leverage existing resources to post-train an off-the-shelf bottle. I am curious, though, Matt, maybe following up on this. So, it it sounds like your position is we need thousands of of new narrow benchmarks to capture maybe every labor category, every industry vertical. It assuming that’s the correct, is that something that Invisible is working on, can be working on, should be working on? Yeah, we do spend quite a bit of time working on that. In fact, a lot of the time what we’re building is customer-specific benchmarks for an individual task. So, that is that is a

[00:24:01] lot of what we think about is actually how to test equivalence for a given task. And, you know, I I think one of the things that folks have not fully realized is let’s say you take a really high-performing LLM and you want to tailor it to your individual context. That process of actually fine-tuning it off of your data. So, an example I would give is um And I I think one of the challenges that that people were hoping that this would be a SaaS um buyers paradigm, meaning like I could just buy something that off the shelf would just solve everything I needed. So, like I wanted to buy a a a sales agent, I wouldn’t have to do anything. I could just take in a sales agent that would sell well. And the reality is that’s pretty hard to do. You need to actually train it up on your specific knowledge corpus, your information. And so, the way we would think about it is you take the LLM or you take an agent that’s been trained for sales, and then you fine-tune it off of your specific company information, your products, the way you sell, your way of speaking. And then you have to build an eval or a a benchmark against that to say this is performing well or

[00:25:01] not on that >> Well, quick follow-up question if if I may because there was the sort of infamous Bloomberg GPT moment where Bloomberg was sort of in quasi competition with the frontier labs. They had a wide variety of internal proprietary data sets. Their original plan, this is now sort of an infamous episode from 1 to 2 years ago. Their plan was to offer their own proprietary frontier model, basically, and but trained critically, pre-trained and/or post-trained off of their internal data sets. And the the plan was to achieve superb performance in financial domain because they had all the data or a lot of data that were not broadly available to the general public. But what actually happened is the generalist models offered by the frontier labs that were training basically off of the internet and more or less publicly available data sets within a few months leapfrogged Bloomberg’s GPT project. And so, I I guess the the the moral of that parable in in my mind is how far do you think we

[00:26:01] can really get with proprietary data sets, proprietary benchmarks before the generalist models completely wipe the floor with them? Who’s sorry, to clarify, I’m saying you use an LLM. The process I’m describing of actually fine-tuning a model a large language model for your specific context is basically adding more context. You’re saying most of the LLMs offer a paradigm where you can do this, where you can add your knowledge corpus and train it to be more specific to your your individual context. I I don’t think the I don’t think you’ll see individual institutions building their own LLMs. I think that’s a very compute-intensive, very difficult thing to do. I think you’ll see them tailoring the large language models to their context. Sure. Sure. To be clear, I I Oh, one more question if I may. To be clear, wasn’t asking whether you think every institution’s going to get into the business of pre-training their models. I was rather asking whether you think post-training, which is inclusive of supervised fine-tuning, reinforcement fine-tuning, a variety of other post-training, whether you think that

[00:27:01] has a long-term future. Or will, maybe in 1 to 2 years, we just use pre-trained plus post-trained generalist model off the shelf and not need any internal benchmarks and any internal data sets for post-training. Well, I think there are clearly going to be use cases where you are going to need the context of the individual company, right? Like if you just take the law firm example, I just just take the I mean, just let’s try this way. Um there are documents that company has on how they want um their docu- their their future state documents for M&A agreement to look, right? And the the LLMs are not going to have that information. So, at some point you are going to have to see the post-processing layer happening at the enterprise. And what we’re seeing more and more is there’s ways to design that layer so that you can, as new models evolve, kind of drop those in. And we are seeing more and more folks experiment with that. So, they’re using all the new tech that’s being rolled out. But >> I think, in fact, what’s going to happen is over time that edge in data is going

[00:28:00] to be the most valuable part of any company. Is that trade secret type of how do we do things? Now, at some point it may leak into the public models, but for the >> Like like if you used Open AI, right? Yeah, if you use any of the frontier models connected to them. I remember we were talking to, you know, Replika, etc. Peo- people are using it, and then the the data is going straight into the cloud, right? And that’s kind of dangerous. We’re going to have They’re going to have to solve that layer in a very powerful way. I That’s one of my predictions to forecast, etc., is is we’re going to need to see a layer of protection between company data and the broader AI world. Um Matt, I want to make this a little more tangible. Now, I know you can’t talk about the work you’ve done with the hyperscalers, uh but you’ve identified, I think, five or six uh cases where you can speak publicly about it. So, if you don’t mind, maybe we can toss a few of those in and then talk about them as concrete examples. And and since Alex made his no financial

[00:29:02] involvement uh statement, I will say I’m a proud advisor and am conflicted, in a positive fashion, supporting what Matt and Francis are doing. So, but do you want to pick one of those? I I I loved the example on on the basketball court. Can you can you speak to that one? Yeah, sure. So, we worked with the Charlotte Hornets on fine-tuning custom computer vision models for draft prep for them. So, in their case, they wanted to look at the spatial movement patterns of players on a very broad scale across single-point cameras across a whole host of different um uh college uni- universities and and um international locations. And so, we fine-tuned a custom computer vision model um to specifically look at movement patterns they were interested in before before the draft. And so that was a big part of their draft evaluation. >> in English. You basically took the video and you were able to use models to

[00:30:02] evaluate every player based upon the video to see how well they performed at every different I’m not like a sports guy. So it’s the Yeah, that’s becoming clear here actually. >> Yeah, yeah. It says the salt economy. >> [laughter] >> Yeah, sure. So think of if you take typical NBA stats, there’re things like points, rebounds, what’s called plus minus is one ratio that’s often used which is like the the amount you score versus give up when you’re in the game. But they’re mostly stats that are kind of transactional stats. What they don’t look at is the movement patterns of the players who create space, who where people are positioned at any point in time. And that’s actually a lot of the most interesting data. If you go back to like, you know, some of the other original baseball analytics that Billy Beane did for the A’s, it’s the movement patterns of players and who is in the best spacing, right? And there are companies that do this on on very consistent formats like on the same court, but we’ve been able to do is to do that

[00:31:00] over many different camera angles, many different stadiums very very quickly. And that is using custom computer vision models. So we effectively are able to take a single point camera and understand the movement patterns of players in many different environments. And so the Hornets use this how? For select team selection player selection? >> Wait, draft selection. So to understand which players fit certain characteristics they were looking at. Fascinating. Yeah, it’s a complicated problem, too, because chemistry between like it’s not just about finding the best player. The chemistry between players matters, too. It gets it gets infinitely complex and it’s a cool little case study. But you know, Gavin Baker was saying recently that in fantasy football leagues all over the country, which I used to love before I ran out of time. Now now >> [laughter] >> Now now you have an agent doing it for you and having fun. But that’s exactly yeah, that’s the point. >> we’re up now we’re obsolating human obsolating human sports leagues, replacing them with robot sports leagues and e-sports. Yes. Very 21st century, not Twensen.

[00:32:00] I’m betting on T-800 again. Yes. That’s right. Yeah, but people are losing their leagues all you know, great great fantasy football people are losing all over the place cuz the AI agent is tracking a huge amount of more detailed data. And you know, if you if you look at the video footage, you know, somebody’s like making it up and down court very slowly. Nobody’s going to notice that, but the AI will notice it in a heartbeat. And then that just goes into the great model. It’s it’s really a cool little case study. Since you asked a little bit about kind of if I have a traditional business were were thinking about how to do this, I’ll give a I’ll give a slightly different one which is Lifespan MD, which actually Peter, I think this one will resonate with you in particular, which is a concierge I I know I know Chris who runs it. Yeah. Yeah, so so so Lifespan MD is a concierge medicine business. And you can think of it as they have a network of practices both internationally and in the United States, which all have very different sets of data on their patients kind of practice information. And so the the thing I always start with with any

[00:33:01] AI use case is you have to get the data right. Before you can even start with AI, you have to make sure that you have the structured and unstructured data together that you want. And so the first um the first thing that we’re doing for them is on our data platform Neuron, we’re creating a HIPAA compliant multi-tenant cloud instance where we bring in together all the patient and provider data that’s of interest. And we start to bring both a 360 degree view of both the patient and the practice. And so you can start to think of things like if you wanted to understand what longevity focused tests male patients 35 to 50 are using most frequently. You can start to think about things like that on patient outcomes that are really interesting. If you want to understand practice performance, if you want to understand um where you have certain patients that are not compliant or not as interactive. It’s effectively just a control tower to understand everything that’s going on across that footprint of practices. And then I think the the area where generative AI has become more important for that is actually kind of a chat

[00:34:00] agents where people can ask questions, knowledge management systems, and and really interrogate and ask questions of all the key data from all of those practices. One of the key things that’s challenging about that is obviously in health care you have to be extremely careful about which data is stored locally at the practice versus how that’s brought centrally. And so the HIPAA compliant multi-tenant cloud is one of the key components of that is is actually making sure that no patient data leaves the premise of the individual practices and doctors are able to access certain things and then certain practice metrics are organized centrally. >> [snorts] >> I I heard the coolest thing this week. It’s a a QA company that has invented talk to your defect. It’s just the coolest concept. The defect actually has a personality and you can ask it questions about itself like where did you originate? I I can totally imagine what you just said in health care being talk to your illness. Like you have a conversation with it. Where did you come from? How do I treat Are you are you getting better or worse if I do this thing? And it’s talking back to you with

[00:35:00] a personality. It’s just the coolest idea ever, isn’t it? I I think it’s amazing. It’s one thing with the defect. It’s a little awkward when you say here’s the bacteria you’re talking to. >> Well, I I just mean the defect is real. I talk to your illness maybe a Maybe it gets a little weird. I don’t know what voice you would give it. Voldemort voice or something. >> [laughter] >> Tell me how do I kill you? How do I dispatch you? What Dave, one thing I’d note there, too, is I think there’s a question and I get asked this often of like how do you Peter, you asked earlier, how do sectors evolve? Like I think actually the question of does decisioning of individual patient care change with gen AI is a much murkier question. I think the easier place to start and I think where, you know, in many ways it’d be very interesting is the US as an example spends about 13 to 14,000 per per patient per capita on health care, right? Compared to 2,500 to 3,000 per capita in say Germany or Canada. Something like 30 to 40% of that is admin cost.

[00:36:01] And that is not admin cost that anyone wants to bear. And so this is something where I actually think the idea that Lifespan MD is pursuing is not to change the stand actually make the physician even more empowered, but to take all of the really painful admin and scheduling and and make that the part that they don’t have to deal with anymore. And AI should do a huge amount of damage in those areas. Exactly. What are you seeing most companies get wrong on their mission to implement AI? Yeah, I I think it’s a couple different things. I think the first one is a lack of focus on data as the starting point. So I do think it I do think the challenge if you just tried to um if you tried to build an AI agent on fragmented customer and product data, it’s going to break by definition, right? And so I think you do have to be in a place where the data you’re going to feed feed into the models is is clear and working. So I think that’s been one one major challenge. Um I think probably >> think most do you think I mean, if you

[00:37:01] had to look at companies as a whole in the medium and large size, do they have clean data? How long does it take a company to sort of get its data into a format and a level of fidelity that’s useful? I mean, is this a is this a hard lift or an easy lift? It depends if you take the paradigm of I’m going to put everything in a data lake and get everything right, which can take 5 years. And the reality is most big companies have spent a half decade trying to get all their major data schemas in order. But I think if you start with the question of like what data do I need for this specific use case? Like you know, if you take um uh let’s take credit underwriting. Like to do that well, you need one you need a set of data around the credit itself, the market. You probably have five to six kind of core data variables you need. Uh kind of the core financials of the business, the the security of the credit, all

[00:38:01] those kinds of core piece of core But you don’t need every piece of data across the entire commercial bank to be right. You need the core elements for that use case. And so I think I think companies that are focused on the exact data they need to get right, I think they’ve done pretty well, but I do think trying to get all data like I mean, you’ve also seen the enterprise for a long time, Peter. If you asked any Fortune 1000 company to look at their full data repository and how much of it is accurate and working and clear and uh accessible right now. I mean, very few companies have that. So I do think being very tactical about what data you need. The other thing I think for gen AI in particular is that a lot of the most important data is non system of record non structured data. So it’s things like images, videos, text files. It’s just not things that people have tried to master historically. And so I think the first step in this is saying what is the thing I’m trying to solve and how do I make sure I have that data ready. Yep. Yeah, one thing I see a lot of you know,

[00:39:00] had a long board meeting this morning. Company that’s very AI forward in portfolio accounting company called Vestmark. And the data, you know, for account reconciliation for example, the data is abundant. But it doesn’t tell you what the person actually does. You know, it just tells you how it was reconciled. So now the path to success is first the AI assistant which helps accelerate you through the day, but it also knows what you’re actually doing. Then that accumulates, then that becomes the RLHF for the training or tuning data. Because what you’re trying to do is is like what are you doing, guys? And that’s not really represented in the data, but a lot of times you go talk to a bank or an insurance company and they’re like our data is our advantage. Go ahead, bomb it into the neural net and train it.” You’re like, “What the I don’t even know what that means, you know? Like, I’m just going to throw all terabytes of spreadsheet data in and see what happens? Like, that’s going to go clippy on you?” Like, Well, you have all sorts of other issues as well. I was talking to the CIO of one of the biggest banks in the world and they have 300

[00:40:01] different customer databases. Okay, 300. One for mortgages, one for loans, one for this. And because the mortgage people don’t want to tell the loans people about their customer data, so they guard it jealously. It’s a total disaster for the poor CIO. Fascinating. Alex, Yeah. I I I think these are all very interesting points. I I’d like to, if I may be so bold, jump up several levels and and maybe speak a little bit more about the business model of Invisible. My understanding, correct me if I’m wrong, Matt, is there’s an element of the business, I think it’s called Meridial, that as sort of a marketplace for ML freelancers, if if I understand correctly. And I’m I’m I’m curious. I I think the in my mind, one of the many elephants in the room in this conversation is that we’re arguably on the edge of recursive self-improvement. All of the frontier labs, more or less, I think would agree with the assertion that we’re nearing the point where you could have an AI researcher where you

[00:41:01] just turn over computer resources to the AI researcher and the AI researcher does as good, if not a better job than the human AI researchers who work for the frontier labs. If if that is indeed the case, surely one of the the the several elephants in this room, but given limited time, focus on this one, is that the need for a marketplace of ML freelance researchers to train models, doesn’t that evaporate entirely as as we start to reach the point where AI researchers can can build custom models off of custom data sets and custom benchmarks for each client? Yeah, so so as you said, we have two sides of our business. One side, Meridial, which we train all the large language models. And then on the on the enterprise side, we build basic custom applications for enterprises. Look, I think there has been a five-year evolution where I think consistently folks have said at some point you will not need reinforcement learning human feedback to validate and test models.

[00:42:00] And I think the the challenge of that logic is a couple different things. One, the the spectrum of expertise that if you take language, multimodality, um extreme expertise on things like computational biology, and then the fact that a lot of these are reasoning tasks, you do need and there’s a whole host of studies on this that actually pairing synthetic and human data together is stronger, but you do need human feedback on almost every different sort of agent you want to roll out. And so I think the nature of RLHF is changing. So I think you’re moving more towards things like RL gyms, controlled environments, simulations. I think you’re starting to see things like um much more of the expert work now is PhDs, masters. So it’s less what I’d call commodity cat dog, cat dog labeling. But if you say tomorrow you’re going to train a model to figure out um you know, different evolutions in 17th century French architecture in French, you are going to want RLHF to do that to validate it. And I think that you’re

[00:43:00] seeing that over and over is actually as as the models move more and more into very specific areas, there is more and more RLHF needed for them. That’s interesting. It It I I maybe I’ll share my intuition and then would be curious to to hear what you’re you’re seeing in in your version of the grand truth. My my intuition, my impression is that we’re seeing greater and greater data efficiency and and pardon I mean RLHF was obviously sort of very fashionable over the past three years. Maybe it went through sort of peak fashion, if you will, and then we saw the rise of reinforcement fine-tuning mechanisms that that maybe are far more data efficient and maybe even more human time efficient. If you have to just build an RL environment, arguably that’s per human hour involved probably a lot more time efficient than staffing out to to to some so-called developing country folks to, as you say, cat dog, cat dog

[00:44:00] do supervised fine-tuning or or or some other RLHF type mechanism. Surely, I’m projecting. My intuition is you’d see more data efficiency, not less, and and therefore the amount of time, effort, money expended on RLHF or or any sort of even if we we buy your your assertion that we’re seeing sort of hyper-parochialization of lots of different tasks and each of them is going to need artisanal annotation. Surely, there is a competing force, which is increasing data efficiency from algorithmic efficiencies like like reinforcement fine-tuning. What are you seeing? Yeah, I mean, people have been arguing that for five years, but I think I think at least what I’ve seen on the ground is the the accuracy that you want um and the if you think about a reasoning task that involves a set a several step leap and you think about the risk of hallucinations, it is more useful to have human feedback involved in that in

[00:45:00] some form, all right? And so I think I don’t think that that means if you think in some ways RLHF happens after all the pre-training compute cost, um it’s a pretty small percentage of the total cost in training. And it is some of the most valuable feedback. And as you as you see more and more specific agents being trained for specific tasks, like take legal services as an example. If you train if you get a new legal services data set, which is interesting, and you want to train a model off of that, you are going to want to see some sort of comparable equivalents, whether it’s an associate or or an M&A lawyer equivalent, where you actually test if it works. Now, is it possible that at some point, 10, 15 years from now, you run out of things to train on? Possibly, but actually, I mean, if you take the number of languages, modalities, robotics is probably the next frontier of this in some ways. RL gyms, contact centers, there’s a lot I we are a as a company a fully believer in I talked about it on the enterprise side, too, that human in the loop is going to be a feature, not a bug, for a long, long time. And I think

[00:46:00] the the the entire red herring of the enterprise, for example, is that autonomous agents will do all of this with no human in the loop. I actually think you’re going to need more and more humans at every step. >> Alex, you’re saying that the level of intelligence of these agents, as we pass through AGI and get to ASI, is such that they’ll figure it all out as good as any human and be and replace that human in the loop. What’s your timing on that? >> That was exactly my question, Peter. So So my timeline, if I had to spitball, of course this is not the predictions episode, so don’t hold me to it. Hold me to my my predictions in the next episode. My timelines are approximately two to three years for as a conservative outer bound for for sub-element of recursive self-improvement where we get our AI researcher that’s as good, if not stronger, than than the human researchers for building ML models as a conservative outer bound. Now, 10 to 15 years, two to three years max. That’s the outer outer edge. But But I [clears throat] also believe Matt’s totally right that that 2026 is going to be the year of recursive

[00:47:00] self-improvement capabilities growing in crazy exponentially and corporations moving at a snail’s pace compared to what they could be doing. And it’s all going to be stuck and bottlenecked and log-jammed and it’s going to frustrate the hell out of Google and OpenAI. And and companies like Invisible are the the lubricant that’s going to actually get it from point A to point B. But that clippy use case is a really good you know, like in our tests for for contact centers, 80% of the people massively prefer the AI. But the 20% that don’t like it more than torture the whole thing to death >> [laughter] >> and make it better to repeal the entire thing. There are probably eight ways to fix that quickly. Yeah. But it’s not going to come from Google and it’s not going to come from OpenAI. And it’s going to involve data that isn’t in the natural data set, you know? And and it it’s right now, if you told me two years ago that everyone in the world will know what RLHF stands for, and there will be three people who are multi-billionaires who build RLHF

[00:48:01] companies walking around, be like, “That’s not even a thing.” Oh, wait, now it’s not only a thing, it’s massive in scale. There’ll be new terminology in 2026 for many, many of these other bottlenecks that yeah, the AI can do it, but for whatever reason the bank is not doing it. The contact center is not doing it. And those bottlenecks are going to like they’re they’re going to be so lucrative for companies like Invisible to just plow them down. I I can’t answer the specific question of whether your workforce is going to involve the distributed workforce that you just described. What was it called, Alex? Or Matt? It’s called Meridial Meridial. Meridial. Yeah, so there there is a really healthy debate on whether is Meridial a key part of this or a network of even more agents a key part of this? Or is it, you know, is 2026 the transition year between the two? It’s going to be a really interesting footrace between those two different approaches. But I But I think that’s the That That really is Dave I I think you

[00:49:00] put your finger on it, Dave. That That really is what I’m asking, which I think is a distinct question from is there value in supervised fine-tuning or reinforcement learning with human feedback going forward? Of course there is. What I’m really asking is how much of that can come from AI sort of bootstrapping it in the near term future versus needing human inputs. And what I’m saying is think about a balance between generalizability and hyper-specificity. And I agree with you on on generalizability. I don’t I actually don’t think RLHF is important even now for that. But where where it gets more complicated is when you want to train off of specific tasks. So let’s take let’s take the insurance claim example that I I mentioned earlier, right? You’re going to you’re going to generate a 10% 10-page insurance claim. And you could apply this to any enterprise use case, many consumer use cases, but in that world, you build you know, an an LLM is producing an outcome and it’s fine-tuned off of a specific company’s data, but you need a way to actually say at that point, does this produce a comparable output to what that claim did

[00:50:00] to to what to what a human doing this task before was doing? And so when I mentioned earlier custom benchmarks, that is the process which you do that is you actually do need human equivalence testing, you need a human to provide a comparable data set, and to say this looks good or it doesn’t. And you just don’t have precedent data to train that off of in in um in any of these LLMs because there’s the human input is not Now, again, that’s going to keep going down more and more specific tasks. If you take legal services, take it by language, take it by topic, take it by document type, there’s human feedback required for all of that. I I I almost I mean not to put too fine a point on it, but I I want to make sure that those in this in this episode who want to to drink the bitter pill with the bitter glass of water for the bitter lesson are are are so drinking. I’m I’m curious Matt to to understand how you see this. Surely there’s a wave of generalism that is over time

[00:51:00] maybe we can sort [clears throat] of finesse what the appropriate time scale is. Sounds like maybe your your so-called timelines are are a little bit longer than perhaps mine, but would you at least agree with the premise that over time even the specialized skills end up getting subsumed by generalist models, or or do you think that’s just never going to happen? We’ll always or by always I mean like on time scales of 10 to 15 years, which is pretty long time scale, we’re just going to have generalist models that are always sort of like specially fine-tuned. I don’t think all expertise all specialized expertise is going to go away now. I mean again, if you think about a lot of that a lot of the information that specific experts have, uh there’s no training data available for that. Like it it’s stuff that sits in people’s head, it’s experience. Like take I mean again, I’m aware of many of the narratives that human expertise becomes less important. Again, we are a company that actually thinks the human touch elements become more and more important, but take you know, take sales for example.

[00:52:00] Um many of the best selling patterns, many of the people who done that the best, like the there is no information you can train off of what they do. They they live you know, human interaction actually in a in a world where, you know, there are 500 companies selling email-based SDRs, I think human beings become more important in that world. So I I don’t actually think specialize I actually think that the shifts are expertise becomes more and more important in many different areas. I think human loop stays really important, but I think the I mean if you take a contact center, and Alex, I understand that the theory of what you’re saying, but like we’re four to five years into this. And if you look at the number of US contact centers that have migrated to using agents, it’s a pretty small percentage. Can I ask you actually the Jane Street question is really burning a hole in my pocket, too. So it’s really clear that stock picking is moving to AI at warp speed. >> And the reason is because the there are no barriers. Like you’re just placing a trade that’s already automated, so it’s like And that’s the bellwether to me. It’s a great benchmark. More money. >> Yeah, and and also almost all of the

[00:53:01] volume on public equities markets has long since been dominated by algos. So this happened decades ago. Yeah, well, it’s it started with rapid trading, so the quants were already there. So now that it’s moving to fundamental analysis, it’s the same mindset. So it’s That’s one of the reasons it’s just taking off. But but you know, like Peter said, you know, you’re making more money. Okay, let’s just keep going then. You know, there’s no there’s nobody who’s saying, “But I’m going to lose my job.” It’s like, “No, we’ll just pay you more. Let’s just go.” So it’s a really an interesting, you know, bellwether. But you know, within that world, they’re struggling because the the data is so proprietary. Mhm. >> And it’s looking more and more [clears throat] likely that these self-improving massive foundation models are going to get to, you know, superhuman IQ this year. This year being 2026. But the prompt window is getting massive and the recursive chain of thought reasoning is getting really really good. So you can you can actually feed it data without having to retrain it and have it achieve the job. So if I take that mindset, you know, from Jane Street and I move it

[00:54:00] over now I’m a mechanic and I’m trying to fix a car, I’m trying to diagnose what’s wrong with it, and I have audio, and I have, you know, sensor data. Great, easy use case. But am I going to then put that data into the LLM API and transmit it to OpenAI where they can accumulate it? And then if they decide later they want to be a garage, they can they have all my data? Or am I going to run some kind of a walled-off model? And you know, garage mechanic’s maybe not the best example. That’s why I chose Jane Street cuz they’re never going to take their proprietary data and give it to OpenAI. But in the middle ground you have banks, insurance companies, you know, hospitals, like are how are they going to deal with this? Like it’s easy now. Like sometime in 2026 it becomes easy, but the data is is proprietary. That’s my only reason for having a competitive advantage. I don’t want to give it like over the to the API. Yeah, look, I think you’re seeing there are definitely sectors, many of which you just named, banking, health care, where um people are deciding to keep their data on premise, or they’re using things like

[00:55:01] small language models for those sorts of reasons. And I think you may continue to see that as a trend. Um I think one mistake folks often make is not all data is proprietary, so you can have you take the Jane Street case, maybe their trading data is is proprietary, but their you know, back office kind of forecasting data might not be and so or back office finance data might not be and so I think one thing is being clear about the data that you don’t that you need to keep proprietary and that you do want to take more parameters of of security around, and then what data you say, “Look, this is actually I’m going to be very uh careful as a company, but this is data that is not as as uh proprietary.” I think that sort of balance I think the whole you know, similar to what we discussed with contact centers, the idea of I will not give anything to the LLM, but I’ll keep it all in-house, I don’t think that makes sense either, but I do think that’s a paradigm you’re seeing more and more. I think the the Yeah. So I want to kind of change the tack a bit if that’s okay. Um when I I actually do agree that we’ll

[00:56:01] automate, but I think we’ll automate in a way that’s different from this discussion. So let me give an example. Let’s say I’m Canon printers and I’m selling home printers, right? Right now I have a bunch of people doing marketing and content development and brand management, then sales people to sell to the Best Buys and so on, and online folks, then you have post-purchase getting the customer to try and register the dang printer, and then you’ve got all the repair support technical staff, and then you’ve got your accounting folks in the company You could get all your job functions managed by Right? So you’ve got pockets of people doing different functions across the board. Um if I was going to build an AI-native uh printer sales company, then I might think about having all of those things automated completely with AI, and then you’re not human-centric, but you’re function-centric across those. The printer could report when it’s, you know, running out of ink and you ship it a new thing, you

[00:57:00] uh it tells you when there’s a problem with it or there’s a problem coming up, you you alert your your repair staff saying, “Hey, this guy may maybe we can upsell him a printer.” Da da da da da, and you essentially automate all the functionality with AI, and you leave the human 90% out of the loop almost completely because you’ve automated the core functionality, and right now what I’m seeing is what I used to call radio over TV, right? When you first had television, we took radio announcers, put them on TV to read radio scripts. Okay, we didn’t adapt for the medium. And I think what I’m seeing right now is we’re automating right now what the human beings are doing at each of those functions, but surely over time we’re going to automate the functional flow and then get rid of the human beings completely. AI-native, AI-first, right? Not to mention getting rid of the printers. Well, that’s a separate question. I’m just using that example. >> [laughter] >> Who’s going to be doing any of the printing? Let’s leave that part aside just for the moment. I I think you’re absolutely right, Salim. I mean, this is where a young AI-native company reimagines an entire field and has zero

[00:58:04] legacy and zero friction in coming forward. The question as Matt said at the beginning is do they have the distribution? Right? But this is where a large company, Canon in this case, should actually be investing in entrepreneurs. I mean, one of the things you and I talk about a lot of times is if I’m a large company and I don’t know what to do, I would basically hold a competition, ask young AI entrepreneurs around the world to come forward and how would you disrupt my company? Uh you know, give me a pitch, and then I would pick the best five of them and I would fund them. And I would say, you know, we’re going to fund you to disrupt us, and then, you know, we’re going to give you access to our data, to everything we have, and then ultimately we’re going to buy you or buy a majority stake in you, and we’re going to make you our new company. Right? This is the innovation on the edge, the displacement of the core, etc. How you want to call it. This

[00:59:00] episode is brought to you by Blitz C, autonomous software development with infinite code context. Blitz C uses thousands [music] of specialized AI agents that think for hours to understand enterprise-scale code bases with millions of lines of code. Engineers start every development sprint with the Blitz C platform, bringing in their development requirements. >> [music] >> The Blitz C platform provides a plan, then generates and precompiles code for each task. Blitz C delivers 80% or more of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Enterprises are achieving a 5x engineering velocity increase when incorporating Blitzee as their pre-IDE development tool pairing it with their coding co-pilot of choice to bring an AI native SDLC into their org. Ready to 5x your engineering velocity? Visit blitzee.com to schedule a demo and

[01:00:02] >> [music] >> You’re a medium-size or a large-size company. I’m not going to focus on the on the startup right now. And what do you do in 2026? Cuz you’re going to have to do something. You’re going to have pressure from your board, from your shareholders, from From Alex. >> from just competition. Uh so you got to do something. And what I heard you say you so far, uh Matt, is number one, you got to get clean data. You need to make sure you understand what your data situation is. Number two, you should pick two or three, uh if you would, areas, call them benchmarks, where you’re going to run experiments on. Uh and it’s not a proposal, it’s not a, you know, an idea. It’s actually uh run it. Actually do, you know, uh run an experiment uh to see how it works. Um what else? Um and then pour scale, pour money on the things that do work, and then have an expanding sort of uh

[01:01:01] increasing circumference around the the company’s major revenue engines. How do you think about that? Walk us through a few more steps. Yeah, so I I think one of the things which has been a lot of the the topic of conversation here was given given all the improvements in the models, given, um you know, what Salim was walking through on the potential to clean sheet and design a company from scratch, why has, you know, there was this MIT report that came out that 5% of enterprise models make right now made it to production, right? So I think there’s a starting question of given all this tech excitement, why has that been so much harder? And it’s not the technical challenges that we talked about, it’s the data, it’s the focus on which priorities to look at. I think the other two big ones though are the organizational structure by which you pursue those initiatives. And particularly that the advice I give everyone is do not locate this in your technology organization. Take your best operator, your best ops person, give them an operational KPI, and track it to that. And make sure it’s a really clear operational KPI. So we talked a bunch

[01:02:00] about contact centers. You should have an op an operational person there lead it around, you know, um CSAT score, time per call, whatever the core metrics you’re looking at. And and that should be your guide. If you want to take something like inventory forecasting, you should do it around inventory days, stock outs, all those kind of metrics. But I think if you have a clear sense of which operational person is leading it and how they’re marshaling resources around it, and you have a clear KPI, you’re going to make progress if you focus on a couple different things. I think the the failure mode on that has been you let a thousand flowers bloom, none of them have an operational metric, and you kind of end up with a science project dynamic. Yes, exactly. That’s exactly right. It It It If you walk in, a thousand flowers bloom, you walk in and you say, “I am going to give you a million genius-level people for free. Do something.” It fails. Yeah, it’s like it’s like here’s a million people for free, and they’re all geniuses. And it it fails for that same reason. It’s like I didn’t think of an

[01:03:01] idea, so I said I’d a thousand flowers, just go bloom. I couldn’t think of anything, so maybe you will. Like how’s that going to work? I’ve seen that You’re exactly right. It’s it just it’s just so sad, you know? >> We go we go even further. We basically say not just take the operator and put a put put put them outside the organization and let them build something from scratch at the edge. Because otherwise you’re getting encumbered by all the internal rules and bureaucracies, and that gets slowed down for a huge amount. Then it fails for for legacy reasons. Yeah, it’s not the monkey skunk works, it’s the Apple MacBook team. Yeah, I mean look, Apple is actually a master at this. If you think about Apple will do is they will form a small team that’s very disruptive. They will put them at the edge of the company. They’ll keep them secret and stealth, and they’ll say to them, “Go disrupt another industry.” Right? Whether it’s watches or retail or whatever. At last count, I think they have 18 teams looking at different industries to think about. And when they think it’s ready to disrupt, they go into it, and they patiently iterate.

[01:04:00] Right? Um I the Apple Watch, for example. Um so this is the model I think we’re going to see many other companies take on, where you you do this and you Because if you think of any operational company, the insights they have on all sorts of adjacent industries is incredible. Very hard to disrupt in their own industry because they’re probably pretty optimized for it unless you come with the AI startup, but they can really disrupt a lot of the edge cases a lot of the industries around them. So I expect them to launch AI native startups that go into adjacent industries and go attack some of their neighbors. Nice. Uh Matt, before we get to a few of your 2026 predictions, can you just share a couple more of the the use cases here just cuz they’re they’re fun? >> So we worked with SAIC Vantour and the US Navy on building um intelligence for underwater drone swarm for unmanned underwater vehicles. So think of that as if you have a series of drones, and you have enormous numbers of sensors on each of those drones, and you need to understand the movement patterns of those different drones. Um

[01:05:00] and in each case uh you see a you know, you see an object underwater, what do you do? Do you engage? Do you step back? Do you move with other drones? That whole movement pattern and decisioning for underwater unmanned vehicles, um that’s what we worked on, fine-tuning a model to do that, training it, looking at all the movement pattern data. And again, this is one of those interesting things about drones is they are autonomous, and so thinking about how those movement patterns evolve uh in complex environments is very very trick hard to do, but you also have lots and lots of interesting sensor data to do that. Um [snorts] I think one that maybe anchors more on the human decisioning side is Swiss Gear, so like Swiss Army, the luggage brand. You know, similarly, I I actually think this is Peter one that a lot of folks in the audience may relate to in some form, which is, you know, they had enormous mix of different data tables around products, customers, etc. They couldn’t really bring together for inventory forecasting. And so we we used our data platform Neuron to bring together 750 tables really quickly, and then optimize the forecasting to look at both minimizing

[01:06:01] stock outs and um optimizing which inventory to hold. Which, you know, if you get inventory forecasting right, it’s probably one of the major issues for most small business for most big and small businesses is you minimize lost revenue, uh you make sure that you don’t hold lots of excess inventory. It’s one of the hardest things to do, particularly if you got a six- to eight-month order cycle time. And so that was that was something we partnered with them on, and I think was was a great outcome. We ended up expanding their overall inventory coverage by about 30%, and um basically 2x the numbers of SKUs as a reliable prediction. And again, that was done in about in a couple of months. All right. So later this week, my Moonshots mates and I are recording our 2026 predictions. We’ll have Emad back, and we’ll be talking each of us will provide two predictions for 2026. We’ll have our top 10 from the Moonshots podcast. It’s going to be fun. Uh it’s going to be a battle. Uh we’re going to ask our listeners to vote on which predictions they like best. I mean, of

[01:07:00] course they’re all going to vote for Alex’s, but hey. Uh Matt, uh talk to us about what you see coming in 2026. Yeah, I think I’ll I’ll call out a couple, and we’ve we’ve just done a bunch of research on our kind of our 2026 predictions. So I won’t I won’t say all of them, but I’ll I’ll call out a couple. Um Yeah. I think one of the first ones I would I would anchor on is multi-agent teams. So I think one of the challenges, and it’s it’s inherent a lot of what we discussed about discussed here, is if you’re a large enterprise or medium-size company implementing a use case, you won’t you won’t necessarily have one decisioning agent that does everything. You’ll train task-specific agents for individual tasks, usually orchestrated by an LLM. And what that allows you to do is to pinpoint the accuracy on those specific tasks, and then use the uh broader logic set of the LLM to make sure they all work together properly. And I think that’s been an architecture that’s been discussed pretty broadly for a while, but I don’t I think that we’re just starting to see the green shoots of more and more folks having success with that.

[01:08:00] Uh contact centers being a good example. Um so I think that’s a big one that I would call out. I think the other the second one I’ll call out is the multimodal leap. I think more and more video, images, audio are going to become a bigger and bigger part of how people engage with with these models. So I think audio probably one of the most interesting. Um and so I do think the the way you’ll be able to speak to them, interact with them, visualize them is going to be a really interesting moment for 2026. And I don’t think that will all be text-based like it has been historically. Mhm. And then maybe maybe one other thing to talk about. No, I was going to I was going to ask Alex for feedback. Go ahead. But but finish up, Matt. Yeah, so one the third one I’ll call out cuz we’ve talked about it a couple on the on this episode, so I’ll kind of what we call either the mirror world or RL gyms. So I don’t actually think that’s a well-understood concept for many folks in the audience, but think of that as actually created creating simulated environments to or digital twins for tasks you might want to test, right? So maybe that’s a coding

[01:09:01] environment, maybe that’s a contact center as we’ve used that a couple times. But it allows you to actually simulate a series of function calls, tasks, or environments by which if you’re going to train a model or a task, you can actually test how it’s going to work, like a manufacturing environment, before you roll it out to your actual physical world. And I think that’s more and more in both the model builders in the enterprise, what we’re seeing is a a interesting topic. I want to go around to the May 22nd maybe ask some some final final questions of Matt. Alex, you want to kick us off? Yeah, I I think the most interesting crux of what we’re discussing here is what is the future of human expertise. For for that matter, does human expertise have a future? And assuming it does, What’s the half-life of the It’s cooked. What what’s what’s the half-life of the value of human expertise? And So so To put that in question for Matt, what do you think uh uh uh of all of the the forms of human expertise, of all of

[01:10:01] the labor categories and job roles that exist in the economy today, what do you think will be the last three of those job roles or forms of expertise that will disappear or or ultimately succumb to AI? What are the last three to survive? >> Last expert standing. Okay. >> That’s right. I mean, I’ll I’ll go back to where I started the episode. I think a lot of the commentary on um mass shifts ignores the actual function of jobs in society today. So if I go let’s let’s take sectors for example, oil and gas. A a lot of the functional expertise, um you know, geo seismic, if you look at seismic engineers, people on oil and gas sites drilling, that that is a human function. Like you do need So I think real estate as another example. Like, you know, humans actually help select which You can go down a whole list of different areas. I think there are sectors where you’re going to see uh more disruption near term. I call that a couple of them BPOs, legal services. I think media is a fast-changing area. But

[01:11:02] I’m also not exactly sure that those are lead to negative like meaning have negative employment consequences. Like if you take media, it’s a really interesting one. You know, 5 6 years ago, 8 years ago, I I think media as a category really struggled a lot of ways for paid media as an example, right? And you’ve actually now seen in the last couple of years post-yellow era, uh Substack, Medium, all these blogs have become much more interesting. You have way more media entrepreneurs. And so you’ve changed the function of society and like where the money is coming from changes, but it has not changed total employment. And you know, look, I I understand a lot of the skepticism that says that you know, AI is going to radically change everything, but I think if you look at the at American society for the last 100 years, it’s something like 25% of every high school class uh goes into a field that did not exist when they were in high school. And And And the the reason that persists is people go into the working world understanding the tools they have,

[01:12:01] thinking out about what they can create from that. And you know, one of my favorite statistics which I saw The Wall Street Journal reported last last like a couple weeks ago is 20% of US employment right now is digital ecosystem jobs. And something like 9% 9% of um US citizens are full-time social media influencers. It’s mind-boggling to me. Yeah. But but Yeah. But you know, again, these this is the changing nature of work. And so I think that pattern will persist. I think that the core of what will change is the process of looking up information across multiple systems and documents is you’re going to receive that that is going to become less valuable. But I think all the jobs that involve human interaction, physical work, physical Like I actually think one of the most interesting things in the next couple years is the job ecosystem around data centers, electricians, et cetera is going to become way more in demand. I was actually sitting at this other panel. Optimistic

[01:13:00] >> at a I was on a panel with with um a recruiting someone who runs a recruiting company. They were saying that job profile I think will will two three four X over the next couple years. And so that will have pretty interesting implications for the education system and everything else. But I I don’t I think it will we will see an evolution. I meant the human the humanoid robot electrician and and plumber. Alex, very quickly, what are your three last standing human roles here? Or your last three standing? >> So I’ll I’ll I’ll present multiple competing hypotheses. Uh So hypothesis one, two >> briefly. Uh >> [laughter] >> One hypothesis is it’s the politician because they they have to make the laws. Yeah, that’s true. Another another hypothesis is that it’s the greatest intellects, the physicists or mathematicians. Even though as we talk on the pod, math and the sciences are are all getting solved on the one hand, they’re still perhaps to the extent that that represents the the culmination of human intellectual accomplishment, maybe the greatest intellects will be the the last to be automated. There’s another

[01:14:01] school of thought that says, “No, it’s it’s the roles that involve the greatest need for human authenticity.” Because even though it’s not actually a capabilities question, people nonetheless demand human contact or or something to that effect. And so it’s going to be the highest touch job roles where people just want to know that there’s a human counterparty on the other side of the interaction. So that that’s a set of three hypotheses. Tastemakers will will dominate. That that’s authenticity. Bucket number three. >> Saylor said word for word actually. Yeah, we had on his on his boat we had that enjoyable sunset conversation. I I thank you, Alex. Saleem, do you want to go next on a on a closing question for for Matt? Um I think you talk covered some of it on the industries that are um kind of going after. You’ve had some You guys have done some government work. Where in government functionality do you see the biggest opportunity for AI uh automation uh efficiency, et cetera?

[01:15:02] Yeah, um Everywhere. >> Look, I I actually think this could be one of the two. >> I think this could be one of the really positive trends for society. So um I saw a study recently that AI-assisted permitting could cut energy and um data center project implementation timelines by 50%. Um Think about uh housing. Like one of the biggest challenges right now for housing development in the US is NIMBY regulations and how complex it is to build housing because of the myriad of different regulations and zoning contracts by location, right? Or even take I think the OECD came out with thing that or came out this report that um AI could shrink public sector process cycle timelines by 70% on licensing, benefits approvals, compliance, and basically accelerating infrastructure deployment. So to me, the simplest thing that AI can do is project management and timelines related to all spending and infrastructure deployments. This would be a really positive thing

[01:16:01] for society in my mind. Amazing. Good question, Saleem. Dave, why don’t you close us out on the questions here? Oh, I got so many, but I’ll pick I’ll pick the best. First, Matt, uh how many hours of video footage will there be of you one year from today compared to one year ago? Cuz I know we saw each other in Riyadh a few weeks ago. >> [laughter] >> And and I know that you you are the thought leader in this whole bottleneck of AI getting into the enterprise. It feels like what we’re doing right now. You know, you did the footage of you that’s out there right now is all this CNBC, Bloomberg type, you know, 5-minute format. But here we’re getting your real thoughts. It’s just so much better. But how many hours can we count on a year from today? Well, look, I think as of 12 months ago I had done almost no interviews of any kind. So this job has been fun in that front. And look, I what I enjoy about the podcast format is it does allow you to talk about any some more complex topics. And so I you know, I particularly podcast like this that’s really interesting. So hopefully many more in in the year to

[01:17:00] come. Well, I’m hoping for at least a 10X on that. And then my follow-up question to that is the avatar version of you that’s also out there talking, is that a 2026 thing you think or when? Yeah, it’s probably happens in 2026. I don’t think it would be that hard to train an avatar off of my public statements. So I you know, I think that’ll be an interesting We are actually working um in the sports space actually on the topic of avatar training. And I think it is actually an interesting space where um you could imagine a lot of different areas where rather than a chatbot interaction, people want to speak to people they know via an avatar that might get I actually think that will become a more natural part of society and a pretty interesting one actually. I totally agree. I just the timeline is could be, you know, as soon as two months as far as I’m concerned. >> What makes you think it’s not an avatar we’re speaking to you right now, Dave? That’s a good question. >> [laughter] >> That seems very human actually. I don’t know. The best ones are. >> orbs behind you kind of give it away. >> Yeah, they are pretty they are pretty strange. That’s not real. >> [laughter] [01:18:00] >> Matt, where do people find you? Where do people find Invisible? Who should go to Invisible to check out what what you do and how you do it? Sure. So we have seven offices now. New York, San Francisco, Austin, Texas, DC, London, Poland, and Paris. Um I’m the easiest to find probably. We have an office right off of Union Square. Um uh which is where I’m at least half the time when I’m not on the road. And [snorts] And look, I think in in terms of who who should come to us and from the listener base in particular, any mid-cap um or or or enterprise company that is that knows there is potential in their business, that knows that AI can transform in a positive way, and is struggling to bring all the pieces together. I think that is the the main thing I would say is there is no doubt, you know, everything Alex is asking, the technology has made an enormous step change over the last couple years. The hard thing is actually the change management, the operationalization, the metric tracking, the evaluation. It’s It’s kind of bringing together like you know, I I think it’s the difference between the Our our founder Francis has

[01:19:02] a an idea of you have all the components to build a cake, but you don’t have a cake. Like what we do is we actually bake the cake in the end. We build you something that works. We make AI work. And we use all the modern tools to do that. Amazing. And the website? Uh invisible.tech.ai. All right. Thank you, Matt. Saleem, [clears throat] Dave, AWG, I’m going to see you guys in a couple of days for our 2026 predictions. Make them brilliant. It’s going to be It’s going to be fun. All right. >> to benchmark for trapping tracking benchmarks. >> [laughter] >> That’s your All right. All right. No, that’s not the one I’m going to talk about. Okay. >> All right, guys. Have a great day. Every week, my team and I study the top 10 technology meta-trends that will transform industries over the decade ahead. I cover trends ranging from humanoid robotics, AGI, and quantum computing to transport, energy, longevity, and more. There’s no fluff, only the most important stuff that

[01:20:00] matters, that impacts our lives, our companies, and our careers. If you want me to share these meta-trends with you, I write a newsletter twice a week, sending it out as a short 2-minute read via email. And if you want to discover the most important meta-trends 10 years before anyone else, this report’s for you. Readers include founders and CEOs from the world’s most disruptive companies and entrepreneurs building the world’s most disruptive tech. It’s not for you if you don’t want to be informed about what’s coming, why it matters, and how you can benefit from it. To subscribe for free, go to diamandis.com/meta-trends to gain access to the trends 10 years before anyone else. All right, now back to this episode. >> [music]