Original broadcast 2/3/26
Presented by Microsoft
Federal agencies are moving quickly from AI strategy to real implementation, shaped in part by Office of Management and Budget guidance and growing expectations that AI should deliver measurable mission outcomes. In this program, leaders from Microsoft, the Centers for Medicare & Medicaid Services, the Cybersecurity and Infrastructure Security Agency, and the Government Accountability Office share how agencies can become “frontier agencies” by improving productivity, strengthening decision-making, modernizing workflows, and maintaining trust and security as AI adoption expands. Across four conversations, a consistent theme emerges: success depends on mission alignment, data quality and governance, cybersecurity and risk management, workforce readiness, and infrastructure that can adapt as AI evolves.
A key point in the discussion is that agencies need clear ways to measure progress toward becoming a frontier organization. Carmen explains that government doesn’t use private-sector markers like profitability to define success. Instead, she points to productivity measures such as time saved by employees, increases in accuracy, improvements in quality of work, faster processing times, and better delivery of benefits and services to citizens. She also frames frontier progress as a broader national advantage: when federal agencies become more capable and more effective, the benefits extend to the citizens and communities they serve, as well as warfighters and federal personnel who depend on mission execution.
Carmen also outlines several foundational principles that separate effective innovation from innovation that becomes a distraction. She explains that AI should be used to augment human decision-making rather than replace it, allowing employees to work with more insight and confidence. She warns against “random acts of innovation,” where organizations chase new technologies without connecting them to mission needs, and she stresses that innovation must be tied to measurable mission impact in order to build trust and sustain momentum.
Another theme is process modernization. Carmen argues that many current workflows were built when modern AI capabilities didn’t exist, and agencies will limit results if they simply bolt AI onto outdated processes. Instead, she recommends reshaping business processes around what is now possible, with the goal of shortening cycle times and freeing employees from lower-value tasks so they can focus on higher-impact work.
The conversation closes with examples of measurable outcomes from early adoption. Carmen shares that one large agency’s Copilot pilot produced a 74% improvement in quality of work, a 75% productivity boost, and up to two hours saved per week for many employees. She also highlights an agency effort to process legal, contractual, and financial documents that was expected to take years but was completed in weeks. In another modernization example, developers using GitHub Copilot reported a 25% to 50% reduction in coding time when upgrading legacy systems, demonstrating how AI can accelerate one of government’s most persistent challenges: application modernization at scale.
Jeneen describes a practical example of improving analytic outcomes through experimentation. CMS hosted an event called the “chili cook off,” providing a limited dataset to participants to test analytic approaches. The agency learned that expanding access to quality data produced better results and improved CMS’s ability to generate operational insights, especially in program integrity work where identifying risk quickly is essential.
Nelli emphasizes that while governance and data preparation are critical, agencies should avoid getting stuck in endless pilot activity without moving forward. She discusses the importance of selecting high-value use cases tied directly to mission outcomes, defining success criteria before testing, and then moving from proof of concept into production. She notes that agencies cannot afford to run hundreds of pilots simultaneously, so prioritization is essential. In her view, use cases should be evaluated based on mission importance, feasibility, implementation time, measurable success outcomes, and risk—including the risk of not adopting AI and falling behind.
One of the most operationally significant examples discussed is the CMS fraud war room. Jeneen explains that the war room brought together law enforcement, investigators, and analytics professionals in a two-hour session, two days a week. This approach collapsed a traditionally slow, linear process into a faster cycle of reviewing high-risk cases, making decisions, and taking action. She shares that the fraud war room enabled CMS to suspend $1.8 billion in payments, and that the model expanded over time as more participants recognized its value. She also highlights that the team improved the approach after an early pilot period by incorporating feedback, adjusting the process, and reducing obstacles that slowed decision-making.
Nelli adds that AI success requires agencies to think beyond a single model choice, since model popularity can shift quickly. She argues that agencies should focus on secure platforms that support multiple model options, enable secure data integration, allow continuous monitoring, and make it easier to manage AI applications through their lifecycle rather than rebuilding from scratch each time. She also notes that AI-enhanced search, summarization, and natural language querying can help agencies manage data overload, allowing teams to find relevant information faster across structured and unstructured sources.
Bob Costello, CIO at the Cybersecurity and Infrastructure Security Agency, and Steve Faehl, US Government Security Leader at Microsoft, focus on how agencies are approaching AI in cybersecurity through multiple lenses. The discussion highlights that agencies face two immediate needs: using AI to strengthen cyber defense operations and securing AI tools and systems so they function as intended. Bob also adds a third factor that determines success: workforce readiness. He emphasizes that agencies must train and educate teams on how to use AI-enabled tools responsibly, especially in an environment where CISA’s workforce supports cybersecurity best practices across government.
Steve introduces a framework for structuring AI and cybersecurity priorities: security with AI, security of AI, and security from AI. Security with AI focuses on using AI to improve defense capabilities. Security of AI focuses on protecting AI systems and ensuring they operate safely and as intended. Security from AI focuses on how adversaries use AI to become more effective attackers. He stresses that agencies should start by identifying which of these problems they are trying to solve so they can choose the right tools and manage risk appropriately.
The conversation also explores how AI is changing the balance between attackers and defenders. Steve notes that cybersecurity has long been asymmetric, where attackers only need to exploit one weakness while defenders must secure everything. AI can help shift this dynamic by improving vulnerability discovery, vulnerability prioritization, and operational scale for defenders. Bob adds that defenders need more than a basic “patch everything” approach. He describes how AI-enabled tools can help agencies determine whether they are truly vulnerable to specific threats and identify situations where multiple low-severity vulnerabilities, when chained together, create serious risk. That allows security teams to prioritize work more intelligently while still maintaining urgency.
The discussion concludes with how AI could support faster risk decision-making. Bob describes moving beyond infrequent assessments toward near real-time risk posture visibility by synthesizing continuous telemetry. Both leaders emphasize that agencies cannot afford to stand still, and that creating a culture of experimentation—trying new solutions quickly, learning from failures, and moving on when necessary—will be critical to keeping pace with accelerating threats.
Dave Hinchman, Director of IT and Cybersecurity at the Government Accountability Office, and Wole Moses, Chief AI Officer, US Federal Civilian at Microsoft, discuss how infrastructure decisions determine whether AI can scale inside agencies. Dave explains that the federal government is still in the early stages of AI adoption. While tools and pilots are expanding, AI is not yet integrated broadly across operations, and agencies are still working through how to apply this technology in sustainable, scalable ways.
Wole explains that major cloud providers offer managed AI services that simplify many architecture decisions, including access to AI models, GPUs for training and inferencing, and tools for building AI applications. However, he stresses that agencies still must make key decisions about model selection and benchmarking, governance, observability, and how AI applications will integrate with existing mission systems. He notes that many of the highest-value AI deployments depend on integration with systems like case management and records platforms, meaning architecture planning must include connectivity and workflow integration from the beginning.
The conversation also focuses on integration strategies for legacy environments. Wole explains that agencies are building abstraction layers, such as REST APIs, to enable modern connectivity into older systems. He also points to AI-native connectivity approaches, including model context protocols, along with other integration technologies that can help connect systems that were never designed for AI-era workflows. These approaches can help agencies adopt AI while modernizing incrementally rather than waiting for full system replacement.
The conversation concludes with a forward-looking view of what agencies should be preparing for next. Wole encourages agencies to begin building awareness and readiness for agentic AI, moving from today’s model of AI assistants toward environments where AI agents operate as collaborative teammates—and eventually toward more autonomous scenarios. Both leaders reinforce that long-term success will depend on combining adaptable infrastructure, strong governance, cybersecurity discipline, and workforce training so agencies can scale AI responsibly while maintaining trust and mission performance.