It’s hard to read the news or browse social media without hearing the words “Artificial Intelligence.” OpenAI’s 2022 announcement of ChatGPT pushed artificial intelligence firmly into mainstream consciousness. Already a huge number of companies have begun offering products that claim to either use AI, be created by AI, or are AI systems to be used in other products. It’s not hyperbolic to say that if your company is in the tech space, and it doesn't have the word AI on its website home page or domain name, it’s in the minority.
At the heart of GPT and related offerings are large-language models (LLMs) that are capable of understanding natural language and complex tasks. Unlike previous advancements in machine learning, these models are freely available through APIs, so organizations no longer need to hire a team of PhD level data scientists and engineers spend years gathering data and building and testing and iterating on models, and then millions more deploying and maintaining them. Open source frameworks like Langchain and LlamaIndex have helped make utilizing LLMs even easier to implement, and have helped regularize common patterns like Retrieval Augmented Generation (RAG).
One indicator of just how easy it is to integrate LLMs into existing products are the “AI Assistant” buttons that seem to be popping up in products everywhere, from sales and marketing websites to in-application help in products like Salesforce and Zoom. And if you tune into social media streams like LinkedIn, you’ll be inundated with advertisements for products that use LLMs to do everything imaginable, from customer support, to analyzing medical records, offering psychiatric care, and helping you play the stock market (alarmingly, often without any discussion of risk).
Given the flurry of adoption, CIOs are faced with intense pressure to develop an AI strategy. A key component of any IT strategy is whether to build and integrate internally or develop with a partner.
Common questions to be answered include:
-
Is integrating LLM technology just another product feature to roll out?
-
Is achieving AI success as simple as creating a nice UI, and plugging in another API?
-
OpenAI, Microsoft, Google, Anthropic and other players have heavily invested in trying to make these models trustworthy, but are they trustworthy for your use cases (more on this below)?
If the answers to these questions are mostly yes, IT leaders could conclude the best strategy is to have internal teams do it, possibly leveraging the AI/LLM features that are also being made available by the big, incumbent software companies that are already in place. The build versus buy decision has been debated for decades and the path forward with AI is now part of that discussion.
Conversely, there are literally thousands of companies that have emerged that are 100% focused on AI and making it work across the full spectrum of industry and process categories. These companies have very bright, hungry innovation teams that can push the AI envelope in ways that big incumbent software companies will struggle to keep up with.
One of the reasons the build versus buy debate continues to rage in the world of technology is there’s really not a “right” answer. Each approach has benefits and peril. And, over time as the hype settles, one approach may emerge.
To really make an informed decision, it’s worth reviewing how AI systems can fail, how they are evaluated by regulators, and how organizations will need to systematically manage these risks.
Here are some things to consider:
-
Brand, Reputational and Financial Risk, Bias: AI chat functionality gone rogue can lead to reputational and financial risks. Models can also perpetuate unfair biases without human guidance and/or extensive safeguards. How do you measure and prevent this?
-
Validation, Maintenance, Measurement: Closely related to these considerations: how do you ensure the AI system is operating during and after development? To deploy AI safely, you need MLOps and LLMOps to ensure that AI models you develop (whether developed in house or from third parties) are working as intended. This includes adapting your existing approaches to change management and software development lifecycle (SDLC), operations, and monitoring. For example, if you’re building a system Retrieval Augmented Generation (RAG), how does it perform based on RAG Triad metrics, and how does performance change as LLM models and configuration evolve?
-
Data Quality, Data Pipelines: If you’re training an AI model, you will need data. Even if you are not training your own model, and using an off-the-shelf foundation model like GPT-4, you need to understand the provenance of its training data and limitations. After training and deployment, you will need to provide data to your model, either directly from users or from datasets like your own documents and databases. How do you ingest them, ensure accuracy, and deal with new versions and change? How do you measure the performance of queries against that data and improve them over time?
-
Security, Access Control: Models can be poisoned, stolen, or misused in a huge variety of ways. This includes revealing personal information in training data of the underlying LLMs, as well as opening the possibility of privilege escalation in the context of AI agents. Adversaries are getting smarter; does your internal security team have the knowledge and resources to keep up? And how do you control access to the underlying data the LLM is consuming and presenting back?
-
Licensing, Vendor Lock-in: Using third-party AI services directly can get expensive with usage limits. Is GPT-4 the right solution for every use case? Or could a smaller, less-capable, but cheaper open source model work? If so, how do you evaluate the trade-offs while maintaining performance? Is being locked into OpenAI a risk?
-
Hidden Costs: In-house teams need cloud resources, GPUs, etc. If you decide to implement your own model, or fine-tune an existing model, you will potentially need a large investment in deploying and maintaining the infrastructure to build and maintain these capabilities.
-
Compliance: Current and planned regulations impact how AI systems are categorized and controlled. Do you have a framework in place that can help you manage the complexity of the system as your internal team implements it, and prove to regulators you’ve adequately considered risk (for example the NIST AI Risk Management Framework)?
The current AI revolution seems to be following a pattern familiar to many who lived through previous technology revolutions like cloud computing and big data: an early focus on speed-to-market followed rapidly by concern over the ramifications. Evidence of this includes current and planned national regulations like the newly adopted European Union AI act, President Biden’s executive order on AI, state regulations in Texas, Connecticut, and Illinois, and even local regulations in Seattle and New York City.
Another parallel: with increased risk comes huge opportunities. CIOs, decision makers, and the legal, security, privacy, and compliance teams that support them will need to closely consider all of the above when considering AI strategy, to best capture these opportunities and set their organizations up for success.
One thing is for certain, AI and LLMs are here to stay and companies must develop a strategy to implement, experiment, tune, and learn. Some will opt to build it themselves, others will engage partners and consultants, and others will turn to startups to blaze a new path forward.