This blog post focuses on Microsoft’s Copilot product however some of the concepts I discuss here can be applied to any Copilot product or offering. As such, in an attempt to keep things clear where I am discussing something purely in the context of Microsoft’s Copilot I shall refer to it as “MS Copilot” while more for generic discussions I’ll simply refer to “Copilots”.
The explosion of Copilots
AI is undoubtedly proving itself to be a massive disruptive force into businesses (and society in general) right now. One of the ways this is most visible is through the introduction of Copilots, they are popping up everywhere.
For those living under the metaphorical rock for the past few months let’s just take a moment to explore what I mean when I say Copilot. Needless to say, I am not talking about the second/relief pilot of an aircraft, I’m not an expert but those have been around for decades and are not major disruptive forces in businesses.
No, the Copilot definition I’m talking about hasn’t even made it into the dictionary yet and is the term that is being applied to Generative artificial intelligence chatbots that use large language models to reason over data to produce text and other data outputs in response to the users input.
One of the biggest copilots in the context of Businesses is the development and introduction of Microsoft Copilot, previously known as Microsoft 365 Copilot. At NewOrbit we’re starting to make great usage of MS Copilot to find and reason with the vast amount of data we hold as a business within the Microsoft 365 systems we use such as Teams messages, Outlook emails and Office documents in Sharepoint.
Microsoft Copilot extensibility
In the last blog I looked at how you can use Microsoft Graph Connectors to expand the data and knowledge available to MS Copilot. In this post I’ll continue to dive into some of the other extensibility options offered by Microsoft, specifically “Plugins” which will give MS Copilot the ability to do things as well as simple querying data. But as you’ll discover I find that while appearing all fancy and shiny on the surface these plugins are probably not all that great while I also consider instead if my Human brain just isn’t sophisticated enough to appreciate the brilliance these plugins can offer.
Plugins
Microsoft allow the extension of MS Copilot via something called a “Plugin” which allow both querying of data as well as being able to perform actions on data and external systems. One such plugin is called an API plugin and is a mechanism developed to allow MS Copilot to interact with REST APIs via OpenAPI specifications.
Microsoft provide a great example and tutorial for API plugins using a sample Budgets api which I’ll refer to in this blog. The api is a basic one with five api endpoints that allow for the management of budgets and exposes endpoints to add and withdraw money from those budgets.
Not for plain old MS Copilot
The first observation I found about developing the API plugin is that they cannot be used to extend MS Copilot; instead they are to be used to extend declarative agents. So on the face of it they don’t fit my brief at all but it’s not all bad, in the previous blog post I discussed how one of our next steps would be to investigate developing a custom MS Copilot agent and that is all a declarative agent is, a custom agent except they require limited code but as you’ll see that does not necessarily translate to less work.
Great results at the surface
It should not be overlooked that setting up an API plugin is easy, especially when you have an api which is fully defined via OpenAPI with full summaries and descriptions for each endpoints; and on the face of it delivers great results, just see the screen grab below. With nothing more than a few descriptions taken from the openapi the MS Copilot agent has been able to comprehend the purpose of the api as well as the data it returns and tie that in with the users requests including grabbing the previous answer which included all the budgets names and utilising that to understand the users request to update the “plugin budget”. Plus it has been able to work out that it’s needed to do a little arithmetic to link the request to increase a budget to a specified amount in order call the available endpoint to increase the budget by a specified amount.
Beyond the surface
Beyond the surface however we can quickly start to see where the limitations might lie. In my previous example I asked the AI to increase the plugin budget to $60,000 and the AI correctly worked out that this would be an increase of $10,000 and correctly use the full budget name of “Contoso Copilot plugin project”. Look at what happens however if we ask it the same thing but in a fresh chat window without asking it to list the budgets first.
As you can see it doesn’t do quite so well, at first it thinks I am asking for a new budget to be created with a budget of $60,000 and then when I try again and tell the AI to increase the budget by an amount it understands my request but it takes the budget name I’ve given it on face value rather than understanding the true budget name that it will need to use.
It might not all be lost
That’s not the end of the story though because remember we’re looking at the out of the box example, plenty of room for improvements.
The main place for improvements is in the api plugin manifest file which allows for a greater breadth of instructions to the AI that can help it understand the endpoints (or functions) available to it and give context to the AI as to when certain functions might be unsuitable.
So, let’s try adding some instructions
to these functions to help avoid the two issues we saw last time:
{
"name": "GetBudgets",
"description": "Returns details including name and available funds of budgets, optionally filtered by budget name",
"states": {
"reasoning": {
"instructions": [
"This function is used to get all budgets or budgets based on budget name, only use this function with a parameter when you are certain on the EXACT budget name, the user must have been explicit that they have given you the EXACT budget name (for example placing the budget name in quotes), otherwise use it without any parameters and find the closest budget name to the search term you've been given.",
"The function takes a budget name as input and returns details including name and available funds of budgets, optionally filtered by budget name."
]
}
}
}
As you can see I’ve added a load of instructions to try and help smooth out some of the misgivings the AI previously had. These new instructions include telling the AI to not use the CreateBudget
to increase a budget as well as adding an instruction to check the budget name first.
Let’s see how that gets on then:
Actually, I don’t need to include another example here, the AI did the same again. Not great.
To do two or not do two
Let’s look at another example here where I ask the Bot to transfer some money from one budget to another budget.
As you can see it works out that it needs to add some budget to the website budget but does not remove budget from the plugin budget, once again I try to add some helpful instructions to help it out:
{
"name": "ChargeBudget",
"description": "Charge an amount to a budget with a specified name, removing the amount from available funds",
"states": {
"reasoning": {
"instructions": [
"This function is used to charge an amount to a budget with a specified name, removing the amount from available funds.",
"The function takes a budget name and an amount to charge as input.",
"The function returns a message indicating the success or failure of the operation.",
"You should only call this function if you have a valid budget name and amount to charge, ensure that you validate the budget name exists or find the correct budget name using the `GetBudgets` function first (the user may have not used the exact budget name but has instead given you a close enough string).",
"If you are asked to transfer an amount to a budget, use this function in conjunction with the the `ExtendBudget` function, you MUST use **both** functions to complete a transfer, otherwise state that you are unable to transfer money."
]
}
}
},
{
"name": "ExtendBudget",
"description": "Add an amount to a budget with a specified name, adding the amount to available funds",
"states": {
"reasoning": {
"instructions": [
"This function is used to add an amount to a budget with a specified name, adding the amount to available funds.",
"The function takes a budget name and an amount to add as input.",
"The function returns a message indicating the success or failure of the operation.",
"You should only call this function if you have a valid budget name and amount to add, ensure that you validate the budget name exists or find the correct budget name using the `GetBudgets` function first (the user may have not used the exact budget name but has instead given you a close enough string).",
"If you are asked to charge a budget, use this function in conjunction with the the `ChargeBudget` function.",
"If you are asked to transfer an amount to a budget, use this function **in conjunction** with the the `ChargeBudget` function, you MUST use **both** functions to complete a transfer, otherwise state that you are unable to transfer money."
]
}
}
}
But, once again, this is fruitless! So I try being more explicit and verbose in my request to the AI and ask it simply to remove money from one budget and add some money to another.
As you can see even this is too much for the AI and we have the same result which leads to my belief that these agents simply cannot call two functions at once. Quite a limitation.
Summary
Let’s summarise where we are and what we have then. API plugins for MS Copilot seem a great and easy way to take existing apis and integrate them into MS Copilot (overlooking the fact they cannot hook into the default MS Copilot agent). Many existing APIs will already have openapi definitions so will easily fit in and they can be further enhanced into MS Copilot via additional instructions so on the face of it they look great.
The limitations however are numerous, one such example is that if it wants to query things it’s forced to use the api available to it which usually means the user must be quite explicit because it can’t do the usual loose retrieval/lookup that we’ve coming to expect from RAG powered LLMs. I did investigate trying to combine the graph connector I’ve discussed in the previous blog with an API plugin but my experience was that even when instructed via the instructions
the Agent was not able to tie together the items found in the connector with the operations available to it via the plugin leaving the user only able to query the data via the limited API we can code for in Openapi.
The more important limitation is that the API plugins can only perform one operation at once which is highly limiting; as things stand we’ll never be able to fully harness the power of AI to reason over the multitude of things available via an api and offer complex and highly valuable operations that a human would never think to perform.
Having said all this there is another side to all this that is worth closing this blog on. I started writing this blog in December 2024 and a Twitter/X tweet/post caught my eye. Now while Matt Pocock is a self confessed Typescript “Wizard” he is not an AI expert but I think his post still complements a key closing thought of mine.
The first rule of building something with AI:
— Matt Pocock (@mattpocockuk) December 20, 2024
If it's possible without AI, DON'T USE IT.
Lot of folks getting this wrong right now.
The examples I’ve worked through today have been simple and relatable. In my example a “relatable” action has been to transfer money between budgets. But, the api can do that already very easily via two endpoints and most if not all end users will understand which endpoints need to be used to transfer money between budgets so they likely will simply do that. Plus in the real world there is likely already a UI for the api so adding it to Copilot is not really going to add much value.
So with that thought in our head, has my example been a poor test in which to evaluate the API plugin functionality; Have I been unfair? Is there immense power in these plugins that my human brain simply cannot imagine or comprehend (and thus I cannot demonstrate it here)?
I don’t know the right answers to these questions, if there are “right” answers to these questions at all. My opinion is that while these plugins cannot perform multiple operations at once they do not possess this “immense” power I describe above. Once they can perform multiple operations then perhaps, just maybe, there might be something worth pursuing here.