Debugging API Performance Problems
In this article
A performance problem typically stems from a call to an endpoint taking longer than expected. Today, we have come to expect everything to be processed immediately. Whether you're tasked with taking over an existing API or developing one from scratch, these problems can rear their heads at any time. Over the course of a project, when an API is built piece by piece with the help of a team of developers, the impact of each modification can easily be lost in relation to the rest of the application.
Imagine having a set of micro-services responsible for implementing a Kanban board website. One service could be responsible for holding the information related to each card. Another service could be responsible for controlling the flow of the board. Another service could be responsible for notifications to users when a card moves. Any given user of the application does not fully know the level of complexity behind such a simple-looking website!
With that picture in mind, what do you do when a customer of the website mentions to the development team that loading the page takes longer than expected after the last release?
Now that we've set the stage for the issue at hand, let's walk through how to narrow down which areas we should start looking at. If you don't have in-depth knowledge of the entire application, it's great to think about the overall picture of the system. We have a website that communicates with several APIs to deliver its content. Breaking it down further, we have APIs that need to store information about the current state of each card, properties on each card, etc. Taking that knowledge, a great place to start by monitoring the application when it's running is to observe which calls are being made. If the APIs use SQL Server to store data, there is a perfect tool to do just that.
SQL Profiler is an application that is included with SQL Management Studio and provides this functionality. When running Profiler, you start by connecting to the database you want to monitor. Once you select a database and start a trace, you can now start using the application to monitor the queries issued against the database.
Analyzing the results
Now to make sense of the results. The main evidence we are looking for is how often does our application make queries against the database to provide the page to users? A sample query to group like-queries together could look something like what is pictured to the left.
With results like those below, we can now connect the dots with the operation that we performed in the application. If we were just loading the initial Kanban board and saw results like this, we have plenty of avenues to pursue.
From the results, it looks like we're calling to get the card information one card at a time then calling to get the different column states, and finally the different people assigned to the cards. If the database lived on the same machine, results like this would not be as big of an issue. Since the database lives on a different machine and we must query for these results over a network connection, there can be considerable delay causing the page to load much slower than expected.
At this point, we can look through the code to better understand how we are making these calls and how best to refactor them. In this case, the best solution would be to retrieve the cards from the database in a bulk operation to save on the number of round trips we have to make, resulting in a quicker page load for the user.