Skip to main content

From memory to evidence: Understanding Cloud Usage at Scale

 With over 1,800 cloud projects delivered by one organization, it’s hard to answer a simple question: what cloud services are actually used where? As projects grow, teams change, and repositories multiply, tracking which AWS or Microsoft Azure services are used in each project, or across industries, becomes a real challenge. This knowledge often exists only in people’s heads or scattered across codebases, making it harder to find than expected.

For me, this is not just curiosity. I needed this information for three practical reasons.
First, I want to understand patterns across industries and customers. One customer uses a lot of storage and messaging. Another one is more about identity, monitoring, and eventing. When you have many repos, your brain cannot keep this map in its head.
Second, RFP questions can be very specific. They don’t ask “Do you have cloud experience?” They ask, “How many projects have you delivered with DynamoDB?” or “Do you have experience with Azure Functions and Event Grid?” and they want an answer fast. Doing a manual repo audit every time is slow.
Third, cloud vendor competencies (or similar programs) often require evidence. It’s not enough to say “we used this service. You need to point to real projects and prove you have done the work.
So I decided to build a Python app to help me in this journey. You can find it here: https://github.com/vunvulear/Cloud.ServiceAggregator


The goal is simple: give me a quick report of what cloud services appear to be used in a repository. The app can scan the repo and search for signals like SDK imports, IaC templates, configuration files, deployment scripts, and naming patterns. It’s not magic, and it’s not trying to “understand” the whole architecture. But it is good at the boring part: collecting hints and turning them into a list you can review.
What I like the most is the feeling of clarity. Rather than jumping between folders and guessing, I get a structured output I can share with teammates. It becomes easier to say: “This repo touches these AWS services, and these Azure services.” If I need more details, I can go deeper, but at least I start with a map.
Additionally, it changes the conversation. When someone asks, “Do we have experience with X?”, I can answer with data, not only with memory. Furthermore, when preparing for vendor programs, I can filter repos that use a target service and focus on the best examples.
I’m still improving it. Service names are varied, and code styles differ. Sometimes a repo uses a service but hides it behind an internal wrapper, so detection is harder. But even with this limitation, the tool already saves time and reduces stress as deadlines approach.
If you have the same problem, feel free to try it and adapt it as needed. I encourage you to share your ideas, whether it's new detectors, better reports, or innovative formats. I'm eager to collaborate and learn. My goal with this project is simple: make cloud knowledge easier to find before you urgently need it. Join me and let's improve it together!

Comments

Popular posts from this blog

Why Database Modernization Matters for AI

  When companies transition to the cloud, they typically begin with applications and virtual machines, which is often the easier part of the process. The actual complexity arises later when databases are moved. To save time and effort, cloud adoption is more of a cloud migration in an IaaS manner, fulfilling current, but not future needs. Even organisations that are already in the cloud find that their databases, although “migrated,” are not genuinely modernised. This disparity becomes particularly evident when they begin to explore AI technologies. Understanding Modernisation Beyond Migration Database modernisation is distinct from merely relocating an outdated database to Azure. It's about making your data layer ready for future needs, like automation, real-time analytics, and AI capabilities. AI needs high throughput, which can be achieved using native DB cloud capabilities. When your database runs in a traditional setup (even hosted in the cloud), in that case, you will enc...

How to audit an Azure Cosmos DB

In this post, we will talk about how we can audit an Azure Cosmos DB database. Before jumping into the problem let us define the business requirement: As an Administrator I want to be able to audit all changes that were done to specific collection inside my Azure Cosmos DB. The requirement is simple, but can be a little tricky to implement fully. First of all when you are using Azure Cosmos DB or any other storage solution there are 99% odds that you’ll have more than one system that writes data to it. This means that you have or not have control on the systems that are doing any create/update/delete operations. Solution 1: Diagnostic Logs Cosmos DB allows us activate diagnostics logs and stream the output a storage account for achieving to other systems like Event Hub or Log Analytics. This would allow us to have information related to who, when, what, response code and how the access operation to our Cosmos DB was done. Beside this there is a field that specifies what was th...

[Post Event] Azure AI Connect, March 2025

On March 13th, I had the opportunity to speak at Azure AI Connect about modern AI architectures.  My session focused on the importance of modernizing cloud systems to efficiently handle the increasing payload generated by AI.