From memory to evidence: Understanding Cloud Usage at Scale

With over 1,800 cloud projects delivered by one organization, it’s hard to answer a simple question: what cloud services are actually used where? As projects grow, teams change, and repositories multiply, tracking which AWS or Microsoft Azure services are used in each project, or across industries, becomes a real challenge. This knowledge often exists only in people’s heads or scattered across codebases, making it harder to find than expected.

For me, this is not just curiosity. I needed this information for three practical reasons.

First, I want to understand patterns across industries and customers. One customer uses a lot of storage and messaging. Another one is more about identity, monitoring, and eventing. When you have many repos, your brain cannot keep this map in its head.

Second, RFP questions can be very specific. They don’t ask “Do you have cloud experience?” They ask, “How many projects have you delivered with DynamoDB?” or “Do you have experience with Azure Functions and Event Grid?” and they want an answer fast. Doing a manual repo audit every time is slow.

Third, cloud vendor competencies (or similar programs) often require evidence. It’s not enough to say “we used this service. You need to point to real projects and prove you have done the work.

So I decided to build a Python app to help me in this journey. You can find it here: https://github.com/vunvulear/Cloud.ServiceAggregator

The goal is simple: give me a quick report of what cloud services appear to be used in a repository. The app can scan the repo and search for signals like SDK imports, IaC templates, configuration files, deployment scripts, and naming patterns. It’s not magic, and it’s not trying to “understand” the whole architecture. But it is good at the boring part: collecting hints and turning them into a list you can review.

What I like the most is the feeling of clarity. Rather than jumping between folders and guessing, I get a structured output I can share with teammates. It becomes easier to say: “This repo touches these AWS services, and these Azure services.” If I need more details, I can go deeper, but at least I start with a map.

Additionally, it changes the conversation. When someone asks, “Do we have experience with X?”, I can answer with data, not only with memory. Furthermore, when preparing for vendor programs, I can filter repos that use a target service and focus on the best examples.

I’m still improving it. Service names are varied, and code styles differ. Sometimes a repo uses a service but hides it behind an internal wrapper, so detection is harder. But even with this limitation, the tool already saves time and reduces stress as deadlines approach.

If you have the same problem, feel free to try it and adapt it as needed. I encourage you to share your ideas, whether it's new detectors, better reports, or innovative formats. I'm eager to collaborate and learn. My goal with this project is simple: make cloud knowledge easier to find before you urgently need it. Join me and let's improve it together!

Cloud as a Story - Vunvulea Radu

Search This Blog

From memory to evidence: Understanding Cloud Usage at Scale

Comments

Post a Comment

Popular posts from this blog

How to audit an Azure Cosmos DB

Why Database Modernization Matters for AI

Azure Well-Architected AI workload Assessment