The last posts covered blueprints of a money laundering
solution for Microsoft Azure and AWS. You can check the two posts if you want
to find more about the proposed solutions below:
- Blueprint of a transaction monitoring solution on top of AWS and custom ML algorithm for money laundering
- Blueprint of a transaction monitoring solution on top of Azure and custom ML algorithm for money laundering
- Comparison of a money laundering solution build on top of AWS and Azure
The blueprints were designed, having in mind two key
elements.
- 1. Reduce the operation and management cost at minimal
- 2. Reduce at minimum the configuration and development costs by using out-of-the-box cloud services as much as possible
In this post, we will make a comparison between what
services we used for each layer, trying to identify the key differential
factors between each of them.
Ingest
For this purpose, we used Azure Event Hub and AWS Kinesis
Data Stream. Both services are capable of ingesting an event stream of data
over HTTPS. Azure Event Hub has the advantage to support AMQP protocols that
can offer us a stateful connection between the bank data centres and us. On the
other side, using Amazon Kinesis, we can push payload that has 1MB in
comparison with Azure Event Hub where the maximum size of a single request is
limited to 265KB.
The audit and logs of events are kept inside Azure Event Hub
for maximum 7 days, in comparison with AWS Kinesis where are limited to maximum
1 day. This can be an essential factor at the moment when our backend system is
throttled, and we need to define NFR related to the recovery time.
Inside AWS Kinesis Data Stream, you can't define multiple
consumers. You can have only one consumer of the stream. Azure Event Hub
provides us with the concept of consumer groups, allowing us to extend the
platform and hook other listeners the stream of audit and logs.
The monitoring capabilities of AWS Kinesis is fully
integrated with AWS CloudWatch that enables us to monitor the instances. Azure
Event Hub has only basic monitoring capabilities that are well integrated with
Azure Monitoring.
Aggregator
To be able to aggregate audit streams from multiple locations,
we would use AWS Kinesis Data Firehose or Azure Event Grid. At this moment in
time, Azure is offering to us Azure Event Grid that can be used with success to
aggregate content from multiple Azure Event Hubs, combining the data streams
with content from other sources and push the output to another instance of
Azure Event Hub.
This service enables us to aggregate in one stream of data
all the transactions and execute data filtering. Situations when logs or audit
data are not in the right format from all the datacenters, it’s a common
situation, especially when the system lifecycle is independent in each subsidiary.
This can be handled easily using a filter that can push the invalid content to
another stream where data transformation can be executed.
AKS Kinesis is not supporting the capability to merge
multiple Kinesis streams. A classical approach using AWS Lambda is available
but requires substantial computation power, and extra cost for the implementation
in comparison with a solution build on top of Azure Event Grid. Because of this,
the approach build on top of AKS relies only on one instance of AKS Kinesis Data
Stream. To avoid the use of AWS Lambda between multiple instances of ASK Kinesis
Data Streams.
Custom ML algorithm
Inside Azure, we can combine Azure Stream Analytics and
Azure Machine Learning service in one solution, where the stream of data is
processed inside Azure Stream Analytics applying a custom machine learning solution
hosted and running inside Azure Machine Learning. There is no need to store the
data stream inside storage before applying the data analytic part.
For Machine Learning part of AWS package, we can use SageMaker
- that is enabling us to run our custom ML algorithm. Before fetching data to SageMaker, it is required
for us to push the transaction information from AWS Kinesis to AWS S3.
At this moment in time, SageMaker can consume data only from
S3, in comparison with Azure Machine Learning that it is well integrated with Azure
Stream Analytics. We are forced to push the output of Kinesis Data Firehose to
AWS S3, because of a current limitation of AWS SageMaker that can consume
content only from AWS S3 Storage.
Both Machine Learning solutions are robust and flexible,
offering a great experience from the UI and UX. From some perspective, you
could say that Azure Machine Learning is more for citizen data scientists in
comparison with SageMaker that looks like a solution closer to the developers
and data scientists. At the same time, don’t forget that both of them are
supporting R language and Jupyter Notebook interfaces are available (SageMaker)
or can be imported (Azure Machine Learning). From a features perspective, both
platforms are supporting anomaly detection, ranking, recommendation and many
more.
Laundering alerts
For transactions that are marked as suspect, both blueprints
are using a based messaging solution to send a notification to other systems.
For Azure, we used an ESB service offered out of the box – Azure Service Bus,
that provides us with the capability to use the subscriber concept. In this way,
we can hook multiple subscribers dynamically. For enterprise applications,
where a reliable message broker is required, Azure Service Bus is the way to do
it. It is a good starting point solution for most of the situations.
Similar to Azure, AWS is offering a wide variety of options
related to messaging solutions. Inside the blueprint, the initial concept was
based using one AWS SQS instances, where the AWS Lambda that consumes the
messages would decide if the subsidiary system needs to be notified. The
solution can be extended with success using AWS MQ that it provides similar
functionality of an ESB system.
Alerts processing
In both solutions, we rely on serverless architecture that
can scale dynamically, and we pay only for what we consume. On AWS, we rely on
Lambda functions to run our logic. They are flexible enough to provide us with connectivity
with almost any kind of internal or external system. The same functionality is
implemented inside Azure using Azure Functions. From an implementation point of
view, there are differences related to how we need to design our serverless
solutions. Inside Azure Functions, we need to work with bindings, triggers in
comparison with AWS where the hooks are done using handler and context. The
concepts are similar, but the way how you can use it is a little different.
Both services are offering similar functionality and set of
features. The services are evolving so fast, that comparison them becomes absolute
in just a few days. We need to remember that even if they are offering the same
functionality, the way how you implement the logic inside AWS Lambda is different
from the way how you do it inside Azure Functions, mainly because of the
handles, bindings, triggers and context.
Data store
The blueprints are using a NoSQL approach using documents. Azure
Cosmos DB and AWS DynamoDB are canonical databased offered by the two cloud providers.
You will find that Azure Cosmos DB offer includes multiple types of database
model not only document store and a key-value store. This does not mean that
inside AWS you don’t have an option to graph storage or wide column store. AWS
offers them as a different service – AWS Neptune and AWS Cassandra.
The most significant difference between them is the consistency
model. Inside AWS DynamoDB there are only 2 consistency levels in comparison
with Azure Cosmos DB where there are 5 different levels. Inside Azure, the cost
of running the solution with different consistency levels is the same in
comparison with AWS DynamoDB, where the impact of the consistency level can be identified
in the running cost of the service.
Dashboard
The dashboards are hosted inside AWS EC2 and Azure Web Apps.
Azure Web Apps is offering a more SaaS approach, where you need to deploy the
web application and you are done. The configuration, infrastructure and
security part are managed by Azure by default, and the development team can focus
on the content. Inside AWS, even if you are using EC2, you already have
templates that can configure your machine with the right distribution of Linux and
Apache.
Another approach on the web interface would be to use microservices
when you could host your web application inside AWS ECS or Azure Kubernetes
Services. The blueprint does not cover the load balancer and caching topic,
where both providers are offering strong solutions Application Load Balancer
together with Route 53 inside AWS or Azure Load Balancer and Traffic Manager
inside Azure.
Bots support
For bot integration, the blueprint proposed SaaS solution from
both providers – AWS Lex and Azure Bot Service.
AWS Lex is well integrated with the AWS ecosystem and supports
a wide variety of technology stacks with text and speech support. The
integration with AWS Lambda and mobile applications offers an out of the box
solution for our case. The limitations that could affect our business use cases
is the multilingual support and the data set preparation and mapping it’s not
so easy. When you need AWS Lex inside a web application things could be slowed
that you would expect at the beginning.
The bot frameworks offered by Microsoft inside Azure has
support for multiple languages, it is well integrated with multiple channels,
and it’s built on top LUIS API. The documentation is excellent, and the ramp-up
phase is much faster in comparison with AWS. The biggest drawback is the SDK limitation,
where we have libraries for C# and Node.JS, for the rest we need to use REST
API.
Reporting Tool
Both blueprints are using reporting capabilities offered out-of-the-box.
PowerBI is well known inside Microsoft ecosystem, being well integrated with different
systems. It’s super easy to use, and the ML layer that can analyse your report
data can help users to discover insights that they were not aware until then.
Amazon QuickSight is offering similar capabilities like
PowerBI, including ML capabilities. The interesting fact is the payment mechanism
that is per session. From the learning curve, the Amazon solution is more
easilty to use, but the number of features is not high.
Blueprints comparison
Looking at the blueprints, they are similar from the type of
cloud services that were used as building blocks. The functionality and capabilities
of each cloud provider are almost the same. The differences are not high and
you can accomplish the same things using both cloud providers. You need to be
aware of the small differences that might affect the way how you design your
solution. A good example is connectors supported by AWS SageMaker and Azure Event
Grid equivalent inside AWS.
Comments
Post a Comment