Designing a Robust Data Architecture: Capturing data in streaming and batch mode, then storing it in Data Lakes and databases.
Setting up an ETL Process: Configuring ETL pipelines for extracting, loading and transforming data.
Real-time Data Analysis and Anomaly Detection: Implementing solutions for data analysis and anomaly detection, with a particular focus on credit card fraud prevention.
Platform
For this project, we evaluated the three main public cloud providers: Azure, AWS and GCP, to identify the best tools and services to meet the specific needs of the banking data streaming system.
Used Services
Here are the services used in each Cloud Provider
Data Streaming :
Azure Event Hubs
AWS Kinesis
GCP Pub/Sub
Data Lakes :
Azure Data Lake Storage Gen2
AWS S3
GCP Cloud Storage (GCS)
API Management :
Azure API Management
AWS API Gateway
GCP API Gateway
No-SQL Databases :
Azure CosmosDB
AWS DynamoDB
GCP Bigtable
Visualization :
Power BI (PBI)
Tableau
Metabase
Anomaly Detection :
Azure Stream Analytics
AWS Kinesis Data Analytics
GCP BigQuery
Conclusion
This project audited and compared the capabilities of the three main public cloud providers for banking data streaming. Using a combination of streaming services, storage, API management, NoSQL databases, visualisation and anomaly detection tools, we were able to design a robust and efficient data architecture.
Our expertise in evaluating and implementing cutting-edge technologies has enabled us to provide a comprehensive and adaptable solution that meets the critical needs of fraud detection and compliance analysis in the banking sector.