Technologies used

Kubernetes

Kubernetes

Talend

Talend

Jupyter Notebook

Jupyter Notebook

YARN

YARN

Sqoop

Sqoop

Telecom

Intelligent Data Platform for a multi-national telecom operator

Impact by the number

 

2X

reduction in the cost of data analytics

75+

use cases implemented

20TB+

data migrated from 50+ diverse sources into IDP

Telecom

Intelligent Data Platform for a multi-national telecom operator

The challenge

A telco wanted to build an analytics infrastructure to enable a 360-degree real-time view of customers, business operations and to reduce the cost by sunsetting the expensive legacy systems after migration of data and reports into the new data platform.

The solution

Addo designed and provisioned a modern data platform architecture, and optimized it for performance, scalability and cost-efficiency. The goal was to build a data analytics platform to support real-time insights and decision-making.We followed best practices to design and deploy an HA-enabled Private Cloud Infrastructure that offers zero downtime. The platform was to serve as the centralized data analytics platform for various business units, thereby enabling efficient decision making for business users, and providing support for data science and artificial intelligence workloads.

The solution included: 

  • Data Management; Data modeling and governance set up for sales, marketing, CDRs and billing domains
  • Data Migration; We migrated data of about 20TB+ from 500+ tables from 15+ source systems to sunset legacy systems. We carried out ETL/ELT identification for Data Migration from Teradata to Big Data and performed Data Sourcing from source systems to HDFS->Hive->Spark->Hive->Vertica.
  • Storage and Analytics; We implemented a multi-tiered data lake to store and curate data, and set up an EDW for self-service analytics, ad hoc reporting and canned reports. Multi-tiered storage enabled query and cost optimization. MDM was implemented on the semantic layer for multiple domains. 
  • Data Virtualization; Schema-on-read technique was leveraged to conceptually organize and combine physically stored data in Cloudera/HDFS and HP Vertica making it available via a virtualization layer 
  • Data Integration; We integrated data from different autonomous and heterogeneous data sources using the Talend Data Integration solution, the following components were used:
      • Talend Studio: Talend Studio was used to write and design jobs
      • Talend Source Repository was used to Store Job Artifacts
      • Talend Real-time big data platform: acted as an ETL/ELT virtualization layer
      • Talend Remote Engines were setup on Red Hat OpenStack Nova compute Engines
      • Talend Cloud was used to manage, run tasks and plans, administer projects, users, and user roles, and manage execution engines
  • Big Data Solutions; Big data services were implemented using Cloudera, the following products and services of Cloudera were used in this project: 
      • Cloudera Manager: used to deploy, manage, monitor, and diagnose issues with CDH deployments
      • Apache Spark: interface for programming entire clusters with implicit data parallelism and fault tolerance
      • Apache Hadoop: a software framework for distributed storage and processing of big data using the MapReduce programming model
  • Private Cloud Infrastructure; We designed and set up a zero-downtime HA-enabled Big-Data architecture with 45 Nodes cluster configurations & performance tuning as per best practices. Hardware Resizing of Cloudera was done as per best practices over available hardware. Data modelling was done within Cloudera tools and technologies. Kerberos and Sentry were used on CDH/CDS Clusters.
  • BI Reports; Data Virtualization layer was enabled and integrated seamlessly with BI reportings for improved access management and governance
  • Managed Services; We provided production support by setting up a service help desk using Jira to track issues and implementing 24/7 monitoring and logging. We Instituted an L1, L2 & L3 support engineering environment and implemented two core sets of service levels for software support which were Response Time and Resolution Time

The Addo team comprised of the following specialised roles: Enterprise Delivery Owner, Project Manager, Project Coordinator , Scrum Master, Solutions Architect, Principal Data Engineer, Data Modeler, DevOps Engineer, Data Engineer, QA Engineer, Prod/Live Team (Software Engineers), Service Delivery Manager, Senior System Support Engineer, L2  System Support Engineer, L1  System Support Engineer

The results

The built data analytics platform enabled business users to gain real-time insights for decision making and a 2x reduction in the cost of data analytics infrastructure and licensing was observed. Within two years of platform launch, 75+ use cases have been implemented successfully and a daily volume of ~25 TB is generated for the data platform from over 50 source systems.

Looking for a
similar project?
Let's talk

Related case studies

Telecom

Automated quality audit for a leading Asian telecom firm

Read more

Telecom

Incident automation for a large telecommunications firm in Asia

Read more