Skip to main content

The Client Domain.

One of the major insurance companies is seeking assistance in driving their AI/ML initiatives, which includes the provision of high-quality training data and the automation of business processes through easily deployable ML models and integrated human-in-the-loop workflows.

To support their business expansion strategy, the client aimed to elevate their data transformation services to a new level. This required the implementation of a highly scalable, cloud-native data platform capable of generating near real-time reports and serving as a foundational service for AI.

The client’s leadership envisioned the new platform as a solution to address the architectural and technological limitations of their legacy monolithic system, which was implemented during the early stages of the company. Their objective was to enhance data quality and improve the scalability of data collection, processing, and reporting services.

The project encompasses the following key objectives:.

Highly Scalable Solution: The solution aims to support up to 1,000 parallel workflows, enabling efficient data collection, processing, and reporting operations at scale.

Real-Time Analytics: An efficient analytics solution will be implemented, empowering data annotation teams and business units to generate data reports in real time, enabling timely insights and decision-making.

Advanced Architecture: The solution will leverage microservices and incorporate specific ML models within its data processing workflows. This advanced architecture will enhance the system’s capabilities and optimize data processing efficiency.

Collaboration with iPivot: The client has partnered with iPivot to develop the solution on the AWS platform. iPivot’s NextGen Data Platform will serve as the foundation, with customization and fine-tuning performed to align with the specific requirements of the client’s business.

By combining iPivot’s expertise with AWS infrastructure and technologies, the aim is to create a scalable, real-time analytics solution that leverages advanced architecture and tailored workflows to meet the client’s unique needs.

Building a Next-Generation Data Platform with a Data Lake and Advanced Real-Time Analytics

To support Client’s data transformation goals, iPivot initiated the development of a new data platform by conducting a series of discovery workshops. These workshops aimed to evaluate the existing operations, infrastructure, and architecture of Client’s applications, while also identifying the key performance indicators (KPIs) necessary for the success of the platform.

During this assessment phase, iPivot uncovered various architectural and infrastructural limitations within Client’s legacy monolithic solution. These limitations impeded the data annotation and analytics teams from effectively processing data at scale and generating real-time reports. Such restrictions led to increased overhead, inefficiencies, higher total cost of ownership (including the cost of service for Client’s clients), and hindered Client’s data expansion efforts in the AI market.

To address these challenges, iPivot proposed the development of a new data platform on the AWS infrastructure. This platform would incorporate a data lake and advanced capabilities, enabling near real-time analytics and report generation. By leveraging the scalability and flexibility of AWS, iPivot aimed to create a solution that overcame the limitations of the legacy system and positioned Client for enhanced data processing, analytics, and market competitiveness.

The engineering team at iPivot undertook the design and development of scalable data pipelines utilizing Apache Kafka, Amazon S3, and Amazon ECS. Thorough testing and fine-tuning were conducted before migrating Client’s non-relational data streams and workloads to these new pipelines.

For the creation of a cloud-native data lake, iPivot leveraged AWS Glue, Amazon Athena, and Amazon S3. These services were selected for their flexibility, scalability, and efficiency, empowering the Client’s teams to handle data faster and on a larger scale.
To meet the scalability requirements, iPivot decomposed the monolithic data workflows into specific microservices, utilizing Kafka Streams for their implementation. Apache Kafka served as a central communication hub between the microservices and databases. As part of the migration strategy, the Change Data Capture (CDC) mechanism from PostgreSQL was utilized as the initial data feed for consuming microservices.

The initial testing of the proposed solution demonstrated both linear scalability and significant performance improvements. This reinforced the effectiveness of the architectural decisions and validated the capability of the new data platform to handle larger volumes of data efficiently while ensuring seamless scalability for future growth.

Key Products/Services Used.

Faster, More Efficient Data Processing and Analytics Drive Efficiencies and Accelerate Client’s Business Expansion

Client’s growth potential was hindered by outdated technologies powering its data services, resulting in limitations in accelerating and scaling data transformations. As a trusted partner, iPivot played a vital role in helping Client overcome these technological barriers, leading to reduced overhead costs, operational efficiencies, and accelerated business growth.

Leveraging our expertise in developing cutting-edge streaming data platforms, iPivot designed and implemented a new, highly scalable cloud-native data platform for Client. This platform integrates a data lake with advanced real-time analytics capabilities, enabling Client to accelerate and scale its data transformation services while equipping its teams with powerful tools for near real-time data analytics and reporting.

With the implementation of the new data platform, Client’s end-to-end data pipelines now operate five times faster, resulting in a 20-fold increase in throughput for data annotation jobs and data rows. Additionally, the time required to generate a report on processed data has been dramatically reduced from almost 30 minutes to less than a minute. These impressive outcomes are made possible by the advanced architecture of the data platform, ensuring linear scalability and significant performance improvements.

Client’s growth potential was hindered by outdated technologies powering its data services, resulting in limitations in accelerating and scaling data transformations. As a trusted partner, iPivot played a vital role in helping Client overcome these technological barriers, leading to reduced overhead costs, operational efficiencies, and accelerated business growth.

Let Us Be Your Trusted Shield Against Cyber Threats

Blogs
November 16, 2023

7 Reasons why you should bid farewell to On-Premise Data Centre and move to Cloud in 2023

Monoliths are outdated. The traditional development approach allows the application components to pack and couple tightly in a single unit, ensuring easy development and deployment. The structure is reliable and…
Blogs
November 16, 2023

Cyber Security threats in Cloud-Native Applications: How attacks can lead to data breaches, system instability, and operational disruption

While all organizations are moving from traditional environments to more flexible, scalable, and agile cloud-native solutions, it is not exaggerating to say, “With more advanced architecture, the complexities that invite…
Blogs
November 16, 2023

Cyber Security for Organizations on Hybrid Cloud: Best Practices to Protect Infrastructure, Applications, and Data.

Organizations operating in the Hybrid Cloud integrate on-premise infrastructure with the public cloud environment to leverage both fields’ functionalities at their best. Think of it this way: A public cloud…

Leave a Reply