All articles

Reducing Integration and Analytics Costs for a Fortune 500 Cloud Solutions Provider

Reducing Integration and Analytics Costs for a Fortune 500 Cloud Solutions Provider
Definian helped a Fortune 500 cloud solutions provider reduce integration and analytics costs through a modern data platform built on AWS and Snowflake.
Tagged in:
Mathieu Stark
Mathieu
Stark
Principal, Data Value Realization Practice Lead
View bio

Background

A Fortune 500 CRM (Customer Relationship Management) and cloud solutions provider faced the complex challenge of exponential growth in integration, compute, and data warehousing costs caused by the rapid acceleration of data volumes across its data landscape. The organization needed to move its data and analytics from Snowflake to a custom-built Hive solution on the Amazon AWS (Amazon Web Services) platform to reduce costs and improve performance.

Why Definian

While the Client has significant in-house data analytics and cloud infrastructure skills, they needed data engineering expertise to complete this initiative in the desired time-frame. The client had previously worked with Definian on data governance and integration projects, and through that experience, knew Definian had the necessary skills, accelerators, and methods to carry out this project efficiently and effectively. This was validated by Definian's three decades of experience building complex data engineering solutions.

The Project: A Joint Effort Across Four Milestones

Like many large transformative initiatives, this project simultaneously posed significant risk and value. To reduce project risk and maximize value along the way, the initiative was split into four distinct milestones. This modular approach enabled the Client to realize value throughout the initiative without disrupting current processes. It also enabled the Client and Definian to focus their energy on their respective strengths.

Being a pioneer in cloud applications and data modeling, the Client owned the design and development of the new analytics platform. The Client leveraged Definian’s data engineering solutions to minimize development time and maximize data pipeline throughput. While Definian upgraded the data pipelines that fed their analytics platforms, the Client focused on data models and cloud architectures.

Milestone 1: Migrate Jitterbit Integrations to AWS Glue

The project's first milestone focused on replacing approximately 300 Jitterbit integrations that connected the Client's operational data to their primary analytics data stores in Snowflake and Redshift. To help keep this milestone on track, Definian used its integration design frameworks and reference library to reverse engineer the poorly documented legacy Jitterbit integrations and replicate them in AWS glue.

Milestone 2: Design and Build the Pipelines for the Future State Data and Analytics Platform

While the Client focused on designing and developing the Hive database in AWS infrastructure, Definian designed and built the future state integration framework and process. Collaborating closely with the Client, Definian enhanced the integrations from Milestone 1 to easily re-point to the new analytics warehouse during cut-over. Additionally, Definian and the Client found opportunities to rationalize and improve the performance of existing integrations. As part of the improvements, Definian increased pipeline efficiency by transitioning/mirroring the ETLs from AWS Glue to Apache Airflow.

Milestone 3: Migrate from Snowflake to Hive

With the new analytics platform operational, it was time to migrate the data and shut down Snowflake. Definian built a Snowflake to Hive pipeline to execute the migration using PySpark in Apache Airflow. This approach maximized throughput and minimized development time. To reduce downtime during the cut-over, Definian and the Client collaborated on a tight cut-over plan. The execution of the plan exceeded expectations, resulting in no downtime and an on-time go-live.

Phase 4: Consolidate Data Silos

After the new analytics platform went live, the last step was consolidating and decommissioning additional data silos into the new analytics platform. Definian designed the pipelines and processes for this last step to enable the Client to self-execute the plan when ready. When the Client was ready to migrate, Definian provided as-needed back-up to the Client.

Impact: Improved Data Pipelines, Improved Data Analytics, Lower Costs

This complex initiative enabled long-term sustainable analytics capabilities for the Client. They have a pathway for more intelligent AI, sharper analytics, and data-driven decisions. The new data pipelines in Apache Airflow run at a lower cost and greater efficiency than the prior Jitterbit framework.

Other articles

Foundation First: The Root Cause and the Path Forward

Foundation First: The Root Cause and the Path Forward

Data Governance
Best Practices
Data Value Realization
Part 2 of The Three Failures That Will Define Who Survives AI. Why treating data as a technology concern instead of its own strategic pillar is the root cause, and what Foundation First looks like in practice.
The Three Failures That Will Define Who Survives AI

The Three Failures That Will Define Who Survives AI

Data Governance
Best Practices
Data Value Realization
Over 80% of AI projects fail to reach production. The problem is not the technology. Three predictable failure modes are turning enterprise AI into the most expensive technology failure in corporate history.
The Model Isn’t the Problem

The Model Isn’t the Problem

Data Governance
Best Practices
Healthcare AI pilots stall before reaching production. The model is rarely the issue. The gap between training data and production data is what breaks deployment.
Client testimonial
The Definian team was great to work with. Professional, accommodating, organized, knowledgeable ... We could not have been as successful without you.
Senior Manager | Top Four Global Consulting Firm

Partners & Certifications

Ready to unleash the value in your data?