This post explains how to get the best performance when replicating data from Salesforce. Leverage Salesforce’s chunkSize parameter to extract large amounts of data.
Category: Data Integration
As the data ecosystems evolved to allow for more functionality and flexibility and new technologies, Data Integration became more and more critical to enabling all the possibilities.
Data Integration is the means to eliminate data silos and leverage all the data’s potential for an organization.
For clarification, the Data Ingestion is comprised of Data Acquisition, Data Landing and Data Preparation.
First, Acquisition is the method of extracting or receiving a defined data set before further consumption. The Acquisition is how we obtain data and what characteristics the data has.
Landing is the act of unloading a data set on a target database or storage system.
As for Data Preparation, it is the process to make the data ready for use.
Ideally, we keep data as-is when we move data. I.e., we first acquire data. Once it has landed in the target, we can prepare the data for consumption. Thus, we eliminate the business and technical problems that inaccurate, contradictory and inconsistent data causes.
However, some scenarios require some minimum transformations to allow the ingestion from a technical point of view, such as formatting a date (15.3.22) to unify all formats (15/3/2022). In some other cases, we prefer to include them to ensure a certain level of data quality when the data reaches the target (e.g., if we receive records as “Vanilla Inc”, “vanila”, or “vanilla In”, we may unify them to reduce duplication).
In this category, you will find how-to articles, tips, tricks, best practices and use cases for database migrations, replications, data copies, ETL/ ELT, streaming, CDC (Change Data Capture). They focus on Data Acquisition. I only include a Data Preparation process when necessary to achieve the final result.
Impact of changing time in a Qlik Replicate server
When you change time on a Qlik Replicate server, for example, to adapt to the winter or summer schedule, you should follow the best practices we explain in this post.
Latency in Qlik Replicate
When replicating data with Qlik Replicate, latency is a crucial metric to monitor performance. This post explains latency.
Data Replication & Qlik Replicate
This post explains what Data Replication is and how it differentiates from ETL/ ELT, what Change Data Capture (CDC) is, and a few Qlik Replicate use cases.
Retrieve meteorological information from AEMET and load it into BigQuery
This post explains how to download meteorological information from the Spanish Meteorological Agency and load it into BigQuery.
Migration from PostgreSQL to MongoDB on AWS
This post explains how to differentiate among the different types of NoSQL databases. and how to plan a migration from PostgreSQL to MongoDB.