Companies

Fuzzy Matching Demo: Inconsistent Company Names

Companies

You can find here an exercise to test Fuzzy Magic in Google Cloud with Cloud Dataprep by Trifacta. The goal is to standardise the company names in a file (VANILLA Ltd, *** VANILLA LTD ***, vanilla ltd, vanila ltd, Vanilla Ltd., etc.) before uploading them to the target table in BigQuery.

Cloud Dataprep is the combination of Trifacta software for data preparation and Cloud Dataflow to upload the data. As a reminder, Cloud Dataflow is an Apache Beam runner.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *