Using Hadoop to Solve Supplier Normalization

It is an unspoken truth that users of large ERP systems frequently record their suppliers as strings in a database where a single supplier may be represented in many different ways. Because consistent supplier data is required to accurately perform spend analysis, an effort was undertaken to algorithmically assist JAGGAER Solutions Consultants with a common business problem: supplier normalization. Now for the technical challenge, we asked ourselves, how can you quickly find all the strings that represent a single supplier?