Duplicate parts are the software bugs of the data world, hard to identify without specific tools and processes in place. If these “bugs” are not found in implementation they can be ultimately time consuming and costly down the road. For these reasons it is crucial to get ahead of the issue as soon as possible, optimally during a data migration project or switch to a new platform. For example, an SAP ERP data migration involving cleansing and classifying purchased parts from multiple legacy systems from US and International locations would be the perfect time to detect, isolate and deal with duplicate parts. The best case would be to have lead time of 6 to 12 months prior to the actual cutover to clearly find these duplicates and to give enough time to formulate the best plan to deal with these duplicates. Some may want to remove duplicate parts all together, while others may want to look for ways to consolidate and rationalize this data to help mitigate duplicate inventory, still others may want to associate them for purchaser’s reference for ordering and price comparisons. Regardless of final goal for your company you first need to identify them through these steps.
Duplicate purchased parts – same part assigned to different internal part numbers
- Collect then classify the data
It is key to run any duplicate analysis off of a specific category of data. As a result, the first order of business is to classify the parts. From each legacy ERP system, you would need to gather all the purchase part data together for analysis including: Manufacturing part numbers, supplier names, pricing, existing internal part numbers and descriptions plus any existing commodity codes can be useful to help with the initial classification.
Typically, data coming from different systems needs to be normalized to a single standard classification model – you can do this by classifying all the purchase part data against a single data model. This data model will help dictate the categories which in turn drive the attributes and allowed values each part should have.
- Enrich and validate the data
Once all the data from each system is classified – you need to enrich the data. Using either the manuifacturer part numbers or associated documentation – you can harvest attribute data for each part. (This can be time consuming if you don’t have the expertise in house, you may want to contract this work out.) The data is typically obtained from approved manufacturer websites or company approved documentation e.g specifications, drawings, etc. The harvested data should be loaded into a database and validated against the data model.
For example – for AC Motors – enriching each motor with key information – horsepower, speed, number of poles, phases, frequency, etc. Enriching the data allows us to group duplicate motors and similar motors more easily. In some cases, motors with different manufacturer part numbers can have very similar characteristics.
- Duplicate and Near Duplicate Analysis
Once the data has been enriched, normalized and validated you are able to execute duplication analysis, preferable using duplicate analysis tools that are based on attribute value matches on critical attributes that drive pricing and differentiation.
For grouping similar parts to help find near duplicates, you can use a clustering tool that allowed different weighting factors to be set on the critical attributes and the overall cluster – the CDS tool calls this a neighbor distance. Critical to making this analysis work is having normalized data and having numeric data with separate units of measure.
Figure 1: Examples of duplicate, 5 HP Motors, with different part numbers and supplier pricing.
- Processing Identified Duplicates
We recommend taking the groups of duplicates and select one internal part number to be the master for each manufacturer part number (MPN). Since you will likely end up with multiple internal part numbers for a MPN, we recommend setting up a master cross reference. That way purchasing could consolidate spend and rationalize inventory – without disrupting existing BOMs. You could allow existing BOMS to use any of the part numbers to save time with engineering change control updating BOMs with just one internal part number.
We also recommend that you manage the master cross reference of internal part numbers to MPN in the ERP system. This may be the easiest thing to do given all of the other priorities in a ERP deployment. Overtime, engineering can use the master internal part number for new products – allowing a slow phase out the other internal part numbers.
Summary steps to the process:
- Select the one Internal Part Number, Manufacturer Part Number (MPN) and Manufacturer (MFG) based on purchase history that will be the preferred part.
- Assign all other records as alternates to the assigned preferred part.
- Do not change the Engineering Bills of Materials.
- Move to a global part number on all documentation.
- Maintain the Global Part number to MPN and MFG internally in SAP, or other ERP/PLM systems.
Rich attribution and classification of data is pivotal to preventing and identifying duplicate and near duplicate data that lies within your systems today. Collect then classify part data followed by enrichment and validation of the part data will help ensure your analysis can extract true duplicates. Once duplicates or near duplicates are identified it is essential to have a corporate plan on how to handle these duplicate on an ongoing basis. A good plan will avoid over purchasing of parts and streamline your inventory.