Two Approaches to Data Cleansing
Do you have good data? Can your engineers and designers quickly locate parts to complete their designs? Data cleansing is such an important function since it improves data quality. This will then help increase productivity overall. By cleansing data, it allows for all the outdated or incorrect information to be deleted, which only leaves the highest quality of information to be seen.
When engineers see that their data is a mess, it can truly create an unjustified use of their time. Engineers are programmed to work in an efficient and effective manner. When data quality is poor, it can truly inhibit a project. In turn, there are two approaches that can help someone begin data cleansing. The first approach is “as-you-go” or “all-at-once.”
Here are the two different approaches one could take to cleanse their data:
“As-you-go” entails adding steps to existing processes of reviewing part data to ensure it meets data quality requirements. If the data does not meet the quality requirements, then process steps to normalize, cleanse, and enhance must be added. This effort does require a budget since there may be additional steps added to the current product development process or there may be a need for more resources. As a result, here are the advantages of this process:
- There is no need to prioritize parts to cleanse since parts being processed in the system are the priority.
- Out-of-date components are not processed by the system. This means no resources are used to cleanse them.
- The initial budget Is less than an “all-out-effort” budget. This makes it easier to sell management on this process.
“All-at-once” cleanses all part data in an organized project. This leads to the creation of an effective separate project that helps to cleanse the data. Being that it is designed to be a separate project, it usually requires a dedicated team of resources and software tools. This strategy will normally cost more than the “as-you-go” approach. If you are working with a large amount of data, this one may be the route for your organization to think about using. This approach uses an economy of scale tactic and may work with 3rd parties to ensure all data is cleansed and enriched. Once the cleansed data is introduced back to the company, it is required to implement an NPI (New Part Introduction) process. An NPI process ensures new part data is received at an acceptable quality and there are no duplicate components.
Both approaches require changes in processes which will affect peoples’ responsibilities and will inevitably create some hesitation. To avoid hesitation, management needs to be entirely supportive of the project and engage regularly with project members and users to overcome any setbacks and maintain a positive attitude. As the data is provided to the company, the users need to be provided tools to access the data and encouraged to use it. Users must be educated on the benefits of using the data, how to access the data, and the governance processes to maintain quality data.
In conclusion, if your company is looking for the biggest return on investment, we recommend the “All at Once Process.” To obtain the most value – we recommend you prioritize the parts you process – focus on the highest spend purchased parts that will make your management happy.
Characterizations of the Two Approaches to Data Cleansing
As-you-go |
All-at-once |
Initially requires a lower budget and can sometimes be distributed across different departments. Makes it an easier sell to management. |
Requires a detailed budget, typically higher than the initial budget for as-you-go. Management needs to be sold, likely needs a thorough ROI analysis, which can be difficult to determine. |
Cleanses and enriches data active in the system, so priority is set automatically, and non-active parts are not cleansed. |
Effective when there are large groups of data, so economies of scale can leverage. Is typically good for large companies with many divisions and companies which have grown through acquisition. |
Requires additional steps to be added to product development processes, but these steps should be incorporated to ensure ongoing quality data. |
Additional processes are not initially required but should be added once users start to use the data. Fewer people require changes to their processes at the beginning, but when changes are added they can benefit immediately from having access to good data. |
If there are no planned major changes to IT systems. |
When migrating to new IT systems, it makes sense to have all the data cleansed and enriched so change and user training can be incorporated with the migration. |