1. Overview
Most migrations of data from old (source) to new (target) systems require more than just mapping / transforming old to new values for equivalent columns. Often the new system “thinks” of the data in a different way and has thus modeled it completely differently, requiring more fundamental changes which entail serious data reorganization, and not just “column to column mappings”. A few examples of these would be mapping multiple rows to a single row, eliminating rows altogether, normalizing or denormalizing structures, i.e., turning multiple tables into a single one, or vice versa. Some data wrangling or ETL tools have evolved to do a fair job for the simpler migration data reorg efforts, but even the best still make it cumbersome or impossible in more challenging situations. In these cases, actual procedural code almost always beats off-the-shelf ETL tools for simplicity and time-to-implement. Often the off-the-shelf tool cannot do the job, period. This case study shows one such migration example and how the problem was solved using Oracle procedural language invoked from shell scripts.
2. Problem
The source system contained Parts List data that did not fit the new Windchill Parts List model, and in fact would simply not load to Windchill without structural transformation. Specifically, Windchill disallows the same reference designator value to be loaded twice for the same Bill of Materials (This is understandable since a reference designators (ref des for short) is commonly used to unambiguously identify a component within an electrical schematic or printed circuit board).
However the Parts List in this particular source system used the reference designators to denote call-outs on the drawing for design or assembly specifics, and when those specifics were common between different component part numbers on the same assembly, these call-out notes were shared by having both or several separate component part numbers share the same reference designator(s). For example:
Row | Parent Part# | Child Part# | Find# | Reference Designator |
---|---|---|---|---|
1 | 03-0800 | M11467S | 25 | R1-R3 |
2 | 03-0800 | M11468S | 49 | R1-R3 |
3 | 03-0800 | M11469S | 48 | R1-R3 |
4 | 03-0800 | M11470S | 24 | R1, R3 |
5 | 03-0800 | M11471S | 26 | R1, R3 |
6 | 03-0800 | M11472S | 47 | R1, R3 |
7 | 03-0800 | M11480 | 29 | R2 |
8 | 03-0800 | M11481 | 44 | R2 |
Windchill will attempt to load this Bill Of Materials into its two tables WTPartUsageLink, which is at the child part and find number level, and PartUsesOccurrence, which is a child table of WTPartUsageLink, containing that child component’s reference designators. Many PartUsesOccurrence rows can exist for one WTPartUsageLink, containing multiple reference designators. However our target Windchill implementation will not allow the same reference designator to appear more than once in the BOM, thus this data will be rejected by Windchill during loading.
Because no two child components of the same parent can share any reference designator, our source data could not be migrated into Windchill without further transformation.
3. Solution - Overview
The general solution path as decided in conjunction with the Windchill SMEs was to transform the source data before loading, so that just ONE child component on a BOM would carry any specific reference designators, and the other components which also needed to refer to that designator for assembly callouts, would have their find numbers modified to match the first (carrying) component’s find number. Further, the shared find number would be marked, for those non-carrying components, with a custom attribute (IBA) to inform the user that this new find number was a REFERENCE back to another component’s shared reference designators.
Thus our transformation would turn this:
Row | Parent Part# | Child Part# | Find# | Reference Designator | Find# for Reference Only Flag (IBA) |
---|---|---|---|---|---|
1 | 03-0800 | M11467S | 10 | U1-U4 | |
2 | 03-0800 | M11468S | 25 | U1-U4 | |
3 | 03-0800 | M11469S | 30 | U1-U4 |
Into this:
Row | Parent Part# | Child Part# | Find# | Reference Designator | Find# for Reference Only Flag (IBA) |
---|---|---|---|---|---|
1 | 03-0800 | M11467S | 10 | U1-U4 | |
2 | 03-0800 | M11468S | 10 | R | |
3 | 03-0800 | M11469S | 10 | R |
A document was produced as well during the transformation listing the old and new find numbers and reference designators for all components on the parent part, and stored in Windchill as an attachment to the parent part.
4. Solution - Detailed Transformation Rules
Of course, much more complicated reference designator sharing existed in our source data and our solution had to accommodate these. Returning to our first example:
Row | Parent Part# | Child Part# | Find# | Reference Designator |
---|---|---|---|---|
1 | 05478-00 | G88367S | 25 | R1-R3 |
2 | 05478-00 | G88368S | 49 | R1-R3 |
3 | 05478-00 | G88369S | 48 | R1-R3 |
4 | 05478-00 | G88370S | 24 | R1, R3 |
5 | 05478-00 | G88371S | 26 | R1, R3 |
6 | 05478-00 | G88372S | 47 | R1, R3 |
7 | 05478-00 | G88380 | 29 | R2 |
8 | 05478-00 | G88381 | 44 | R2 |
From the table we see 8 children of the same parent, with their find numbers and reference designators. Together they share the three reference designators of R1, R2 and R3. Rows 1-3 have the common set of R1, R2 and R3. Rows 4-6 have R1 and R3 only, while the child components in rows 7 and 8 have only the ref designator of R2. On the source system’s related CAD drawing, there is a single specification callout for R1, one for R2 and one for R3. The R1 and R3 callouts are meant to apply to the child components named here in rows 1 through 6, while the R2 callout applies to the child components shown in rows 1, 2, 3, 7 and 8.
The first set of transformation rules we applied was:
Transformation Step 1:
1a. Break apart sequenced reference designator designations to create one row for each ref design (e.g., R1-R3 value generates three distinct rows… R1, R2, R3)
1b. Identify children WTPartUsageLink rows which share reference designators with other children of same parent.
1c. Select ONE WTPartUsageLink child row to be the PRIMARY PARTUSAGELINK, and carry the reference designator. The one selected must not itself have reference designators which are not included in other children which share ANY of its ref designators (“all who share any must share all” rule).
1d. Take other children which share (all of) the selected primary’s refdesigs, remove their refdesigs, and update their PARTUSAGELINK rows to be REFERENCE ONLY.
After Transformation Step 1:
Row | Parent Part# | Child Part# | Find# | Reference Designator | Find# for Reference Only Flag (IBA) |
---|---|---|---|---|---|
4 | 05478-00 | G88370S | 24 | R1, R3 | |
1 | 05478-00 | G88367S | 24 | R | |
2 | 05478-00 | G88368S | 24 | R | |
3 | 05478-00 | G88369S | 24 | R | |
5 | 05478-00 | G88371S | 24 | R | |
6 | 05478-00 | G88372S | 24 | R | |
7 | 05478-00 | G88380 | 29 | R2 | |
8 | 05478-00 | G88381 | 29 | R | |
1 | 05478-00 | G88367S | 29 | R | |
2 | 05478-00 | G88368S | 29 | R | |
3 | 05478-00 | G88369S | 29 | R |
Before transformation step 1, above, about 30% of the components rows in our source data shared reference designators with another component for the part and therefore would not load to Windchill. Following step 1, that percentage was down to 2%.
Examining this remaining 2%, the reason was discovered to be a “deadlock” condition as illustrated below:
Parent# | Child# | Find Number | REFDES |
---|---|---|---|
61-70277-002 | RNC55H1004FS | 50 | R69, R103 |
61-70277-002 | RNC55H1274FS | 51 | R69, R103 |
61-70277-002 | RNC55H1474FS | 52 | R69, R103 |
61-70277-002 | RNC55H1604FS | 53 | R69, R103 |
61-70277-002 | RNC55H1964FS | 54 | R69, R103 |
61-70277-002 | RNC55H9533FS | 55 | R60, R103 |
Neither the R69,103 pair nor the R60,103 pair can be used due to the ‘all who share any must share all’ rule, so these must be split up into 3 separate PartUsageLink rows carrying R69, R103 and R60 ref designators, respectively.
Transformation Step 2 (to cover above ‘deadlock’ cases):
2a. After application of transformation rule 1, identify those component parts which still contain reference designators shared with other components
2b. For each such component part, identify its’ “Problem Reference Designators” (those still shared with other component part).
2c. Creating new PartUsageLink row for each unique component and problem reference designator
2d. Re-apply transformation Step 1, to take advantage of newly created find numbers and allow duplicate reference designators to be removed
For example, after the initial application of step 1, our code identified the following problem data, with the problem reference designators identified (steps 2a and 2b):
PARENT | CHILD | FIND | REFDESIGNATORS | Problem (shared) Reference Designators |
---|---|---|---|---|
6055283-001-01 | 7018023-001 | 108 | U28-U29, U40-U41 | U28, |
6055283-001-01 | 7018023-002 | 109 | U1-U9, U19-U27, U31-U33, U43-U45 | U21,U31,U32 |
6055283-001-01 | 7018023-003 | 117 | U21, U28, U31-U32 | U21,U28,U31,U32 |
Step 2c then creates the following new PartUsageLinks each with single reference designator:
PARENT | CHILD | FIND | REFDESIGNATORS | Problem (shared) Reference Designators |
---|---|---|---|---|
6055283-001-01 | 7018023-001 | 108 | U28-U29, U40-U41 | U28, |
6055283-001-01 | 7018023-002 | 109 | U1-U9, U19-U27, U31-U33, U43-U45 | U21,U31,U32 |
6055283-001-01 | 7018023-003 | 117 | U21, U28, U31-U32 | U21,U28,U31,U32 |
After re-running the pass 1 transforms (step 2d), the PUL/finds look like this:
PARENT | CHILD | FIND | REFDESIGNATOR | Find# for Reference Only Flag (IBA) |
---|---|---|---|---|
6055283-001-001 | 7018023-001 | 108 | U29, U40, U41 | |
6055283-001-001 | 7018023-002 | 109 | U1-U9, U19-U20, U22-U27, U33, U43-45 | |
6055283-001-001 | 7018023-001 | 142 | U28 | |
6055283-001-001 | 7018023-003 | 142 | R | |
6055283-001-001 | 7018023-002 | 143 | U32 | |
6055283-001-001 | 7018023-003 | 143 | R | |
6055283-001-001 | 7018023-002 | 144 | U31 | |
6055283-001-001 | 7018023-003 | 144 | R | |
6055283-001-001 | 7018023-002 | 145 | U21 | |
6055283-001-001 | 7018023-003 | 145 | R |
This illustrates how the code isolated those reference designators which were causing the “deadlock”, isolated them off to their own PartUsageLink / Find Number rows, and pointed back up to them with the “Reference” find number flag (IBA).
Wrap-up transformation rules then removed holes (compressed) new find numbers and looked for opportunities to combine multiple reference designators into a single find number when possible (opportunities created after splitting into “one find number per reference designator”. These are not detailed here).
After these transformations were applied no duplicate reference designators remained and the data could be 100% loaded to Windchill without errors.
5. Solution - Code
These transformations were implemented as a series of Oracle Steps, which were statements, and, in some cases procedural language implemented as stored procedures. These Oracle steps were invoked sequentially from an invoking Unix Shell script. They were broken up such that each one did a small and distinct piece of work, and they generally updated “working” tables to be used in the next Oracle step. Multiple Oracle statements were performed for each above-listed “transformation rule step”, with about 30 Oracle statements in total for the entire transformation.
As an example, here’s a SQL statement for step 1b which identifies children WTPartUsageLink rows which share reference designators with other children of the same parent:
/*
Build table of shared refdesignators containing:
Parent info (partnumber/rev/cage) + child info (partnumber/cage/find) +
refdesig + ‘shared child info’ (partnumber/cage/find)
This is done by linking refdesig back to refdesig, looking for same parent, refdesig, but different
child+findnumber
*/
INSERT INTO refdesig_sharing_trans (
parentpartnumber,
parentpartorganization_name,
parentpartrevision,
childpartnumber,
childpartorganization_name,
findnumber,
schildpartnumber,
schildpartorganization_name,
sfindnumber,
refdesignator
)
SELECT
r1.parentpartnumber,
r1.parentpartorganization_name,
r1.parentpartrevision,
r1.childpartnumber,
r1.childpartorganization_name,
r1.findnumber,
r2.childpartnumber,
r2.childpartorganization_name,
r2.findnumber,
r1.refdesignator
FROM refdesig_trans r1
INNER JOIN refdesig_trans r2 ON (
r2.parentpartnumber=r1.parentpartnumber AND
r2.parentpartorganization_name=r1.parentpartorganization_name AND
r2.parentpartrevision=r1.parentpartrevision AND
(
r2.findnumber <> r1.findnumber OR
r2.childpartnumber <> r1.childpartnumber OR
r2.childpartorganization_name <> r1.childpartorganization_name
)
AND r2.refdesignator=r1.refdesignator;
6. Conclusion
This case study illustrates our migration “90-10 rule”, wherein 10% of the source data requires 90% of the effort to make it conform to the rules of the target system. Nevertheless by a detailed investigation of the source data, a thorough understanding of the business and data modeling rules driving the non-uniqueness (done in conjunction with engineering SMEs), and utilization of some custom Oracle coding, we were able to transform the data to an acceptable state and achieve 100% loading to WTPartUsageLink and PartUsesOccurrence with a 2 week effort, including investigation, rule enumeration, coding and testing.
Enter your email below: