Gentran and Sterling Integrator Mapping Techniques

Wednesday, April 29, 2009

How to make inbound 997's catalogue themselves

In Gentran for Windows, the document browsers show you the Document Name and the Reference Data.

When you create a map, for say an 810, you typically capture the BIG02 Invoice Number and use a standard rule to update Document Record assign to DocumentName, right? And if BIG04 Purchase Order Number is available, you may assign that to ReferenceData. This makes it nice browse and find your invoices in the system. And you do the equivalent with any document, inbound or outbound.

You've noticed that the outbound 997's like to mark their referenced group control numbers (AK102) in DocumentName and group status (AK901) in ReferenceData. How nice! If you only had the group id (AK101) in there as well, you could nicely cross-reference your outbound 997's to your inbound functional groups.

But the inbound 997's catalogue nothing in the metadata. (Bummer) Like, if you received a negative 997, you'd like to know which one, right? Oh I know it'll mark your outbound group as failed, but just try finding that 997 in the In Drawer. After all, they might have sent you some AK3's and AK4's to tell you why.

If you want the 997, you can always do a file search: you know, browse into Documents/#year/month/day and search for AK1*IN*653~ ... provided you know the delimiters it came in with. But to find it by query or document browser requires that the 997 catalogue itself in the metadata.

The only solution is to modify the 997 break map.

Caution: Modifying build and break maps is systems level stuff and is not supported by Sterling, so be sure to keep backups so you can rollback if something goes wrong. If you mess up and you need Sterling to help you fix it, the support call may be expensive.

What I always do is to write an extended rule on AK101 [0479], something like this:

string[40] id;

id = #0479 + "*" + #0028; // Essentially, AK101*AK102
update Document set DocumentName = id;

Then down on AK901 [0715], I use a standard rule to update Document-ReferenceData.

Now when you browse inbound 997's, you'll see a DocumentName like "IN*653" with ReferenceData = "A".

Another beauty of this is that you can easily query your inbound 997's directly;

select DocumentName, ReferenceData
from Document_tb
where Direction=0 and TransactionSetID='997'
and ReferenceData in ('E', 'P', 'R')
order by DateCreated desc

You can even write monitoring code that not only finds your failed outbounds, but cross-references the negative 997's that indicated the failure.

Now isn't that just simply lovely.

One disclaimer: I've noticed that Gentran for Windows seems to indicate a translation failure in the audit log when this modified 997 break map runs. It works correctly, and does update the outbound status as expected, but for some reason it reports an error to the audit log. I haven't figured this one out yet. Still, a minor nuisance on the audit log is a small price to pay for the ability to cross-reference your 997's.

EDI Joke: DOL Unit of Measure

The cXML specification requires that the UnitOfMeasure element contain a code from ISO Recommendation 20. When this is mapped into ANSI X12, it must therefore be translated to ANSI's code list for DE0355.

As long as a proper ISO code is used in the UnitOfMeasure element, the translation works quite well.

The most common problem I found there was that a customer's back-end system might have used ANSI Units of Measure, and when Ariba Buyer was installed, they did not translate their code to the ISO code required by cXML. In these cases, the cXML data contained an ANSI code that would be interpreted as ISO and then mistranslated.

But the most laughable case I ever saw, which was escalated from the support team, had the customer sending Unit of Measure code DOL. The code DOL does not exist in the ISO R20 table.

Due to a defect in the translation code at the time, if the ISO table lookup failed, the cXML UofM was simply written to the DE0355 element "as is", and truncated if necessary, to which it became code DO.

Now it just happens that code DO in ANSI represents "Dollars (US)".

I was mystified as to where this code DOL came from and what it was meant to represent.

The customer was using this code DOL to mean Dollars, hence they were happy that it came over as code DO in ANSI. One had to wonder how they were buying Dollars? Who was the supplier, I thought, the treasury? The mint?

No. The customer was actually buying "consulting services". So I said to them, Why don't you buy in units of time, like Man Hours? They wanted to buy in units of budget. And after pressing the customer with the what and why questions that we Systems Analysts always ask, I managed to get them to admit that they had created this non-standard Unit of Measure Code DOL to mean Dollars.

The problem was that while the process just happened to truncate the L, and render code DO to the 850, the supplier's invoice could not translate code DO back to DOL.

There is no code in ISO R20 that corresponds to "Dollars (US)", and so DO is one of a handful of unsupported codes. And DOL, being homegrown, does not exist in the ISO R20 code list.

Tsk tsk; have you ever had to tell folks to not use their homegrown codes in this industry?

The solution I recommended to the customer was to use code M4 for Monetary Units. This code happens to be the same in both ISO and ANSI which worked both ways and was perfectly acceptable by the supplier.

How to populate the 856's HL segments using only standard rules with accumlators

Introduction

The 856 is one of those hierarchical documents that rely on the HL segment. The HL segment specifies the hierarchy level within the data itself, and then the detail segments used depend on the level specified.

Because of its innate complexity, some folks call the 856 the "document from hell". Typical implementations have either three or five hierarchy levels. Sample structures might be Shipment-Order-Item, or Shipment-Order-Tare-Pack-Item.

There is a certain temptation to use extended rules to populate the HL segments, and I've often seen this done haphazardly. There is a perfectly straight-forward way to render the HL segments using standard rules with accumulators only, and it only requires some simple organization.

In this article, I show my preferred technique for building HL segments. This method can also be used for building an 811.

This method works perfectly well in both Gentran for Windows and GIS. The sample I give here is for an 856 with a Shipment-Order-Item structure for simplicity. It can be easily adapted to the SOTPI structure. I shall present another article for how a variation of this method can be used to fill the record headers of an IDOC.

HL structure

We recall that the HL segment has this structure:

0628 AN 1/12 Hierarchical id number
0734 AN 1/12 Hierarchical parent id number
0735 ID 1/2 Hierarchical level code
0736 ID 1/1 Hierarchical child code

With our example, we'd expect to see HL data something like this:

HL*1**S*1~
HL*2*1*O*1~
HL*3*2*I*0~
HL*4*2*I*0~
HL*5*1*O*1~
HL*6*5*I*0~

Segment and Group Structure and Naming

On the output side, I will physically rearrange the data records. I will paste the HL group inside itself to the required depth. The child HL will of course be at the end of the parent's data segments.

The segment naming will be impacted, since we are implementing concrete repetitions of 2010_HL_grp. My solution is to simply include the level code in the identifier. So 2010_HL_S_grp, 2010_HL_O_grp, and 2010_HL_I_grp. I do this also with the segments. If my patience is thin, I do it only with the segments that I am activating.

Accumulator Usage

What I do to implement this on output, is that I use Accumulator 0 as the HL sequence number, and then I assign the next few Accumulators to the respective levels.

The accumulators will be named as follows:

HL_id_seq
HL_shipment_id
HL_order_id
HL_item_id

[Technically you don't really need HL_item_id, since it is a leaf, but we define it for consistency and later expansion, if needed.]

So on the first HL for shipment level, on the HL01 [0628] element, I create a standard rule for Use Accumulator. I name Accumulator 0 as "HL_id_seq" and then specify:

Primary=0: HL_id_seq

Increment primary
Use primary

Primary=0: HL_id_seq

Move primary to secondary 1: HL_shipment_id

I find that two rules are needed as the first one does not allow the move primary to secondary option. You can have a two part Accumulator type standard rule.

HL01 [0628] will always have the first part: inc primary; use primary. This simply generates the HL sequence numbers. The second part of the rule will change depending on the level. The secondary accumulator is used to maintain the ancestry.

At the shipment level, HL02 [0734] is unused, since it represents the root.

For HL03 [0735], I simply use a standard rule with a constant: HL_level_shipment = "S". And HL04 [0736] will be another constant: HL_child_yes = 1.

Now see what happens at the order level.

HL01 [0628] will have its usual stuff:

Primary=0: HL_id_seq

Increment primary
Use primary

Primary=0: HL_id_seq

Move primary to secondary 2: HL_order_id

Notice the only change is to use Accumulator 2: HL_order_id as our secondary.

At the order level, HL02 [0734] now has a rule of its own, since it must capture the parent id, namely that of the shipment. Conveniently, we have that in Accumulator 1: HL_shipment_id. All we have to do is use it.

Primary=1: HL_shipment_id

Use primary

HL03 [0735] is simply a constant: HL_level_order = "O", and HL04 [0736] is again HL_child_yes = 1.

We do much the same thing at the item level.

HL01 [0628] follows its normal convention:

Primary=0: HL_id_seq

Increment primary
Use primary

Primary=0: HL_id_seq

Move primary to secondary 3: HL_item_id

And HL02 [0734] captures the current order id:

Primary=2: HL_order_id

Use primary

HL03 [0735] has its usual constant: HL_level_item = "I", and now HL04 [0736] marks itself as a leaf: HL_child_no = 0.

Technically there is no need to capture HL_item_id, since being a leaf it can never be a parent. The idea here is to keep the rules as consistent as possible. This keeps it much easier for the client to maintain the map. Suppose they needed to modify this map to have an SOIP hierarchy: they need only change the hierachy type in the BSN segment, and now just add another level after item. Their changes to the rules will be minimal and thus highly stable through changes.

If you need to render a CTT segment, and CTT01 expects to have the total number of HL segments, all you need to do is create a standard rule with Use primary on the HL_id_seq.

In some implementations there is a concrete 1:1 relationship between shipment and order, that is, strictly one order per shipment. You need only arrange this in your physical segment/group layout to parallel your source data. The standard rules for constructing the HL segment remain exactly the same.

Gotcha!

There is still one outstanding gotcha. There must be an actual mapping from input to output to initiate output of the segment, even though HL is the group lead.

My favourite solution is to use kicker elements on input. These of course are done in temporary records with a $$$ tag. We simply install these elements at the appropriate levels in the input. I by convention will name the elements kick_HL_shipment, kick_HL_order, and kick_HL_item. I define them as string 1/1 and use an extended rule to populate them with their level codes: S, O, and I respectively, and then I map them to HL03. The constant rule on HL03 overrides the value anyway, but I use the appropriate letter for consistency.

Summary

Here now we have created an outbound 856 with the standard in-order traversal through its hierarchy.

If you need to tap the accumulators in extended rules on the output side, you can conveniently use the accum(n) function. (Remember to use the integer constants that we defined to ensure that actual integers are used.) By convention, accum(0) is the global sequencer. This way, the natural numbers, 1, 2, 3, etc correspond directly with their respective levels.

Other Applications

811 Consolidated Services Invoice

Exactly the same methodology can be used to create an outbound 811 Consolidated Services Invoice. The HL03 level codes are different, and there are typically more levels, but the theory is identical.

SAP IDOCs

A SAP IDOC has a segment level hierarchy. Each segment has a sequence number and a parent segment sequence number and a level number. It is easy to see how this methodology could be used to populate the IDOC segment header.

(Of course if you are simply importing the IDOC into SAP, you can get away with populating only the record id, but technically that's cheating. I usually populate at least the segment sequence number.)

I intend to create a separate article dedicated to the application of this technique to fully build the IDOC segment header.

Tuesday, April 21, 2009

How to default the N404 country code

This was a cute little feature that I built at Ariba.

The cXML specification mandates the isoCountryCode in any PostalAddress element. Therefore, when processing incoming EDI, such as from an 810 -> InvoiceDetailRequest, the N404 country code is mandatory if the N2, N3 or N4 is used at all.

The problem is that most EDI suppliers in North America do not include N404. A typical N4 may look like this:

N4*Watertown*CT*06795~

Whereas cXML would require this:

N4*Watertown*CT*06795*US~

N404 had to be present, as the cXML.dtd will fail the PostalAddress element if isoCountryCode is absent. This was initially a problem with many suppliers, as they had neglected the N404 element despite its being marked as required in the Implementation Guide.

The solution was based on three premises:
1) An N4 semantic rule requires N402 for any address in USA or Canada.
2) Suppliers were reliable at using correct official state or province codes.
3) US State and Canadian Province codes are mutually exclusive. No state shares the same code with any province.

Ariba uses a specialized system. I present here a workable solution for Gentran for Windows. An equivalent solution can be set up in GIS.

The solution is to create a Division Lookup table called STATES. The Item code is your two-letter code. The Description is the name and Text1 is the country code: US or CA. A few sample rows are shown here:

 Item  Description    Text1
 NB   New Brunswick   CA
 NC   North Carolina  US
 ND   North Dakota    US
 NE   Nebraska        US
 NH   New Hampshire   US
 NJ   New Jersey      US
 NL   Newfoundland    CA

Now you can have an extended rule on your N404 on input, or your country code on output. The N404 version would look something like this:

string[2] state;
string[3] country;

state = #0156;
country = #0026;

if state <> "" & country = "" then begin
select TEXT1 into country from DivisionLookup
  where TableName="STATES" and Item=state;
if country <> "" then begin
  #0026 = country;
end
else begin
  cerror(ERR_invalidElemCode, #0156);
end
end

(Note that ERR_invalidElemCode you've defined as a constant for the appropriate code; I don't have the list in front of me.)

Now you can just map N404 as if it was always there.

This code of course does not take into account an N404 that is inconsistent with N402, and does nothing to inspect the zip/postal code. But it is a nice solution to when your output mandates a country code and the N4 has not provided it.

Friday, April 17, 2009

Maintain index variables for the segment groups

How often do you find that you need to index the repeating segment groups in your extended rules. I hardly ever do a map where this is not needed. You know, you're down at the detail level and you have to selectively map the N1's or the REF's. So you go and write the piece of code and then the compiler says that additional indices are needed to represent this data, and it's a pain to dig up all those index values.

Say in our 850 example, you're down in the N1 group (2/350), child of PO1 (2/010) and you need to iterate through your REF segment (2/390). It happens.

Even if you have prudently cleaned up the segment and group id's, you still have this nuisance of trying to do this:

$2390_REF[index(1)][index(2)][idx_REF].#0128 ...

Those index values can get confusing and can lead to performance questions if they are needed frequently. While the code may look simple enough, it is not clearly obvious from the code what the segment group hierarchy really is. Furthermore, there is a temptation to copy such code from another section in which the indices may not apply and thus produce questionable results.

The convention I follow is to maintain a running group level, grplvl, and an index. I want to avoid hardcoding the group level and keep it relative to its parent group.
We would like to capture those values for their scope and hold them in variables., and we would like these variables to follow a fixed convention.

This is where our document_grplvl = 0 comes into play. A group at the document level is a child of the document. Let us take a look at the entry to the header level N1 group (1/310).

In the group's onBegin() rule I define a standard preamble:


integer this_N1_grplvl, this_N1;

this_N1_grplvl = document_grplvl + inc;
this_N1 = index(this_N1_grplvl);

Now, anytime in my group that I need the current index, I do not need to use the index function and run the risk of using the wrong index number. The variable this_N1 will have the value that I need.

If I had reason in there to reference the REF segment (1/350), then my code is simply like this:


qual = $1350_REF[this_N1][idx_REF].#0128;

This code is also very portable. If I am going to work with the detail level N1 (2/350), the preamble is copied with only a simple change:


integer this_N1_grplvl, this_N1;

this_N1_grplvl = this_PO1_grplvl + inc;
this_N1 = index(this_N1_grplvl);

Now you see, it is taken for granted that the PO1 group (2/010) had its preamble in place. Since this N1 group is a child of PO1, we simply increment off PO1's grplvl.

Another case: say we need to flag that the bill-to and ship-to addresses were actually present in the header, and then report an error at document.onEnd() if one was missing. Or consider a real-life case I had in which the ship-to and ship-from were optional in themselves, but if either one was used they were both required. Also, if carrier was present, then ship-from and ship-to were both required. We could use booleans to flag that the N1 entity was present, but since the segment groups are 1-based, I like to keep a variable capturing the group that had it, and 0 (zero) if it is not present. This way if I want to use it at document.onEnd(), I can.

What I do for this is at the document.onBegin() rule, I define index capture variables, like this:


integer the_N1_CA, the_N1_SF, the_N1_ST;

the_N1_CA = zero;
the_N1_SF = zero;
the_N1_ST = zero;

Now when I am in the N1 group, I can then do my test. There is a little gotcha here in that if you are going to be testing for validation, you must capture the value at the field level. Such a rule is really in proper context in the group.onEnd() rule, since it concerns the overall group. But in the validation pass on the map, field data isn't always visible in the onEnd() rule, and so errors are falsely reported.

What I do for this is in the N1 group's preamble, I add another variable called this_N1_entity.


string[3] this_N1_entity;

this_N1_entity = "";

In N1's 0098 element, all I have is this_N1_entity = #0098; Now I can use that variable in my onEnd() rule safely. I can also use it further down, as the current N1 entity may influence which REF's I am interested in capturing. So now, to tie this back to our the_N1_... variables, our N1 group's onEnd() may look like this now:


if this_N1_entity = "CA" then begin
the_N1_CA = this_N1;
end
else if this_N1_entity = "SF" then begin
the_N1_SF = this_N1;
end
else if this_N1_entity = "ST" then begin
the_N1_ST = this_N1;
end

This code will always remember the last occurrence of any of these entities. But if we wanted to capture only the first occurrence and ignore others, we can do that too, very easily, as shown with a modified CA test:


if this_N1_entity = "CA" then begin
if the_N1_CA = zero then begin
  // capture it
end
else begin
  // call this an error or ignore it
end
end

The test at the document.onEnd() is now greatly simplified:


if the_N1_CA <> zero then begin
 if the_N1_SF = zero | the_N1_ST = zero then begin
   // CA present, but SF or ST is missing
 end
end
else if the_N1_SF <> zero & the_N1_ST = zero then begin
 // SF present without ST
end
else if the_N1_SF = zero & the_N1_ST <> zero then begin
 // ST present without SF
end

By keeping proper indices for the group hierarchy, it greatly simplifies your group array index codes whenever you need to reference them. The extended rules retain very high portability. They can be copied from one group to another with only simple and well-represented changes.

Thursday, April 16, 2009

Segment and Group Identifiers for EDIFACT

This article is further to my earlier article on Segment and Group Identifiers for X12. In this article I discuss how I apply this technique to EDIFACT.

With EDIFACT there are no tables of Header, Detail, Summary. Even though the transaction set (called Message in EDIFACT) still conceptually has that structure, the segments are numbered sequentially straight through. They use 4-digit numbers, so we just use the same methodology. The segment groups each have their own unique 4-digit sequence number, distinct from the segments (so nice!), and are in themselves sequentially numbered, making for a nice little naming convention we can put in the group descriptions.

Looking at the D.00A ORDERS as an example, I would identify the first few segments with descriptions like this:

0020_BGM  Beginning of Message
0030_DTM  Date/Time/Period
0040_PAI  Payment Instructions
0050_ALI  Additional Information
0060_IMD  Item Description
0070_FTX  Free Text
0080_GIR  Related Identification Numbers
0090_RFF  SG1: RFF-DTM
 0100_RFF  Reference
 0110_DTM  Date/Time/Period
...

Notice the 0090, which is the first segment group. I always put SG1, SG2, ... into the description followed by the segment id range as is normally found in the EDIFACT documentation. Also, I don't worry about the _grp suffix, since it has a unique number anyway. I only added the _grp in X12 because I needed it to make the identifier unique. It is perfectly appropriate to be consistent and name it 0090_RFF_grp if you prefer.

Once again, this naming convention always pays off with much-more readable, and less error-prone extended rules, and a much more maintainble map for the client.

Wednesday, April 15, 2009

Clean up the Segment and Group Identifiers

This step I find invaluable when creating a new map. An hour or so spent up front to clean up the identifiers will make the rest of development much easier, make error reports and debugging much easier, and improve readability down the road. I have never seen this done on any maps other than those which I wrote, and I really wish others would follow such a standard. When this step is avoided, extended rules can remain brutally cryptic, and when done, become so much cleaner to work with. In this post, I discuss the technique I always use.

When you create a new map, the EDI template is imported from the standard.mdb MS Access file. The field and record names in the map are generated at load time from a query on this database. By default, the field names are all modeled after the element identifiers and the records after the segment identifiers. The repeating segment groups have "loop ids" assigned arbitrarily from the Sterling file, and do not relate to the actual segment positions.

This creates a few problems as the map editor requires record and group identifiers to be unique, and it strives to make the field identifiers also unique across the board. Using the segment id for the record id is fine at the beginning, say with BEG, CUR, REF, PER, etc. But once these segments occur again, such as inside a group, the map editor then has to uniquify the name with something like REF:2, DTM:4 and so on. This leads to confusion as this :2 and :4 are not much more than random numbers.

This leads to even further confusion with the groups. The SAC group might be given a name like 1000_SAC and the AMT group might be 2000_AMT. These numbers are meaningless to the transaction.

What I always do is to include the true segment position with the segment id. This makes it genuinely unique, and very self-documenting.

Take for example the good old 850 from 004010. As we know, X12 divides the documents into tables of segments, typically Header, Detail, and Summary, numbered as 1-2-3. In 004010, the segment positions have 3 digits. So I simply make 4-digit numbers. In such an 850, I will rename the segment records to something like this:

1020_BEG
1040_CUR
1050_REF
1060_PER
...
The group presents a slight problem since X12 does not assign a separate number to the group itself, as EDIFACT does. In our 850 example, the first group is SAC at 1/120, so I just append _grp to the group id, like this:

1120_SAC_grp
1120_SAC
1125_CUR

Notice how very nicely now we have the 1125_CUR totally distinct from the 1040_CUR. Furthermore, the number distinctly identifies the segment's true position within the transaction set. Now your group references within your extended rules.

Let's say at your document.onEnd() rule you have a reason to iterate through your header SAC groups. The code for the extended rule is self-documenting:

integer max_SAC, idx;
string[4] qual;

max_SAC = count($1120_SAC[*]);
idx = one;
while idx <= max_SAC do begin
qual = $1120_SAC[idx].#1300;
if qual = "G830" then begin
...
end
idx = idx + inc;
end

From this code, it is obvious to the reader which SAC group is being searched here.

As an additional practice, I usually replace the segment descriptions with mixed-case, as all upper-case is awkward to read, and often gets clipped in the map editor.

It takes about an hour to go through a new map and correct all of the segment and group names. I always update them all, even if I am not going t use the segment. I find this hour is always very well-spent, and an hour spent here will save me many hours of coding and debugging time. To make such changes later would require fixing all of the extended rules, which might then be incorrectly updated leading to more bugs.

While this initial clean up may seem to be tedious or redundant, it is an effort that always pays off.