https://DevOpsCloud.io -- Cloud Monk Losang Jinpa

Sams Teach Yourself COBOL in 24 Hours - Hour 7 Manipulating String Data

Return to Teach Yourself COBOL in 24 Hours, COBOL bibliography, COBOL, COBOL DevOps, Awesome COBOL, Awesome IBM Mainframe, IBM Mainframe development, IBM Mainframe bibliography, Fortran

“ (TYCb24H 1998)

Hour 7 Manipulating String Data

In Hour 6, “Manipulating Data,” you learned some basic methods for manipulating data. However, programming often requires more-complex data manipulation, especially when working with character strings. In this hour, you learn about manipulating character strings. The following topics are covered:

• The definition of a string

• The String statement

• Using delimiters with the String statement

• The Unstring statement

• Using delimiters with the Unstring statement

A string is a set of characters. It can be described as being all the characters in any particular field. The field, or data item, can also be referred to as a string. Working with strings is a common task in all kinds of computer programming. In your programming, you may need to disassemble a field, or string, of data. On the other hand, you might be required to create a string of data for some special use. For example, you might have a file that has first and last names in separate fields, and you want to print the combined name on an address label.

Some database and spreadsheet systems generate a delimited text file. You may want to read one of these files and separate the values in the individual delimited records into different fields. In other cases, you might want to create a comma- or comma-quote–delimited file to import into one of these systems.

Image

A delimiter is a field-separation character. When data fields are strung together by these different systems, the compiler needs some way to distinguish the individual fields that make up the string. Many systems create what is termed a CSV file, which is a file made up of strings where the individual fields are separated by a comma. CSV stands for “comma-separated value.” Some systems further separate fields by placing quotation]] marks around the alphanumeric fields. This practice is the origin of the term comma-quote–delimited file. The comma separates the individual fields, and this separation character is known as a delimiter.

The String functions in COBOL are very robust. The two basic statements for manipulating a string of data are String and Unstring. String combines data into a single string. Unstring separates a string of data into individual fields.

The String Statement When you need to merge, or string, multiple data fields into a single field, you should use the String statement. The simplest form of the String statement uses one or more input fields and moves them consecutively into an output field, sometimes referred to as the target field.

000032 String “ABC” “123” Delimited By Size Into Output-Field. This String statement results in the value “ABC123” being stored in Output-Field. The Delimited By Size clause indicates that the entire input field is to be used in the String operation.

There are some important rules to remember when using the String statement.

• The target field cannot be reference modified. That is, you may not String Into Output-Field (3:5).

• Numeric fields must be Usage Display data items.

• You may string into Group Level items. I discourage this practice, however, because it is too easy to get invalid data into subordinate numeric data fields.

• The target field is not cleared, or padded with spaces, as in a Move statement. Use caution to ensure that your target field is properly initialized.

If the target of your String operation is too small to contain the characters that are being strung into it, an overflow condition occurs. You may capture this occurrence by coding the On Overflow clause. After this clause, you may place any logic that you desire to execute when an output field overflow occurs. You can also code a corresponding clause—Not On Overflow—to execute any time an overflow condition does not occur.

Image

An overflow condition does not exist if your String statement fails to fill the target field.

Examine the following snippet of code:

000025 Working-Storage Section. 000026 01 Data-Field Pic X(20). 000027 01 Field-1 Pic X(12) Value “Total”. 000028 01 Field-2 Pic X(12) Value “Price”. 000029 Procedure Division. 000030 Start-String-Example. 000031 String Field-1 Delimited By Size 000032 Field-2 Delimited By Size 000033 Into Data-Field 000034 On Overflow 000035 Display “String Overflow” 000036 End-String 000037 Stop Run 000038 . This code contains several notable items. First, the Delimited By clause is repeated on each field that is being strung into the output field. You may list as many fields as you desire before any Delimited clause. The next Delimited clause encountered applies to all prior fields after the preceding Delimited clause. Second, an explicit scope terminator is associated with the String statement. I suggest you use End-String any time you code an On Overflow or Not On Overflow clause and any time the String statement is very long or complex. The End-String makes your code easier to understand.

Notice also that this String always triggers the overflow condition. The reason is that the two fields being strung together are each 12 characters, and the target field is only 20. Because 24 characters cannot fit into 20 positions, the overflow always occurs. In COBOL the actual values of the fields used in the String statement have no bearing on the results when Delimited By Size is used. It is not the size of the data within the field that matters, but the field size itself.

String Delimiters Why would anyone code a String statement where the target field cannot hold the full size of the source fields? Look at a real-world example. Assume that you have two fields defined: one for a person’s first name and one for his or her last name. You want to print an address label with the full name, and your label is only wide enough for 30 characters. The fields in which you are storing first and last name are 25 characters each. The potential exists for a complete name to exceed the target field, especially when the single space separating the names is added. When you print your label and the name is cut off, or truncated, because it is too long, you want to print only the last name on the label. This way you avoid any insulting renditions of the person’s name. To accomplish this task, you need to use a delimiter other than Size in your String statement.

You may delimit, or stop, the operation of the String statement using any value you desire. When the value indicated is encountered, the string operation stops and the delimiter itself is not included in the target field. For the task specified here, you use the space character to terminate the String for the first and last names.

Image

When working with real-world data, you cannot be sure that the first and last names contain single words. It is essential to remember that when a character delimiter is specified, the String operation is terminated the first time that character is encountered. Therefore, if you delimit by space and the field contains “Bobby Sue”, only “Bobby” makes it to the target field. Hour 22, “Other Intrinsic Functions,” covers an efficient way to handle this situation.

Key Listing 7.1 into the editor.

Listing 7.1 String Example

000001 @OPTIONS MAIN 000002 Identification Division. 000003 Program-Id. Chapt07a. 000004* String Example 000005 Environment Division. 000006 Configuration Section. 000007 Source-Computer. IBM-PC. 000008 Object-Computer. IBM-PC. 000009 Data Division. 000010 Working-Storage Section. 000011 01 First-Name Pic X(25) Value Spaces. 000012 01 Last-Name Pic X(25) Value Spaces. 000013 01 Combined-Name Pic X(30) Value Spaces. 000014 Procedure Division. 000015 Chapt07a-Start. 000016 Move “First” To First-Name 000017 Move “Last” To Last-Name 000018 String First-Name Delimited By Space 000019 “ ” Delimited By Size 000020 Last-Name Delimited By Space 000021 Into 000022 Combined-Name 000023 On Overflow 000024 Move Last-name To Combined-Name 000025 End-String 000026 Display “1 ” Combined-Name 000027 Move “A” to First-Name 000028 Move “B” to Last-Name 000029 String First-Name Delimited By Space 000030 “ ” Delimited By Size 000031 Last-Name Delimited By Space 000032 Into 000033 Combined-Name 000034 On Overflow 000035 Move Last-name To Combined-Name 000036 End-String 000037 Display “2 “ Combined-Name 000038 Move Spaces To Combined-Name 000039 Move “ReallyLongFirst Name” To First-Name 000040 Move “ReallyLongLastName” To Last-Name 000041 String First-Name Delimited By Space 000042 “ ” Delimited By Size 000043 Last-Name Delimited By Space 000044 Into 000045 Combined-Name 000046 On Overflow 000047 Move Last-name To Combined-Name 000048 End-String 000049 Display “3 ” Combined-Name 000050 Stop Run 000051 . A single alphanumeric literal, space, has been added to the String statements to separate the two names. For this example, the two input fields, First-Name and Last-Name, are strung into the target field until a space is encountered. Compile and run the program. Your output should look like Figure 7.1.

Figure 7.1 Output from Listing 7.1.

Image

Line 1 of the display is what you might expect. However, line 2 looks strange because the target field, Combined-Name, was not cleared between the String statements. Line 3 contains only the Last-Name because the overflow condition occurred and the Move statement coded for that condition was executed.

The delimiters used by the String statement need not be single characters only. Delimiters can be any character or string of characters. Delimiters do not have to be literals, but can instead be data items. Table 7.1 illustrates the results of stringing different data items using various delimiters.

Table 7.1 Results of String Operations with Various Delimiters

Image

Notice in the second example, when in the first field the delimiter is not encountered, the entire field contents are moved. Notice also that all characters including and after the delimiter are omitted.

Occasionally, you may want to String fields into a target field starting from other than the first position. The obvious answer might be to use reference modification on the target field, but COBOL prohibits that practice. However, there is another way to accomplish this task.

You may add a Pointer clause to the String statement. The Pointer indicates the starting position in the target field for the String operation. When the String operation is complete, this Pointer is updated to contain the next position in the target field. The Pointer must be a numeric variable of sufficient size to hold the number of character positions in the target field. If the field is 100 characters long, a pointer variable with a Picture of 9(2) is too small. The pointer variable must always have a value greater than zero.

Image

When using the String statement with the Pointer clause, you must be certain that you have initialized the field with the desired value.

Assume that Target-Field is defined with a value of “TEST FIELD”, and you want to change the word “FIELD” to “FILES”, using the String statement. You can define a numeric field named String-Pointer, set its value to 6, and then issue the following COBOL statement:

000040 String “FILES” Delimited By Size 000041 Into Target-Field 000042 With Pointer String-Pointer 000043 End-String After this string operation, the value of String-Pointer is 11.

One common use for the Pointer clause is to format data that requires special edit patterns. Sometimes these edit patterns can change based on the number of positions or values of the specific data items. For example, a telephone number might be formatted (999) 999-9999, or just 999-9999 if the area code is not provided. The Pointer clause on the String statement can hold the starting position for the seven-digit number portion of the telephone number. If the area code exists and is strung first, the value of the pointer will be 6; otherwise, it will be 1. When the rest of the telephone number is strung into the target field, the number will be properly positioned.

The Unstring Statement Sometimes, instead of creating a new string, you need to separate an existing string into separate fields. You might receive data in a file that contains a first name, middle initial, and last name. You need to separate these into separate data fields. To handle this task, COBOL provides a statement called Unstring.

Unstring Delimiters Unstring, in its simplest form, merely splits a field into parts based on a delimiter. Like the String statement, the delimiter may be a single character, a nonnumeric variable, or a nonnumeric literal. The target field or fields of an Unstring statement are not initialized before the the Unstring statement moves values into them. You must use caution to ensure that the target fields are properly initialized.

Unstring uses a single source field and one or more target fields. The source field may not be reference modified. Unstring examines the source field character by character, moving the data into the first target field. When the specified delimiter is encountered, the Unstring process begins to fill the next target field. If you have a data item that contains a name, for example, “John Joe Jones”, that you want to split into separate fields, code the following Unstring statement:

000044 Unstring Source-Field Delimited By Space 000045 Into Target-1, Target-2, Target-3 000046 End-Unstring Image

Unstring supports the use of the End-Unstring explicit scope terminator. I suggest that you use End-Unstring whenever your String statement uses any optional clauses or extends over several lines.

What would happen if your source field contained “John Joe Jones”, where several spaces separate the fields you want to unstring? If you use the code example in lines 44-46, you will end up with Target-1 containing “John”, Target-2 containing “Joe”, and Target-3 containing spaces. The Unstring considers only the first space it encounters to be a delimiter. To handle the possible repetition of delimiters, insert the word All before the delimiter. The following Unstring statement properly handles the input field example:

000047 Unstring Source-Field Delimited By All Space 000048 Into Target-1, Target-2, Target-3 000049 End-Unstring When you use Unstring, you may use multiple delimiters. Your source field might contain “Jones, Joe John”, and you might want to separate this into three different fields. If you were restricted to only a single delimiter, you would have to issue two Unstring statements to handle this input. However, Unstring allows you to use multiple delimiters:

000050 Unstring Source-Field Delimited By All Space Or 000051 All “,” 000052 Into Target-1, Target-2, Target-3 000053 End-Unstring In this example, if either a space or a comma is encountered, the Unstring proceeds to the next target field.

In addition, the Unstring statement enables you to count the number of target fields that it actually changes. For example, you can determine whether the source field has two names or three by coding the Tallying In clause. When you use this clause, the numeric variable that is specified after Tallying In is incremented by the number of target fields changed.

Image

When using Tallying In, you must make sure to reset to zero the numeric data item being used before each Unstring statement. The tally is incremented by the Unstring statement, but is not set to zero at the start.

000050 Move Zeros To Numeric-Counter 000051 Unstring Source-Field Delimited By All Space Or 000052 All “,” 000053 Into Target-1, Target-2, Target-3 000054 Tallying In Numeric-Counter 000055 End-Unstring In this example, if the source field contains “David Jones”, the field Numeric-Counter has a value of 2 after the Unstring operation.

It might be desirable to know the total number of characters from the source field that were moved into the different target fields. You can capture this information by coding the Count In clause. Delimiters that are encountered are not included in the count. Count In references a numeric data item.

Image

When using the Count In clause and a delimiter other than spaces, the result might not be what you expect. If you are not using spaces as a delimiter, any spaces encountered are added to the character count that is stored in the associated Count In data item.

000050 Move Zeros To Numeric-Counter 000051 Move Zeros To Character-Counter 000052 Unstring Source-Field Delimited By All Space Or 000053 All “,” 000054 Into Target-1, Target-2, Target-3 000055 Count In Character-Counter 000056 Tallying In Numeric-Counter 000057 End-Unstring If the source field has a value of “Expect A Miracle”, the value in Numeric-Counter is 14 after the Unstring is executed; the space character between the words is the delimiter and is not added to the data item specified by Count In.

If any of the target fields are too small to contain the data from the Unstring operation, an overflow condition occurs. As with the String statement, you can capture this occurrence by coding the On Overflow clause. However, the On Overflow clause does not capture which target field overflowed.

The last delimiter encountered can be captured by using the Delimiter In clause. When this clause is used, the last delimiter is stored in the associated data item. If the end of the source field is encountered, the stored delimiter is spaces if alphanumeric or zeros if numeric.

The Pointer clause can indicate the starting position in the source field where you desire the Unstring operation to begin. The data item associated with the Pointer clause must be numeric and have a value greater than zero. You should be sure the field is properly initialized before the next Unstring statement. Listing 7.2 combines many of the features discussed so far. This example accepts a simple mathematical expression and dissects it, displaying the components of the expression. The program requires two Unstring statements.

Listing 7.2 Unstring Example

000001 @OPTIONS MAIN 000002 Identification Division. 000003 Program-Id. Chapt07x. 000004* Unstring Example 000005 Environment Division. 000006 Configuration Section. 000007 Source-Computer. IBM-PC. 000008 Object-Computer. IBM-PC. 000009 Data Division. 000010 Working-Storage Section. 000011 01 Expression-In Pic X(10) Value Spaces. 000012 01 First-Term Pic X(5) Value Spaces. 000013 01 Second-Term Pic X(5) Value Spaces. 000014 01 Operation Pic X Value Spaces. 000015 01 Unstring-Pointer Pic 9(2) Value Zeros. 000016 Screen Section. 000017 01 Main-Screen Blank Screen. 000018 03 Line 01 Column 01 Value “Enter Expression:”. 000019 03 Line 01 Column 19 Pic X(10) Using Expression-In. 000020 03 Line 03 Column 01 Value “First Term ”. 000021 03 Line 04 Column 01 Value “Second Term ”. 000022 03 Line 05 Column 01 Value “Operation ”. 000023 03 Line 03 Column 13 Pic X(5) From First-Term. 000024 03 Line 04 Column 13 Pic X(5) From Second-Term. 000025 03 Line 05 Column 13 Pic X From Operation. 000026 Procedure Division. 000027 Chapt07x-Start. 000028 Display Main-Screen 000029 Accept Main-Screen 000030 Unstring Expression-In 000031 Delimited By “+” or “-” or “*” or “/” 000032 Into First-Term 000033 Delimiter In Operation 000034 Count In Unstring-Pointer 000035 End-Unstring 000036 Add 2 To Unstring-Pointer 000037 Unstring Expression-In 000038 Delimited By “=” 000039 Into Second-Term 000040 Pointer Unstring-Pointer 000041 End-Unstring 000042 Display Main-Screen 000043 Stop Run 000044 . It is entirely permissible, and often desirable, to use Unstring to strip off only a single portion of a source field. The preceding program uses this technique to capture the delimiter. The delimiter, which is the mathematical symbol of the expression entered, is stored in the Operation field. The first term of the expression is stored in the First-Term field. The length of the first term is stored in the Unstring-Pointer field.

The next step is to position the pointer for the start of the next Unstring. The pointer needs to be positioned at the first character after the delimiter. If the first term had three characters before the delimiter, the value of String-Pointer is 3. Then 2 is added to achieve the start position for the next Unstring, which is 5. The delimiter is in the fourth position, and the first character of the second term is in the fifth.

Enter, compile, and run the program. Experiment with it. Enter various expressions and examine the results. Try things like “17-6=”, and “A*123=”.

Summary In this hour, you learned the following:

• You can use the String and Unstring statements to manipulate data fields.

• You can use delimiters to determine the action of the statements.

• With the String statement, Delimited By Size causes the entire source field to be moved into the target field.

• The Pointer clause can position the String statement at various points in the target field.

• Unstring can strip characters from a source field into one or more target fields.

Q&A Q When using the String statement, is the target field cleared before the String operation is performed?

A No, so that you can execute multiple string operations into a single target. The Pointer clause allows you to position the next character in the target field.

Q Can I string more than two fields together in a single String statement?

A Yes. You can list the different fields you want to use in a single String statement.

Q What if I want to use a different delimiter for each field?

A You can list the different delimiters with each field you are unstringing. If all the fields use the same delimiter, you only need to specify the Delimited By clause once, after all the fields are listed.

Q Must the delimiters always be single characters? Can I use something like “SEPARATOR“ as a delimiter?

A Delimiters can be of any size that can be contained in the source field. The word “SEPARATOR” can be used as a delimiter.

Q How do I find out how many fields are found when I unstring a field? I don’t know how many to expect.

A You can determine the number of target fields used by an Unstring operation by specifying the Tallying In clause on your Unstring statement. The tally field is incremented by the number of fields changed. Be careful that you initialize the tally field each time it is used, as the Unstring statement does not automatically do this for you.

Q I already used Unstring to operate on part of a field. I want to Unstring some more data, but I don’t want to start over at the beginning of the field. I know that reference modification is not allowed. What should I do?

A You may use the Pointer clause of the Unstring statement to indicate a data field containing the position of the next character that should be included in the Unstring operation.

Workshop To help reinforce your understanding of the material presented in this hour, refer to the section “Quiz and Exercise Questions and Answers” that can be found on the CD. This section contains quiz questions and exercises for you to complete, as well as the corresponding answers.

[[Fair Use]] [[Source]]s

Fair Use Sources]]:

B001FWIJGE (TYCb24H 1998)

COBOL: COBOL Fundamentals, COBOL Inventor - COBOL Language Designer: 1959 by Howard Bromberg, Norman Discount, Vernon Reeves, Jean E. Sammet, William Selden, Gertrude Tierney, with indirect influence from Grace Hopper, CODASYL, ANSI COBOL, ISO/IEC COBOL; Modern COBOL - Legacy COBOL, IBM COBOL, COBOL keywords, COBOL data structures - COBOL algorithms, COBOL syntax, Visual COBOL, COBOL on Windows, COBOL on Linux, COBOL on UNIX, COBOL on macOS, Mainframe COBOL, IBM i COBOL, IBM Mainframe DevOps, COBOL Standards, COBOL Paradigms (Imperative COBOL, Procedural COBOL, Object-Oriented COBOL - COBOL OOP, Functional COBOL), COBOL syntax, COBOL installation, COBOL containerization, COBOL configuration, COBOL compilers, COBOL IDEs, COBOL development tools, COBOL DevOps - COBOL SRE, COBOL data science - COBOL DataOps, COBOL machine learning, COBOL deep learning, COBOL concurrency, COBOL history, COBOL bibliography, COBOL glossary, COBOL topics, COBOL courses, COBOL Standard Library, COBOL libraries, COBOL frameworks, COBOL research, Grace Hopper, COBOL GitHub, Written in COBOL, COBOL popularity, COBOL Awesome list, COBOL Versions. (navbar_cobol)

SYI LU SENG E MU CHYWE YE. NAN. WEI LA YE. WEI LA YE. SA WA HE.

Table of Contents

Sams Teach Yourself COBOL in 24 Hours - Hour 7 Manipulating String Data

[[Fair Use]] [[Source]]s