Forum: War Ensemble BBS

Can mighty COBOL carry an elephant?

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed May 30 09:21:05 2018

From Newsgroup: comp.lang.cobol

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...
--- Synchronet 3.20a-Linux NewsLink 1.114

From Rick Smith@rs847925@gmail.com to comp.lang.cobol on Wed May 30 10:20:37 2018

From Newsgroup: comp.lang.cobol

On Wednesday, May 30, 2018 at 12:21:06 PM UTC-4, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?

It would be helpful to know the organization as defined in
working-storage.

But, since you are using ODO, what do you need to initialize?
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kerry Liles@kerry.liles@gmail.com to comp.lang.cobol on Wed May 30 13:34:13 2018

From Newsgroup: comp.lang.cobol

On 5/30/2018 12:21 PM, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...

I am not entirely sure what you mean by "the initialization algorithm is
using reference modifications to obviate initializing the whole table
more often that required." (to me that implies that you may be
initializing some portion or portions of the table rather than the
entire table)

If you could post some code snippets that would be helpful.

Perhaps the COBOL compiler you are using already knows the best way to initialize an array/table? You could, for example, say:

MOVE LOW-VALUES TO WS-TABLE

I would also be interested to know whether or not you have tried
different methods of initializing the table and timed the different
attempts? What is the difference between the try illustrated above
versus (say) "INITIALIZE WS-TABLE" or other methods like:

PERFORM VARYING IND FROM BY 1 UNTIL IND > WS-BYTES-IN-WS-TABLE
MOVE SPACE TO WS-TABLE (IND:1)
END-PERFORM

--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed May 30 16:47:10 2018

From Newsgroup: comp.lang.cobol

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in
working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Rick Smith@rs847925@gmail.com to comp.lang.cobol on Wed May 30 17:00:00 2018

From Newsgroup: comp.lang.cobol

On Wednesday, May 30, 2018 at 7:47:11 PM UTC-4, Kellie Fitton wrote:

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

Probably the most efficient way is to set up an independent
record with all values initialized. Move that record to the
table as needed.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed May 30 17:58:19 2018

From Newsgroup: comp.lang.cobol

On Wednesday, May 30, 2018 at 10:34:18 AM UTC-7, Kerry Liles wrote:

Perhaps the COBOL compiler you are using already knows the best way to initialize an array/table? You could, for example, say:

MOVE LOW-VALUES TO WS-TABLE

I would also be interested to know whether or not you have tried
different methods of initializing the table and timed the different
attempts? What is the difference between the try illustrated above
versus (say) "INITIALIZE WS-TABLE" or other methods like:

PERFORM VARYING IND FROM BY 1 UNTIL IND > WS-BYTES-IN-WS-TABLE
MOVE SPACE TO WS-TABLE (IND:1)
END-PERFORM

I am initializing only as needed which is the exact number of
occurrences of populated items, thus Formatting Only On-Demand.
When I used the verb INITIALIZE ws-repository-table it was slower
than move low-values to ws-repository-table. The CPU usage was
reduced by a large margin when I calculated the length of the
table items, maintain a high water mark of the last table item
number which is the higher position, and when the next table
element is needed, I initialized the real values based on the
reference modification with move low-values to repository-table.

Also, another approached I tried was the perform varying:

INITIALIZE WS-REPOSITORY (1)
PERFORM VARYING TABLE-INDEX FROM 2 BY 1
UNTIL TABLE-INDEX > WS-ELEMENTS-IN-REPOSITORY-TABLE
MOVE WS-REPOSITORY (1) TO WS-REPOSITORY (TABLE-INDEX)
END-PERFORM

--- Synchronet 3.20a-Linux NewsLink 1.114

From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Wed May 30 20:15:47 2018

From Newsgroup: comp.lang.cobol

On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...

I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.
A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.
I am not really answering for an in memory table and may advise against it. Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.
I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.
I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.
If you need more on what primary key to write, what secondary key to write then I can expand.
You wrote - initialized constantly - and that needs more explanation.
Greg
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed May 30 20:20:51 2018

From Newsgroup: comp.lang.cobol

Probably the most efficient way is to set up an independent
record with all values initialized. Move that record to the
table as needed.

I just tested this code as an independent record to initialize the table:
for every needed occurrence: move ws-repository to ws-table-items.
it should work just fine since an alphanumeric move is done one byte at
a time from left to right, and stops when the end of the shortest field
is encountered. I think the compiler should issue a warning message though about moving a field to a part of itself just as a notice information.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-repository.
05 format-table.
10 format-alphanumeric pic x(8) value spaces.
10 format-numeric pic s9(9)v99 comp-3 value +0.
05 ws-table-items.
10 filler occurs 1 to 53000 times depending on ws-table-counter.
15 table-plan pic x(8).
15 table-member pic s9(9)v99 comp-3.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Thu May 31 02:07:19 2018

From Newsgroup: comp.lang.cobol

I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

I am not really answering for an in memory table and may advise against it. Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

If you need more on what primary key to write, what secondary key to write then I can expand.

You wrote - initialized constantly - and that needs more explanation.

Greg

Table-wise this table is small for an ISAM file, That's why I
elected to use an in-memory table since the search all still
very fast for a 1 MByte table.
The table needs to get refreshed/reset periodically so it can
accommodate anew set of fresh data collected from a master file.
Hence, the initialization algorithm must reset the old data and
prepare the table for the re-populate process.
I am very intrigued by your two pass report structure. I hope
you have the time to elaborate on the process of varying the
keys according to the users selection criteria. Thanks...
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Thu May 31 12:10:08 2018

From Newsgroup: comp.lang.cobol

In article <790ce753-cbd5-4f4b-a34f-57c3852ee0f6@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

When accessing a WORKING-STORAGE table becomes a problem the processing-design, in my experience, has been outgrown and needs to be re-visited.

That being said: what have you tried so far and what reasons are there
that it is considered a failure? Please post some code.

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Thu May 31 12:37:46 2018

From Newsgroup: comp.lang.cobol

In article <6a44026e-0677-4375-8ac5-044a8abf906a@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

Probably the most efficient way is to set up an independent
record with all values initialized. Move that record to the
table as needed.

I just tested this code as an independent record to initialize the table:
for every needed occurrence: move ws-repository to ws-table-items.
it should work just fine since an alphanumeric move is done one byte at
a time from left to right, and stops when the end of the shortest field
is encountered. I think the compiler should issue a warning message though >about moving a field to a part of itself just as a notice information.

There was a recent discussion about errors that compilers should or should
not issue. I've worked in places where the link-edit step of the compile
JCL was predicated on a zero return-code from the compile step.

01 ws-table-counter pic 9(5) comp-5 value 0.

01 ws-repository.
05 format-table.
10 format-alphanumeric pic x(8) value spaces.
10 format-numeric pic s9(9)v99 comp-3 value +0.
05 ws-table-items.
10 filler occurs 1 to 53000 times depending on ws-table-counter.
15 table-plan pic x(8).
15 table-member pic s9(9)v99 comp-3.

Is there a benefit to losing the ability to address individual entries?
It may not be needed but I do not believe anything would be lost (and a
great deal gained) by re-coding as follows:

*
01 ws-table-counter pic 9(5) comp-5 value 0.
*
01 ws-repository-clear-line.
05 ws-format-clear-line-alpha pic x(8) value spaces.
05 ws-format-clear-line-9v99-c3 pic s9(9)v99 comp-3 value +0.
*
01 ws-table-items.
05 ws-tbl-lin occurs 1 to 53000 times depending on ws-table-counter.
10 ws-tbl-lin-plan pic x(8).
10 ws-tbl-lin-member pic s9(9)v99 comp-3.

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From robin.vowels@robin.vowels@gmail.com to comp.lang.cobol on Thu May 31 22:18:25 2018

From Newsgroup: comp.lang.cobol

On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

You say that you use a binary search. Wouldn't that need all
elements of the table to be initialised first?
--- Synchronet 3.20a-Linux NewsLink 1.114

From Rick Smith@rs847925@gmail.com to comp.lang.cobol on Fri Jun 1 07:44:48 2018

From Newsgroup: comp.lang.cobol

On Friday, June 1, 2018 at 1:18:26 AM UTC-4, robin....@gmail.com wrote:

On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

You say that you use a binary search. Wouldn't that need all
elements of the table to be initialised first?

No. The number of entries in the table is variable.

Given that 0 <= N <= 53000, only N values will participate
in the binary search. Those M where N < M <= 53000 need not
be initialized.
--- Synchronet 3.20a-Linux NewsLink 1.114

From robin.vowels@robin.vowels@gmail.com to comp.lang.cobol on Fri Jun 1 08:35:22 2018

From Newsgroup: comp.lang.cobol

On Saturday, June 2, 2018 at 12:44:49 AM UTC+10, Rick Smith wrote:

On Friday, June 1, 2018 at 1:18:26 AM UTC-4, r.....@gmail.com wrote:

On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

You say that you use a binary search. Wouldn't that need all
elements of the table to be initialised first?

No. The number of entries in the table is variable.

Given that 0 <= N <= 53000, only N values will participate
in the binary search. Those M where N < M <= 53000 need not
be initialized.

Let's hear it from the OP.
She says it's a "mammoth table".
--- Synchronet 3.20a-Linux NewsLink 1.114

From Rick Smith@rs847925@gmail.com to comp.lang.cobol on Fri Jun 1 10:11:38 2018

From Newsgroup: comp.lang.cobol

On Friday, June 1, 2018 at 11:35:23 AM UTC-4, robin....@gmail.com wrote:

On Saturday, June 2, 2018 at 12:44:49 AM UTC+10, Rick Smith wrote:

On Friday, June 1, 2018 at 1:18:26 AM UTC-4, r.....@gmail.com wrote:

On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

You say that you use a binary search. Wouldn't that need all
elements of the table to be initialised first?

No. The number of entries in the table is variable.

Given that 0 <= N <= 53000, only N values will participate
in the binary search. Those M where N < M <= 53000 need not
be initialized.

Let's hear it from the OP.
She says it's a "mammoth table".

Let's not. The 0 and 53000 were given by the OP. The rest is
derivable from the COBOL standard.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Fri Jun 1 14:35:19 2018

From Newsgroup: comp.lang.cobol

On Thursday, May 31, 2018 at 4:21:06 AM UTC+12, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

Your thoughts and opinions are appreciated.

Have you considered using a hash table rather than using a binary search ?
Make the table larger, say double, and calculate a hash from the key. For example take the alpha and redefine as a binary numeric, divide by the table size and use the remainder as the 'bucket number' index to store the entry.
Then the lookup (in idealized conditions) will be a single calculation and lookup rather than a series of divides and comparisons.
Of course it is unlikely to be idealized and so an overflow mechanism will be required for when several items calculate the same 'bucket number'. This can be done by adding an 'overflow chain' field to each item. Several different strategies could be used. For example: on overflow try to put the item in the next empty bucket, or at some offset, or in a reserved overflow area.
Packing density needs to be quite low to avoid as much overflow as possible. It is usual to analyze the actual data with several algorithms in order to choose a reasonable one.
--- Synchronet 3.20a-Linux NewsLink 1.114

From robin.vowels@robin.vowels@gmail.com to comp.lang.cobol on Fri Jun 1 19:23:59 2018

From Newsgroup: comp.lang.cobol

On Saturday, June 2, 2018 at 3:11:39 AM UTC+10, Rick Smith wrote:

On Friday, June 1, 2018 at 11:35:23 AM UTC-4, robin....@gmail.com wrote:

On Saturday, June 2, 2018 at 12:44:49 AM UTC+10, Rick Smith wrote:

On Friday, June 1, 2018 at 1:18:26 AM UTC-4, r.....@gmail.com wrote:

On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:

Depends on the content of the table. Only one type, say, binary? Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

You say that you use a binary search. Wouldn't that need all
elements of the table to be initialised first?

No. The number of entries in the table is variable.

Given that 0 <= N <= 53000, only N values will participate
in the binary search. Those M where N < M <= 53000 need not
be initialized.

Let's hear it from the OP.
She says it's a "mammoth table".

Let's not. The 0 and 53000 were given by the OP. The rest is
derivable from the COBOL standard.

Let's do so.
The lower bound of the test table is 1, not 0.
And 53000 is NOT a "mammoth table".

There's no indication that her test table is the same
as the one actually used.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Fri Jun 1 20:21:42 2018

From Newsgroup: comp.lang.cobol

Have you considered using a hash table rather than using a binary search ?

Make the table larger, say double, and calculate a hash from the key. For example take the alpha and redefine as a binary numeric, divide by the table size and use the remainder as the 'bucket number' index to store the entry.

Then the lookup (in idealized conditions) will be a single calculation and lookup rather than a series of divides and comparisons.

Of course it is unlikely to be idealized and so an overflow mechanism will be required for when several items calculate the same 'bucket number'. This can be done by adding an 'overflow chain' field to each item. Several different strategies could be used. For example: on overflow try to put the item in the next empty bucket, or at some offset, or in a reserved overflow area.

Packing density needs to be quite low to avoid as much overflow as possible. It is usual to analyze the actual data with several algorithms in order to choose a reasonable one.

The table is 1 MByte sized and will be searched often so
a binary search would be more efficient and simpler. Hash
tables are used for relative files, my system is using ISAM
files. I always use ISAM files as lookup search tables when
the table size is rather huge for a binary search all.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Fri Jun 1 20:31:29 2018

From Newsgroup: comp.lang.cobol

There's no indication that her test table is the same
as the one actually used.

As I have mentioned above, the number of entries in the table
are variable. The ODO clause is showing in the table I posted
which is the actual table that is using an independent record
to format the table on-demand.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Sat Jun 2 06:00:48 2018

From Newsgroup: comp.lang.cobol

On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...

Hi Kellie
It seems to be getting complicated with responses. I don't really want to a address inititising an in-memory table and doing a binary search.
You seem to understand Cobol ISAM files.
You seem to have what may not really be a Mammoth table but a large one at say 53,000 entries. Doing it in-memory is always faster but writing a COBOL ISAM file eliminates the binary search and is fast. It could be that you need to allocate a unique name to the temporary sort-work file based on userid or session number.
Anyway, I can give code samples if you email me directly to gregwebace at gmail.com for more. It seems Google Groups does not allow a direct email address.
So far you talk about a binary search but do not reveal what the search key is. You say it needs to be refreshed from some data master data.
I can add more re:
I am very intrigued by your two pass report structure.
Greg
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Sat Jun 2 07:07:18 2018

From Newsgroup: comp.lang.cobol

On Saturday, June 2, 2018 at 6:00:49 AM UTC-7, Greg Wallace wrote:

Anyway, I can give code samples if you email me directly to gregwebace at gmail.com for more. It seems Google Groups does not allow a direct email address.

So far you talk about a binary search but do not reveal what the search key is. You say it needs to be refreshed from some data master data.

I can add more re:
I am very intrigued by your two pass report structure.

Greg

Hi Greg,

The binary search all is based on the primary and secondary keys
are in ascending order (KEY-1 and KEY-2).

I just sent you an email. Thanks...

--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sat Jun 2 14:32:46 2018

From Newsgroup: comp.lang.cobol

On Saturday, June 2, 2018 at 3:21:44 PM UTC+12, Kellie Fitton wrote:

Have you considered using a hash table rather than using a binary search ?

Make the table larger, say double, and calculate a hash from the key. For example take the alpha and redefine as a binary numeric, divide by the table size and use the remainder as the 'bucket number' index to store the entry.

Then the lookup (in idealized conditions) will be a single calculation and lookup rather than a series of divides and comparisons.

Of course it is unlikely to be idealized and so an overflow mechanism will be required for when several items calculate the same 'bucket number'. This can be done by adding an 'overflow chain' field to each item. Several different strategies could be used. For example: on overflow try to put the item in the next empty bucket, or at some offset, or in a reserved overflow area.

Packing density needs to be quite low to avoid as much overflow as possible. It is usual to analyze the actual data with several algorithms in order to choose a reasonable one.

The table is 1 MByte sized and will be searched often so
a binary search would be more efficient and simpler. Hash
tables are used for relative files, my system is using ISAM
files. I always use ISAM files as lookup search tables when
the table size is rather huge for a binary search all.

You seem to miss the point that a hash can be used for an array in memory as well as for relative file. A hash can be _much_more_ efficient than a binary search given an adequate algorithm and sufficiently small packing density to avoid too much overflow.
A binary search will do a comparison with the mid point of the table and then do a divide and comparison until it finds the entry (or doesn't). For a 50,000 table it will do a comparison at (approx) 25,000, 12,500, 6,250, 3,125, 1,562, 781, 390, 195, 97, 48, 24, 12, 6, 3, 2, 1 or so unless it finds a match at one of those.
A hash table will, most of the time, find a match at the calculated position. Occasionally it would need to step along a chain of overflow, so the number of comparisons may average out to, say, 1.1 per lookup while a search all may take a dozen or more, each with a divide and an add or subtract.
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Sun Jun 3 13:20:02 2018

From Newsgroup: comp.lang.cobol

On 31/05/2018 3:15 PM, Greg Wallace wrote:

On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

This might not be the "right" question.

Maybe you need to think about whether you need a table at all, rather
than how it should be initialized...? See below.

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...

I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

I am not really answering for an in memory table and may advise against it. Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

If you need more on what primary key to write, what secondary key to write then I can expand.

You wrote - initialized constantly - and that needs more explanation.

Greg

I just wanted to note in passing that I was betting someone would
suggest using an ISAM file.

It's a very good solution.

(Like Greg, I too have been writing COBOL for 40+ years, so maybe it's
an "Olde Tyme Solution"... :-))

You can discuss all kinds of clever ways to optimize a binary search
(no-one so far has suggested an unbalanced or skewed chop...), You can
look at clever hashing algorithms and re-invent in memory the file
system with buckets and overflow that was implemented by ICL in the
1960s, you can use refmodding to split the table as you insert each
entry in sequence (having first initialized to high-values), but they
all obfuscate what the real requirement is:

You need to build and organize a list into a specific key sequence (and
it is a "big" list...)

Kellie put it in memory because "everybody knows" "Memory must be faster".

(Generally, of course, it is... but if you spend a great deal of time
messing around with your memory-based entries and moving great hunks of
your table around, it certainly won't be as fast as you might hope.)

Given the same requirements, (and given I can't use LINQ) I would opt
for the same solution that Greg has suggested.

Here's why:

1. I HATE, LOATHE, and DETEST OCCURS DEPENDING and simply won't use it.
It is a pointless bloody waste of time that lulls you into thinking you
are using memory in an optimized way but goes ahead and allocates
maximum space anyway. You save nothing with it.
(OK, as Rick pointed out, in this case it effectively "limits" the scope
of the binary chop, but that is not compelling enough for me to change
my mind about it... :-))

2. The problem of initializing the table for different data types is
removed if you simply load it sequentially from an ISAM file.
At the same time, you can obtain a count of the entries actually loaded,
so you know what the limit is and don't NEED OCCURS
DEPENDING...(Hooray!) (You will need to write your own binary chop to
search it, but that's pretty trivial. If you REALLY want to use SEARCH
ALL then you need to use OCCURS DEPENDING.)

3. There is no need for SORT (either external or internal); ISAM sorts
it as it is created.

4. I don't like re-inventing the wheel; everything you need has already
been written by the people who wrote ISAM...

So...
1. Set up an ISAM file for "temporary" use that has the required key and element (record) structure you need. (Each record on the file will be an element in the table.) Define this file for sequential access and give
it a "fairly large" block size. (Most of the data manipulation will then
be in memory, but you don't have to worry about it.)

2. As you receive the elements, write them to the ISAM file.

3. When you need to use the table, perform a routine that reads the ISAM
file and writes sequentially to the table. (Loads the table from the
file with one sequential pass.)

At this point you should stop and ask yourself why you need the table at
all. Why not just get records randomly from the ISAM file?

The answer will depend on how you use the table. Are you sharing it
between several modules, for example? Once it is built does it not
change? (Until the next "set" of data causes it to be re-loaded...)
Is there a great deal of access to it (where physical IO could
accumulate to slow things down...)?

Kellie originally asked for opinions.

Given the constraints imposed by using COBOL (no LINQ available), mine
is pretty close to Greg's...

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Sat Jun 2 19:40:28 2018

From Newsgroup: comp.lang.cobol

On Saturday, June 2, 2018 at 6:20:07 PM UTC-7, pete dashwood wrote:

On 31/05/2018 3:15 PM, Greg Wallace wrote:

On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

This might not be the "right" question.

Maybe you need to think about whether you need a table at all, rather
than how it should be initialized...? See below.

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...

I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

I am not really answering for an in memory table and may advise against it. Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

If you need more on what primary key to write, what secondary key to write then I can expand.

You wrote - initialized constantly - and that needs more explanation.

Greg

I just wanted to note in passing that I was betting someone would
suggest using an ISAM file.

It's a very good solution.

(Like Greg, I too have been writing COBOL for 40+ years, so maybe it's
an "Olde Tyme Solution"... :-))

You can discuss all kinds of clever ways to optimize a binary search
(no-one so far has suggested an unbalanced or skewed chop...), You can
look at clever hashing algorithms and re-invent in memory the file
system with buckets and overflow that was implemented by ICL in the
1960s, you can use refmodding to split the table as you insert each
entry in sequence (having first initialized to high-values), but they
all obfuscate what the real requirement is:

You need to build and organize a list into a specific key sequence (and
it is a "big" list...)

Kellie put it in memory because "everybody knows" "Memory must be faster".

(Generally, of course, it is... but if you spend a great deal of time messing around with your memory-based entries and moving great hunks of
your table around, it certainly won't be as fast as you might hope.)

Given the same requirements, (and given I can't use LINQ) I would opt
for the same solution that Greg has suggested.

Here's why:

1. I HATE, LOATHE, and DETEST OCCURS DEPENDING and simply won't use it.
It is a pointless bloody waste of time that lulls you into thinking you
are using memory in an optimized way but goes ahead and allocates
maximum space anyway. You save nothing with it.
(OK, as Rick pointed out, in this case it effectively "limits" the scope
of the binary chop, but that is not compelling enough for me to change
my mind about it... :-))

2. The problem of initializing the table for different data types is
removed if you simply load it sequentially from an ISAM file.
At the same time, you can obtain a count of the entries actually loaded,
so you know what the limit is and don't NEED OCCURS
DEPENDING...(Hooray!) (You will need to write your own binary chop to
search it, but that's pretty trivial. If you REALLY want to use SEARCH
ALL then you need to use OCCURS DEPENDING.)

3. There is no need for SORT (either external or internal); ISAM sorts
it as it is created.

4. I don't like re-inventing the wheel; everything you need has already
been written by the people who wrote ISAM...

So...
1. Set up an ISAM file for "temporary" use that has the required key and element (record) structure you need. (Each record on the file will be an element in the table.) Define this file for sequential access and give
it a "fairly large" block size. (Most of the data manipulation will then
be in memory, but you don't have to worry about it.)

2. As you receive the elements, write them to the ISAM file.

3. When you need to use the table, perform a routine that reads the ISAM file and writes sequentially to the table. (Loads the table from the
file with one sequential pass.)

At this point you should stop and ask yourself why you need the table at all. Why not just get records randomly from the ISAM file?

The answer will depend on how you use the table. Are you sharing it
between several modules, for example? Once it is built does it not
change? (Until the next "set" of data causes it to be re-loaded...)
Is there a great deal of access to it (where physical IO could
accumulate to slow things down...)?

Kellie originally asked for opinions.

Given the constraints imposed by using COBOL (no LINQ available), mine
is pretty close to Greg's...

Pete.

I think Greg's suggestion to use an ISAM file instead of a table
is a far superior method since this will eliminate the need to
initialize the table, and will shorten the runtime instruction
path since COBOL programs are I/O bound rather than CPU bound.
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Sun Jun 3 19:28:12 2018

From Newsgroup: comp.lang.cobol

On 3/06/2018 2:40 PM, Kellie Fitton wrote:

On Saturday, June 2, 2018 at 6:20:07 PM UTC-7, pete dashwood wrote:

On 31/05/2018 3:15 PM, Greg Wallace wrote:

On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

This might not be the "right" question.

Maybe you need to think about whether you need a table at all, rather
than how it should be initialized...? See below.

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...

I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

I am not really answering for an in memory table and may advise against it. >>> Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

If you need more on what primary key to write, what secondary key to write then I can expand.

You wrote - initialized constantly - and that needs more explanation.

Greg

I just wanted to note in passing that I was betting someone would
suggest using an ISAM file.

It's a very good solution.

(Like Greg, I too have been writing COBOL for 40+ years, so maybe it's
an "Olde Tyme Solution"... :-))

You can discuss all kinds of clever ways to optimize a binary search
(no-one so far has suggested an unbalanced or skewed chop...), You can
look at clever hashing algorithms and re-invent in memory the file
system with buckets and overflow that was implemented by ICL in the
1960s, you can use refmodding to split the table as you insert each
entry in sequence (having first initialized to high-values), but they
all obfuscate what the real requirement is:

You need to build and organize a list into a specific key sequence (and
it is a "big" list...)

Kellie put it in memory because "everybody knows" "Memory must be faster". >>
(Generally, of course, it is... but if you spend a great deal of time
messing around with your memory-based entries and moving great hunks of
your table around, it certainly won't be as fast as you might hope.)

Given the same requirements, (and given I can't use LINQ) I would opt
for the same solution that Greg has suggested.

Here's why:

1. I HATE, LOATHE, and DETEST OCCURS DEPENDING and simply won't use it.
It is a pointless bloody waste of time that lulls you into thinking you
are using memory in an optimized way but goes ahead and allocates
maximum space anyway. You save nothing with it.
(OK, as Rick pointed out, in this case it effectively "limits" the scope
of the binary chop, but that is not compelling enough for me to change
my mind about it... :-))

2. The problem of initializing the table for different data types is
removed if you simply load it sequentially from an ISAM file.
At the same time, you can obtain a count of the entries actually loaded,
so you know what the limit is and don't NEED OCCURS
DEPENDING...(Hooray!) (You will need to write your own binary chop to
search it, but that's pretty trivial. If you REALLY want to use SEARCH
ALL then you need to use OCCURS DEPENDING.)

3. There is no need for SORT (either external or internal); ISAM sorts
it as it is created.

4. I don't like re-inventing the wheel; everything you need has already
been written by the people who wrote ISAM...

So...
1. Set up an ISAM file for "temporary" use that has the required key and
element (record) structure you need. (Each record on the file will be an
element in the table.) Define this file for sequential access and give
it a "fairly large" block size. (Most of the data manipulation will then
be in memory, but you don't have to worry about it.)

2. As you receive the elements, write them to the ISAM file.

3. When you need to use the table, perform a routine that reads the ISAM
file and writes sequentially to the table. (Loads the table from the
file with one sequential pass.)

At this point you should stop and ask yourself why you need the table at
all. Why not just get records randomly from the ISAM file?

The answer will depend on how you use the table. Are you sharing it
between several modules, for example? Once it is built does it not
change? (Until the next "set" of data causes it to be re-loaded...)
Is there a great deal of access to it (where physical IO could
accumulate to slow things down...)?

Kellie originally asked for opinions.

Given the constraints imposed by using COBOL (no LINQ available), mine
is pretty close to Greg's...

Pete.

I think Greg's suggestion to use an ISAM file instead of a table
is a far superior method since this will eliminate the need to
initialize the table, and will shorten the runtime instruction
path since COBOL programs are I/O bound rather than CPU bound.

I can't remember if it is legal to write unordered entries to an ISAM
file when ACCESS is SEQUENTIAL; it might not be... so you probably need
to make ACCESS is DYNAMIC the access mode. (ACCESS is SEQUENTIAL would
be ideal when you come to load the table, but it isn't much use if you
can't write unordered entries to it.)

Whether a program is IO or CPU bound has nothing to do with the
language, and COBOL is no different to any other language in this regard.

(You probably fell into this trap because COBOL is primarily used for
business applications, which do a lot of "data processing", requiring
frequent IO to the data, but it is really the logic of the program (what
the application has to DO...) that will determine whether it is IO or
CPU bound. You can write a CPU bound program in COBOL just as easily as
in any other language. I recently wrote some stuff about this while
arguing objects and layers. There is a hopefully amusing graphic that
might make you smile...: https://primacomputing.co.nz/PRIMAMetro/ObjectsAndLayers2.aspx
Unfortunately, the site is down at the moment with server problems and
my ISP is looking at it. It may be a couple of days...)

In terms of execution efficiency of the ISAM solution, it comes down
largely to how much of the file you can buffer in memory, but if you ran
a benchmark I think you would be agreeably surprised by the speed of it.
The actual processing logic is certainly much simpler than manipulating
and initializing your table, if the table is truly "large".

Good Luck!

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Sun Jun 3 00:43:26 2018

From Newsgroup: comp.lang.cobol

On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...

I think if Pete agrees than the ISAM idea carries more weight. I just add that you must have KEY. If you were doing a binary search you must be searching for some value. This should be the primary to what I call the sort-work file. You just close and open it for output and it is initialized.
I tend to Open for Output ten close and then Open again for I-O. There was a good reason for this that escapes me. Even elephants/mammoths don't have perfect memory.
Next is the file name. If you have multiple simultaneous users you may want a unique file name for each user session and there are several ways to do this.
If your KEY is not unique then you can generate a sequence number as the key and have a secondary key for searches (Start, Read-Next).
Another tip I employ is to always have a flag to indicate whether a file is open. I tend to use myfilename-open which is Y or N. If the file is open successfully set the flag to Y. When it closes set the file to N. This way you can open and close the file in many places. E.G. to refresh the file test whether myfilename-open = Y, then close it, then reopen it. This is pseudo code for convenience rather than actual correct COBOL syntax. It is also a very good way to make sure all files are closed on exit if you have and should have one exit point.
Also most I-O to this file will be in Cache memory which can a bit slower but you are semi-employing an in memory table without reinventing the wheel re binary searches.
I hope this is sufficiently clear.
Greg
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Sun Jun 3 20:21:13 2018

From Newsgroup: comp.lang.cobol

On 3/06/2018 7:43 PM, Greg Wallace wrote:

On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...

I think if Pete agrees than the ISAM idea carries more weight.

It was a nice thing to say, Greg, but it really isn't true; ideas stand
on their own merit, not on who espouses them or doesn't... :-)

I just add that you must have KEY. If you were doing a binary search you
must be searching for some value. This should be the primary to what I
call the sort-work file. You just close and open it for output and it is initialized.

I tend to Open for Output ten close and then Open again for I-O. There was a good reason for this that escapes me. Even elephants/mammoths don't have perfect memory.

It might be a ploy to get the indexes loaded; if you were only reading,
they wouldn't need to be... I wouldn't do it because you won't need the indexes when you read the file for sequential load. (Also, OPEN/CLOSEs
take quite a bit of time... Personally, I'd open output, write the
entries as they arrive, then close, and open input.

Next is the file name. If you have multiple simultaneous users you may want a unique file name for each user session and there are several ways to do this.

If your KEY is not unique then you can generate a sequence number as the key and have a secondary key for searches (Start, Read-Next).

If your table wasn't worried about the sequence of non-unique keys, then neither should your ISAM file be... :-)

Another tip I employ is to always have a flag to indicate whether a file is open. I tend to use myfilename-open which is Y or N. If the file is open successfully set the flag to Y. When it closes set the file to N. This way you can open and close the file in many places.

I'd use an 88 level with a value of 1 or 0... :-)

01 filler pic x value space.
12 myfilenameOPEN value '1'.
12 myfilenameCLOSED value '0'.

Some people don't like 88 levels (I do...), some people use "Y" and "N"
when they mean "Logical TRUE" and "Logical FALSE", (I don't...).

I worked in some non-English-speaking countries where they considered it "English arrogance" if Y and N were used... :-) I changed to using 1 and
0... then they all stopped using "S" and "N" and did the same... :-)

(I use 1 and 0 because it seems more elegant to my eye, but I don't
think anything you want to use is "wrong" (maybe Y for false... :-))

I do think that the flag should recognize "indeterminate", hence the
initial value, and I probably wouldn't test for negative values like
"NOT myfilenameOPEN", rather using specific settings for the states I'm interested in.

E.G. to refresh the file test whether myfilename-open = Y, then close
it, then reopen it. This is pseudo code for convenience rather than
actual correct COBOL syntax. It is also a very good way to make sure all
files are closed on exit if you have and should have one exit point.

Also most I-O to this file will be in Cache memory which can a bit slower but you are semi-employing an in memory table without reinventing the wheel re binary searches.

Not re-inventing the wheel... good stuff.

I hope this is sufficiently clear.

It was to me... :-)

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Sun Jun 3 20:28:46 2018

From Newsgroup: comp.lang.cobol

On 3/06/2018 8:21 PM, pete dashwood wrote:

I'd use an 88 level with a value of 1 or 0... :-)

01 filler pic x value space.
12 myfilenameOPEN value '1'.
12 myfilenameCLOSED value '0'.

And then I coded it with level 12... :-)

please read...

01 filler pic x value space.
88 myfilenameOPEN value '1'.
88 myfilenameCLOSED value '0'.

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 02:53:44 2018

From Newsgroup: comp.lang.cobol

On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

In terms of execution efficiency of the ISAM solution, it comes down
largely to how much of the file you can buffer in memory, but if you ran
a benchmark I think you would be agreeably surprised by the speed of it.
The actual processing logic is certainly much simpler than manipulating
and initializing your table, if the table is truly "large".

ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

YMMV
--- Synchronet 3.20a-Linux NewsLink 1.114

From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Sun Jun 3 14:47:07 2018

From Newsgroup: comp.lang.cobol

On Sunday, 3 June 2018 19:53:45 UTC+10, Richard wrote:

On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

In terms of execution efficiency of the ISAM solution, it comes down largely to how much of the file you can buffer in memory, but if you ran
a benchmark I think you would be agreeably surprised by the speed of it. The actual processing logic is certainly much simpler than manipulating and initializing your table, if the table is truly "large".

ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

YMMV

I don't want to write a new program so I used one of my two-pass report programs. It was reading 200,000 records and writing to a Sort file and took about 6 seconds to when the sort file is produced. When I used standard user selection options to reduce it to 50,000 output records in the Sort file it still took about 6 seconds. So the reading of the entire file was the main delay.
It is a complex program and is building a sort-key according to user selection. In this case, the sort key was a compound of
Name-Code, Invoice-No and Tran-Date. The building of the key is by an array and moving each character in a loop. It also has an in-progress display to the screen so the user can see some action going on.
Not discussed is whether the data is local or on remote server, so mine is local.
Furthermore, Kellie has indicated that re-loading the table is periodic rather than every time. So the time to reload is not every time and a normal search should be fast.
There are so many ways of doing things in COBOL and that is why I call it a Chameleon language.
Greg
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 15:14:09 2018

From Newsgroup: comp.lang.cobol

On Thursday, May 31, 2018 at 11:47:11 AM UTC+12, Kellie Fitton wrote:

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

No. Your description tells me that you are initializing one more than the 'exact number of occurrences only'. When you have put something into position 1 the 'exact number of occurrences' is 1, but you are then initializing position 2.

Do you need an extra empty initialized occurrence at the end of the array which is then included in the ODO ? So that after 'putting something in the first position' the 'occurrences' is 2 ?

What is done in 'initialize the next position (2)' that cannot be done in 'put something into position 2'.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 15:33:13 2018

From Newsgroup: comp.lang.cobol

On Thursday, May 31, 2018 at 3:20:52 PM UTC+12, Kellie Fitton wrote:

Probably the most efficient way is to set up an independent
record with all values initialized. Move that record to the
table as needed.

I just tested this code as an independent record to initialize the table:
for every needed occurrence: move ws-repository to ws-table-items.
it should work just fine since an alphanumeric move is done one byte at
a time from left to right, and stops when the end of the shortest field
is encountered. I think the compiler should issue a warning message though about moving a field to a part of itself just as a notice information.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-repository.
05 format-table.
10 format-alphanumeric pic x(8) value spaces.
10 format-numeric pic s9(9)v99 comp-3 value +0.
05 ws-table-items.
10 filler occurs 1 to 53000 times depending on ws-table-counter.
15 table-plan pic x(8).
15 table-member pic s9(9)v99 comp-3.

"""for every needed occurrence: move ws-repository to ws-table-items. it should work just fine since an alphanumeric move is done one byte at
a time from left to right, and stops when the end of the shortest field
is encountered."""
No. An alphanumeric move will pad out the end of the receiving field with spaces. Think of MOVE "A" TO WS-Name which is PIC X(40). Do you expect an "A" followed by whatever was in the remainder of that field before the move ? You should expect an "A" followed by 39 spaces.
ws-table-items is a field that is 14 x 53000 bytes long. The move as given will move 14 bytes then fill the other 14 x 52999 bytes with spaces.
If you are doing that "for every needed occurrence" then you are overwriting the whole field each time.
Now, the ODO table may have a variable number of _usable_ occurrences, depending on ws-table-counter, but that does not change the size of ws-table-items which is a group field.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 15:45:49 2018

From Newsgroup: comp.lang.cobol

On Thursday, May 31, 2018 at 9:07:20 PM UTC+12, Kellie Fitton wrote:

I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

I am not really answering for an in memory table and may advise against it.
Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

If you need more on what primary key to write, what secondary key to write then I can expand.

You wrote - initialized constantly - and that needs more explanation.

Greg

Table-wise this table is small for an ISAM file, That's why I
elected to use an in-memory table since the search all still
very fast for a 1 MByte table.

The table needs to get refreshed/reset periodically so it can
accommodate anew set of fresh data collected from a master file.
Hence, the initialization algorithm must reset the old data and
prepare the table for the re-populate process.

The only 'reset' you need is MOVE ZERO TO ws-table-count.
Then as you load data the code would be:
ADD 1 TO ws-table-count
MOVE plan TO table-plan(ws-table-count)
MOVE member TO table-member(ws-table-count)
As you say, the ODO confines the data. There is no need for the data items beyond the ODO to have anything specific in them or to be 'initialized' and any 'initialization' within the ODO gets overwritten.

I am very intrigued by your two pass report structure. I hope
you have the time to elaborate on the process of varying the
keys according to the users selection criteria. Thanks...

--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 16:00:19 2018

From Newsgroup: comp.lang.cobol

On Sunday, June 3, 2018 at 7:43:27 PM UTC+12, Greg Wallace wrote:

On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:

Hi Folks,

One of my programs is handling a mammoth table that needs to be
initialized constantly. It is a million-byte table and used for
lookup records (binary search all) to increase the speed of the
program. The clause occurs depending on is used to create the
table accordingly. Moreover, to ensure reduced CPU consumption,
the initialization algorithm is using reference modifications to
obviate initializing the whole table more often that required.

I need your kind help with the following question:

What is the most optimized method to initialize a mammoth table?

Your thoughts and opinions are appreciated.

COBOL - the elephant that can stand on its trunk...

I think if Pete agrees than the ISAM idea carries more weight. I just add that you must have KEY. If you were doing a binary search you must be searching for some value. This should be the primary to what I call the sort-work file. You just close and open it for output and it is initialized.

I tend to Open for Output ten close and then Open again for I-O. There was a good reason for this that escapes me. Even elephants/mammoths don't have perfect memory.

MicroFocus had an option that allowed OPEN I-O to create the file if it did not already exist. Other systems would give a '35' file status and fail. However OPEN OUTPUT would delete an existing file and recreate a new empty one which may not be useful unless you have just detected a '35' file status.

Next is the file name. If you have multiple simultaneous users you may want a unique file name for each user session and there are several ways to do this.

If your KEY is not unique then you can generate a sequence number as the key and have a secondary key for searches (Start, Read-Next).

Another tip I employ is to always have a flag to indicate whether a file is open. I tend to use myfilename-open which is Y or N. If the file is open successfully set the flag to Y. When it closes set the file to N. This way you can open and close the file in many places. E.G. to refresh the file test whether myfilename-open = Y, then close it, then reopen it. This is pseudo code for convenience rather than actual correct COBOL syntax. It is also a very good way to make sure all files are closed on exit if you have and should have one exit point.

Also most I-O to this file will be in Cache memory which can a bit slower but you are semi-employing an in memory table without reinventing the wheel re binary searches.

I hope this is sufficiently clear.
Greg

--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon Jun 4 06:36:29 2018

From Newsgroup: comp.lang.cobol

On Sunday, June 3, 2018 at 3:33:14 PM UTC-7, Richard wrote:

On Thursday, May 31, 2018 at 3:20:52 PM UTC+12, Kellie Fitton wrote:

Probably the most efficient way is to set up an independent
record with all values initialized. Move that record to the
table as needed.

I just tested this code as an independent record to initialize the table: for every needed occurrence: move ws-repository to ws-table-items.
it should work just fine since an alphanumeric move is done one byte at
a time from left to right, and stops when the end of the shortest field
is encountered. I think the compiler should issue a warning message though about moving a field to a part of itself just as a notice information.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-repository.
05 format-table.
10 format-alphanumeric pic x(8) value spaces.
10 format-numeric pic s9(9)v99 comp-3 value +0.
05 ws-table-items.
10 filler occurs 1 to 53000 times depending on ws-table-counter.
15 table-plan pic x(8).
15 table-member pic s9(9)v99 comp-3.

"""for every needed occurrence: move ws-repository to ws-table-items. it should work just fine since an alphanumeric move is done one byte at
a time from left to right, and stops when the end of the shortest field
is encountered."""

No. An alphanumeric move will pad out the end of the receiving field with spaces. Think of MOVE "A" TO WS-Name which is PIC X(40). Do you expect an "A" followed by whatever was in the remainder of that field before the move ? You should expect an "A" followed by 39 spaces.

The sending variable format-alphanumeric is size pic x(8)
The receiving variable table-plan is size pic x(8)
both variables are same size, same length - NO remainder - NO Pad Out...
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon Jun 4 07:08:15 2018

From Newsgroup: comp.lang.cobol

On Sunday, June 3, 2018 at 3:14:10 PM UTC-7, Richard wrote:

On Thursday, May 31, 2018 at 11:47:11 AM UTC+12, Kellie Fitton wrote:

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

No. Your description tells me that you are initializing one more than the 'exact number of occurrences only'. When you have put something into position 1 the 'exact number of occurrences' is 1, but you are then initializing position 2.

Do you need an extra empty initialized occurrence at the end of the array which is then included in the ODO ? So that after 'putting something in the first position' the 'occurrences' is 2 ?

What is done in 'initialize the next position (2)' that cannot be done in 'put something into position 2'.

The initialization logic is: [format-table-items-as-needed-only].
It will initialize the first occurrence in the table, then after
putting something in the first-position-of-the-table (1), the
next position will be initialized when the table have the second
occurrence and ready to get populated with data for position (2)

--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon Jun 4 07:53:53 2018

From Newsgroup: comp.lang.cobol

On Sunday, June 3, 2018 at 4:00:20 PM UTC-7, Richard wrote:

Micro Focus had an option that allowed OPEN I-O to create the file if it did not already exist. Other systems would give a '35' file status and fail. However OPEN OUTPUT would delete an existing file and recreate a new empty one which may not be useful unless you have just detected a '35' file status.

If the file does not exist--the OPEN I-O will create the file Only
if the Runtime Configurable IO_CREATES is turned on in the Runtime configuration file: [ IO_CREATES ON ]. Or, if the SELECT statement
for that file is including the OPTIONAL phrase...
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon Jun 4 10:38:41 2018

From Newsgroup: comp.lang.cobol

On Sunday, June 3, 2018 at 2:47:08 PM UTC-7, Greg Wallace wrote:

I don't want to write a new program so I used one of my two-pass report programs. It was reading 200,000 records and writing to a Sort file and took about 6 seconds to when the sort file is produced. When I used standard user selection options to reduce it to 50,000 output records in the Sort file it still took about 6 seconds. So the reading of the entire file was the main delay.
Greg

Greg,
Do you use Micro Focus COBOL compiler? if so:
Did you set the environment variable IDXDATBUF to increase the buffer size?
the default value is 0, increasing its value will improve file access speed. the variable must be set in increments of 4096
SET IDXDATBUF=8192
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Mon Jun 4 13:19:23 2018

From Newsgroup: comp.lang.cobol

On Tuesday, June 5, 2018 at 1:36:31 AM UTC+12, Kellie Fitton wrote:

On Sunday, June 3, 2018 at 3:33:14 PM UTC-7, Richard wrote:

On Thursday, May 31, 2018 at 3:20:52 PM UTC+12, Kellie Fitton wrote:

Probably the most efficient way is to set up an independent
record with all values initialized. Move that record to the
table as needed.

I just tested this code as an independent record to initialize the table: for every needed occurrence: move ws-repository to ws-table-items.
it should work just fine since an alphanumeric move is done one byte at
a time from left to right, and stops when the end of the shortest field is encountered. I think the compiler should issue a warning message though
about moving a field to a part of itself just as a notice information.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-repository.
05 format-table.
10 format-alphanumeric pic x(8) value spaces.
10 format-numeric pic s9(9)v99 comp-3 value +0.
05 ws-table-items.
10 filler occurs 1 to 53000 times depending on ws-table-counter.
15 table-plan pic x(8).
15 table-member pic s9(9)v99 comp-3.

"""for every needed occurrence: move ws-repository to ws-table-items. it should work just fine since an alphanumeric move is done one byte at
a time from left to right, and stops when the end of the shortest field
is encountered."""

No. An alphanumeric move will pad out the end of the receiving field with spaces. Think of MOVE "A" TO WS-Name which is PIC X(40). Do you expect an "A" followed by whatever was in the remainder of that field before the move ? You should expect an "A" followed by 39 spaces.

The sending variable format-alphanumeric is size pic x(8)
The receiving variable table-plan is size pic x(8)

"""move ws-repository to ws-table-items"""
I am glad to see that you are not actually doing what you said you were doing. It still seems completely unnecessary.

both variables are same size, same length - NO remainder - NO Pad Out...

"""and stops when the end of the shortest field is encountered."""
The MOVE stops when the end of the _receiving_ field is encountered (by adding spaces if necessary).
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Mon Jun 4 13:33:52 2018

From Newsgroup: comp.lang.cobol

On Tuesday, June 5, 2018 at 2:08:16 AM UTC+12, Kellie Fitton wrote:

On Sunday, June 3, 2018 at 3:14:10 PM UTC-7, Richard wrote:

On Thursday, May 31, 2018 at 11:47:11 AM UTC+12, Kellie Fitton wrote:

Depends on the content of the table. Only one type, say, binary?
Or, mixed types, binary and alphanumeric?
It would be helpful to know the organization as defined in working-storage.
But, since you are using ODO, what do you need to initialize?

The table organization are a combination of binary comp-3
and alphanumeric. The table is populated based on ODO and
the initialization technique is to initialize as needed
only. Initializing the first occurrence in the table then
when putting something in the first position the algorithm
will initialize the next. Therefore, initializing the exact
number of occurrences only.

No. Your description tells me that you are initializing one more than the 'exact number of occurrences only'. When you have put something into position 1 the 'exact number of occurrences' is 1, but you are then initializing position 2.

Do you need an extra empty initialized occurrence at the end of the array which is then included in the ODO ? So that after 'putting something in the first position' the 'occurrences' is 2 ?

What is done in 'initialize the next position (2)' that cannot be done in 'put something into position 2'.

The initialization logic is: [format-table-items-as-needed-only].
It will initialize the first occurrence in the table, then after
putting something in the first-position-of-the-table (1), the
next position will be initialized when the table have the second
occurrence and ready to get populated with data for position (2)

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?
Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?
"ready to get populated with data"
Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.
Your question was: "What is the most optimized method to initialize a mammoth table?".
The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 20:54:42 2018

From Newsgroup: comp.lang.cobol

In article <8dbfa290-0ace-433d-8d45-c8ccedbb394a@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

[snip]

I think Greg's suggestion to use an ISAM file instead of a table
is a far superior method since this will eliminate the need to
initialize the table, and will shorten the runtime instruction
path since COBOL programs are I/O bound rather than CPU bound.

Mr Plinston pointed out that loading the table in a certain manner
eliminates the need to initialize the table.

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 20:57:01 2018

From Newsgroup: comp.lang.cobol

In article <9faef590-ab53-4f88-a6bd-3c56d8f58e75@googlegroups.com>,
Richard <riplin@azonic.co.nz> wrote:

[snip]

Have you considered using a hash table rather than using a binary search ?

I considered that, once, and then considered the 2-year programmer who
would be stuck maintaining the code.

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Mon Jun 4 14:02:47 2018

From Newsgroup: comp.lang.cobol

On Tuesday, 5 June 2018 03:38:42 UTC+10, Kellie Fitton wrote:

On Sunday, June 3, 2018 at 2:47:08 PM UTC-7, Greg Wallace wrote:

I don't want to write a new program so I used one of my two-pass report programs. It was reading 200,000 records and writing to a Sort file and took about 6 seconds to when the sort file is produced. When I used standard user selection options to reduce it to 50,000 output records in the Sort file it still took about 6 seconds. So the reading of the entire file was the main delay.

Greg

Greg,

Do you use Micro Focus COBOL compiler? if so:
Did you set the environment variable IDXDATBUF to increase the buffer size? the default value is 0, increasing its value will improve file access speed. the variable must be set in increments of 4096

SET IDXDATBUF=8192

This elephant was using MicroFocus (MF) COBOL on the first IBM PC that only had two floppy disks. MF COBOL was one of only 20 Apps certified for the release of the first IBM PC. I was not happy with MF ISAM for many reasons and in about 1990 switched to AcuCobol. There ISAM method is called Vision and I have found it very reliable to this day. MF no doubt improved theirs subsequently.
AcuCobol's config file has a similar option plus some I add:
# Use 1 if you want files open for I-O to be created if they do not exist IO-CREATES 1
# NEEDED LOTS OF FILES (DEFAULT IS 128)
MAX-FILES 255
# use VISION as the default data source
DEFAULT_HOST VISION
Kellie, can you expand on your Table with some sample data?
So far I get this where XXXXXXXX and YYYYYYYY are values.
Search Field Pic X(8) Return Field Pic X(8)
XXXXXXXX YYYYYYYY
Re Dynamic: I tend to use this every time.
file SLSRT.CPY - always the same
SELECT SORT-FILE ASSIGN SORT-FILE-NAME
ORGANIZATION INDEXED
ACCESS MODE DYNAMIC
RECORD KEY SORT-KEY-FIELD
FILE STATUS FILSTAT
LOCK MODE IS EXCLUSIVE.
Here are follow up code samples.

file FDSRT.CPY - always the same
FD SORT-FILE.
01 SORT-RECORD.
02 SORT-KEY-FIELD.
03 SORT-KEY PIC X(30).
03 SORT-REDEF REDEFINES SORT-KEY.
07 SORT-ARRAY PIC X OCCURS 30.
03 SORT-SEQ-NO PIC S9(3) COMP-3. SORT-SEQ-NO is a tie breaker in case the sort produces duplicate keys.

The following copy module varies according to the sort file being produced and must follow FDSRT.CPY.
BMBI is a unique name for a particular file.
file SRTBMBI.FD
02 SRT-BMBI-RECORD.
03 SRT-BMBI-KEY.
05 SRT-BMBI-TRAN-NO PIC 9(7).
03 SRT-BMBI-DATA.
05 SRT-BMBI-TRAN-DATE PIC 9(8).

Leaving out other files this is all the code one sees in the source program to define the sort-work file.
....
INPUT-OUTPUT SECTION.
FILE-CONTROL.
COPY "SLSRT.SL".
....
DATA DIVISION.
FILE SECTION.
COPY "FDSRT.FD ".
COPY "SRTBMBI.FD ".
....
Hoping this helps
Greg
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 21:21:07 2018

From Newsgroup: comp.lang.cobol

In article <89492c6b-0d46-484a-874f-3f0b67856f04@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

[snip]

The table is 1 MByte sized and will be searched often so
a binary search would be more efficient and simpler.

It is not the table's size that slows the search, it is the number of
entries. Consider:

01 WS-ONE-MEG-TBL.
05 WS-ONE-MEG-LINE OCCURS 1 TO 1000 DEPENDING ON MEG-ENTTRIES.
10 WS-1ST-500 PIC X(500).
10 WS-2ND-500 PIC X(500).

Now... when MEG-ENTRIES = 1000 the table contains 1,000,000 adressable characters. Now, consider:

01 WS-ONE-MEG-TBL-2.
05 WS-ONE-MEG-TBL-2-LINE OCCURS 1 TO 50000 DEPENDING ON MEG-ENTRIES.
10 WS-1ST-50 PIC X(20).
10 WS-2ND-50 PIC X(20).

... and when MEG-ENTRIES = 10000 the table contains 1,000,000 adressable characters... but in ten times as many entries.

Both are 1-Meg tables but at full capacity the binary search (SEARCH ALL,
in some compilers) of WS-ONE-MEG-TBL will take log2(1000) comparisons...
call it 10.

log2(50000) is... call it 16.

It isn't the size, it's how it is organized.

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 21:24:15 2018

From Newsgroup: comp.lang.cobol

In article <799aefa3-aa73-4bfb-bd67-26613d3892cf@googlegroups.com>,
Greg Wallace <gregwebace@gmail.com> wrote:

[snip]

I tend to Open for Output ten close and then Open again for I-O. There
was a good reason for this that escapes me. Even elephants/mammoths
don't have perfect memory.

I was taught to do this in order to 'let the buffers set'... fill up the
file faster by WRITE to OPEN OUTPUT, close, open I-O for your lookups.

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 21:26:30 2018

From Newsgroup: comp.lang.cobol

In article <fnhmnuFfh7jU1@mid.individual.net>,
pete dashwood <dashwood@enternet.co.nz> wrote:

On 3/06/2018 7:43 PM, Greg Wallace wrote:

[snip]

I think if Pete agrees than the ISAM idea carries more weight.

It was a nice thing to say, Greg, but it really isn't true; ideas stand
on their own merit, not on who espouses them or doesn't... :-)

Ideas must have merit, Mr Dashwood said so... wait...

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 08:26:12 2018

From Newsgroup: comp.lang.cobol

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,
The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The
ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.
01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.
compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Tue Jun 5 13:25:35 2018

From Newsgroup: comp.lang.cobol

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

"""The table needs to be initialized (formatted) prior to being populated"""
NO IT DOES NOT. You seem to be incredibly resistant to advice.
'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.
In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.
If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.
"""The initialization is done As-Needed-Only for each table-row."""
That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.
Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.
At the very least you can now speed up the 'initialization' by simply doing:

add 1 to ws-counter
move low-values to ws-table-items(ws-counter)
which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.
Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.
I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.
You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kerry Liles@kerry.liles@gmail.com to comp.lang.cobol on Tue Jun 5 18:11:49 2018

From Newsgroup: comp.lang.cobol

On 6/5/2018 4:25 PM, Richard wrote:

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The
ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

"""The table needs to be initialized (formatted) prior to being populated"""

NO IT DOES NOT. You seem to be incredibly resistant to advice.

'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

"""The initialization is done As-Needed-Only for each table-row."""

That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

At the very least you can now speed up the 'initialization' by simply doing:

add 1 to ws-counter
move low-values to ws-table-items(ws-counter)

which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.

Roger the above.

I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?

+1 to this.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Tue Jun 5 15:22:53 2018

From Newsgroup: comp.lang.cobol

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

Just to show that I am capable of running tests and timing them, which is what you should be doing, I have done 'initializing' a table 3 ways, the results are:
Each time is for 10,000 repeats and for 50,000 entries
1. move low-values to ws-repository : 0.12 seconds
2. move low-values to ws-table-items(index) : 7.2 seconds
3. move low-values to ws-repository(calculated:length of entry) : 59 seconds So, not only is your code wrong but it is the worst way of doing the initialization by a factor of 48000%.
Your original code was at least 6000% slower than the best.
And it doesn't need to be done anyway.
Also, as yet another failure, ws-table-position is only 5 digits (pic 9(5)) and this will overflow when you multiply 14 * 53000, or 14 * any number more than 7142.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Clark F Morris@cfmpublic@ns.sympatico.ca to comp.lang.cobol on Tue Jun 5 21:19:54 2018

From Newsgroup: comp.lang.cobol

On Tue, 5 Jun 2018 08:26:12 -0700 (PDT), Kellie Fitton
<KELLIEFITTON@yahoo.com> wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The >ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

If both table-plan and table-member are filled in then the
initialization is a waste of computer cycles since the filling of
table-plan will overwrite the low-values in position 1 of table-plan.

Clark Morris
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Wed Jun 6 14:24:25 2018

From Newsgroup: comp.lang.cobol

On 3/06/2018 9:53 PM, Richard wrote:

On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

In terms of execution efficiency of the ISAM solution, it comes down
largely to how much of the file you can buffer in memory, but if you ran
a benchmark I think you would be agreeably surprised by the speed of it.
The actual processing logic is certainly much simpler than manipulating
and initializing your table, if the table is truly "large".

ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

YMMV

Thanks for that, Richard. It appears that random ISAM access may be
worse than I would have expected...

That reinforces the case for loading the table and then using the table
for random retrieval (where this makes sense to do, of course.)

I noted (and completely agree with) comments by you and Clark under
Kellie's post.

There seems to be some fundamental mis-understanding about
"initializing" then overwriting.

Hopefully, the posts have helped to clear it up.

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 20:29:49 2018

From Newsgroup: comp.lang.cobol

On Tuesday, June 5, 2018 at 1:25:37 PM UTC-7, Richard wrote:

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

"""The table needs to be initialized (formatted) prior to being populated"""

NO IT DOES NOT. You seem to be incredibly resistant to advice.

'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

"""The initialization is done As-Needed-Only for each table-row."""

That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

At the very least you can now speed up the 'initialization' by simply doing:

add 1 to ws-counter
move low-values to ws-table-items(ws-counter)

which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.

I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?

Richard,
When I posted my question I said: the table needs to be
initialized CONSTANTLY. Also, as I have mentioned in
this thread Previously, the table needs to get Refreshed
and Reset PERIODICALLY before a new set of fresh data
can re-populate the table. The initialize process must
erase, remove, clear all previous entries in the table,
to prepare the table for the NEW set of collected data.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 20:38:02 2018

From Newsgroup: comp.lang.cobol

On Tuesday, June 5, 2018 at 3:22:54 PM UTC-7, Richard wrote:

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

Just to show that I am capable of running tests and timing them, which is what you should be doing, I have done 'initializing' a table 3 ways, the results are:

Each time is for 10,000 repeats and for 50,000 entries

1. move low-values to ws-repository : 0.12 seconds

2. move low-values to ws-table-items(index) : 7.2 seconds

3. move low-values to ws-repository(calculated:length of entry) : 59 seconds

So, not only is your code wrong but it is the worst way of doing the initialization by a factor of 48000%.

Your original code was at least 6000% slower than the best.

And it doesn't need to be done anyway.

Also, as yet another failure, ws-table-position is only 5 digits (pic 9(5)) and this will overflow when you multiply 14 * 53000, or 14 * any number more than 7142.

The ws-table-position variable should be 9 digits, I was typing fast to
explain the process logic while talking on my cellphone.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 20:49:31 2018

From Newsgroup: comp.lang.cobol

On Tuesday, June 5, 2018 at 5:19:58 PM UTC-7, Clark F Morris wrote:

On Tue, 5 Jun 2018 08:26:12 -0700 (PDT), Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The >ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

If both table-plan and table-member are filled in then the
initialization is a waste of computer cycles since the filling of
table-plan will overwrite the low-values in position 1 of table-plan.

Clark Morris

Clark,
As I have mentioned in this thread previously, the initialize
process must happen periodically to reset, refresh the table
and prepare it for a new set of replacement data. The rest
and initialize process is much faster when done based on the
number of entries in the table: [calculate table-position].
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 21:05:25 2018

From Newsgroup: comp.lang.cobol

On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:

On 6/5/2018 4:25 PM, Richard wrote:

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The
ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

"""The table needs to be initialized (formatted) prior to being populated"""

NO IT DOES NOT. You seem to be incredibly resistant to advice.

'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

"""The initialization is done As-Needed-Only for each table-row."""

That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

At the very least you can now speed up the 'initialization' by simply doing:

add 1 to ws-counter
move low-values to ws-table-items(ws-counter)

which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.

Roger the above.

I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?

+1 to this.

Richard,
I already tested and compared several sets of initialization
methods before posting my question. The Initialize process was
conducted with the following methods:
initialize ws-repository
move low-values to ws-repository
perform varying loop
move spaces to table-plan
move zeros to table-member
end-perform
calculate ws-table-position
The last method was faster by a considerable margin [70%].
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 21:11:47 2018

From Newsgroup: comp.lang.cobol

On Tuesday, June 5, 2018 at 7:24:30 PM UTC-7, pete dashwood wrote:

On 3/06/2018 9:53 PM, Richard wrote:

On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

In terms of execution efficiency of the ISAM solution, it comes down
largely to how much of the file you can buffer in memory, but if you ran >> a benchmark I think you would be agreeably surprised by the speed of it. >> The actual processing logic is certainly much simpler than manipulating
and initializing your table, if the table is truly "large".

ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

YMMV

Thanks for that, Richard. It appears that random ISAM access may be
worse than I would have expected...

That reinforces the case for loading the table and then using the table
for random retrieval (where this makes sense to do, of course.)

I noted (and completely agree with) comments by you and Clark under
Kellie's post.

There seems to be some fundamental mis-understanding about
"initializing" then overwriting.

Hopefully, the posts have helped to clear it up.

Pete.

--
I used to write COBOL; now I can do anything...

Pete,

As mentioned in my question: the table needs to get refreshed
and reset Constantly. The initialize process must REMOVE the
old data from the table, then the NEW SET OF FRESH DATA will
re-populate the table periodically. Hence, re-initialize...
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Tue Jun 5 21:59:03 2018

From Newsgroup: comp.lang.cobol

On Wednesday, June 6, 2018 at 4:11:48 PM UTC+12, Kellie Fitton wrote:

On Tuesday, June 5, 2018 at 7:24:30 PM UTC-7, pete dashwood wrote:

On 3/06/2018 9:53 PM, Richard wrote:

On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

In terms of execution efficiency of the ISAM solution, it comes down
largely to how much of the file you can buffer in memory, but if you ran >> a benchmark I think you would be agreeably surprised by the speed of it. >> The actual processing logic is certainly much simpler than manipulating >> and initializing your table, if the table is truly "large".

ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

YMMV

Thanks for that, Richard. It appears that random ISAM access may be
worse than I would have expected...

That reinforces the case for loading the table and then using the table for random retrieval (where this makes sense to do, of course.)

I noted (and completely agree with) comments by you and Clark under Kellie's post.

There seems to be some fundamental mis-understanding about
"initializing" then overwriting.

Hopefully, the posts have helped to clear it up.

Pete.

--
I used to write COBOL; now I can do anything...

Pete,

As mentioned in my question: the table needs to get refreshed
and reset Constantly. The initialize process must REMOVE the
old data from the table, then the NEW SET OF FRESH DATA will
re-populate the table periodically. Hence, re-initialize...

There is _no_ need to 'remove' the old data from the table (based on the code in your posts). Re-populating the table will _overwrite_ the old data regardless of whether it was names and numbers or low values.

The ODO set at the appropriate point will ensure that any 'old data' beyond that point is not accessed.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Tue Jun 5 22:20:28 2018

From Newsgroup: comp.lang.cobol

On Wednesday, June 6, 2018 at 4:05:26 PM UTC+12, Kellie Fitton wrote:

On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:

On 6/5/2018 4:25 PM, Richard wrote:

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The
ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

"""The table needs to be initialized (formatted) prior to being populated"""

NO IT DOES NOT. You seem to be incredibly resistant to advice.

'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

"""The initialization is done As-Needed-Only for each table-row."""

That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

At the very least you can now speed up the 'initialization' by simply doing:

add 1 to ws-counter
move low-values to ws-table-items(ws-counter)

which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.

Roger the above.

I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?

+1 to this.

Richard,

I already tested and compared several sets of initialization
methods before posting my question. The Initialize process was
conducted with the following methods:

initialize ws-repository

move low-values to ws-repository

perform varying loop
move spaces to table-plan
move zeros to table-member
end-perform

calculate ws-table-position

The last method was faster by a considerable margin [70%].

Then I would posit that your description of what your code does is simply not true.
The 'calculate' code would move low-values to the table from byte 1 for the number of entries to whatever the ws-table-count held. You claimed that you did:
"""The initialization is done As-Needed-Only for each table-row."""
and previously you had claimed:
"""The initialization logic is: [format-table-items-as-needed-only].
It will initialize the first occurrence in the table, then after
putting something in the first-position-of-the-table (1), the
next position will be initialized when the table have the second
occurrence and ready to get populated with data for position (2)"""
It is as if you are unaware of what you are actually doing in the code.
It may well be that the 'calculate ws-table-position' is 'faster', especially when ws-table-counter is zero, as it will be in your code sample, because it will do nothing.
If ws-table-counter is > zero and you are doing this 'for each table-row' as you load the data then it is overwriting the data already loaded.
Get your act together and work out what your code really is and what it is supposed to be doing; post actual code from your compiled program instead of retyping what you guess it to be; and stop wasting everyone's time.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Tue Jun 5 23:42:04 2018

From Newsgroup: comp.lang.cobol

On Wednesday, 6 June 2018 15:20:29 UTC+10, Richard wrote:

On Wednesday, June 6, 2018 at 4:05:26 PM UTC+12, Kellie Fitton wrote:

On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:

On 6/5/2018 4:25 PM, Richard wrote:

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The
ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

"""The table needs to be initialized (formatted) prior to being populated"""

NO IT DOES NOT. You seem to be incredibly resistant to advice.

'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

"""The initialization is done As-Needed-Only for each table-row."""

That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

At the very least you can now speed up the 'initialization' by simply doing:

add 1 to ws-counter
move low-values to ws-table-items(ws-counter)

which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.

Roger the above.

I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?

+1 to this.

Richard,

I already tested and compared several sets of initialization
methods before posting my question. The Initialize process was
conducted with the following methods:

initialize ws-repository

move low-values to ws-repository

perform varying loop
move spaces to table-plan
move zeros to table-member
end-perform

calculate ws-table-position

The last method was faster by a considerable margin [70%].

Then I would posit that your description of what your code does is simply not true.

The 'calculate' code would move low-values to the table from byte 1 for the number of entries to whatever the ws-table-count held. You claimed that you did:

"""The initialization is done As-Needed-Only for each table-row."""

and previously you had claimed:

"""The initialization logic is: [format-table-items-as-needed-only].
It will initialize the first occurrence in the table, then after
putting something in the first-position-of-the-table (1), the
next position will be initialized when the table have the second
occurrence and ready to get populated with data for position (2)"""

It is as if you are unaware of what you are actually doing in the code.

It may well be that the 'calculate ws-table-position' is 'faster', especially when ws-table-counter is zero, as it will be in your code sample, because it will do nothing.

If ws-table-counter is > zero and you are doing this 'for each table-row' as you load the data then it is overwriting the data already loaded.

Get your act together and work out what your code really is and what it is supposed to be doing; post actual code from your compiled program instead of retyping what you guess it to be; and stop wasting everyone's time.

I would like to take a step back. You opened a thread with 'Can mighty Cobol carry an elephant'. That is a brilliant title and you have engaged much discussion. While I may soften Richards remarks you did leave a lot of gaps in the explanation.
Discussion about in memory tables and initialization somewhat bores me.
As a business analyst, I would want to know more about the application to look at why this is necessary. There may be some other solution.
Greg
--- Synchronet 3.20a-Linux NewsLink 1.114

From Louis Krupp@lkrupp@nospam.pssw.com.invalid to comp.lang.cobol on Wed Jun 6 01:13:43 2018

From Newsgroup: comp.lang.cobol

On Tue, 5 Jun 2018 20:49:31 -0700 (PDT), Kellie Fitton
<KELLIEFITTON@yahoo.com> wrote:

On Tuesday, June 5, 2018 at 5:19:58 PM UTC-7, Clark F Morris wrote:

On Tue, 5 Jun 2018 08:26:12 -0700 (PDT), Kellie Fitton
<KELLIEFITTON@yahoo.com> wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The
ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

If both table-plan and table-member are filled in then the
initialization is a waste of computer cycles since the filling of
table-plan will overwrite the low-values in position 1 of table-plan.

Clark Morris

Clark,

As I have mentioned in this thread previously, the initialize
process must happen periodically to reset, refresh the table
and prepare it for a new set of replacement data. The rest
and initialize process is much faster when done based on the
number of entries in the table: [calculate table-position].

Why not just reset ws-table-counter and possibly table-index,
depending on what you're using to store newly read records?
Louis
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kerry Liles@kerry.liles@gmail.com to comp.lang.cobol on Wed Jun 6 09:46:42 2018

From Newsgroup: comp.lang.cobol

On 6/6/2018 12:11 AM, Kellie Fitton wrote:

On Tuesday, June 5, 2018 at 7:24:30 PM UTC-7, pete dashwood wrote:

On 3/06/2018 9:53 PM, Richard wrote:

On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

In terms of execution efficiency of the ISAM solution, it comes down
largely to how much of the file you can buffer in memory, but if you ran >>>> a benchmark I think you would be agreeably surprised by the speed of it. >>>> The actual processing logic is certainly much simpler than manipulating >>>> and initializing your table, if the table is truly "large".

ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

YMMV

Thanks for that, Richard. It appears that random ISAM access may be
worse than I would have expected...

That reinforces the case for loading the table and then using the table
for random retrieval (where this makes sense to do, of course.)

I noted (and completely agree with) comments by you and Clark under
Kellie's post.

There seems to be some fundamental mis-understanding about
"initializing" then overwriting.

Hopefully, the posts have helped to clear it up.

Pete.

--
I used to write COBOL; now I can do anything...

Pete,

As mentioned in my question: the table needs to get refreshed
and reset Constantly. The initialize process must REMOVE the
old data from the table, then the NEW SET OF FRESH DATA will
re-populate the table periodically. Hence, re-initialize...

As others have pointed out, there is no need to clear the table if your program simply keeps track of the 'currently' highest used entry in the
table (setting that back to 0 or 1 or whatever you like to use as the
first entry effectively "clears" the table)
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed Jun 6 10:26:03 2018

From Newsgroup: comp.lang.cobol

As a business analyst, I would want to know more about the application to look at why this is necessary. There may be some other solution.

Greg

Greg,

The initialize process resets the table before a new set of
date re-populate the table. Moving low-values to the table
with the calculated table position works very well and fast.
It clears only the number of data entry occurrences without
initializing the entire table [max size].

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed Jun 6 10:28:52 2018

From Newsgroup: comp.lang.cobol

On Wednesday, June 6, 2018 at 12:13:45 AM UTC-7, Louis Krupp wrote:

Why not just reset ws-table-counter and possibly table-index,
depending on what you're using to store newly read records?

Louis

Louis,

I reset the ws-table-counter back to 1 and it did clear
the old entries from the table. Thanks...

--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed Jun 6 10:30:13 2018

From Newsgroup: comp.lang.cobol

On Wednesday, June 6, 2018 at 6:46:45 AM UTC-7, Kerry Liles wrote:

As others have pointed out, there is no need to clear the table if your program simply keeps track of the 'currently' highest used entry in the table (setting that back to 0 or 1 or whatever you like to use as the
first entry effectively "clears" the table)

Kerry,

I reset the ws-table-counter back to 1 and it did clear
the old entries from the table. Thanks...

--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Wed Jun 6 12:22:55 2018

From Newsgroup: comp.lang.cobol

On Thursday, June 7, 2018 at 5:26:05 AM UTC+12, Kellie Fitton wrote:

As a business analyst, I would want to know more about the application to look at why this is necessary. There may be some other solution.

Greg

Greg,

The initialize process resets the table before a new set of
date re-populate the table. Moving low-values to the table
with the calculated table position works very well and fast.
It clears only the number of data entry occurrences without
initializing the entire table [max size].

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

You haven't answered the question that Greg asked which is: "why this is necessary". You have been told, several times, that it is not, and yet you continue to 'refine and optimize the unrequired'.
But to a more substantial criticism of you latest mechanism.
You originally posted the code above, and then later claimed it was 70% faster than the obvious and simpler code of a simple move of low-values to the whole table (given that is what you comparing to). When you did that the code had ws-counter-value of zero, meaning that no move was done at all.
So the question arises as to what value are you putting into ws-counter-value before finding it to be working 'well and fast' or indeed 70% faster than something else?
And then we come to "It clears only the number of data entry occurrences" and "before a new set of date re-populate the table". What mechanism have you implemented to determine ahead of time how many occurrences will be required by 'the new set of data'.
What value will you put in ws-table-counter (the ODO and basis of the calculation) when you are about to read the file that contains the data items? In particular you need to answer this for the first time the program populates the table.
Will you pre-read the data file and count the items that require an entry? Your mechanism will either be "well" _or_ "fast" but certainly not both.
If you do pre-read (and thus double the re-populate time) then how will you ensure that additional records are not added to the file by other processes between the pre-read prior to the initialization (which depends on knowing how many entries are required) and the re-read to do the populating?
In fact _any_ mechanism that could possibly identify the correct number to use for ws-table-counter in order to "clear only the number of data entry occurrences" will take longer than the "70% faster" that you claim (unsubstantiated) for your code.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Wed Jun 6 12:37:10 2018

From Newsgroup: comp.lang.cobol

On Wednesday, June 6, 2018 at 6:42:06 PM UTC+12, Greg Wallace wrote:

On Wednesday, 6 June 2018 15:20:29 UTC+10, Richard wrote:

On Wednesday, June 6, 2018 at 4:05:26 PM UTC+12, Kellie Fitton wrote:

On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:

On 6/5/2018 4:25 PM, Richard wrote:

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The
ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

"""The table needs to be initialized (formatted) prior to being populated"""

NO IT DOES NOT. You seem to be incredibly resistant to advice.

'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

"""The initialization is done As-Needed-Only for each table-row."""

That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

At the very least you can now speed up the 'initialization' by simply doing:

add 1 to ws-counter
move low-values to ws-table-items(ws-counter)

which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.

Roger the above.

I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?

+1 to this.

Richard,

I already tested and compared several sets of initialization
methods before posting my question. The Initialize process was
conducted with the following methods:

initialize ws-repository

move low-values to ws-repository

perform varying loop
move spaces to table-plan
move zeros to table-member
end-perform

calculate ws-table-position

The last method was faster by a considerable margin [70%].

Then I would posit that your description of what your code does is simply not true.

The 'calculate' code would move low-values to the table from byte 1 for the number of entries to whatever the ws-table-count held. You claimed that you did:

"""The initialization is done As-Needed-Only for each table-row."""

and previously you had claimed:

"""The initialization logic is: [format-table-items-as-needed-only].
It will initialize the first occurrence in the table, then after
putting something in the first-position-of-the-table (1), the
next position will be initialized when the table have the second
occurrence and ready to get populated with data for position (2)"""

It is as if you are unaware of what you are actually doing in the code.

It may well be that the 'calculate ws-table-position' is 'faster', especially when ws-table-counter is zero, as it will be in your code sample, because it will do nothing.

If ws-table-counter is > zero and you are doing this 'for each table-row' as you load the data then it is overwriting the data already loaded.

Get your act together and work out what your code really is and what it is supposed to be doing; post actual code from your compiled program instead of retyping what you guess it to be; and stop wasting everyone's time.

I would like to take a step back. You opened a thread with 'Can mighty Cobol carry an elephant'. That is a brilliant title and you have engaged much discussion. While I may soften Richards remarks you did leave a lot of gaps in the explanation.

My remarks are likely to be no worse than he would get at his first code review should he ever get a job as a trainee coder. In fact, given his refusal to take advice, from several here, that initialization is not required beyond setting the correct ODO (and he even gets doing that wrong), and his explanations often not matching the code he supplies, then it is likely that he wouldn't survive his second code review.

Discussion about in memory tables and initialization somewhat bores me.

As a business analyst, I would want to know more about the application to look at why this is necessary. There may be some other solution.

You are correct. While the table lookup may be significantly faster than the simple solution (as proposed by Pete and others) of just doing a lookup on the ISAM file - adding an index on 'plan' if it doesn't already exist, the complexity of determining when a re-population is necessary may completely nullify that saving.
But I don't think that he has any real application, he certainly hasn't thought it through.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Wed Jun 6 12:43:55 2018

From Newsgroup: comp.lang.cobol

On Thursday, June 7, 2018 at 5:28:53 AM UTC+12, Kellie Fitton wrote:

On Wednesday, June 6, 2018 at 12:13:45 AM UTC-7, Louis Krupp wrote:

Why not just reset ws-table-counter and possibly table-index,
depending on what you're using to store newly read records?

Louis

Louis,

I reset the ws-table-counter back to 1 and it did clear
the old entries from the table. Thanks...

Actually that will "clear the old entries" except 1.

Being 'off by 1' is a common failure of the novice programmer and something that must be vigorously checked for.

--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Wed Jun 6 19:56:02 2018

From Newsgroup: comp.lang.cobol

In article <a8d646ea-8790-49cb-8888-a0bd2b3687f7@googlegroups.com>,
Richard <riplin@azonic.co.nz> wrote:

[snip]

Now the code has ws-table-items with the occurs (which
may be an improvement) rather than a subsiduary filler field with the
occurs.

Mr Plinston, upon seeing that first posting and its inability to address a given table entry my suspicions of 'how did this ever pass Prod review?' waxed.

They've yet to wane. I don't know what's going on here but olfaction indicates a mix of both 'please do my job' and 'please do my homework'... maybe 'please do my on-the-job-training'?

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Wed Jun 6 20:03:13 2018

From Newsgroup: comp.lang.cobol

In article <7621f8ab-def8-4886-9363-8bfa8993dd39@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

[snip]

The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.

Don't worry about not paying any attention to work-related questions you
are asking others to assist you with for free. I, for one, am paying double-less attention in my responses.

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Wed Jun 6 20:08:52 2018

From Newsgroup: comp.lang.cobol

In article <d630c630-3881-4a87-881a-a6c7a55aaf55@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

[snip]

When I posted my question I said: the table needs to be
initialized CONSTANTLY.

This is, at best, loose terminology. If something is being done
constantly then nothing else is being done.

Also, as I have mentioned in

this thread Previously, the table needs to get Refreshed
and Reset PERIODICALLY before a new set of fresh data
can re-populate the table.

Sorry, there's no time to refresh and reset... something else is being
done constantly, remember?

I suggest we start afresh. Assuming that the program to which you are referring has already been written:

1) What does the program currently do?

2) What should the program be doing better?

Assuming that the program has not already been written:

0) What is the program going to do?

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard@riplin@azonic.co.nz to comp.lang.cobol on Wed Jun 6 14:01:01 2018

From Newsgroup: comp.lang.cobol

On Wednesday, June 6, 2018 at 4:05:26 PM UTC+12, Kellie Fitton wrote:

On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:

On 6/5/2018 4:25 PM, Richard wrote:

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The
ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

"""The table needs to be initialized (formatted) prior to being populated"""

NO IT DOES NOT. You seem to be incredibly resistant to advice.

'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

"""The initialization is done As-Needed-Only for each table-row."""

That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

At the very least you can now speed up the 'initialization' by simply doing:

add 1 to ws-counter
move low-values to ws-table-items(ws-counter)

which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.

Roger the above.

I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?

+1 to this.

Richard,

I already tested and compared several sets of initialization
methods before posting my question. The Initialize process was
conducted with the following methods:

I am not convinced that your claim is true. If you had "tested and compared several methods" then you would not have posted code that was the _worst_ example, by many thousand percent, and that didn't do it as you claimed ('off by 1 error').

initialize ws-repository

move low-values to ws-repository

perform varying loop
move spaces to table-plan
move zeros to table-member
end-perform

calculate ws-table-position

The last method was faster by a considerable margin [70%].

I have done some testing of your 'calculate ws-table-position' and can't get it significantly faster than 'move low-values to ws-repository' without reducing the number of entries that it clears to being much smaller numbers. To get it 70% faster would require only clearing about a third or less of the table. So how are you determining the number that does need to be cleared? What number did you use in your test?
--- Synchronet 3.20a-Linux NewsLink 1.114

From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Wed Jun 6 17:53:34 2018

From Newsgroup: comp.lang.cobol

On Thursday, 7 June 2018 06:08:53 UTC+10, docd...@panix.com wrote:

In article <d630c630-3881-4a87-881a-a6c7a55aaf55@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

[snip]

When I posted my question I said: the table needs to be
initialized CONSTANTLY.

This is, at best, loose terminology. If something is being done
constantly then nothing else is being done.

Also, as I have mentioned in

this thread Previously, the table needs to get Refreshed
and Reset PERIODICALLY before a new set of fresh data
can re-populate the table.

Sorry, there's no time to refresh and reset... something else is being
done constantly, remember?

I suggest we start afresh. Assuming that the program to which you are referring has already been written:

1) What does the program currently do?

2) What should the program be doing better?

Assuming that the program has not already been written:

0) What is the program going to do?

DD

Kellie, you are sending everyone into a spin and I see many trying to help. You are not explaining the application.

You have some master data that is constantly updated. Why? What is the nature of it.

An example of a different approach could be that the master data needs an alternate key.

Greg
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 15:54:50 2018

From Newsgroup: comp.lang.cobol

On 7/06/2018 5:30 AM, Kellie Fitton wrote:

On Wednesday, June 6, 2018 at 6:46:45 AM UTC-7, Kerry Liles wrote:

As others have pointed out, there is no need to clear the table if your
program simply keeps track of the 'currently' highest used entry in the
table (setting that back to 0 or 1 or whatever you like to use as the
first entry effectively "clears" the table)

Kerry,

I reset the ws-table-counter back to 1 and it did clear
the old entries from the table. Thanks...

Kellie,

I think I can see where your confusion was and it is understandable. WHY
no initialization is required is not immediately obvious. Fortunately,
you received some good advice...

To summarize; it works like this:

1. If you use ODO, then ODO has the value for the highest entry in your
table and it won't let anything access data that may have been
previously stored "beyond" that entry number. (You have a "window" of
entries between 1 and this number...)

2. If you DON'T use ODO, and write your own binary chop to search the
table, then you must count the entries as you add them and this will be
the highest entry, just like ODO had, above. (In your homemade binary
chop you will use this number for your search calculation but you will
never access anything above it. Just like ODO, you have, in effect, the
same "window" of entries between 1 and this number.)

So, in either case, there is no need to ever initialize the table
because anything above the high entry cannot be accessed, and everything
in the active "window" is going to be overwritten...

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 15:57:50 2018

From Newsgroup: comp.lang.cobol

On 7/06/2018 5:30 AM, Kellie Fitton wrote:

On Wednesday, June 6, 2018 at 6:46:45 AM UTC-7, Kerry Liles wrote:

As others have pointed out, there is no need to clear the table if your
program simply keeps track of the 'currently' highest used entry in the
table (setting that back to 0 or 1 or whatever you like to use as the
first entry effectively "clears" the table)

Kerry,

I reset the ws-table-counter back to 1 and it did clear
the old entries from the table. Thanks...

It didn't really clear them, it just made the upper limit equal 1 entry
so your active "window" became 1 entry...

That's why Kerry said "effectively". :-)

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 16:15:19 2018

From Newsgroup: comp.lang.cobol

On 7/06/2018 12:53 PM, Greg Wallace wrote:

On Thursday, 7 June 2018 06:08:53 UTC+10, docd...@panix.com wrote:

In article <d630c630-3881-4a87-881a-a6c7a55aaf55@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

[snip]

When I posted my question I said: the table needs to be
initialized CONSTANTLY.

This is, at best, loose terminology. If something is being done
constantly then nothing else is being done.

Also, as I have mentioned in

this thread Previously, the table needs to get Refreshed
and Reset PERIODICALLY before a new set of fresh data
can re-populate the table.

Sorry, there's no time to refresh and reset... something else is being
done constantly, remember?

I suggest we start afresh. Assuming that the program to which you are
referring has already been written:

1) What does the program currently do?

2) What should the program be doing better?

Assuming that the program has not already been written:

0) What is the program going to do?

DD

Kellie, you are sending everyone into a spin and I see many trying to help.

Speak for yourself, Greg... I'm not spun by anything here and I can see
other posts that aren't either. :-)
You are not explaining the application.

I believe Kellie has given it his/her best shot, but there is some
looseness in the language.

It is pretty well established that a "large" table needs to be created
and sequenced before it is used. It is also established that new "sets"
of data arrive periodically to replace what was there before.

Problems have happened due to a misconception about the need to
initialize the table. I suspect those misconceptions are now resolved.

You have some master data that is constantly updated. Why? What is the nature of it.

As the original enquiry was about initializing the table, the
application use of the data is not really pertinent to the discussion
and risks just strewing more confusion.

An example of a different approach could be that the master data needs an alternate key.

Or it could be any one of the many suggestions already made... :-)

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 16:17:50 2018

From Newsgroup: comp.lang.cobol

On 6/06/2018 3:38 PM, Kellie Fitton wrote:

On Tuesday, June 5, 2018 at 3:22:54 PM UTC-7, Richard wrote:

On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:

On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

"ready to get populated with data"

Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

Your question was: "What is the most optimized method to initialize a mammoth table?".

The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.

Richard,

The table needs to be initialized (formatted) prior to being
populated with the data collected from the master file. The
initialization is done As-Needed-Only for each table-row. The
ws-table-counter has the higher position in the repository
table effectively occupied when the initialization is going
to be made. Below is the code that calculates the new table
position prior to the data population into the table-row.

01 ws-table-counter pic 9(5) comp-5 value 0.
01 ws-table-position pic 9(5) comp-5 value 0.
01 ws-repository.
03 ws-table-items occurs 1 to 53000 times depending on
ws-table-counter
ascending table-plan
indexed by table-index.
05 table-plan pic x(8).
05 table-member pic s9(9)v99 comp-3.

compute ws-table-position =
(length of ws-table-items * ws-table-counter)
end-compute
move low-values to ws-repository (1:ws-table-position)

Just to show that I am capable of running tests and timing them, which is what you should be doing, I have done 'initializing' a table 3 ways, the results are:

Each time is for 10,000 repeats and for 50,000 entries

1. move low-values to ws-repository : 0.12 seconds

2. move low-values to ws-table-items(index) : 7.2 seconds

3. move low-values to ws-repository(calculated:length of entry) : 59 seconds

So, not only is your code wrong but it is the worst way of doing the initialization by a factor of 48000%.

Your original code was at least 6000% slower than the best.

And it doesn't need to be done anyway.

Also, as yet another failure, ws-table-position is only 5 digits (pic 9(5)) and this will overflow when you multiply 14 * 53000, or 14 * any number more than 7142.

The ws-table-position variable should be 9 digits, I was typing fast to explain the process logic while talking on my cellphone.

LOL! I think that last sentence says it all... :-)

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 16:28:51 2018

From Newsgroup: comp.lang.cobol

On 7/06/2018 8:03 AM, docdwarf@panix.com wrote:

In article <7621f8ab-def8-4886-9363-8bfa8993dd39@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

[snip]

The ws-table-position variable should be 9 digits, I was typing fast to
explain the process logic while talking on my cellphone.

Don't worry about not paying any attention to work-related questions you
are asking others to assist you with for free. I, for one, am paying double-less attention in my responses.

DD

Yes, it would seem that posts on "mighty COBOL" might be "best
avoided"... :-)

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 16:29:52 2018

From Newsgroup: comp.lang.cobol

On 7/06/2018 8:03 AM, docdwarf@panix.com wrote:

In article <7621f8ab-def8-4886-9363-8bfa8993dd39@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

[snip]

The ws-table-position variable should be 9 digits, I was typing fast to
explain the process logic while talking on my cellphone.

Don't worry about not paying any attention to work-related questions you
are asking others to assist you with for free. I, for one, am paying double-less attention in my responses.

DD

Kellie,

if you don't take posts here seriously, there is a real risk you will be considered a troll (or worse...) and you may not get much help when you
really need it.

I give you credit for being honest enough to admit what was happening,
but try to understand that most people here are serious about their programming and although they really will help, no one likes to feel
that their time is being wasted.

You need to think before typing and choose your words carefully.

Pete.
--
I used to write COBOL; now I can do anything...
--- Synchronet 3.20a-Linux NewsLink 1.114

From Louis Krupp@lkrupp@nospam.pssw.com.invalid to comp.lang.cobol on Thu Jun 7 01:32:24 2018

From Newsgroup: comp.lang.cobol

On Tue, 5 Jun 2018 20:38:02 -0700 (PDT), Kellie Fitton
<KELLIEFITTON@yahoo.com> wrote:

The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.

Please say the cell phone conversation was about COBOL.
Louis
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Thu Jun 7 01:02:58 2018

From Newsgroup: comp.lang.cobol

On Thursday, June 7, 2018 at 12:33:22 AM UTC-7, Louis Krupp wrote:

On Tue, 5 Jun 2018 20:38:02 -0700 (PDT), Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.

Please say the cell phone conversation was about COBOL.

Louis

Hi Louis,

As a matter of fact, it was. However, it was some bad news about my
favorite programming language---COBOL. It is making me angry and
emotional to say the least. I will explain why in a new post shortly.
I would like to hear your unbiased opinion, though. Thanks...

--- Synchronet 3.20a-Linux NewsLink 1.114

From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Thu Jun 7 01:26:21 2018

From Newsgroup: comp.lang.cobol

On Thursday, 7 June 2018 18:02:59 UTC+10, Kellie Fitton wrote:

On Thursday, June 7, 2018 at 12:33:22 AM UTC-7, Louis Krupp wrote:

On Tue, 5 Jun 2018 20:38:02 -0700 (PDT), Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.

Please say the cell phone conversation was about COBOL.

Louis

Hi Louis,

As a matter of fact, it was. However, it was some bad news about my
favorite programming language---COBOL. It is making me angry and
emotional to say the least. I will explain why in a new post shortly.
I would like to hear your unbiased opinion, though. Thanks...

Sorry to hear that Kellie and I look forward to a new post.

If I was your Boss, I would not be questioning the COBOL language but the why. Why do you need a binary search on a constantly varying table? Why is it constantly varying? Is there a better way of doing it?

Greg
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Thu Jun 7 03:01:45 2018

From Newsgroup: comp.lang.cobol

On Thursday, June 7, 2018 at 1:26:22 AM UTC-7, Greg Wallace wrote:

On Thursday, 7 June 2018 18:02:59 UTC+10, Kellie Fitton wrote:

On Thursday, June 7, 2018 at 12:33:22 AM UTC-7, Louis Krupp wrote:

On Tue, 5 Jun 2018 20:38:02 -0700 (PDT), Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.

Please say the cell phone conversation was about COBOL.

Louis

Hi Louis,

As a matter of fact, it was. However, it was some bad news about my favorite programming language---COBOL. It is making me angry and
emotional to say the least. I will explain why in a new post shortly.
I would like to hear your unbiased opinion, though. Thanks...

Sorry to hear that Kellie and I look forward to a new post.

If I was your Boss, I would not be questioning the COBOL language but the why. Why do you need a binary search on a constantly varying table? Why is it constantly varying? Is there a better way of doing it?

Greg

Hi Greg,

First, one of my programs function as a sifting thread, it will
collect certain data from a master file based on some qualifying
criteria, patterns and relevant information. The collected data
are loaded into the table temporarily for the purpose of lookup
and comparison against counterpart data mined from another file.

Once the analysis are done, these sets of data must be removed
from the table and replaced with a new fresh data to repeat the
same process again. Working with Tables Without the ODO clause,
I always initialize the table to reset/format/refresh the table
to prepare it for the re-populate process. The binary search all
was selected to increase the speed of the table lookup process.

I am looking forward to hear your opinion in my new post.

Regards...

--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Thu Jun 7 13:30:50 2018

From Newsgroup: comp.lang.cobol

In article <2feb1ff3-fb62-47af-a521-61df3e782133@googlegroups.com>,
Greg Wallace <gregwebace@gmail.com> wrote:

[snip]

Why do you need a binary search on a constantly varying table?
Why is it constantly varying? Is there a better way of doing it?

Mr Wallace, these need to be adressed in their own thread; the wealth of knowledge uncovered might astound some folks here.

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Thu Jun 7 13:49:25 2018

From Newsgroup: comp.lang.cobol

In article <69f834be-74d3-4031-8998-f851e2201375@googlegroups.com>,
Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

On Thursday, June 7, 2018 at 1:26:22 AM UTC-7, Greg Wallace wrote:

[snip]

If I was your Boss, I would not be questioning the COBOL language but

the why. Why do you need a binary search on a constantly varying table?
Why is it constantly varying? Is there a better way of doing it?

Hi Greg,

First, one of my programs function as a sifting thread, it will
collect certain data from a master file based on some qualifying
criteria, patterns and relevant information.

Ms Fitton - that reads more like advertising-copy and less like program
specs, it is so broad as to be devoid of value.

'... collect certain data' ... unless you want to 'collect all data' then almost all programs do some of this.

'... from a master file ...' ... how is this different from collecting
certain data 'from a subordinate file'? How does it assist the statememt
of the program's function to add this?

'... based on some qualifying criteria, patterns and relevant
information'... ... as opposed on getting data that 'doesn't qualify for
your criteria, is random and irrelevant'?

This is noise, Ms Fitton, and is best kept to a minimum.

The collected data
are loaded into the table temporarily for the purpose of lookup
and comparison against counterpart data mined from another file.

So... from a master file you'll gather the customer numbers of all the left-handed taxi drivers who live in Chicago, load them into a Mammoth
Table and then check what they purchased in October... and based on that
you may or may not mount an ad campaign.

Something like that?

DD
--- Synchronet 3.20a-Linux NewsLink 1.114

Who's Online
Recent Visitors
- Noozle
  Sat Apr 19 14:10:30 2025
  from Noozle City via Telnet
- Noozle
  Sat Apr 19 09:18:26 2025
  from Noozle City via Telnet
- Microbot
  Sat Apr 19 04:21:48 2025
  from Moore, Ok via Telnet
- Noozle
  Fri Apr 18 18:10:21 2025
  from Noozle City via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,030
Nodes:	10 (0 / 10)
Uptime:	25:18:00
Calls:	13,346
Calls today:	3
Files:	186,574
D/L today:	2,318 files (557M bytes)
Messages:	3,357,749

Can mighty COBOL carry an elephant?

Who's Online

Recent Visitors

System Info