• Can mighty COBOL carry an elephant?

    From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed May 30 09:21:05 2018
    From Newsgroup: comp.lang.cobol

    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Rick Smith@rs847925@gmail.com to comp.lang.cobol on Wed May 30 10:20:37 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, May 30, 2018 at 12:21:06 PM UTC-4, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?

    It would be helpful to know the organization as defined in
    working-storage.

    But, since you are using ODO, what do you need to initialize?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kerry Liles@kerry.liles@gmail.com to comp.lang.cobol on Wed May 30 13:34:13 2018
    From Newsgroup: comp.lang.cobol

    On 5/30/2018 12:21 PM, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...


    I am not entirely sure what you mean by "the initialization algorithm is
    using reference modifications to obviate initializing the whole table
    more often that required." (to me that implies that you may be
    initializing some portion or portions of the table rather than the
    entire table)

    If you could post some code snippets that would be helpful.

    Perhaps the COBOL compiler you are using already knows the best way to initialize an array/table? You could, for example, say:

    MOVE LOW-VALUES TO WS-TABLE

    I would also be interested to know whether or not you have tried
    different methods of initializing the table and timed the different
    attempts? What is the difference between the try illustrated above
    versus (say) "INITIALIZE WS-TABLE" or other methods like:

    PERFORM VARYING IND FROM BY 1 UNTIL IND > WS-BYTES-IN-WS-TABLE
    MOVE SPACE TO WS-TABLE (IND:1)
    END-PERFORM


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed May 30 16:47:10 2018
    From Newsgroup: comp.lang.cobol


    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in
    working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Rick Smith@rs847925@gmail.com to comp.lang.cobol on Wed May 30 17:00:00 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, May 30, 2018 at 7:47:11 PM UTC-4, Kellie Fitton wrote:
    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.

    Probably the most efficient way is to set up an independent
    record with all values initialized. Move that record to the
    table as needed.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed May 30 17:58:19 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, May 30, 2018 at 10:34:18 AM UTC-7, Kerry Liles wrote:

    Perhaps the COBOL compiler you are using already knows the best way to initialize an array/table? You could, for example, say:

    MOVE LOW-VALUES TO WS-TABLE

    I would also be interested to know whether or not you have tried
    different methods of initializing the table and timed the different
    attempts? What is the difference between the try illustrated above
    versus (say) "INITIALIZE WS-TABLE" or other methods like:

    PERFORM VARYING IND FROM BY 1 UNTIL IND > WS-BYTES-IN-WS-TABLE
    MOVE SPACE TO WS-TABLE (IND:1)
    END-PERFORM


    I am initializing only as needed which is the exact number of
    occurrences of populated items, thus Formatting Only On-Demand.
    When I used the verb INITIALIZE ws-repository-table it was slower
    than move low-values to ws-repository-table. The CPU usage was
    reduced by a large margin when I calculated the length of the
    table items, maintain a high water mark of the last table item
    number which is the higher position, and when the next table
    element is needed, I initialized the real values based on the
    reference modification with move low-values to repository-table.

    Also, another approached I tried was the perform varying:

    INITIALIZE WS-REPOSITORY (1)
    PERFORM VARYING TABLE-INDEX FROM 2 BY 1
    UNTIL TABLE-INDEX > WS-ELEMENTS-IN-REPOSITORY-TABLE
    MOVE WS-REPOSITORY (1) TO WS-REPOSITORY (TABLE-INDEX)
    END-PERFORM

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Wed May 30 20:15:47 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...
    I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.
    A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.
    I am not really answering for an in memory table and may advise against it. Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.
    I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.
    I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.
    If you need more on what primary key to write, what secondary key to write then I can expand.
    You wrote - initialized constantly - and that needs more explanation.
    Greg
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed May 30 20:20:51 2018
    From Newsgroup: comp.lang.cobol


    Probably the most efficient way is to set up an independent
    record with all values initialized. Move that record to the
    table as needed.


    I just tested this code as an independent record to initialize the table:
    for every needed occurrence: move ws-repository to ws-table-items.
    it should work just fine since an alphanumeric move is done one byte at
    a time from left to right, and stops when the end of the shortest field
    is encountered. I think the compiler should issue a warning message though about moving a field to a part of itself just as a notice information.


    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-repository.
    05 format-table.
    10 format-alphanumeric pic x(8) value spaces.
    10 format-numeric pic s9(9)v99 comp-3 value +0.
    05 ws-table-items.
    10 filler occurs 1 to 53000 times depending on ws-table-counter.
    15 table-plan pic x(8).
    15 table-member pic s9(9)v99 comp-3.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Thu May 31 02:07:19 2018
    From Newsgroup: comp.lang.cobol

    I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

    A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

    I am not really answering for an in memory table and may advise against it. Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

    I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

    I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

    If you need more on what primary key to write, what secondary key to write then I can expand.

    You wrote - initialized constantly - and that needs more explanation.

    Greg
    Table-wise this table is small for an ISAM file, That's why I
    elected to use an in-memory table since the search all still
    very fast for a 1 MByte table.
    The table needs to get refreshed/reset periodically so it can
    accommodate anew set of fresh data collected from a master file.
    Hence, the initialization algorithm must reset the old data and
    prepare the table for the re-populate process.
    I am very intrigued by your two pass report structure. I hope
    you have the time to elaborate on the process of varying the
    keys according to the users selection criteria. Thanks...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Thu May 31 12:10:08 2018
    From Newsgroup: comp.lang.cobol

    In article <790ce753-cbd5-4f4b-a34f-57c3852ee0f6@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    When accessing a WORKING-STORAGE table becomes a problem the processing-design, in my experience, has been outgrown and needs to be re-visited.

    That being said: what have you tried so far and what reasons are there
    that it is considered a failure? Please post some code.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Thu May 31 12:37:46 2018
    From Newsgroup: comp.lang.cobol

    In article <6a44026e-0677-4375-8ac5-044a8abf906a@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    Probably the most efficient way is to set up an independent
    record with all values initialized. Move that record to the
    table as needed.


    I just tested this code as an independent record to initialize the table:
    for every needed occurrence: move ws-repository to ws-table-items.
    it should work just fine since an alphanumeric move is done one byte at
    a time from left to right, and stops when the end of the shortest field
    is encountered. I think the compiler should issue a warning message though >about moving a field to a part of itself just as a notice information.

    There was a recent discussion about errors that compilers should or should
    not issue. I've worked in places where the link-edit step of the compile
    JCL was predicated on a zero return-code from the compile step.



    01 ws-table-counter pic 9(5) comp-5 value 0.






    01 ws-repository.
    05 format-table.
    10 format-alphanumeric pic x(8) value spaces.
    10 format-numeric pic s9(9)v99 comp-3 value +0.
    05 ws-table-items.
    10 filler occurs 1 to 53000 times depending on ws-table-counter.
    15 table-plan pic x(8).
    15 table-member pic s9(9)v99 comp-3.

    Is there a benefit to losing the ability to address individual entries?
    It may not be needed but I do not believe anything would be lost (and a
    great deal gained) by re-coding as follows:

    *
    01 ws-table-counter pic 9(5) comp-5 value 0.
    *
    01 ws-repository-clear-line.
    05 ws-format-clear-line-alpha pic x(8) value spaces.
    05 ws-format-clear-line-9v99-c3 pic s9(9)v99 comp-3 value +0.
    *
    01 ws-table-items.
    05 ws-tbl-lin occurs 1 to 53000 times depending on ws-table-counter.
    10 ws-tbl-lin-plan pic x(8).
    10 ws-tbl-lin-member pic s9(9)v99 comp-3.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From robin.vowels@robin.vowels@gmail.com to comp.lang.cobol on Thu May 31 22:18:25 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:
    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.

    You say that you use a binary search. Wouldn't that need all
    elements of the table to be initialised first?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Rick Smith@rs847925@gmail.com to comp.lang.cobol on Fri Jun 1 07:44:48 2018
    From Newsgroup: comp.lang.cobol

    On Friday, June 1, 2018 at 1:18:26 AM UTC-4, robin....@gmail.com wrote:
    On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:
    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.

    You say that you use a binary search. Wouldn't that need all
    elements of the table to be initialised first?

    No. The number of entries in the table is variable.

    Given that 0 <= N <= 53000, only N values will participate
    in the binary search. Those M where N < M <= 53000 need not
    be initialized.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From robin.vowels@robin.vowels@gmail.com to comp.lang.cobol on Fri Jun 1 08:35:22 2018
    From Newsgroup: comp.lang.cobol

    On Saturday, June 2, 2018 at 12:44:49 AM UTC+10, Rick Smith wrote:
    On Friday, June 1, 2018 at 1:18:26 AM UTC-4, r.....@gmail.com wrote:
    On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:
    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.

    You say that you use a binary search. Wouldn't that need all
    elements of the table to be initialised first?

    No. The number of entries in the table is variable.

    Given that 0 <= N <= 53000, only N values will participate
    in the binary search. Those M where N < M <= 53000 need not
    be initialized.

    Let's hear it from the OP.
    She says it's a "mammoth table".
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Rick Smith@rs847925@gmail.com to comp.lang.cobol on Fri Jun 1 10:11:38 2018
    From Newsgroup: comp.lang.cobol

    On Friday, June 1, 2018 at 11:35:23 AM UTC-4, robin....@gmail.com wrote:
    On Saturday, June 2, 2018 at 12:44:49 AM UTC+10, Rick Smith wrote:
    On Friday, June 1, 2018 at 1:18:26 AM UTC-4, r.....@gmail.com wrote:
    On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:
    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.

    You say that you use a binary search. Wouldn't that need all
    elements of the table to be initialised first?

    No. The number of entries in the table is variable.

    Given that 0 <= N <= 53000, only N values will participate
    in the binary search. Those M where N < M <= 53000 need not
    be initialized.

    Let's hear it from the OP.
    She says it's a "mammoth table".

    Let's not. The 0 and 53000 were given by the OP. The rest is
    derivable from the COBOL standard.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Fri Jun 1 14:35:19 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, May 31, 2018 at 4:21:06 AM UTC+12, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    Your thoughts and opinions are appreciated.

    Have you considered using a hash table rather than using a binary search ?
    Make the table larger, say double, and calculate a hash from the key. For example take the alpha and redefine as a binary numeric, divide by the table size and use the remainder as the 'bucket number' index to store the entry.
    Then the lookup (in idealized conditions) will be a single calculation and lookup rather than a series of divides and comparisons.
    Of course it is unlikely to be idealized and so an overflow mechanism will be required for when several items calculate the same 'bucket number'. This can be done by adding an 'overflow chain' field to each item. Several different strategies could be used. For example: on overflow try to put the item in the next empty bucket, or at some offset, or in a reserved overflow area.
    Packing density needs to be quite low to avoid as much overflow as possible. It is usual to analyze the actual data with several algorithms in order to choose a reasonable one.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From robin.vowels@robin.vowels@gmail.com to comp.lang.cobol on Fri Jun 1 19:23:59 2018
    From Newsgroup: comp.lang.cobol

    On Saturday, June 2, 2018 at 3:11:39 AM UTC+10, Rick Smith wrote:
    On Friday, June 1, 2018 at 11:35:23 AM UTC-4, robin....@gmail.com wrote:
    On Saturday, June 2, 2018 at 12:44:49 AM UTC+10, Rick Smith wrote:
    On Friday, June 1, 2018 at 1:18:26 AM UTC-4, r.....@gmail.com wrote:
    On Thursday, May 31, 2018 at 9:47:11 AM UTC+10, Kellie Fitton wrote:
    Depends on the content of the table. Only one type, say, binary? Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.

    You say that you use a binary search. Wouldn't that need all
    elements of the table to be initialised first?

    No. The number of entries in the table is variable.

    Given that 0 <= N <= 53000, only N values will participate
    in the binary search. Those M where N < M <= 53000 need not
    be initialized.

    Let's hear it from the OP.
    She says it's a "mammoth table".

    Let's not. The 0 and 53000 were given by the OP. The rest is
    derivable from the COBOL standard.

    Let's do so.
    The lower bound of the test table is 1, not 0.
    And 53000 is NOT a "mammoth table".

    There's no indication that her test table is the same
    as the one actually used.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Fri Jun 1 20:21:42 2018
    From Newsgroup: comp.lang.cobol


    Have you considered using a hash table rather than using a binary search ?

    Make the table larger, say double, and calculate a hash from the key. For example take the alpha and redefine as a binary numeric, divide by the table size and use the remainder as the 'bucket number' index to store the entry.

    Then the lookup (in idealized conditions) will be a single calculation and lookup rather than a series of divides and comparisons.

    Of course it is unlikely to be idealized and so an overflow mechanism will be required for when several items calculate the same 'bucket number'. This can be done by adding an 'overflow chain' field to each item. Several different strategies could be used. For example: on overflow try to put the item in the next empty bucket, or at some offset, or in a reserved overflow area.

    Packing density needs to be quite low to avoid as much overflow as possible. It is usual to analyze the actual data with several algorithms in order to choose a reasonable one.
    The table is 1 MByte sized and will be searched often so
    a binary search would be more efficient and simpler. Hash
    tables are used for relative files, my system is using ISAM
    files. I always use ISAM files as lookup search tables when
    the table size is rather huge for a binary search all.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Fri Jun 1 20:31:29 2018
    From Newsgroup: comp.lang.cobol


    There's no indication that her test table is the same
    as the one actually used.


    As I have mentioned above, the number of entries in the table
    are variable. The ODO clause is showing in the table I posted
    which is the actual table that is using an independent record
    to format the table on-demand.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Sat Jun 2 06:00:48 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...
    Hi Kellie
    It seems to be getting complicated with responses. I don't really want to a address inititising an in-memory table and doing a binary search.
    You seem to understand Cobol ISAM files.
    You seem to have what may not really be a Mammoth table but a large one at say 53,000 entries. Doing it in-memory is always faster but writing a COBOL ISAM file eliminates the binary search and is fast. It could be that you need to allocate a unique name to the temporary sort-work file based on userid or session number.
    Anyway, I can give code samples if you email me directly to gregwebace at gmail.com for more. It seems Google Groups does not allow a direct email address.
    So far you talk about a binary search but do not reveal what the search key is. You say it needs to be refreshed from some data master data.
    I can add more re:
    I am very intrigued by your two pass report structure.
    Greg
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Sat Jun 2 07:07:18 2018
    From Newsgroup: comp.lang.cobol

    On Saturday, June 2, 2018 at 6:00:49 AM UTC-7, Greg Wallace wrote:

    Anyway, I can give code samples if you email me directly to gregwebace at gmail.com for more. It seems Google Groups does not allow a direct email address.

    So far you talk about a binary search but do not reveal what the search key is. You say it needs to be refreshed from some data master data.

    I can add more re:
    I am very intrigued by your two pass report structure.

    Greg


    Hi Greg,

    The binary search all is based on the primary and secondary keys
    are in ascending order (KEY-1 and KEY-2).

    I just sent you an email. Thanks...

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sat Jun 2 14:32:46 2018
    From Newsgroup: comp.lang.cobol

    On Saturday, June 2, 2018 at 3:21:44 PM UTC+12, Kellie Fitton wrote:
    Have you considered using a hash table rather than using a binary search ?

    Make the table larger, say double, and calculate a hash from the key. For example take the alpha and redefine as a binary numeric, divide by the table size and use the remainder as the 'bucket number' index to store the entry.

    Then the lookup (in idealized conditions) will be a single calculation and lookup rather than a series of divides and comparisons.

    Of course it is unlikely to be idealized and so an overflow mechanism will be required for when several items calculate the same 'bucket number'. This can be done by adding an 'overflow chain' field to each item. Several different strategies could be used. For example: on overflow try to put the item in the next empty bucket, or at some offset, or in a reserved overflow area.

    Packing density needs to be quite low to avoid as much overflow as possible. It is usual to analyze the actual data with several algorithms in order to choose a reasonable one.


    The table is 1 MByte sized and will be searched often so
    a binary search would be more efficient and simpler. Hash
    tables are used for relative files, my system is using ISAM
    files. I always use ISAM files as lookup search tables when
    the table size is rather huge for a binary search all.
    You seem to miss the point that a hash can be used for an array in memory as well as for relative file. A hash can be _much_more_ efficient than a binary search given an adequate algorithm and sufficiently small packing density to avoid too much overflow.
    A binary search will do a comparison with the mid point of the table and then do a divide and comparison until it finds the entry (or doesn't). For a 50,000 table it will do a comparison at (approx) 25,000, 12,500, 6,250, 3,125, 1,562, 781, 390, 195, 97, 48, 24, 12, 6, 3, 2, 1 or so unless it finds a match at one of those.
    A hash table will, most of the time, find a match at the calculated position. Occasionally it would need to step along a chain of overflow, so the number of comparisons may average out to, say, 1.1 per lookup while a search all may take a dozen or more, each with a divide and an add or subtract.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Sun Jun 3 13:20:02 2018
    From Newsgroup: comp.lang.cobol

    On 31/05/2018 3:15 PM, Greg Wallace wrote:
    On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    This might not be the "right" question.

    Maybe you need to think about whether you need a table at all, rather
    than how it should be initialized...? See below.


    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...

    I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

    A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

    I am not really answering for an in memory table and may advise against it. Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

    I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

    I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

    If you need more on what primary key to write, what secondary key to write then I can expand.

    You wrote - initialized constantly - and that needs more explanation.

    Greg


    I just wanted to note in passing that I was betting someone would
    suggest using an ISAM file.

    It's a very good solution.

    (Like Greg, I too have been writing COBOL for 40+ years, so maybe it's
    an "Olde Tyme Solution"... :-))

    You can discuss all kinds of clever ways to optimize a binary search
    (no-one so far has suggested an unbalanced or skewed chop...), You can
    look at clever hashing algorithms and re-invent in memory the file
    system with buckets and overflow that was implemented by ICL in the
    1960s, you can use refmodding to split the table as you insert each
    entry in sequence (having first initialized to high-values), but they
    all obfuscate what the real requirement is:

    You need to build and organize a list into a specific key sequence (and
    it is a "big" list...)

    Kellie put it in memory because "everybody knows" "Memory must be faster".

    (Generally, of course, it is... but if you spend a great deal of time
    messing around with your memory-based entries and moving great hunks of
    your table around, it certainly won't be as fast as you might hope.)

    Given the same requirements, (and given I can't use LINQ) I would opt
    for the same solution that Greg has suggested.

    Here's why:

    1. I HATE, LOATHE, and DETEST OCCURS DEPENDING and simply won't use it.
    It is a pointless bloody waste of time that lulls you into thinking you
    are using memory in an optimized way but goes ahead and allocates
    maximum space anyway. You save nothing with it.
    (OK, as Rick pointed out, in this case it effectively "limits" the scope
    of the binary chop, but that is not compelling enough for me to change
    my mind about it... :-))

    2. The problem of initializing the table for different data types is
    removed if you simply load it sequentially from an ISAM file.
    At the same time, you can obtain a count of the entries actually loaded,
    so you know what the limit is and don't NEED OCCURS
    DEPENDING...(Hooray!) (You will need to write your own binary chop to
    search it, but that's pretty trivial. If you REALLY want to use SEARCH
    ALL then you need to use OCCURS DEPENDING.)

    3. There is no need for SORT (either external or internal); ISAM sorts
    it as it is created.

    4. I don't like re-inventing the wheel; everything you need has already
    been written by the people who wrote ISAM...

    So...
    1. Set up an ISAM file for "temporary" use that has the required key and element (record) structure you need. (Each record on the file will be an element in the table.) Define this file for sequential access and give
    it a "fairly large" block size. (Most of the data manipulation will then
    be in memory, but you don't have to worry about it.)

    2. As you receive the elements, write them to the ISAM file.

    3. When you need to use the table, perform a routine that reads the ISAM
    file and writes sequentially to the table. (Loads the table from the
    file with one sequential pass.)

    At this point you should stop and ask yourself why you need the table at
    all. Why not just get records randomly from the ISAM file?

    The answer will depend on how you use the table. Are you sharing it
    between several modules, for example? Once it is built does it not
    change? (Until the next "set" of data causes it to be re-loaded...)
    Is there a great deal of access to it (where physical IO could
    accumulate to slow things down...)?

    Kellie originally asked for opinions.

    Given the constraints imposed by using COBOL (no LINQ available), mine
    is pretty close to Greg's...

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Sat Jun 2 19:40:28 2018
    From Newsgroup: comp.lang.cobol

    On Saturday, June 2, 2018 at 6:20:07 PM UTC-7, pete dashwood wrote:
    On 31/05/2018 3:15 PM, Greg Wallace wrote:
    On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    This might not be the "right" question.

    Maybe you need to think about whether you need a table at all, rather
    than how it should be initialized...? See below.


    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...

    I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

    A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

    I am not really answering for an in memory table and may advise against it. Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

    I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

    I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

    If you need more on what primary key to write, what secondary key to write then I can expand.

    You wrote - initialized constantly - and that needs more explanation.

    Greg


    I just wanted to note in passing that I was betting someone would
    suggest using an ISAM file.

    It's a very good solution.

    (Like Greg, I too have been writing COBOL for 40+ years, so maybe it's
    an "Olde Tyme Solution"... :-))

    You can discuss all kinds of clever ways to optimize a binary search
    (no-one so far has suggested an unbalanced or skewed chop...), You can
    look at clever hashing algorithms and re-invent in memory the file
    system with buckets and overflow that was implemented by ICL in the
    1960s, you can use refmodding to split the table as you insert each
    entry in sequence (having first initialized to high-values), but they
    all obfuscate what the real requirement is:

    You need to build and organize a list into a specific key sequence (and
    it is a "big" list...)

    Kellie put it in memory because "everybody knows" "Memory must be faster".

    (Generally, of course, it is... but if you spend a great deal of time messing around with your memory-based entries and moving great hunks of
    your table around, it certainly won't be as fast as you might hope.)

    Given the same requirements, (and given I can't use LINQ) I would opt
    for the same solution that Greg has suggested.

    Here's why:

    1. I HATE, LOATHE, and DETEST OCCURS DEPENDING and simply won't use it.
    It is a pointless bloody waste of time that lulls you into thinking you
    are using memory in an optimized way but goes ahead and allocates
    maximum space anyway. You save nothing with it.
    (OK, as Rick pointed out, in this case it effectively "limits" the scope
    of the binary chop, but that is not compelling enough for me to change
    my mind about it... :-))

    2. The problem of initializing the table for different data types is
    removed if you simply load it sequentially from an ISAM file.
    At the same time, you can obtain a count of the entries actually loaded,
    so you know what the limit is and don't NEED OCCURS
    DEPENDING...(Hooray!) (You will need to write your own binary chop to
    search it, but that's pretty trivial. If you REALLY want to use SEARCH
    ALL then you need to use OCCURS DEPENDING.)

    3. There is no need for SORT (either external or internal); ISAM sorts
    it as it is created.

    4. I don't like re-inventing the wheel; everything you need has already
    been written by the people who wrote ISAM...

    So...
    1. Set up an ISAM file for "temporary" use that has the required key and element (record) structure you need. (Each record on the file will be an element in the table.) Define this file for sequential access and give
    it a "fairly large" block size. (Most of the data manipulation will then
    be in memory, but you don't have to worry about it.)

    2. As you receive the elements, write them to the ISAM file.

    3. When you need to use the table, perform a routine that reads the ISAM file and writes sequentially to the table. (Loads the table from the
    file with one sequential pass.)

    At this point you should stop and ask yourself why you need the table at all. Why not just get records randomly from the ISAM file?

    The answer will depend on how you use the table. Are you sharing it
    between several modules, for example? Once it is built does it not
    change? (Until the next "set" of data causes it to be re-loaded...)
    Is there a great deal of access to it (where physical IO could
    accumulate to slow things down...)?

    Kellie originally asked for opinions.

    Given the constraints imposed by using COBOL (no LINQ available), mine
    is pretty close to Greg's...

    Pete.
    I think Greg's suggestion to use an ISAM file instead of a table
    is a far superior method since this will eliminate the need to
    initialize the table, and will shorten the runtime instruction
    path since COBOL programs are I/O bound rather than CPU bound.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Sun Jun 3 19:28:12 2018
    From Newsgroup: comp.lang.cobol

    On 3/06/2018 2:40 PM, Kellie Fitton wrote:
    On Saturday, June 2, 2018 at 6:20:07 PM UTC-7, pete dashwood wrote:
    On 31/05/2018 3:15 PM, Greg Wallace wrote:
    On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    This might not be the "right" question.

    Maybe you need to think about whether you need a table at all, rather
    than how it should be initialized...? See below.


    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...

    I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

    A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

    I am not really answering for an in memory table and may advise against it. >>> Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

    I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

    I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

    If you need more on what primary key to write, what secondary key to write then I can expand.

    You wrote - initialized constantly - and that needs more explanation.

    Greg


    I just wanted to note in passing that I was betting someone would
    suggest using an ISAM file.

    It's a very good solution.

    (Like Greg, I too have been writing COBOL for 40+ years, so maybe it's
    an "Olde Tyme Solution"... :-))

    You can discuss all kinds of clever ways to optimize a binary search
    (no-one so far has suggested an unbalanced or skewed chop...), You can
    look at clever hashing algorithms and re-invent in memory the file
    system with buckets and overflow that was implemented by ICL in the
    1960s, you can use refmodding to split the table as you insert each
    entry in sequence (having first initialized to high-values), but they
    all obfuscate what the real requirement is:

    You need to build and organize a list into a specific key sequence (and
    it is a "big" list...)

    Kellie put it in memory because "everybody knows" "Memory must be faster". >>
    (Generally, of course, it is... but if you spend a great deal of time
    messing around with your memory-based entries and moving great hunks of
    your table around, it certainly won't be as fast as you might hope.)

    Given the same requirements, (and given I can't use LINQ) I would opt
    for the same solution that Greg has suggested.

    Here's why:

    1. I HATE, LOATHE, and DETEST OCCURS DEPENDING and simply won't use it.
    It is a pointless bloody waste of time that lulls you into thinking you
    are using memory in an optimized way but goes ahead and allocates
    maximum space anyway. You save nothing with it.
    (OK, as Rick pointed out, in this case it effectively "limits" the scope
    of the binary chop, but that is not compelling enough for me to change
    my mind about it... :-))

    2. The problem of initializing the table for different data types is
    removed if you simply load it sequentially from an ISAM file.
    At the same time, you can obtain a count of the entries actually loaded,
    so you know what the limit is and don't NEED OCCURS
    DEPENDING...(Hooray!) (You will need to write your own binary chop to
    search it, but that's pretty trivial. If you REALLY want to use SEARCH
    ALL then you need to use OCCURS DEPENDING.)

    3. There is no need for SORT (either external or internal); ISAM sorts
    it as it is created.

    4. I don't like re-inventing the wheel; everything you need has already
    been written by the people who wrote ISAM...

    So...
    1. Set up an ISAM file for "temporary" use that has the required key and
    element (record) structure you need. (Each record on the file will be an
    element in the table.) Define this file for sequential access and give
    it a "fairly large" block size. (Most of the data manipulation will then
    be in memory, but you don't have to worry about it.)

    2. As you receive the elements, write them to the ISAM file.

    3. When you need to use the table, perform a routine that reads the ISAM
    file and writes sequentially to the table. (Loads the table from the
    file with one sequential pass.)

    At this point you should stop and ask yourself why you need the table at
    all. Why not just get records randomly from the ISAM file?

    The answer will depend on how you use the table. Are you sharing it
    between several modules, for example? Once it is built does it not
    change? (Until the next "set" of data causes it to be re-loaded...)
    Is there a great deal of access to it (where physical IO could
    accumulate to slow things down...)?

    Kellie originally asked for opinions.

    Given the constraints imposed by using COBOL (no LINQ available), mine
    is pretty close to Greg's...

    Pete.

    I think Greg's suggestion to use an ISAM file instead of a table
    is a far superior method since this will eliminate the need to
    initialize the table, and will shorten the runtime instruction
    path since COBOL programs are I/O bound rather than CPU bound.

    I can't remember if it is legal to write unordered entries to an ISAM
    file when ACCESS is SEQUENTIAL; it might not be... so you probably need
    to make ACCESS is DYNAMIC the access mode. (ACCESS is SEQUENTIAL would
    be ideal when you come to load the table, but it isn't much use if you
    can't write unordered entries to it.)

    Whether a program is IO or CPU bound has nothing to do with the
    language, and COBOL is no different to any other language in this regard.

    (You probably fell into this trap because COBOL is primarily used for
    business applications, which do a lot of "data processing", requiring
    frequent IO to the data, but it is really the logic of the program (what
    the application has to DO...) that will determine whether it is IO or
    CPU bound. You can write a CPU bound program in COBOL just as easily as
    in any other language. I recently wrote some stuff about this while
    arguing objects and layers. There is a hopefully amusing graphic that
    might make you smile...: https://primacomputing.co.nz/PRIMAMetro/ObjectsAndLayers2.aspx
    Unfortunately, the site is down at the moment with server problems and
    my ISP is looking at it. It may be a couple of days...)

    In terms of execution efficiency of the ISAM solution, it comes down
    largely to how much of the file you can buffer in memory, but if you ran
    a benchmark I think you would be agreeably surprised by the speed of it.
    The actual processing logic is certainly much simpler than manipulating
    and initializing your table, if the table is truly "large".

    Good Luck!

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Sun Jun 3 00:43:26 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...
    I think if Pete agrees than the ISAM idea carries more weight. I just add that you must have KEY. If you were doing a binary search you must be searching for some value. This should be the primary to what I call the sort-work file. You just close and open it for output and it is initialized.
    I tend to Open for Output ten close and then Open again for I-O. There was a good reason for this that escapes me. Even elephants/mammoths don't have perfect memory.
    Next is the file name. If you have multiple simultaneous users you may want a unique file name for each user session and there are several ways to do this.
    If your KEY is not unique then you can generate a sequence number as the key and have a secondary key for searches (Start, Read-Next).
    Another tip I employ is to always have a flag to indicate whether a file is open. I tend to use myfilename-open which is Y or N. If the file is open successfully set the flag to Y. When it closes set the file to N. This way you can open and close the file in many places. E.G. to refresh the file test whether myfilename-open = Y, then close it, then reopen it. This is pseudo code for convenience rather than actual correct COBOL syntax. It is also a very good way to make sure all files are closed on exit if you have and should have one exit point.
    Also most I-O to this file will be in Cache memory which can a bit slower but you are semi-employing an in memory table without reinventing the wheel re binary searches.
    I hope this is sufficiently clear.
    Greg
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Sun Jun 3 20:21:13 2018
    From Newsgroup: comp.lang.cobol

    On 3/06/2018 7:43 PM, Greg Wallace wrote:
    On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...

    I think if Pete agrees than the ISAM idea carries more weight.

    It was a nice thing to say, Greg, but it really isn't true; ideas stand
    on their own merit, not on who espouses them or doesn't... :-)

    I just add that you must have KEY. If you were doing a binary search you
    must be searching for some value. This should be the primary to what I
    call the sort-work file. You just close and open it for output and it is initialized.

    I tend to Open for Output ten close and then Open again for I-O. There was a good reason for this that escapes me. Even elephants/mammoths don't have perfect memory.

    It might be a ploy to get the indexes loaded; if you were only reading,
    they wouldn't need to be... I wouldn't do it because you won't need the indexes when you read the file for sequential load. (Also, OPEN/CLOSEs
    take quite a bit of time... Personally, I'd open output, write the
    entries as they arrive, then close, and open input.

    Next is the file name. If you have multiple simultaneous users you may want a unique file name for each user session and there are several ways to do this.

    If your KEY is not unique then you can generate a sequence number as the key and have a secondary key for searches (Start, Read-Next).

    If your table wasn't worried about the sequence of non-unique keys, then neither should your ISAM file be... :-)


    Another tip I employ is to always have a flag to indicate whether a file is open. I tend to use myfilename-open which is Y or N. If the file is open successfully set the flag to Y. When it closes set the file to N. This way you can open and close the file in many places.

    I'd use an 88 level with a value of 1 or 0... :-)

    01 filler pic x value space.
    12 myfilenameOPEN value '1'.
    12 myfilenameCLOSED value '0'.

    Some people don't like 88 levels (I do...), some people use "Y" and "N"
    when they mean "Logical TRUE" and "Logical FALSE", (I don't...).

    I worked in some non-English-speaking countries where they considered it "English arrogance" if Y and N were used... :-) I changed to using 1 and
    0... then they all stopped using "S" and "N" and did the same... :-)

    (I use 1 and 0 because it seems more elegant to my eye, but I don't
    think anything you want to use is "wrong" (maybe Y for false... :-))

    I do think that the flag should recognize "indeterminate", hence the
    initial value, and I probably wouldn't test for negative values like
    "NOT myfilenameOPEN", rather using specific settings for the states I'm interested in.

    E.G. to refresh the file test whether myfilename-open = Y, then close
    it, then reopen it. This is pseudo code for convenience rather than
    actual correct COBOL syntax. It is also a very good way to make sure all
    files are closed on exit if you have and should have one exit point.

    Also most I-O to this file will be in Cache memory which can a bit slower but you are semi-employing an in memory table without reinventing the wheel re binary searches.

    Not re-inventing the wheel... good stuff.

    I hope this is sufficiently clear.

    It was to me... :-)

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Sun Jun 3 20:28:46 2018
    From Newsgroup: comp.lang.cobol

    On 3/06/2018 8:21 PM, pete dashwood wrote:
    I'd use an 88 level with a value of 1 or 0... :-)

    01  filler pic x value space.
        12 myfilenameOPEN    value '1'.
        12 myfilenameCLOSED  value '0'.

    And then I coded it with level 12... :-)

    please read...

    01 filler pic x value space.
    88 myfilenameOPEN value '1'.
    88 myfilenameCLOSED value '0'.

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 02:53:44 2018
    From Newsgroup: comp.lang.cobol

    On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

    In terms of execution efficiency of the ISAM solution, it comes down
    largely to how much of the file you can buffer in memory, but if you ran
    a benchmark I think you would be agreeably surprised by the speed of it.
    The actual processing logic is certainly much simpler than manipulating
    and initializing your table, if the table is truly "large".

    ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

    I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

    YMMV
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Sun Jun 3 14:47:07 2018
    From Newsgroup: comp.lang.cobol

    On Sunday, 3 June 2018 19:53:45 UTC+10, Richard wrote:
    On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

    In terms of execution efficiency of the ISAM solution, it comes down largely to how much of the file you can buffer in memory, but if you ran
    a benchmark I think you would be agreeably surprised by the speed of it. The actual processing logic is certainly much simpler than manipulating and initializing your table, if the table is truly "large".

    ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

    I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

    YMMV
    I don't want to write a new program so I used one of my two-pass report programs. It was reading 200,000 records and writing to a Sort file and took about 6 seconds to when the sort file is produced. When I used standard user selection options to reduce it to 50,000 output records in the Sort file it still took about 6 seconds. So the reading of the entire file was the main delay.
    It is a complex program and is building a sort-key according to user selection. In this case, the sort key was a compound of
    Name-Code, Invoice-No and Tran-Date. The building of the key is by an array and moving each character in a loop. It also has an in-progress display to the screen so the user can see some action going on.
    Not discussed is whether the data is local or on remote server, so mine is local.
    Furthermore, Kellie has indicated that re-loading the table is periodic rather than every time. So the time to reload is not every time and a normal search should be fast.
    There are so many ways of doing things in COBOL and that is why I call it a Chameleon language.
    Greg
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 15:14:09 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, May 31, 2018 at 11:47:11 AM UTC+12, Kellie Fitton wrote:
    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.


    No. Your description tells me that you are initializing one more than the 'exact number of occurrences only'. When you have put something into position 1 the 'exact number of occurrences' is 1, but you are then initializing position 2.

    Do you need an extra empty initialized occurrence at the end of the array which is then included in the ODO ? So that after 'putting something in the first position' the 'occurrences' is 2 ?

    What is done in 'initialize the next position (2)' that cannot be done in 'put something into position 2'.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 15:33:13 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, May 31, 2018 at 3:20:52 PM UTC+12, Kellie Fitton wrote:
    Probably the most efficient way is to set up an independent
    record with all values initialized. Move that record to the
    table as needed.


    I just tested this code as an independent record to initialize the table:
    for every needed occurrence: move ws-repository to ws-table-items.
    it should work just fine since an alphanumeric move is done one byte at
    a time from left to right, and stops when the end of the shortest field
    is encountered. I think the compiler should issue a warning message though about moving a field to a part of itself just as a notice information.


    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-repository.
    05 format-table.
    10 format-alphanumeric pic x(8) value spaces.
    10 format-numeric pic s9(9)v99 comp-3 value +0.
    05 ws-table-items.
    10 filler occurs 1 to 53000 times depending on ws-table-counter.
    15 table-plan pic x(8).
    15 table-member pic s9(9)v99 comp-3.
    """for every needed occurrence: move ws-repository to ws-table-items. it should work just fine since an alphanumeric move is done one byte at
    a time from left to right, and stops when the end of the shortest field
    is encountered."""
    No. An alphanumeric move will pad out the end of the receiving field with spaces. Think of MOVE "A" TO WS-Name which is PIC X(40). Do you expect an "A" followed by whatever was in the remainder of that field before the move ? You should expect an "A" followed by 39 spaces.
    ws-table-items is a field that is 14 x 53000 bytes long. The move as given will move 14 bytes then fill the other 14 x 52999 bytes with spaces.
    If you are doing that "for every needed occurrence" then you are overwriting the whole field each time.
    Now, the ODO table may have a variable number of _usable_ occurrences, depending on ws-table-counter, but that does not change the size of ws-table-items which is a group field.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 15:45:49 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, May 31, 2018 at 9:07:20 PM UTC+12, Kellie Fitton wrote:
    I may be a bit of an extinct Mammoth elephant but have been doing Cobol for 40 years.

    A 1 MByte memory table these days is small - but doing a binary search or sort - that is really extinct.

    I am not really answering for an in memory table and may advise against it.
    Your question: What is the most optimized method to initialize a mammoth table - that needs to be initialized constantly.

    I will only address a suggestion that you write a temproary sort-work using an ISAM file. It is extremly quick and efficient. I gave up using COBOL Sort in the 1980's and moved to using temproary ISAM files and let the file system handle sorting.

    I introduced a standard report structure that is two pass. The first pass is to build an ISAM file with a primary key that is say 32 bytes and the program varies the key according to user selection criteria. The 2nd pass processes the sort-work file. This works well today and is quick.

    If you need more on what primary key to write, what secondary key to write then I can expand.

    You wrote - initialized constantly - and that needs more explanation.

    Greg


    Table-wise this table is small for an ISAM file, That's why I
    elected to use an in-memory table since the search all still
    very fast for a 1 MByte table.

    The table needs to get refreshed/reset periodically so it can
    accommodate anew set of fresh data collected from a master file.
    Hence, the initialization algorithm must reset the old data and
    prepare the table for the re-populate process.
    The only 'reset' you need is MOVE ZERO TO ws-table-count.
    Then as you load data the code would be:
    ADD 1 TO ws-table-count
    MOVE plan TO table-plan(ws-table-count)
    MOVE member TO table-member(ws-table-count)
    As you say, the ODO confines the data. There is no need for the data items beyond the ODO to have anything specific in them or to be 'initialized' and any 'initialization' within the ODO gets overwritten.

    I am very intrigued by your two pass report structure. I hope
    you have the time to elaborate on the process of varying the
    keys according to the users selection criteria. Thanks...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Sun Jun 3 16:00:19 2018
    From Newsgroup: comp.lang.cobol

    On Sunday, June 3, 2018 at 7:43:27 PM UTC+12, Greg Wallace wrote:
    On Thursday, 31 May 2018 02:21:06 UTC+10, Kellie Fitton wrote:
    Hi Folks,

    One of my programs is handling a mammoth table that needs to be
    initialized constantly. It is a million-byte table and used for
    lookup records (binary search all) to increase the speed of the
    program. The clause occurs depending on is used to create the
    table accordingly. Moreover, to ensure reduced CPU consumption,
    the initialization algorithm is using reference modifications to
    obviate initializing the whole table more often that required.

    I need your kind help with the following question:

    What is the most optimized method to initialize a mammoth table?

    Your thoughts and opinions are appreciated.





    COBOL - the elephant that can stand on its trunk...

    I think if Pete agrees than the ISAM idea carries more weight. I just add that you must have KEY. If you were doing a binary search you must be searching for some value. This should be the primary to what I call the sort-work file. You just close and open it for output and it is initialized.

    I tend to Open for Output ten close and then Open again for I-O. There was a good reason for this that escapes me. Even elephants/mammoths don't have perfect memory.

    MicroFocus had an option that allowed OPEN I-O to create the file if it did not already exist. Other systems would give a '35' file status and fail. However OPEN OUTPUT would delete an existing file and recreate a new empty one which may not be useful unless you have just detected a '35' file status.
    Next is the file name. If you have multiple simultaneous users you may want a unique file name for each user session and there are several ways to do this.

    If your KEY is not unique then you can generate a sequence number as the key and have a secondary key for searches (Start, Read-Next).

    Another tip I employ is to always have a flag to indicate whether a file is open. I tend to use myfilename-open which is Y or N. If the file is open successfully set the flag to Y. When it closes set the file to N. This way you can open and close the file in many places. E.G. to refresh the file test whether myfilename-open = Y, then close it, then reopen it. This is pseudo code for convenience rather than actual correct COBOL syntax. It is also a very good way to make sure all files are closed on exit if you have and should have one exit point.

    Also most I-O to this file will be in Cache memory which can a bit slower but you are semi-employing an in memory table without reinventing the wheel re binary searches.

    I hope this is sufficiently clear.
    Greg
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon Jun 4 06:36:29 2018
    From Newsgroup: comp.lang.cobol

    On Sunday, June 3, 2018 at 3:33:14 PM UTC-7, Richard wrote:
    On Thursday, May 31, 2018 at 3:20:52 PM UTC+12, Kellie Fitton wrote:
    Probably the most efficient way is to set up an independent
    record with all values initialized. Move that record to the
    table as needed.


    I just tested this code as an independent record to initialize the table: for every needed occurrence: move ws-repository to ws-table-items.
    it should work just fine since an alphanumeric move is done one byte at
    a time from left to right, and stops when the end of the shortest field
    is encountered. I think the compiler should issue a warning message though about moving a field to a part of itself just as a notice information.


    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-repository.
    05 format-table.
    10 format-alphanumeric pic x(8) value spaces.
    10 format-numeric pic s9(9)v99 comp-3 value +0.
    05 ws-table-items.
    10 filler occurs 1 to 53000 times depending on ws-table-counter.
    15 table-plan pic x(8).
    15 table-member pic s9(9)v99 comp-3.


    """for every needed occurrence: move ws-repository to ws-table-items. it should work just fine since an alphanumeric move is done one byte at
    a time from left to right, and stops when the end of the shortest field
    is encountered."""

    No. An alphanumeric move will pad out the end of the receiving field with spaces. Think of MOVE "A" TO WS-Name which is PIC X(40). Do you expect an "A" followed by whatever was in the remainder of that field before the move ? You should expect an "A" followed by 39 spaces.
    The sending variable format-alphanumeric is size pic x(8)
    The receiving variable table-plan is size pic x(8)
    both variables are same size, same length - NO remainder - NO Pad Out...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon Jun 4 07:08:15 2018
    From Newsgroup: comp.lang.cobol

    On Sunday, June 3, 2018 at 3:14:10 PM UTC-7, Richard wrote:
    On Thursday, May 31, 2018 at 11:47:11 AM UTC+12, Kellie Fitton wrote:
    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.


    No. Your description tells me that you are initializing one more than the 'exact number of occurrences only'. When you have put something into position 1 the 'exact number of occurrences' is 1, but you are then initializing position 2.

    Do you need an extra empty initialized occurrence at the end of the array which is then included in the ODO ? So that after 'putting something in the first position' the 'occurrences' is 2 ?

    What is done in 'initialize the next position (2)' that cannot be done in 'put something into position 2'.


    The initialization logic is: [format-table-items-as-needed-only].
    It will initialize the first occurrence in the table, then after
    putting something in the first-position-of-the-table (1), the
    next position will be initialized when the table have the second
    occurrence and ready to get populated with data for position (2)


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon Jun 4 07:53:53 2018
    From Newsgroup: comp.lang.cobol

    On Sunday, June 3, 2018 at 4:00:20 PM UTC-7, Richard wrote:

    Micro Focus had an option that allowed OPEN I-O to create the file if it did not already exist. Other systems would give a '35' file status and fail. However OPEN OUTPUT would delete an existing file and recreate a new empty one which may not be useful unless you have just detected a '35' file status.
    If the file does not exist--the OPEN I-O will create the file Only
    if the Runtime Configurable IO_CREATES is turned on in the Runtime configuration file: [ IO_CREATES ON ]. Or, if the SELECT statement
    for that file is including the OPTIONAL phrase...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon Jun 4 10:38:41 2018
    From Newsgroup: comp.lang.cobol

    On Sunday, June 3, 2018 at 2:47:08 PM UTC-7, Greg Wallace wrote:

    I don't want to write a new program so I used one of my two-pass report programs. It was reading 200,000 records and writing to a Sort file and took about 6 seconds to when the sort file is produced. When I used standard user selection options to reduce it to 50,000 output records in the Sort file it still took about 6 seconds. So the reading of the entire file was the main delay.
    Greg
    Greg,
    Do you use Micro Focus COBOL compiler? if so:
    Did you set the environment variable IDXDATBUF to increase the buffer size?
    the default value is 0, increasing its value will improve file access speed. the variable must be set in increments of 4096
    SET IDXDATBUF=8192
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Mon Jun 4 13:19:23 2018
    From Newsgroup: comp.lang.cobol

    On Tuesday, June 5, 2018 at 1:36:31 AM UTC+12, Kellie Fitton wrote:
    On Sunday, June 3, 2018 at 3:33:14 PM UTC-7, Richard wrote:
    On Thursday, May 31, 2018 at 3:20:52 PM UTC+12, Kellie Fitton wrote:
    Probably the most efficient way is to set up an independent
    record with all values initialized. Move that record to the
    table as needed.


    I just tested this code as an independent record to initialize the table: for every needed occurrence: move ws-repository to ws-table-items.
    it should work just fine since an alphanumeric move is done one byte at
    a time from left to right, and stops when the end of the shortest field is encountered. I think the compiler should issue a warning message though
    about moving a field to a part of itself just as a notice information.


    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-repository.
    05 format-table.
    10 format-alphanumeric pic x(8) value spaces.
    10 format-numeric pic s9(9)v99 comp-3 value +0.
    05 ws-table-items.
    10 filler occurs 1 to 53000 times depending on ws-table-counter.
    15 table-plan pic x(8).
    15 table-member pic s9(9)v99 comp-3.


    """for every needed occurrence: move ws-repository to ws-table-items. it should work just fine since an alphanumeric move is done one byte at
    a time from left to right, and stops when the end of the shortest field
    is encountered."""

    No. An alphanumeric move will pad out the end of the receiving field with spaces. Think of MOVE "A" TO WS-Name which is PIC X(40). Do you expect an "A" followed by whatever was in the remainder of that field before the move ? You should expect an "A" followed by 39 spaces.

    The sending variable format-alphanumeric is size pic x(8)
    The receiving variable table-plan is size pic x(8)
    """move ws-repository to ws-table-items"""
    I am glad to see that you are not actually doing what you said you were doing. It still seems completely unnecessary.
    both variables are same size, same length - NO remainder - NO Pad Out...
    """and stops when the end of the shortest field is encountered."""
    The MOVE stops when the end of the _receiving_ field is encountered (by adding spaces if necessary).
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Mon Jun 4 13:33:52 2018
    From Newsgroup: comp.lang.cobol

    On Tuesday, June 5, 2018 at 2:08:16 AM UTC+12, Kellie Fitton wrote:
    On Sunday, June 3, 2018 at 3:14:10 PM UTC-7, Richard wrote:
    On Thursday, May 31, 2018 at 11:47:11 AM UTC+12, Kellie Fitton wrote:
    Depends on the content of the table. Only one type, say, binary?
    Or, mixed types, binary and alphanumeric?
    It would be helpful to know the organization as defined in working-storage.
    But, since you are using ODO, what do you need to initialize?

    The table organization are a combination of binary comp-3
    and alphanumeric. The table is populated based on ODO and
    the initialization technique is to initialize as needed
    only. Initializing the first occurrence in the table then
    when putting something in the first position the algorithm
    will initialize the next. Therefore, initializing the exact
    number of occurrences only.


    No. Your description tells me that you are initializing one more than the 'exact number of occurrences only'. When you have put something into position 1 the 'exact number of occurrences' is 1, but you are then initializing position 2.

    Do you need an extra empty initialized occurrence at the end of the array which is then included in the ODO ? So that after 'putting something in the first position' the 'occurrences' is 2 ?

    What is done in 'initialize the next position (2)' that cannot be done in 'put something into position 2'.


    The initialization logic is: [format-table-items-as-needed-only].
    It will initialize the first occurrence in the table, then after
    putting something in the first-position-of-the-table (1), the
    next position will be initialized when the table have the second
    occurrence and ready to get populated with data for position (2)
    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?
    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?
    "ready to get populated with data"
    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.
    Your question was: "What is the most optimized method to initialize a mammoth table?".
    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 20:54:42 2018
    From Newsgroup: comp.lang.cobol

    In article <8dbfa290-0ace-433d-8d45-c8ccedbb394a@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    [snip]

    I think Greg's suggestion to use an ISAM file instead of a table
    is a far superior method since this will eliminate the need to
    initialize the table, and will shorten the runtime instruction
    path since COBOL programs are I/O bound rather than CPU bound.

    Mr Plinston pointed out that loading the table in a certain manner
    eliminates the need to initialize the table.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 20:57:01 2018
    From Newsgroup: comp.lang.cobol

    In article <9faef590-ab53-4f88-a6bd-3c56d8f58e75@googlegroups.com>,
    Richard <riplin@azonic.co.nz> wrote:

    [snip]

    Have you considered using a hash table rather than using a binary search ?

    I considered that, once, and then considered the 2-year programmer who
    would be stuck maintaining the code.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Mon Jun 4 14:02:47 2018
    From Newsgroup: comp.lang.cobol

    On Tuesday, 5 June 2018 03:38:42 UTC+10, Kellie Fitton wrote:
    On Sunday, June 3, 2018 at 2:47:08 PM UTC-7, Greg Wallace wrote:

    I don't want to write a new program so I used one of my two-pass report programs. It was reading 200,000 records and writing to a Sort file and took about 6 seconds to when the sort file is produced. When I used standard user selection options to reduce it to 50,000 output records in the Sort file it still took about 6 seconds. So the reading of the entire file was the main delay.

    Greg


    Greg,

    Do you use Micro Focus COBOL compiler? if so:
    Did you set the environment variable IDXDATBUF to increase the buffer size? the default value is 0, increasing its value will improve file access speed. the variable must be set in increments of 4096

    SET IDXDATBUF=8192
    This elephant was using MicroFocus (MF) COBOL on the first IBM PC that only had two floppy disks. MF COBOL was one of only 20 Apps certified for the release of the first IBM PC. I was not happy with MF ISAM for many reasons and in about 1990 switched to AcuCobol. There ISAM method is called Vision and I have found it very reliable to this day. MF no doubt improved theirs subsequently.
    AcuCobol's config file has a similar option plus some I add:
    # Use 1 if you want files open for I-O to be created if they do not exist IO-CREATES 1
    # NEEDED LOTS OF FILES (DEFAULT IS 128)
    MAX-FILES 255
    # use VISION as the default data source
    DEFAULT_HOST VISION
    Kellie, can you expand on your Table with some sample data?
    So far I get this where XXXXXXXX and YYYYYYYY are values.
    Search Field Pic X(8) Return Field Pic X(8)
    XXXXXXXX YYYYYYYY
    Re Dynamic: I tend to use this every time.
    file SLSRT.CPY - always the same
    SELECT SORT-FILE ASSIGN SORT-FILE-NAME
    ORGANIZATION INDEXED
    ACCESS MODE DYNAMIC
    RECORD KEY SORT-KEY-FIELD
    FILE STATUS FILSTAT
    LOCK MODE IS EXCLUSIVE.
    Here are follow up code samples.

    file FDSRT.CPY - always the same
    FD SORT-FILE.
    01 SORT-RECORD.
    02 SORT-KEY-FIELD.
    03 SORT-KEY PIC X(30).
    03 SORT-REDEF REDEFINES SORT-KEY.
    07 SORT-ARRAY PIC X OCCURS 30.
    03 SORT-SEQ-NO PIC S9(3) COMP-3. SORT-SEQ-NO is a tie breaker in case the sort produces duplicate keys.

    The following copy module varies according to the sort file being produced and must follow FDSRT.CPY.
    BMBI is a unique name for a particular file.
    file SRTBMBI.FD
    02 SRT-BMBI-RECORD.
    03 SRT-BMBI-KEY.
    05 SRT-BMBI-TRAN-NO PIC 9(7).
    03 SRT-BMBI-DATA.
    05 SRT-BMBI-TRAN-DATE PIC 9(8).

    Leaving out other files this is all the code one sees in the source program to define the sort-work file.
    ....
    INPUT-OUTPUT SECTION.
    FILE-CONTROL.
    COPY "SLSRT.SL".
    ....
    DATA DIVISION.
    FILE SECTION.
    COPY "FDSRT.FD ".
    COPY "SRTBMBI.FD ".
    ....
    Hoping this helps
    Greg
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 21:21:07 2018
    From Newsgroup: comp.lang.cobol

    In article <89492c6b-0d46-484a-874f-3f0b67856f04@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    [snip]

    The table is 1 MByte sized and will be searched often so
    a binary search would be more efficient and simpler.

    It is not the table's size that slows the search, it is the number of
    entries. Consider:

    01 WS-ONE-MEG-TBL.
    05 WS-ONE-MEG-LINE OCCURS 1 TO 1000 DEPENDING ON MEG-ENTTRIES.
    10 WS-1ST-500 PIC X(500).
    10 WS-2ND-500 PIC X(500).

    Now... when MEG-ENTRIES = 1000 the table contains 1,000,000 adressable characters. Now, consider:

    01 WS-ONE-MEG-TBL-2.
    05 WS-ONE-MEG-TBL-2-LINE OCCURS 1 TO 50000 DEPENDING ON MEG-ENTRIES.
    10 WS-1ST-50 PIC X(20).
    10 WS-2ND-50 PIC X(20).

    ... and when MEG-ENTRIES = 10000 the table contains 1,000,000 adressable characters... but in ten times as many entries.

    Both are 1-Meg tables but at full capacity the binary search (SEARCH ALL,
    in some compilers) of WS-ONE-MEG-TBL will take log2(1000) comparisons...
    call it 10.

    log2(50000) is... call it 16.

    It isn't the size, it's how it is organized.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 21:24:15 2018
    From Newsgroup: comp.lang.cobol

    In article <799aefa3-aa73-4bfb-bd67-26613d3892cf@googlegroups.com>,
    Greg Wallace <gregwebace@gmail.com> wrote:

    [snip]

    I tend to Open for Output ten close and then Open again for I-O. There
    was a good reason for this that escapes me. Even elephants/mammoths
    don't have perfect memory.

    I was taught to do this in order to 'let the buffers set'... fill up the
    file faster by WRITE to OPEN OUTPUT, close, open I-O for your lookups.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Mon Jun 4 21:26:30 2018
    From Newsgroup: comp.lang.cobol

    In article <fnhmnuFfh7jU1@mid.individual.net>,
    pete dashwood <dashwood@enternet.co.nz> wrote:
    On 3/06/2018 7:43 PM, Greg Wallace wrote:

    [snip]

    I think if Pete agrees than the ISAM idea carries more weight.

    It was a nice thing to say, Greg, but it really isn't true; ideas stand
    on their own merit, not on who espouses them or doesn't... :-)

    Ideas must have merit, Mr Dashwood said so... wait...

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 08:26:12 2018
    From Newsgroup: comp.lang.cobol

    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.
    Richard,
    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The
    ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.
    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.
    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Tue Jun 5 13:25:35 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)
    """The table needs to be initialized (formatted) prior to being populated"""
    NO IT DOES NOT. You seem to be incredibly resistant to advice.
    'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.
    In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.
    If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.
    """The initialization is done As-Needed-Only for each table-row."""
    That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.
    Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.
    At the very least you can now speed up the 'initialization' by simply doing:

    add 1 to ws-counter
    move low-values to ws-table-items(ws-counter)
    which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.
    Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.
    I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.
    You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kerry Liles@kerry.liles@gmail.com to comp.lang.cobol on Tue Jun 5 18:11:49 2018
    From Newsgroup: comp.lang.cobol

    On 6/5/2018 4:25 PM, Richard wrote:
    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The
    ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    """The table needs to be initialized (formatted) prior to being populated"""

    NO IT DOES NOT. You seem to be incredibly resistant to advice.

    'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

    In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

    If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

    """The initialization is done As-Needed-Only for each table-row."""

    That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

    Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

    At the very least you can now speed up the 'initialization' by simply doing:

    add 1 to ws-counter
    move low-values to ws-table-items(ws-counter)

    which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

    Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.


    Roger the above.

    I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

    You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?



    +1 to this.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Tue Jun 5 15:22:53 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)
    Just to show that I am capable of running tests and timing them, which is what you should be doing, I have done 'initializing' a table 3 ways, the results are:
    Each time is for 10,000 repeats and for 50,000 entries
    1. move low-values to ws-repository : 0.12 seconds
    2. move low-values to ws-table-items(index) : 7.2 seconds
    3. move low-values to ws-repository(calculated:length of entry) : 59 seconds So, not only is your code wrong but it is the worst way of doing the initialization by a factor of 48000%.
    Your original code was at least 6000% slower than the best.
    And it doesn't need to be done anyway.
    Also, as yet another failure, ws-table-position is only 5 digits (pic 9(5)) and this will overflow when you multiply 14 * 53000, or 14 * any number more than 7142.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Clark F Morris@cfmpublic@ns.sympatico.ca to comp.lang.cobol on Tue Jun 5 21:19:54 2018
    From Newsgroup: comp.lang.cobol

    On Tue, 5 Jun 2018 08:26:12 -0700 (PDT), Kellie Fitton
    <KELLIEFITTON@yahoo.com> wrote:

    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The >ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)
    If both table-plan and table-member are filled in then the
    initialization is a waste of computer cycles since the filling of
    table-plan will overwrite the low-values in position 1 of table-plan.

    Clark Morris
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Wed Jun 6 14:24:25 2018
    From Newsgroup: comp.lang.cobol

    On 3/06/2018 9:53 PM, Richard wrote:
    On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

    In terms of execution efficiency of the ISAM solution, it comes down
    largely to how much of the file you can buffer in memory, but if you ran
    a benchmark I think you would be agreeably surprised by the speed of it.
    The actual processing logic is certainly much simpler than manipulating
    and initializing your table, if the table is truly "large".

    ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

    I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

    YMMV

    Thanks for that, Richard. It appears that random ISAM access may be
    worse than I would have expected...

    That reinforces the case for loading the table and then using the table
    for random retrieval (where this makes sense to do, of course.)

    I noted (and completely agree with) comments by you and Clark under
    Kellie's post.

    There seems to be some fundamental mis-understanding about
    "initializing" then overwriting.

    Hopefully, the posts have helped to clear it up.

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 20:29:49 2018
    From Newsgroup: comp.lang.cobol

    On Tuesday, June 5, 2018 at 1:25:37 PM UTC-7, Richard wrote:
    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    """The table needs to be initialized (formatted) prior to being populated"""

    NO IT DOES NOT. You seem to be incredibly resistant to advice.

    'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

    In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

    If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

    """The initialization is done As-Needed-Only for each table-row."""

    That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

    Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

    At the very least you can now speed up the 'initialization' by simply doing:

    add 1 to ws-counter
    move low-values to ws-table-items(ws-counter)

    which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

    Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.

    I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

    You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?
    Richard,
    When I posted my question I said: the table needs to be
    initialized CONSTANTLY. Also, as I have mentioned in
    this thread Previously, the table needs to get Refreshed
    and Reset PERIODICALLY before a new set of fresh data
    can re-populate the table. The initialize process must
    erase, remove, clear all previous entries in the table,
    to prepare the table for the NEW set of collected data.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 20:38:02 2018
    From Newsgroup: comp.lang.cobol

    On Tuesday, June 5, 2018 at 3:22:54 PM UTC-7, Richard wrote:
    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    Just to show that I am capable of running tests and timing them, which is what you should be doing, I have done 'initializing' a table 3 ways, the results are:

    Each time is for 10,000 repeats and for 50,000 entries

    1. move low-values to ws-repository : 0.12 seconds

    2. move low-values to ws-table-items(index) : 7.2 seconds

    3. move low-values to ws-repository(calculated:length of entry) : 59 seconds

    So, not only is your code wrong but it is the worst way of doing the initialization by a factor of 48000%.

    Your original code was at least 6000% slower than the best.

    And it doesn't need to be done anyway.

    Also, as yet another failure, ws-table-position is only 5 digits (pic 9(5)) and this will overflow when you multiply 14 * 53000, or 14 * any number more than 7142.
    The ws-table-position variable should be 9 digits, I was typing fast to
    explain the process logic while talking on my cellphone.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 20:49:31 2018
    From Newsgroup: comp.lang.cobol

    On Tuesday, June 5, 2018 at 5:19:58 PM UTC-7, Clark F Morris wrote:
    On Tue, 5 Jun 2018 08:26:12 -0700 (PDT), Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The >ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)
    If both table-plan and table-member are filled in then the
    initialization is a waste of computer cycles since the filling of
    table-plan will overwrite the low-values in position 1 of table-plan.

    Clark Morris
    Clark,
    As I have mentioned in this thread previously, the initialize
    process must happen periodically to reset, refresh the table
    and prepare it for a new set of replacement data. The rest
    and initialize process is much faster when done based on the
    number of entries in the table: [calculate table-position].
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 21:05:25 2018
    From Newsgroup: comp.lang.cobol

    On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:
    On 6/5/2018 4:25 PM, Richard wrote:
    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The
    ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    """The table needs to be initialized (formatted) prior to being populated"""

    NO IT DOES NOT. You seem to be incredibly resistant to advice.

    'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

    In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

    If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

    """The initialization is done As-Needed-Only for each table-row."""

    That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

    Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

    At the very least you can now speed up the 'initialization' by simply doing:

    add 1 to ws-counter
    move low-values to ws-table-items(ws-counter)

    which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

    Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.


    Roger the above.

    I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

    You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?



    +1 to this.
    Richard,
    I already tested and compared several sets of initialization
    methods before posting my question. The Initialize process was
    conducted with the following methods:
    initialize ws-repository
    move low-values to ws-repository
    perform varying loop
    move spaces to table-plan
    move zeros to table-member
    end-perform
    calculate ws-table-position
    The last method was faster by a considerable margin [70%].
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue Jun 5 21:11:47 2018
    From Newsgroup: comp.lang.cobol

    On Tuesday, June 5, 2018 at 7:24:30 PM UTC-7, pete dashwood wrote:
    On 3/06/2018 9:53 PM, Richard wrote:
    On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

    In terms of execution efficiency of the ISAM solution, it comes down
    largely to how much of the file you can buffer in memory, but if you ran >> a benchmark I think you would be agreeably surprised by the speed of it. >> The actual processing logic is certainly much simpler than manipulating
    and initializing your table, if the table is truly "large".

    ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

    I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

    YMMV

    Thanks for that, Richard. It appears that random ISAM access may be
    worse than I would have expected...

    That reinforces the case for loading the table and then using the table
    for random retrieval (where this makes sense to do, of course.)

    I noted (and completely agree with) comments by you and Clark under
    Kellie's post.

    There seems to be some fundamental mis-understanding about
    "initializing" then overwriting.

    Hopefully, the posts have helped to clear it up.

    Pete.

    --
    I used to write COBOL; now I can do anything...


    Pete,

    As mentioned in my question: the table needs to get refreshed
    and reset Constantly. The initialize process must REMOVE the
    old data from the table, then the NEW SET OF FRESH DATA will
    re-populate the table periodically. Hence, re-initialize...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Tue Jun 5 21:59:03 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, June 6, 2018 at 4:11:48 PM UTC+12, Kellie Fitton wrote:
    On Tuesday, June 5, 2018 at 7:24:30 PM UTC-7, pete dashwood wrote:
    On 3/06/2018 9:53 PM, Richard wrote:
    On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

    In terms of execution efficiency of the ISAM solution, it comes down
    largely to how much of the file you can buffer in memory, but if you ran >> a benchmark I think you would be agreeably surprised by the speed of it. >> The actual processing logic is certainly much simpler than manipulating >> and initializing your table, if the table is truly "large".

    ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

    I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

    YMMV

    Thanks for that, Richard. It appears that random ISAM access may be
    worse than I would have expected...

    That reinforces the case for loading the table and then using the table for random retrieval (where this makes sense to do, of course.)

    I noted (and completely agree with) comments by you and Clark under Kellie's post.

    There seems to be some fundamental mis-understanding about
    "initializing" then overwriting.

    Hopefully, the posts have helped to clear it up.

    Pete.

    --
    I used to write COBOL; now I can do anything...


    Pete,

    As mentioned in my question: the table needs to get refreshed
    and reset Constantly. The initialize process must REMOVE the
    old data from the table, then the NEW SET OF FRESH DATA will
    re-populate the table periodically. Hence, re-initialize...

    There is _no_ need to 'remove' the old data from the table (based on the code in your posts). Re-populating the table will _overwrite_ the old data regardless of whether it was names and numbers or low values.

    The ODO set at the appropriate point will ensure that any 'old data' beyond that point is not accessed.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Tue Jun 5 22:20:28 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, June 6, 2018 at 4:05:26 PM UTC+12, Kellie Fitton wrote:
    On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:
    On 6/5/2018 4:25 PM, Richard wrote:
    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The
    ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    """The table needs to be initialized (formatted) prior to being populated"""

    NO IT DOES NOT. You seem to be incredibly resistant to advice.

    'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

    In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

    If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

    """The initialization is done As-Needed-Only for each table-row."""

    That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

    Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

    At the very least you can now speed up the 'initialization' by simply doing:

    add 1 to ws-counter
    move low-values to ws-table-items(ws-counter)

    which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

    Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.


    Roger the above.

    I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

    You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?



    +1 to this.


    Richard,

    I already tested and compared several sets of initialization
    methods before posting my question. The Initialize process was
    conducted with the following methods:

    initialize ws-repository

    move low-values to ws-repository

    perform varying loop
    move spaces to table-plan
    move zeros to table-member
    end-perform

    calculate ws-table-position

    The last method was faster by a considerable margin [70%].
    Then I would posit that your description of what your code does is simply not true.
    The 'calculate' code would move low-values to the table from byte 1 for the number of entries to whatever the ws-table-count held. You claimed that you did:
    """The initialization is done As-Needed-Only for each table-row."""
    and previously you had claimed:
    """The initialization logic is: [format-table-items-as-needed-only].
    It will initialize the first occurrence in the table, then after
    putting something in the first-position-of-the-table (1), the
    next position will be initialized when the table have the second
    occurrence and ready to get populated with data for position (2)"""
    It is as if you are unaware of what you are actually doing in the code.
    It may well be that the 'calculate ws-table-position' is 'faster', especially when ws-table-counter is zero, as it will be in your code sample, because it will do nothing.
    If ws-table-counter is > zero and you are doing this 'for each table-row' as you load the data then it is overwriting the data already loaded.
    Get your act together and work out what your code really is and what it is supposed to be doing; post actual code from your compiled program instead of retyping what you guess it to be; and stop wasting everyone's time.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Tue Jun 5 23:42:04 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, 6 June 2018 15:20:29 UTC+10, Richard wrote:
    On Wednesday, June 6, 2018 at 4:05:26 PM UTC+12, Kellie Fitton wrote:
    On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:
    On 6/5/2018 4:25 PM, Richard wrote:
    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The
    ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    """The table needs to be initialized (formatted) prior to being populated"""

    NO IT DOES NOT. You seem to be incredibly resistant to advice.

    'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

    In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

    If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

    """The initialization is done As-Needed-Only for each table-row."""

    That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

    Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

    At the very least you can now speed up the 'initialization' by simply doing:

    add 1 to ws-counter
    move low-values to ws-table-items(ws-counter)

    which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

    Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.


    Roger the above.

    I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

    You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?



    +1 to this.


    Richard,

    I already tested and compared several sets of initialization
    methods before posting my question. The Initialize process was
    conducted with the following methods:

    initialize ws-repository

    move low-values to ws-repository

    perform varying loop
    move spaces to table-plan
    move zeros to table-member
    end-perform

    calculate ws-table-position

    The last method was faster by a considerable margin [70%].

    Then I would posit that your description of what your code does is simply not true.

    The 'calculate' code would move low-values to the table from byte 1 for the number of entries to whatever the ws-table-count held. You claimed that you did:

    """The initialization is done As-Needed-Only for each table-row."""

    and previously you had claimed:

    """The initialization logic is: [format-table-items-as-needed-only].
    It will initialize the first occurrence in the table, then after
    putting something in the first-position-of-the-table (1), the
    next position will be initialized when the table have the second
    occurrence and ready to get populated with data for position (2)"""

    It is as if you are unaware of what you are actually doing in the code.

    It may well be that the 'calculate ws-table-position' is 'faster', especially when ws-table-counter is zero, as it will be in your code sample, because it will do nothing.

    If ws-table-counter is > zero and you are doing this 'for each table-row' as you load the data then it is overwriting the data already loaded.

    Get your act together and work out what your code really is and what it is supposed to be doing; post actual code from your compiled program instead of retyping what you guess it to be; and stop wasting everyone's time.
    I would like to take a step back. You opened a thread with 'Can mighty Cobol carry an elephant'. That is a brilliant title and you have engaged much discussion. While I may soften Richards remarks you did leave a lot of gaps in the explanation.
    Discussion about in memory tables and initialization somewhat bores me.
    As a business analyst, I would want to know more about the application to look at why this is necessary. There may be some other solution.
    Greg
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Louis Krupp@lkrupp@nospam.pssw.com.invalid to comp.lang.cobol on Wed Jun 6 01:13:43 2018
    From Newsgroup: comp.lang.cobol

    On Tue, 5 Jun 2018 20:49:31 -0700 (PDT), Kellie Fitton
    <KELLIEFITTON@yahoo.com> wrote:
    On Tuesday, June 5, 2018 at 5:19:58 PM UTC-7, Clark F Morris wrote:
    On Tue, 5 Jun 2018 08:26:12 -0700 (PDT), Kellie Fitton
    <KELLIEFITTON@yahoo.com> wrote:

    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The
    ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)
    If both table-plan and table-member are filled in then the
    initialization is a waste of computer cycles since the filling of
    table-plan will overwrite the low-values in position 1 of table-plan.

    Clark Morris


    Clark,

    As I have mentioned in this thread previously, the initialize
    process must happen periodically to reset, refresh the table
    and prepare it for a new set of replacement data. The rest
    and initialize process is much faster when done based on the
    number of entries in the table: [calculate table-position].

    Why not just reset ws-table-counter and possibly table-index,
    depending on what you're using to store newly read records?
    Louis
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kerry Liles@kerry.liles@gmail.com to comp.lang.cobol on Wed Jun 6 09:46:42 2018
    From Newsgroup: comp.lang.cobol

    On 6/6/2018 12:11 AM, Kellie Fitton wrote:
    On Tuesday, June 5, 2018 at 7:24:30 PM UTC-7, pete dashwood wrote:
    On 3/06/2018 9:53 PM, Richard wrote:
    On Sunday, June 3, 2018 at 7:28:19 PM UTC+12, pete dashwood wrote:

    In terms of execution efficiency of the ISAM solution, it comes down
    largely to how much of the file you can buffer in memory, but if you ran >>>> a benchmark I think you would be agreeably surprised by the speed of it. >>>> The actual processing logic is certainly much simpler than manipulating >>>> and initializing your table, if the table is truly "large".

    ISAM lookups may be 10 times _slower_ than a table SEARCH ALL.

    I just did a benchmark on a slow system. 5,000,000 SEARCH ALLs on a 50,000 sized table is < 2 seconds. The same number of ISAM reads on the same data takes 20 sec. Load of the table from the ISAM file is insignificant.

    YMMV

    Thanks for that, Richard. It appears that random ISAM access may be
    worse than I would have expected...

    That reinforces the case for loading the table and then using the table
    for random retrieval (where this makes sense to do, of course.)

    I noted (and completely agree with) comments by you and Clark under
    Kellie's post.

    There seems to be some fundamental mis-understanding about
    "initializing" then overwriting.

    Hopefully, the posts have helped to clear it up.

    Pete.

    --
    I used to write COBOL; now I can do anything...


    Pete,

    As mentioned in my question: the table needs to get refreshed
    and reset Constantly. The initialize process must REMOVE the
    old data from the table, then the NEW SET OF FRESH DATA will
    re-populate the table periodically. Hence, re-initialize...


    As others have pointed out, there is no need to clear the table if your program simply keeps track of the 'currently' highest used entry in the
    table (setting that back to 0 or 1 or whatever you like to use as the
    first entry effectively "clears" the table)
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed Jun 6 10:26:03 2018
    From Newsgroup: comp.lang.cobol


    As a business analyst, I would want to know more about the application to look at why this is necessary. There may be some other solution.

    Greg


    Greg,

    The initialize process resets the table before a new set of
    date re-populate the table. Moving low-values to the table
    with the calculated table position works very well and fast.
    It clears only the number of data entry occurrences without
    initializing the entire table [max size].

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed Jun 6 10:28:52 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, June 6, 2018 at 12:13:45 AM UTC-7, Louis Krupp wrote:

    Why not just reset ws-table-counter and possibly table-index,
    depending on what you're using to store newly read records?

    Louis


    Louis,

    I reset the ws-table-counter back to 1 and it did clear
    the old entries from the table. Thanks...

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Wed Jun 6 10:30:13 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, June 6, 2018 at 6:46:45 AM UTC-7, Kerry Liles wrote:

    As others have pointed out, there is no need to clear the table if your program simply keeps track of the 'currently' highest used entry in the table (setting that back to 0 or 1 or whatever you like to use as the
    first entry effectively "clears" the table)


    Kerry,

    I reset the ws-table-counter back to 1 and it did clear
    the old entries from the table. Thanks...


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Wed Jun 6 12:22:55 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, June 7, 2018 at 5:26:05 AM UTC+12, Kellie Fitton wrote:
    As a business analyst, I would want to know more about the application to look at why this is necessary. There may be some other solution.

    Greg


    Greg,

    The initialize process resets the table before a new set of
    date re-populate the table. Moving low-values to the table
    with the calculated table position works very well and fast.
    It clears only the number of data entry occurrences without
    initializing the entire table [max size].

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)
    You haven't answered the question that Greg asked which is: "why this is necessary". You have been told, several times, that it is not, and yet you continue to 'refine and optimize the unrequired'.
    But to a more substantial criticism of you latest mechanism.
    You originally posted the code above, and then later claimed it was 70% faster than the obvious and simpler code of a simple move of low-values to the whole table (given that is what you comparing to). When you did that the code had ws-counter-value of zero, meaning that no move was done at all.
    So the question arises as to what value are you putting into ws-counter-value before finding it to be working 'well and fast' or indeed 70% faster than something else?
    And then we come to "It clears only the number of data entry occurrences" and "before a new set of date re-populate the table". What mechanism have you implemented to determine ahead of time how many occurrences will be required by 'the new set of data'.
    What value will you put in ws-table-counter (the ODO and basis of the calculation) when you are about to read the file that contains the data items? In particular you need to answer this for the first time the program populates the table.
    Will you pre-read the data file and count the items that require an entry? Your mechanism will either be "well" _or_ "fast" but certainly not both.
    If you do pre-read (and thus double the re-populate time) then how will you ensure that additional records are not added to the file by other processes between the pre-read prior to the initialization (which depends on knowing how many entries are required) and the re-read to do the populating?
    In fact _any_ mechanism that could possibly identify the correct number to use for ws-table-counter in order to "clear only the number of data entry occurrences" will take longer than the "70% faster" that you claim (unsubstantiated) for your code.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Wed Jun 6 12:37:10 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, June 6, 2018 at 6:42:06 PM UTC+12, Greg Wallace wrote:
    On Wednesday, 6 June 2018 15:20:29 UTC+10, Richard wrote:
    On Wednesday, June 6, 2018 at 4:05:26 PM UTC+12, Kellie Fitton wrote:
    On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:
    On 6/5/2018 4:25 PM, Richard wrote:
    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The
    ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    """The table needs to be initialized (formatted) prior to being populated"""

    NO IT DOES NOT. You seem to be incredibly resistant to advice.

    'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

    In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

    If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

    """The initialization is done As-Needed-Only for each table-row."""

    That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

    Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

    At the very least you can now speed up the 'initialization' by simply doing:

    add 1 to ws-counter
    move low-values to ws-table-items(ws-counter)

    which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

    Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.


    Roger the above.

    I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

    You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?



    +1 to this.


    Richard,

    I already tested and compared several sets of initialization
    methods before posting my question. The Initialize process was
    conducted with the following methods:

    initialize ws-repository

    move low-values to ws-repository

    perform varying loop
    move spaces to table-plan
    move zeros to table-member
    end-perform

    calculate ws-table-position

    The last method was faster by a considerable margin [70%].

    Then I would posit that your description of what your code does is simply not true.

    The 'calculate' code would move low-values to the table from byte 1 for the number of entries to whatever the ws-table-count held. You claimed that you did:

    """The initialization is done As-Needed-Only for each table-row."""

    and previously you had claimed:

    """The initialization logic is: [format-table-items-as-needed-only].
    It will initialize the first occurrence in the table, then after
    putting something in the first-position-of-the-table (1), the
    next position will be initialized when the table have the second
    occurrence and ready to get populated with data for position (2)"""

    It is as if you are unaware of what you are actually doing in the code.

    It may well be that the 'calculate ws-table-position' is 'faster', especially when ws-table-counter is zero, as it will be in your code sample, because it will do nothing.

    If ws-table-counter is > zero and you are doing this 'for each table-row' as you load the data then it is overwriting the data already loaded.

    Get your act together and work out what your code really is and what it is supposed to be doing; post actual code from your compiled program instead of retyping what you guess it to be; and stop wasting everyone's time.

    I would like to take a step back. You opened a thread with 'Can mighty Cobol carry an elephant'. That is a brilliant title and you have engaged much discussion. While I may soften Richards remarks you did leave a lot of gaps in the explanation.

    My remarks are likely to be no worse than he would get at his first code review should he ever get a job as a trainee coder. In fact, given his refusal to take advice, from several here, that initialization is not required beyond setting the correct ODO (and he even gets doing that wrong), and his explanations often not matching the code he supplies, then it is likely that he wouldn't survive his second code review.

    Discussion about in memory tables and initialization somewhat bores me.

    As a business analyst, I would want to know more about the application to look at why this is necessary. There may be some other solution.

    You are correct. While the table lookup may be significantly faster than the simple solution (as proposed by Pete and others) of just doing a lookup on the ISAM file - adding an index on 'plan' if it doesn't already exist, the complexity of determining when a re-population is necessary may completely nullify that saving.
    But I don't think that he has any real application, he certainly hasn't thought it through.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Wed Jun 6 12:43:55 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, June 7, 2018 at 5:28:53 AM UTC+12, Kellie Fitton wrote:
    On Wednesday, June 6, 2018 at 12:13:45 AM UTC-7, Louis Krupp wrote:

    Why not just reset ws-table-counter and possibly table-index,
    depending on what you're using to store newly read records?

    Louis


    Louis,

    I reset the ws-table-counter back to 1 and it did clear
    the old entries from the table. Thanks...

    Actually that will "clear the old entries" except 1.

    Being 'off by 1' is a common failure of the novice programmer and something that must be vigorously checked for.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Wed Jun 6 19:56:02 2018
    From Newsgroup: comp.lang.cobol

    In article <a8d646ea-8790-49cb-8888-a0bd2b3687f7@googlegroups.com>,
    Richard <riplin@azonic.co.nz> wrote:

    [snip]

    Now the code has ws-table-items with the occurs (which
    may be an improvement) rather than a subsiduary filler field with the
    occurs.

    Mr Plinston, upon seeing that first posting and its inability to address a given table entry my suspicions of 'how did this ever pass Prod review?' waxed.

    They've yet to wane. I don't know what's going on here but olfaction indicates a mix of both 'please do my job' and 'please do my homework'... maybe 'please do my on-the-job-training'?

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Wed Jun 6 20:03:13 2018
    From Newsgroup: comp.lang.cobol

    In article <7621f8ab-def8-4886-9363-8bfa8993dd39@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    [snip]

    The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.

    Don't worry about not paying any attention to work-related questions you
    are asking others to assist you with for free. I, for one, am paying double-less attention in my responses.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Wed Jun 6 20:08:52 2018
    From Newsgroup: comp.lang.cobol

    In article <d630c630-3881-4a87-881a-a6c7a55aaf55@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    [snip]

    When I posted my question I said: the table needs to be
    initialized CONSTANTLY.

    This is, at best, loose terminology. If something is being done
    constantly then nothing else is being done.

    Also, as I have mentioned in
    this thread Previously, the table needs to get Refreshed
    and Reset PERIODICALLY before a new set of fresh data
    can re-populate the table.

    Sorry, there's no time to refresh and reset... something else is being
    done constantly, remember?

    I suggest we start afresh. Assuming that the program to which you are referring has already been written:

    1) What does the program currently do?

    2) What should the program be doing better?

    Assuming that the program has not already been written:

    0) What is the program going to do?

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Richard@riplin@azonic.co.nz to comp.lang.cobol on Wed Jun 6 14:01:01 2018
    From Newsgroup: comp.lang.cobol

    On Wednesday, June 6, 2018 at 4:05:26 PM UTC+12, Kellie Fitton wrote:
    On Tuesday, June 5, 2018 at 3:11:51 PM UTC-7, Kerry Liles wrote:
    On 6/5/2018 4:25 PM, Richard wrote:
    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The
    ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    """The table needs to be initialized (formatted) prior to being populated"""

    NO IT DOES NOT. You seem to be incredibly resistant to advice.

    'Initialization' does NOT 'format' the data area. The 'format' of the table items is set by picture clauses during the compile.

    In fact, low-values may not be valid in table-member, depending on implementation, because the final nibble will be the code for the sign value. This actually doesn't matter because you will be moving a valid number to it before it is used anyway.

    If you are populating the table sequentially, and thus moving data into both fields for every occurrence up to ws-table-counter, then the 'initialization' is just a waste of time. The resulting table will be identical without it.

    """The initialization is done As-Needed-Only for each table-row."""

    That is NOT what your code is doing. Your code is moving low-values from byte 1 of the whole table up to the current limit of the table. If you are executing this code for each data item then you are overwriting the data already in the table each time.

    Your code seems to change each time you post, and you seem to get it wrong each time. Now the code has ws-table-items with the occurs (which may be an improvement) rather than a subsiduary filler field with the occurs.

    At the very least you can now speed up the 'initialization' by simply doing:

    add 1 to ws-counter
    move low-values to ws-table-items(ws-counter)

    which will avoid overwriting all the current data items already loaded and will be faster than reference notation. But it is still a complete waste.

    Another improvement is to 'move low-values to ws-repository' before loading any data (as was suggested by Kerry). This is likely to be much faster than doing it item by item because of the overhead of using a subscript and of doing thousands of moves rather than just one.


    Roger the above.

    I suggest that you post code that has been compiled and TESTED rather than just making up more stuff on the fly and getting it wrong.

    You should be testing the speed of each of these methods and also seeing that the results match what you expect. Why aren't you doing that?



    +1 to this.


    Richard,

    I already tested and compared several sets of initialization
    methods before posting my question. The Initialize process was
    conducted with the following methods:

    I am not convinced that your claim is true. If you had "tested and compared several methods" then you would not have posted code that was the _worst_ example, by many thousand percent, and that didn't do it as you claimed ('off by 1 error').
    initialize ws-repository

    move low-values to ws-repository

    perform varying loop
    move spaces to table-plan
    move zeros to table-member
    end-perform

    calculate ws-table-position

    The last method was faster by a considerable margin [70%].
    I have done some testing of your 'calculate ws-table-position' and can't get it significantly faster than 'move low-values to ws-repository' without reducing the number of entries that it clears to being much smaller numbers. To get it 70% faster would require only clearing about a third or less of the table. So how are you determining the number that does need to be cleared? What number did you use in your test?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Wed Jun 6 17:53:34 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, 7 June 2018 06:08:53 UTC+10, docd...@panix.com wrote:
    In article <d630c630-3881-4a87-881a-a6c7a55aaf55@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    [snip]

    When I posted my question I said: the table needs to be
    initialized CONSTANTLY.

    This is, at best, loose terminology. If something is being done
    constantly then nothing else is being done.

    Also, as I have mentioned in
    this thread Previously, the table needs to get Refreshed
    and Reset PERIODICALLY before a new set of fresh data
    can re-populate the table.

    Sorry, there's no time to refresh and reset... something else is being
    done constantly, remember?

    I suggest we start afresh. Assuming that the program to which you are referring has already been written:

    1) What does the program currently do?

    2) What should the program be doing better?

    Assuming that the program has not already been written:

    0) What is the program going to do?

    DD

    Kellie, you are sending everyone into a spin and I see many trying to help. You are not explaining the application.

    You have some master data that is constantly updated. Why? What is the nature of it.

    An example of a different approach could be that the master data needs an alternate key.

    Greg
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 15:54:50 2018
    From Newsgroup: comp.lang.cobol

    On 7/06/2018 5:30 AM, Kellie Fitton wrote:
    On Wednesday, June 6, 2018 at 6:46:45 AM UTC-7, Kerry Liles wrote:

    As others have pointed out, there is no need to clear the table if your
    program simply keeps track of the 'currently' highest used entry in the
    table (setting that back to 0 or 1 or whatever you like to use as the
    first entry effectively "clears" the table)


    Kerry,

    I reset the ws-table-counter back to 1 and it did clear
    the old entries from the table. Thanks...


    Kellie,

    I think I can see where your confusion was and it is understandable. WHY
    no initialization is required is not immediately obvious. Fortunately,
    you received some good advice...

    To summarize; it works like this:

    1. If you use ODO, then ODO has the value for the highest entry in your
    table and it won't let anything access data that may have been
    previously stored "beyond" that entry number. (You have a "window" of
    entries between 1 and this number...)

    2. If you DON'T use ODO, and write your own binary chop to search the
    table, then you must count the entries as you add them and this will be
    the highest entry, just like ODO had, above. (In your homemade binary
    chop you will use this number for your search calculation but you will
    never access anything above it. Just like ODO, you have, in effect, the
    same "window" of entries between 1 and this number.)

    So, in either case, there is no need to ever initialize the table
    because anything above the high entry cannot be accessed, and everything
    in the active "window" is going to be overwritten...

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 15:57:50 2018
    From Newsgroup: comp.lang.cobol

    On 7/06/2018 5:30 AM, Kellie Fitton wrote:
    On Wednesday, June 6, 2018 at 6:46:45 AM UTC-7, Kerry Liles wrote:

    As others have pointed out, there is no need to clear the table if your
    program simply keeps track of the 'currently' highest used entry in the
    table (setting that back to 0 or 1 or whatever you like to use as the
    first entry effectively "clears" the table)


    Kerry,

    I reset the ws-table-counter back to 1 and it did clear
    the old entries from the table. Thanks...


    It didn't really clear them, it just made the upper limit equal 1 entry
    so your active "window" became 1 entry...

    That's why Kerry said "effectively". :-)

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 16:15:19 2018
    From Newsgroup: comp.lang.cobol

    On 7/06/2018 12:53 PM, Greg Wallace wrote:
    On Thursday, 7 June 2018 06:08:53 UTC+10, docd...@panix.com wrote:
    In article <d630c630-3881-4a87-881a-a6c7a55aaf55@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    [snip]

    When I posted my question I said: the table needs to be
    initialized CONSTANTLY.

    This is, at best, loose terminology. If something is being done
    constantly then nothing else is being done.

    Also, as I have mentioned in
    this thread Previously, the table needs to get Refreshed
    and Reset PERIODICALLY before a new set of fresh data
    can re-populate the table.

    Sorry, there's no time to refresh and reset... something else is being
    done constantly, remember?

    I suggest we start afresh. Assuming that the program to which you are
    referring has already been written:

    1) What does the program currently do?

    2) What should the program be doing better?

    Assuming that the program has not already been written:

    0) What is the program going to do?

    DD

    Kellie, you are sending everyone into a spin and I see many trying to help.

    Speak for yourself, Greg... I'm not spun by anything here and I can see
    other posts that aren't either. :-)
    You are not explaining the application.

    I believe Kellie has given it his/her best shot, but there is some
    looseness in the language.

    It is pretty well established that a "large" table needs to be created
    and sequenced before it is used. It is also established that new "sets"
    of data arrive periodically to replace what was there before.

    Problems have happened due to a misconception about the need to
    initialize the table. I suspect those misconceptions are now resolved.




    You have some master data that is constantly updated. Why? What is the nature of it.

    As the original enquiry was about initializing the table, the
    application use of the data is not really pertinent to the discussion
    and risks just strewing more confusion.

    An example of a different approach could be that the master data needs an alternate key.

    Or it could be any one of the many suggestions already made... :-)

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 16:17:50 2018
    From Newsgroup: comp.lang.cobol

    On 6/06/2018 3:38 PM, Kellie Fitton wrote:
    On Tuesday, June 5, 2018 at 3:22:54 PM UTC-7, Richard wrote:
    On Wednesday, June 6, 2018 at 3:26:14 AM UTC+12, Kellie Fitton wrote:
    On Monday, June 4, 2018 at 1:33:53 PM UTC-7, Richard wrote:

    The question still arises: why are you bothering to initialize the fields that you are going to overwrite ?

    Is it because you wrongly think that a MOVE "stops when the end of the shortest field is encountered" and thus might leave junk in the receiving field ?

    "ready to get populated with data"

    Why do you think happens that makes the entry "ready" other than just incrementing the ODO ? In an ODO table _all_ the entries, all 53000 of them exist all the time as defined. The only thing that ODO adds is setting a virtual upper bound check. If you are going to be moving data to all the subfields then 'initialization' adds nothing.

    Your question was: "What is the most optimized method to initialize a mammoth table?".

    The answer is, in the case you describe with ODO: Don't bother with initializing when the initialization just gets completely overwritten.


    Richard,

    The table needs to be initialized (formatted) prior to being
    populated with the data collected from the master file. The
    initialization is done As-Needed-Only for each table-row. The
    ws-table-counter has the higher position in the repository
    table effectively occupied when the initialization is going
    to be made. Below is the code that calculates the new table
    position prior to the data population into the table-row.

    01 ws-table-counter pic 9(5) comp-5 value 0.
    01 ws-table-position pic 9(5) comp-5 value 0.
    01 ws-repository.
    03 ws-table-items occurs 1 to 53000 times depending on
    ws-table-counter
    ascending table-plan
    indexed by table-index.
    05 table-plan pic x(8).
    05 table-member pic s9(9)v99 comp-3.

    compute ws-table-position =
    (length of ws-table-items * ws-table-counter)
    end-compute
    move low-values to ws-repository (1:ws-table-position)

    Just to show that I am capable of running tests and timing them, which is what you should be doing, I have done 'initializing' a table 3 ways, the results are:

    Each time is for 10,000 repeats and for 50,000 entries

    1. move low-values to ws-repository : 0.12 seconds

    2. move low-values to ws-table-items(index) : 7.2 seconds

    3. move low-values to ws-repository(calculated:length of entry) : 59 seconds

    So, not only is your code wrong but it is the worst way of doing the initialization by a factor of 48000%.

    Your original code was at least 6000% slower than the best.

    And it doesn't need to be done anyway.

    Also, as yet another failure, ws-table-position is only 5 digits (pic 9(5)) and this will overflow when you multiply 14 * 53000, or 14 * any number more than 7142.

    The ws-table-position variable should be 9 digits, I was typing fast to explain the process logic while talking on my cellphone.

    LOL! I think that last sentence says it all... :-)

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 16:28:51 2018
    From Newsgroup: comp.lang.cobol

    On 7/06/2018 8:03 AM, docdwarf@panix.com wrote:
    In article <7621f8ab-def8-4886-9363-8bfa8993dd39@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    [snip]

    The ws-table-position variable should be 9 digits, I was typing fast to
    explain the process logic while talking on my cellphone.

    Don't worry about not paying any attention to work-related questions you
    are asking others to assist you with for free. I, for one, am paying double-less attention in my responses.

    DD

    Yes, it would seem that posts on "mighty COBOL" might be "best
    avoided"... :-)

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Thu Jun 7 16:29:52 2018
    From Newsgroup: comp.lang.cobol

    On 7/06/2018 8:03 AM, docdwarf@panix.com wrote:
    In article <7621f8ab-def8-4886-9363-8bfa8993dd39@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:

    [snip]

    The ws-table-position variable should be 9 digits, I was typing fast to
    explain the process logic while talking on my cellphone.

    Don't worry about not paying any attention to work-related questions you
    are asking others to assist you with for free. I, for one, am paying double-less attention in my responses.

    DD

    Kellie,

    if you don't take posts here seriously, there is a real risk you will be considered a troll (or worse...) and you may not get much help when you
    really need it.

    I give you credit for being honest enough to admit what was happening,
    but try to understand that most people here are serious about their programming and although they really will help, no one likes to feel
    that their time is being wasted.

    You need to think before typing and choose your words carefully.

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Louis Krupp@lkrupp@nospam.pssw.com.invalid to comp.lang.cobol on Thu Jun 7 01:32:24 2018
    From Newsgroup: comp.lang.cobol

    On Tue, 5 Jun 2018 20:38:02 -0700 (PDT), Kellie Fitton
    <KELLIEFITTON@yahoo.com> wrote:
    The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.
    Please say the cell phone conversation was about COBOL.
    Louis
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Thu Jun 7 01:02:58 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, June 7, 2018 at 12:33:22 AM UTC-7, Louis Krupp wrote:
    On Tue, 5 Jun 2018 20:38:02 -0700 (PDT), Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:


    The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.

    Please say the cell phone conversation was about COBOL.

    Louis


    Hi Louis,

    As a matter of fact, it was. However, it was some bad news about my
    favorite programming language---COBOL. It is making me angry and
    emotional to say the least. I will explain why in a new post shortly.
    I would like to hear your unbiased opinion, though. Thanks...

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Greg Wallace@gregwebace@gmail.com to comp.lang.cobol on Thu Jun 7 01:26:21 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, 7 June 2018 18:02:59 UTC+10, Kellie Fitton wrote:
    On Thursday, June 7, 2018 at 12:33:22 AM UTC-7, Louis Krupp wrote:
    On Tue, 5 Jun 2018 20:38:02 -0700 (PDT), Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:


    The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.

    Please say the cell phone conversation was about COBOL.

    Louis


    Hi Louis,

    As a matter of fact, it was. However, it was some bad news about my
    favorite programming language---COBOL. It is making me angry and
    emotional to say the least. I will explain why in a new post shortly.
    I would like to hear your unbiased opinion, though. Thanks...

    Sorry to hear that Kellie and I look forward to a new post.

    If I was your Boss, I would not be questioning the COBOL language but the why. Why do you need a binary search on a constantly varying table? Why is it constantly varying? Is there a better way of doing it?

    Greg
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Thu Jun 7 03:01:45 2018
    From Newsgroup: comp.lang.cobol

    On Thursday, June 7, 2018 at 1:26:22 AM UTC-7, Greg Wallace wrote:
    On Thursday, 7 June 2018 18:02:59 UTC+10, Kellie Fitton wrote:
    On Thursday, June 7, 2018 at 12:33:22 AM UTC-7, Louis Krupp wrote:
    On Tue, 5 Jun 2018 20:38:02 -0700 (PDT), Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:


    The ws-table-position variable should be 9 digits, I was typing fast to >explain the process logic while talking on my cellphone.

    Please say the cell phone conversation was about COBOL.

    Louis


    Hi Louis,

    As a matter of fact, it was. However, it was some bad news about my favorite programming language---COBOL. It is making me angry and
    emotional to say the least. I will explain why in a new post shortly.
    I would like to hear your unbiased opinion, though. Thanks...

    Sorry to hear that Kellie and I look forward to a new post.

    If I was your Boss, I would not be questioning the COBOL language but the why. Why do you need a binary search on a constantly varying table? Why is it constantly varying? Is there a better way of doing it?

    Greg


    Hi Greg,

    First, one of my programs function as a sifting thread, it will
    collect certain data from a master file based on some qualifying
    criteria, patterns and relevant information. The collected data
    are loaded into the table temporarily for the purpose of lookup
    and comparison against counterpart data mined from another file.

    Once the analysis are done, these sets of data must be removed
    from the table and replaced with a new fresh data to repeat the
    same process again. Working with Tables Without the ODO clause,
    I always initialize the table to reset/format/refresh the table
    to prepare it for the re-populate process. The binary search all
    was selected to increase the speed of the table lookup process.

    I am looking forward to hear your opinion in my new post.

    Regards...

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Thu Jun 7 13:30:50 2018
    From Newsgroup: comp.lang.cobol

    In article <2feb1ff3-fb62-47af-a521-61df3e782133@googlegroups.com>,
    Greg Wallace <gregwebace@gmail.com> wrote:

    [snip]

    Why do you need a binary search on a constantly varying table?
    Why is it constantly varying? Is there a better way of doing it?

    Mr Wallace, these need to be adressed in their own thread; the wealth of knowledge uncovered might astound some folks here.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Thu Jun 7 13:49:25 2018
    From Newsgroup: comp.lang.cobol

    In article <69f834be-74d3-4031-8998-f851e2201375@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:
    On Thursday, June 7, 2018 at 1:26:22 AM UTC-7, Greg Wallace wrote:

    [snip]

    If I was your Boss, I would not be questioning the COBOL language but
    the why. Why do you need a binary search on a constantly varying table?
    Why is it constantly varying? Is there a better way of doing it?

    Hi Greg,

    First, one of my programs function as a sifting thread, it will
    collect certain data from a master file based on some qualifying
    criteria, patterns and relevant information.

    Ms Fitton - that reads more like advertising-copy and less like program
    specs, it is so broad as to be devoid of value.

    '... collect certain data' ... unless you want to 'collect all data' then almost all programs do some of this.

    '... from a master file ...' ... how is this different from collecting
    certain data 'from a subordinate file'? How does it assist the statememt
    of the program's function to add this?

    '... based on some qualifying criteria, patterns and relevant
    information'... ... as opposed on getting data that 'doesn't qualify for
    your criteria, is random and irrelevant'?

    This is noise, Ms Fitton, and is best kept to a minimum.

    The collected data
    are loaded into the table temporarily for the purpose of lookup
    and comparison against counterpart data mined from another file.

    So... from a master file you'll gather the customer numbers of all the left-handed taxi drivers who live in Chicago, load them into a Mammoth
    Table and then check what they purchased in October... and based on that
    you may or may not mount an ad campaign.

    Something like that?

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114