• Is mighty COBOL a fortune teller?

    From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon May 21 06:42:12 2018
    From Newsgroup: comp.lang.cobol

    Hi Folks.


    I cobbled together a program to perform data mining on a large
    collection of sizable ISAM files. The algorithms will mine the
    warehoused data for relevant statistics, and generate predictive
    analytics to guide management decisions and measure performance.

    However, extracting real meaning from data can be challenging,
    fiendishly complex to understand, and wildly counter intuitive.
    Major factors to consider are: bad data, flawed processes and
    misinterpretation of results can produce false positives and
    negatives, which can lead to inaccurate conclusions and
    injudicious business decisions. I would like to know your
    professional opinions with the following questions:

    1). in social sciences, is it practical or useful to develop
    a predictive model?

    2). Are there any ironclad guarantees around predictive models?

    Thank you for your feedback.





    COBOL - the elephant that can stand on its trunk...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Doc Trins O'Grace@doctrinsograce@gmail.com to comp.lang.cobol on Mon May 21 09:58:30 2018
    From Newsgroup: comp.lang.cobol

    Good questions!
    Here are some particularly dubious answers, from the least authoritative member of this group.
    1) Yes
    2) No
    As a young fellow, I naively thought that good predictive models were unachievable because of the nature of the "soft sciences" themselves. My mother was a sociologist. I degreed in physics and astronomy. Hence my faulty prejudice! Unable to find employment after my education, I ended up in engineering... and they've been stuck with me for over 45 years.
    Anyway, let me give you an anecdote that could be helpful. In continuous process control we often gather large volumes of data. As part of quality control, lab testing can determine specific quality levels. However, there's a common problem when the lab analysis is not available soon enough to adjust production controls. We have a discipline called statistical process control. But we are hardly every really satisfied by what we can do.
    Nonetheless, the large volumes of data accumulated in process control and quality analysis, is a wonderful resource by which a neural network can be trained.
    So, I've set up neural networks specifically in order to perform some type of real time process/quality control.
    Yet, even when the neural net achieves higher than 90% accuracy, people complain. They ask, "But HOW does the net know? What rules is it using?" However, the nature of the neural net does not easily expose why it arrives at its predictions.
    In the end, people prefer inaccuracies from a predictive model that they understand, to accuracies from one they cannot understand.
    So in my opinion, given the data issues you describe, plus all the presuppositions on which they may be founded, I think you may be into a situation that is so fundamentally complex that you will never be able to prove to yourself -- let alone others -- that you have useful predictive model.
    Of course these issues may have been solved by others since my day. But I rather doubt it.
    I will not be offended if you find this completely useless. However, if you just say "Thank you." I can maintain the illusion that I can still be helpful.
    Good luck, ma'am.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Mon May 21 16:55:18 2018
    From Newsgroup: comp.lang.cobol

    Hi Doc Trins O'Grace,

    I appreciate your informative feedback. I interviewed some
    of my friends who are working as engineers and their answers
    rather surprised me. They said to increase the percentage of
    accuracy and clairvoyant logic of their predictive analytics,
    they must leverage the power of a new wrinkle in their field,
    and it is the reliance on Machine Learning and AI, Artificial
    Intelligence. I find it shocking that the most sophisticated
    predictive software can become fully non-predictive in just
    two weeks period, due to the complexity, uncertainty and
    unpredictability of our connected world. case in point, the
    collapse of financial services firm Lehman Brothers, and the
    great recession of 2007 that was not predicted by economist
    who are trained to forecast and uses predictive analysis.

    I must find a way to make might COBOL as accurate as a traveling
    gypsy. :-)


    Regards.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Tue May 22 20:08:05 2018
    From Newsgroup: comp.lang.cobol

    On 22/05/2018 1:42 AM, Kellie Fitton wrote:
    Hi Folks.


    I cobbled together a program to perform data mining on a large
    collection of sizable ISAM files. The algorithms will mine the
    warehoused data for relevant statistics, and generate predictive
    analytics to guide management decisions and measure performance.

    However, extracting real meaning from data can be challenging,
    fiendishly complex to understand, and wildly counter intuitive.
    Major factors to consider are: bad data, flawed processes and misinterpretation of results can produce false positives and
    negatives, which can lead to inaccurate conclusions and
    injudicious business decisions. I would like to know your
    professional opinions with the following questions:

    1). in social sciences, is it practical or useful to develop
    a predictive model?

    Yes. BUT, there are some provisos...:

    (These provisos apply to changing the alpha factor on an ancient
    Inventory Control system to reflect seasonal demand fluctuations,
    constructing a deep neural network for a specific AI application and
    loading 10s of millions of data points into it, or to a simple
    self-modifying heuristic programming example; in other words, ANY kind
    of software where the "results" are predicated on previous results and modified within desirable constraints:)

    1. The predictions (no matter what the algorithm claims) should be
    considered to be accurate to within 50%. In other words, the model can
    be used to give a "general likelihood" of what is going to happen.

    2. No financial risk of any kind must be taken, based on the prediction.

    3. The rules above don't get changed if the model is within 10% (Unless
    it is run across at least 1000 datasets and ALWAYS predicts within 10%
    of the actual outcome.) In other words, the "credibility" of the model
    may improve but that doesn't alter the rules for using it.


    2). Are there any ironclad guarantees around predictive models?

    No.

    But that doesn't mean they are worthless.

    Some classes of problem can ONLY be solved by a computer using
    heuristics or AI, because it would take longer than the time available,
    to solve them using traditional methods.

    If a heuristic model finds its way through a complex maze, the solution
    may not be the BEST one, but it is better than NO solution.

    If an AI net predicts cases of cholera within 5 miles of your location,
    you might well laugh it off but you'd probably renew your vaccination,
    just in case...

    This whole field is expanding rapidly and it is likely that much more
    reliable predictions will be available within the next few years. It
    might then become possible to relax rules 1 and 2, but for now you
    should treat the output from a predictive model with extreme skepticism,
    even when it gets it pretty much right...

    It's like the "Pirate Rules" in Pirates of the Caribbean; more of a "guideline", really.

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Tue May 22 20:13:30 2018
    From Newsgroup: comp.lang.cobol

    On 22/05/2018 4:58 AM, Doc Trins O'Grace wrote:
    Good questions!

    Here are some particularly dubious answers, from the least authoritative member of this group.

    1) Yes

    2) No

    As a young fellow, I naively thought that good predictive models were unachievable because of the nature of the "soft sciences" themselves. My mother was a sociologist. I degreed in physics and astronomy. Hence my faulty prejudice! Unable to find employment after my education, I ended up in engineering... and they've been stuck with me for over 45 years.

    Anyway, let me give you an anecdote that could be helpful. In continuous process control we often gather large volumes of data. As part of quality control, lab testing can determine specific quality levels. However, there's a common problem when the lab analysis is not available soon enough to adjust production controls. We have a discipline called statistical process control. But we are hardly every really satisfied by what we can do.

    Nonetheless, the large volumes of data accumulated in process control and quality analysis, is a wonderful resource by which a neural network can be trained.

    So, I've set up neural networks specifically in order to perform some type of real time process/quality control.

    Yet, even when the neural net achieves higher than 90% accuracy, people complain. They ask, "But HOW does the net know? What rules is it using?" However, the nature of the neural net does not easily expose why it arrives at its predictions.

    In the end, people prefer inaccuracies from a predictive model that they understand, to accuracies from one they cannot understand.

    So in my opinion, given the data issues you describe, plus all the presuppositions on which they may be founded, I think you may be into a situation that is so fundamentally complex that you will never be able to prove to yourself -- let alone others -- that you have useful predictive model.

    Of course these issues may have been solved by others since my day. But I rather doubt it.

    I will not be offended if you find this completely useless. However, if you just say "Thank you." I can maintain the illusion that I can still be helpful.

    Good luck, ma'am.

    A very interesting read and a very good example.

    "In the end, people prefer inaccuracies from a predictive model that
    they understand, to accuracies from one they cannot understand."

    So true.

    For myself, I would never act on evidence from a model alone, but I
    would let it steer me in a direction where other confirmation could be obtained.

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue May 22 02:32:57 2018
    From Newsgroup: comp.lang.cobol

    On Tuesday, May 22, 2018 at 1:08:09 AM UTC-7, pete dashwood wrote:
    On 22/05/2018 1:42 AM, Kellie Fitton wrote:
    Hi Folks.


    I cobbled together a program to perform data mining on a large
    collection of sizable ISAM files. The algorithms will mine the
    warehoused data for relevant statistics, and generate predictive
    analytics to guide management decisions and measure performance.

    However, extracting real meaning from data can be challenging,
    fiendishly complex to understand, and wildly counter intuitive.
    Major factors to consider are: bad data, flawed processes and misinterpretation of results can produce false positives and
    negatives, which can lead to inaccurate conclusions and
    injudicious business decisions. I would like to know your
    professional opinions with the following questions:

    1). in social sciences, is it practical or useful to develop
    a predictive model?

    Yes. BUT, there are some provisos...:

    (These provisos apply to changing the alpha factor on an ancient
    Inventory Control system to reflect seasonal demand fluctuations, constructing a deep neural network for a specific AI application and
    loading 10s of millions of data points into it, or to a simple self-modifying heuristic programming example; in other words, ANY kind
    of software where the "results" are predicated on previous results and modified within desirable constraints:)

    1. The predictions (no matter what the algorithm claims) should be considered to be accurate to within 50%. In other words, the model can
    be used to give a "general likelihood" of what is going to happen.

    2. No financial risk of any kind must be taken, based on the prediction.

    3. The rules above don't get changed if the model is within 10% (Unless
    it is run across at least 1000 datasets and ALWAYS predicts within 10%
    of the actual outcome.) In other words, the "credibility" of the model
    may improve but that doesn't alter the rules for using it.


    2). Are there any ironclad guarantees around predictive models?

    No.

    But that doesn't mean they are worthless.

    Some classes of problem can ONLY be solved by a computer using
    heuristics or AI, because it would take longer than the time available,
    to solve them using traditional methods.

    If a heuristic model finds its way through a complex maze, the solution
    may not be the BEST one, but it is better than NO solution.

    If an AI net predicts cases of cholera within 5 miles of your location,
    you might well laugh it off but you'd probably renew your vaccination,
    just in case...

    This whole field is expanding rapidly and it is likely that much more reliable predictions will be available within the next few years. It
    might then become possible to relax rules 1 and 2, but for now you
    should treat the output from a predictive model with extreme skepticism, even when it gets it pretty much right...

    It's like the "Pirate Rules" in Pirates of the Caribbean; more of a "guideline", really.

    Pete.
    --
    I used to write COBOL; now I can do anything...


    Hi Pete,

    Thanks for the insightful pointers, will definitely keep them in mind...

    Regards.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Tue May 22 12:15:56 2018
    From Newsgroup: comp.lang.cobol

    In article <66385ec0-9ded-4d4b-aba9-06d151a86331@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:
    Hi Folks.


    I cobbled together a program to perform data mining on a large
    collection of sizable ISAM files. The algorithms will mine the
    warehoused data for relevant statistics, and generate predictive
    analytics to guide management decisions and measure performance.

    Wow. If I had a nickel for every time I was expected to sit through this
    kind of buncombe I'd have to ballance my ballast. Who's feeding you this tripe?


    However, extracting real meaning from data can be challenging,
    fiendishly complex to understand, and wildly counter intuitive.
    Major factors to consider are: bad data, flawed processes and >misinterpretation of results can produce false positives and
    negatives, which can lead to inaccurate conclusions and
    injudicious business decisions. I would like to know your
    professional opinions with the following questions:

    1). in social sciences, is it practical or useful to develop
    a predictive model?

    Archaeology is a social science. Linguistics is a social science.
    Psychology is a social science.


    2). Are there any ironclad guarantees around predictive models?

    The ironclad guarantee is 'there are no ironclad guarantees', in cluding
    this one.


    Thank you for your feedback.

    Honest question: is there a client who's trolling for tripe or are you
    taking a course?

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Tue May 22 12:21:56 2018
    From Newsgroup: comp.lang.cobol

    In article <406a4753-1048-483d-b12e-1af0cc87c389@googlegroups.com>,
    Kellie Fitton <KELLIEFITTON@yahoo.com> wrote:
    Hi Doc Trins O'Grace,

    I appreciate your informative feedback. I interviewed some
    of my friends who are working as engineers and their answers
    rather surprised me. They said to increase the percentage of
    accuracy and clairvoyant logic of their predictive analytics,
    they must leverage the power of a new wrinkle in their field,
    and it is the reliance on Machine Learning and AI, Artificial
    Intelligence.

    Mechanical engineers? Software engineers? People who drive railroad locomitives?

    I find it shocking that the most sophisticated
    predictive software can become fully non-predictive in just
    two weeks period, due to the complexity, uncertainty and
    unpredictability of our connected world. case in point, the
    collapse of financial services firm Lehman Brothers, and the
    great recession of 2007 that was not predicted by economist
    who are trained to forecast and uses predictive analysis.

    I was working on Wall Street on 19 October 1989 and this doesn't surprise
    me in the least. Lord John Maynard Keynes remarked that 'Markets can
    remain irrational a lot longer than you or I can remain solvent.'

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Doc Trins O'Grace@doctrinsograce@gmail.com to comp.lang.cobol on Tue May 22 09:21:38 2018
    From Newsgroup: comp.lang.cobol

    Ma'am, these other responses seem very helpful!
    I was going to add -- and maybe it is obvious -- the trouble is knowing your inputs. In process control one might think that identifying everything you need to measure at any point would be possible. But the largest factories I worked in had continuous processing machinery that were in a single building covering over 40 acres -- this was as steel processing plant. That's a lot of stuff going on. We know that if you can only control what you can measure. Either control or measurement are impossible if they aren't available.
    With the neural nets, we could learn which measurements were most significant or least significant, by: excluding a measurement, retraining the network, and seeing how the its performance changed. (Historic data is wonderful for those purposes.)
    In a polyester manufacturing facility we found out that the ambient temperature of one of larger machines was more important than the temperature of the polyester itself. That surprised everyone.
    Of course, a huge problem is our presuppositions regarding our measurements. In the case of some of the economics predictors that were mentioned, what if the frequency of a street light in Hyderabad turns out to be a significant measurement we never thought to include. (That might be a silly example.)
    Chaos theory was really only articulated in the 70's. It has been incredibly helpful. I think we are pretty good at simplified systems. We aren't so good at reality. Maybe that means that any predictive models will have endogenous issues that we may never understand.
    Personally, that discourages me. Nonetheless, I have an enormous amount of respect for people like you. Your unrelenting efforts to tackle these kinds of problems cannot but yield beneficial results.
    I'll be quiet now. These other guys are a lot smarter and more perspicuous than I've ever hoped to be!
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kellie Fitton@KELLIEFITTON@yahoo.com to comp.lang.cobol on Tue May 22 13:07:50 2018
    From Newsgroup: comp.lang.cobol

    Hi Doc Trins O'Grace,

    You are absolutely right -- the trouble is knowing your inputs.
    I must ask my self are the data good and make sure the data used
    as well as the processes that generate and organize it are of
    the highest quality and fully understand them. I don't want to
    spend long time and resources only to find a bug in the data.
    One can still get problems even with the best data. Garbage in,
    garbage out. Predictive analytics are risky by nature, they are
    valid as long as the input data are also valid.

    Thank you for your valuable insights.

    Regards.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Wed May 23 17:19:37 2018
    From Newsgroup: comp.lang.cobol

    On 23/05/2018 8:07 AM, Kellie Fitton wrote:
    Hi Doc Trins O'Grace,

    You are absolutely right -- the trouble is knowing your inputs.
    I must ask my self are the data good and make sure the data used
    as well as the processes that generate and organize it are of
    the highest quality and fully understand them. I don't want to
    spend long time and resources only to find a bug in the data.

    You actually can't find a bug in the data; data just is, and it is what
    it is. You an find data that falls outside expected parameters, but it's
    still just data. (Marshall McLuhan had a lot to say about this when he
    was formulating Information Theory; he even suggested that the "value"
    of the data was proportional to its "unexpectedness"...)

    You find bugs in the algorithms used to process the data.

    One can still get problems even with the best data. Garbage in,
    garbage out. Predictive analytics are risky by nature, they are
    valid as long as the input data are also valid.

    Sadly, no, they are not.

    You can have "perfect data" all validated to be within the expected constraints and the predictions derived from it can still be "wrong".

    Certainly, there is more probability of the results being "righter" if
    the data is "good", but that probability is asymptotic and will never be "Certainty".

    Hence the rules I gave you previously... :-)

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Thu May 24 18:12:44 2018
    From Newsgroup: comp.lang.cobol

    In article <fmkbvdFmonU1@mid.individual.net>,
    pete dashwood <dashwood@enternet.co.nz> wrote:
    On 23/05/2018 8:07 AM, Kellie Fitton wrote:

    [snip]

    One can still get problems even with the best data. Garbage in,
    garbage out. Predictive analytics are risky by nature, they are
    valid as long as the input data are also valid.

    Sadly, no, they are not.

    Mr Dashwood, it seems that folks no longer study the Hawthorne effect.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From pete dashwood@dashwood@enternet.co.nz to comp.lang.cobol on Fri May 25 12:42:51 2018
    From Newsgroup: comp.lang.cobol

    On 25/05/2018 6:12 AM, docdwarf@panix.com wrote:
    In article <fmkbvdFmonU1@mid.individual.net>,
    pete dashwood <dashwood@enternet.co.nz> wrote:
    On 23/05/2018 8:07 AM, Kellie Fitton wrote:

    [snip]

    One can still get problems even with the best data. Garbage in,
    garbage out. Predictive analytics are risky by nature, they are
    valid as long as the input data are also valid.

    Sadly, no, they are not.

    Mr Dashwood, it seems that folks no longer study the Hawthorne effect.

    DD

    Sorry Doc, not sure of your allusion here. My position has been
    consistent throughout the thread (whether it was observed or not... :-)):

    "Don't trust the results of analytics."

    (This could change in a few years but, for now at least, that's my
    position on it.)

    Pete.
    --
    I used to write COBOL; now I can do anything...
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From docdwarf@docdwarf@panix.com () to comp.lang.cobol on Fri May 25 01:59:56 2018
    From Newsgroup: comp.lang.cobol

    In article <fmp4gfF3fonU1@mid.individual.net>,
    pete dashwood <dashwood@enternet.co.nz> wrote:
    On 25/05/2018 6:12 AM, docdwarf@panix.com wrote:
    In article <fmkbvdFmonU1@mid.individual.net>,
    pete dashwood <dashwood@enternet.co.nz> wrote:
    On 23/05/2018 8:07 AM, Kellie Fitton wrote:

    [snip]

    One can still get problems even with the best data. Garbage in,
    garbage out. Predictive analytics are risky by nature, they are
    valid as long as the input data are also valid.

    Sadly, no, they are not.

    Mr Dashwood, it seems that folks no longer study the Hawthorne effect.

    Sorry Doc, not sure of your allusion here. My position has been
    consistent throughout the thread (whether it was observed or not... :-)):

    "Don't trust the results of analytics."

    Is that an assertion based on experience? Seriously... the Hawthorne
    effect is that the results you get might not be the results you seek;
    having studied this might prevent someone from posting that 'predictive analytics are valid as long as the input data are valid'.

    DD
    --- Synchronet 3.20a-Linux NewsLink 1.114