The Development of Uncertainty in Logic and Probability
by David D. Olmsted (Copyright - 2000, 2006. Free to use for personal and
educational purposes)
Last Revised October 25, 2006
Logic Values Were Modified by Modes from the Beginning
Logic has never explicitly incorporated time into itself yet time effects seem to be the major source logic's theoretical problems as it introduces uncertainty.
Logic’s difficulty with uncertainty has
resulted in many proposed logic systems, each one invented to overcome some limitation
in the uncertainty characterization of some other logic yet each one having uncertainty
characterization problems of its own (N. Rescher - 1969).
Logic originated with the Greeks as
a method for proper reasoning and not as an information processing theory.
Yet not until Aristotle (384 to 322 BC) wrote a book covering
the art of discourse called the Organon (meaning instrument or tool) was
the methodology of logic clearly defined. Michael Grant (1989) says this about it:
"The most significant parts of the Organon are
the Prior and Posterior Analytics, containing two books each. The former work reviews
the general principles of inductive inference, while the Posterior Analytics applies
these methods of proof and definition to the nature and validity of knowledge (epistemology),
indicating that there is a proper method of constructing generalizations, applicable
to all the sciences, and explaining how language can and should be employed for
this purpose."
"Aristotle stresses the novelty of his logical writings
and even if, contrary to what has sometimes been claimed, he did not invent the
discipline of logic (since Parmenides and the sophists and Plato had prepared the
ground) he was perhaps the first to comprehend the importance, not only of the content
of statements, but of their form, and their formal relation to each other. This
was a major breakthrough, for no one had ever before offered a general account of
what is valid in argument and what is not."
Standard truth functional binary logic only makes use of two state symbols, true
or false, to represent the degree of truth of a meaning. Yet even Aristotle realized
that any binary logic that divided meanings into either true or false had its limitations.
Even though he thought logic should be binary for past and present events with its
law of the excluded middle (no intermediate values for truth) he did not accept
binary truth-values for future events since that would result in determinism. Consequently,
he coined the term "definite truth" and "indefinite truth" (chapter 9 of De interpretatione).
Indefinite truth was indicated by the use of the modifying word "possible" and definite
truth was indicated by the use of the modifying word "necessary". This resulted
in Aristotle's two sided possibilities such as "it is possible that this cloak should
be cut but it is also possible that it should not be cut". In later years logic
using these modifiers would be called modal logic (see McCawley -1981 for a good
introduction to modal logic).
The modes of modal logic, necessity and possibility, would be defined with subtle
differences by different authors (Burks, 1977). The modes have the most relevance
with how the IMPLICATION operation is treated. The modes can be seen as representing
a two-state second
dimension that has been added to two-state logic. Recently various abstract modal
logics called multidimensional modal logics have been developed which go beyond
two-state modes into many discrete states with most having rules for transitioning
between states (Marx and Venema, 1997). Yet like most logics the rules are arbitrary
which is why so many modal logics exist.
Yet the obvious generalization of modes is to extend them from discrete states to
a continuous analog value ranging between 0 and 1 with 1 representing absolute necessity
and anything less representing some level of possibility. With this generalization
modal values are seen to define the confidence value for the truth value.
Consider the lying Cretan paradox and how is can be solved using the confidence
value dimension.
All Cretans lie.
I am a Cretan.
I am lying.
So is the Cretan lying or telling the truth? This concerns the reliability and predictability
of the source; the probability it is telling the truth, the confidence one has in it. Confidence can built up over time by testing the correlation of the statement
or it can be determined by contradiction as in the case of the Cretan paradox. With
a confidence level of 0 the truth of the Cretan statement is not relevant.
If the correlation is tested over time the correlation is described by conditional
probability, that is, given some event then the probability that some later event
is “p”. Consequently, probability properly enters logic via the confidence value.
A perfect correlation results in complete predictability and
complete confidence.
In other words the correlation would have a confidence value of 1.
For an incorrect attempt
at dealing with continuous confidence values on binary truth-values from a purely
logical point of view having no statistical correlation coefficient
or probability tie-in see Faust (2000).
Boole's Attempt to Merge Probability and Logic
The first great attempt at unifying probability and logic
occurred in 1854 when George Boole (1815-1864) unified algebra with truth functional
logic in his book, The Laws of Thought, to create Boolean algebra that uses
numbers to represent truth-values. Thus "true" is represented by "1" and "false"
is represented by "0".
Not only did Boole quantify logic but he also attempted to incorporate logic and
probability into one unified theory. Boole stated this as his purpose at the beginning
of chapter 1:
"The design of the following treatise is to investigate the fundamental laws of
those operations of the mind by which reasoning is performed; to give expression
to them in the symbolical language of a Calculus, and upon this foundation to establish
the science of Logic and construct its method; to make that method itself the basis
of a general method for the application of the mathematical doctrine of Probabilities..."
(Page 1)
Boole realized that the mathematics of probability and the mathematics of logic
were similar except that probability had a continuous range of values
between 0 and 1 while logic was binary. Despite this asymmetry Boole still viewed
logic as the deductive process of reasoning and probability as the inductive process
of reasoning.
"The general laws of Nature are not, for the most part, immediate objects of perception.
They are either inductive inferences from a large body of facts... They are in all
cases, and in the strictest sense of the term, probable conclusions, approaching,
indeed, ever and ever nearer to certainty, as they receive more and more confirmation
of experience." (page 4).
The asymmetry in logic and probability came about because Boole limited himself
to using the common linear operations of mathematics, which are addition, subtraction,
and multiplication. Consequently he ended up using multiplication for the AND operation.
This self-imposed limitation on possible operations led him to the conclusion that
logic has only the two values of 0 and 1 after considering the observation that
a proposition must retain its meaning if ANDed with itself. For example, sheep AND
sheep must equal sheep as symbolized by x2 = x. This equation is valid
only if the truth-values are limited to 1 or 0. Boole did not realize that using
the minimum operator (as fuzzy logic would later do) for conjunction would also
satisfy this restriction for all numbers between 0 and 1. Because of this error,
logic and probability remained separate theories.
Multivalued (Fuzzy) Logic
Following the mathematical description of logic trend started by George Boole in
1854, Jan Lukasiewicz in 1920 wondered how Aristotle's indefinite truth for future
events could be represented by the truth value itself without the need for modes or probability.
Lukasiewicz wondered what would be the truth-value for such future statements as
"I will be in city Y at time X" He concluded:
"Therefore the proposition considered at this moment neither true or false and must possess a third value, different from '0' or falsity and '1'
or truth. This value we can designate by '1/2'. It represents the possible". (Lukasiewicz
- 1930).
Later in 1922 he defined the multivalued IMPLICATION operation shown below for the
interval between 0 and 1 after realizing that there could be infinitely many degrees
of truth-values. From that he defined the other logic operations (Lukasiewicz -
1930).
IMPLICATION (p implies q) = 1 - p + q for p > q , 1 otherwise
NEGATION or COMPLIMENT (~ p) = 1 - p
INCLUSIVE OR (p or q) = IMPLICATION ((IMPLICATION p, q), q)
AND (p and q) = ~ ((~ p) OR (~ q))
Figure 1
Wilkinson's Analog Logic Showing MIN and MAX definitions
 |
Close examination of the AND operation and the INCLUSIVE OR operation reveals that
the AND operation takes the minimum value of its inputs while the INCLUSIVE OR operation
takes the maximum value of its inputs. Yet these simple definitions were not explicitly
expressed until they were independently proposed by R.H. Wilkinson(1963) and Lotfi Zadeh (1965)
in a logic that would come to be called fuzzy logic. Finally, Ackermann (1967) presented
a proof showing that the Lukasiewicz formulas were indeed equivalent to the maximum
and minimum formulas of fuzzy logic.
R.H. Wilkinson (1963)
in his remarkable electrical engineering master thesis of 1961 re-invented a part of multivalued logic in terms
of set theory using the simple
MAX operation for the AND. He published his
results in 1963. The main purpose of his paper was to show how any mathematical
function could be simulated using hardwired analog electronic circuits based upon
analog logic. He did this by first creating various linear voltage ramps which were
then selected in a "logic block" using diodes and resistor circuits which implemented
the maximum and minimum fuzzy logic rules of the INCLUSIVE OR and the AND operations
respectively. He called his logic analog logic. It did not have the IMPLICATION
operation or its negated form the CONDITIONAL operation.
Using just set theory, Lotfi Zadeh (then an electrical systems professor) in 1965 re-derived the logic of Wilkinson without the electrical circuits. He called the
sets fuzzy sets and the logic came to be called fuzzy logic. Zadeh was concerned that conventional set theory could not adequately
handle smoothly transitioning set assignments. He realized that a person who turns
fifty is not suddenly considered to have gone from the class of young people to
the class of old people. So an individual who is 49 may have a truth-value
of belonging to the fuzzy set of young people of 0.6 and a truth-value of belonging
to the fuzzy set of old people of 0.4. Notice that the analog truth-value is not
limited only to future events as was the original Lukasiewicz derivation.
Yet neither Zadeh nor Wilkinson defined the IMPLICATION operation since no good
rationale exists for that operation within a set theory context leaving Lukasiewicz’s
definitions as the only complete logic set. Significantly, the negated form of Lukasiewicz’s
IMPLICATION operation forms the simpler bounded difference operation as can be seen
when a “-1” is distributed though the definition as follows:
-1(1 - p + q) = p - q for p > q, 0 otherwise
Figure 2
Proper Derivation of Multivalued Logic Operations is by the Temporal Summation Method
 |
Logically this bounded difference is the CONDITIONAL operation. Significantly, Yamakawa
(1987) used the bounded difference operation to construct the standard multivalued
(fuzzy) logic operations for electronic circuit applications without realizing its
logical significance. He used it because it worked and because it is a very elegant
and simple method of construction.
With Set theory not powerful enough to derive the multivalued logic IMPLICATION
or CONDITIONAL operation a better method is to derive the multivalued operations
from the binary operations using the Temporal Summation Method as shown in figure
2. Since binary logic tests for fundamental state symmetries from an information
processing point of view simply summing the result over time gives the proper multivalued
logic operation definitions for when states exist over a period of time.
The IMPLICATION definition in multivalued logic does not conform to its standard
binary construction which is "(not a) or (b)" when the multivalued INCLUSIVE OR
is used. That this occurs should not be a surprise since often theorems developed
for a special case do not hold up in a more general theory. The binary construction
fails in the multivalued case because it does not conserve the truth assumption
inherent in the binary IMPLICATION operation.
The truth assumption means that the
implication statement is assumed true unless directly contradicted. The statement
“clouds” (a) imply “rain” (b) is proved false only when
“clouds” occur without “rain”. “Rain” can occur without “clouds” and the
implication would still be assumed true. In multivalued (fuzzy) logic “clouds” and
“rain” would get a truth-value proportional to the nearness of the observed clouds
or rain to the ideal definition. Consequently, the greater is the difference between
the two truth-values for "clouds" and "rain" the greater is the statement contradiction
(falsity). Only the difference in the truth-values is important yet the binary construction
can produce different output truth-values for the same difference when different
truth-values are used. For example, when "clouds" = 0.7 and "rain" = 0.3 the binary
construction produces 0.3 while the multivalued construction produces 0.6 (indicating
that the implication is more true than false).
Another way to state the truth assumption of the IMPLICATION operation is that the
absence of evidence is not evidence of absence. Problems can arise when using logical
IMPLICATIONS in mathematical proofs where a proven chain of causation is wanted
but not always attainable from beginning to end. The logistician and mathematician
Bertrand Russell tried to use binary logic to prove the foundations of mathematics
yet the whole approach was found to be either inconsistent or incomplete by Kurt
Godel in 1931. Godel’s first theorem states that for any formal theory rich enough
to include all the formulas of number theory then some formulas must be undecidable
if the theory is to be proved consistent. That is, such a theory must be deliberately
incomplete and ambiguous if it is to be consistent. His second theory, which is
a corollary of the first, asserts that the consistency of such a theory is impossible
to prove by methods “formalizable within the theory’. In 1936 Rosser strengthened
Godel’s theorems by showing that consistency alone implies the incompleteness of
natural numbers, N. (Stoll, 1963).
These results are not due to any property inherent in mathematics but are due instead
to the nature of the binary logic used in the proofs. The predicate calculus logic
statement used in mathematical proofs is “modus ponens” stated as: “given that A
and B are formulas and A is true, A implies B, then B is true”. This linkage form
with its IMPLICATION means that not all the links in a chain of causation need be
proven true, just that none need be proven false as follows:
If A is true and B is true then the implication A implies B is true
If A is false and B is true then the implication is true since the
implication has not been tested so the fact that A implies B must be assumed
to be true.
If A is true and B is false then the implication is false.
If A is false and B is false then the implication is true because
it is not tested.
Here the binary implication operation is forced to become a link in a chain of causation
so assuming an implication is true when its causality is not tested leads to either
inconsistencies or situations that must remain ambiguous and undefined. Proofs would
be much harder to accomplish (if not impossible) if they were required to assume
all causation links to be false when such links are not testable.
The Meaning of Multivalued Logic Truth Values and Confidence Probability Values
Multivalued logic values and confidence probability values make up different dimensions
in an abstract decision space. A multivalued logic value defines the distance between
an actual meaning and its ideal definition. A probability value defines the distance
between an actual confidence level and its perfect confidence level.
But what is the confidence in the probability when the value is only
based on a few observations? Until a large
number of events are observed the probability value is not very valid. Usually a precise
probability is obtained after 10 trials. So after 2 trials the probability that
the probability value is precise may be 0.5, after 7 trials the precise probability
may be 0.8. This second level of confidence
will be defined as the precision value.
Attempts at handling such preliminary and tentative knowledge in probability is
the subject of imprecise probability theory. Presently the field of imprecise probabilities
deals more with decision analysis then with the fundamental problem of representation
in an information processing system.
It generally uses various forms of upper and lower probabilities to confine an imprecise
probability within a certain range (Walley, 1991). This is derived from reliability
analysis in which the best and worst failure probabilities for each component connected
in series are used to determine a range of failure probabilities for the whole series.
The greater the interval between upper and lower probabilities the less is the resultant
probability of failure. Yet this doesn’t really address the problem of how to characterize
the precision
value of a probability due to incomplete information (insufficient event
observations). A decision system often needs to compare supporting and contradictory
evidence using the level of confidence and precison for each piece of evidence. The decision of
a system that tests for contradictory information will have a higher confidence
and precison (validity) than
one that does not test even if no contradictory information is actually present.
The representation of precision values is an important consideration
in other contexts. One might have a range of expert opinions on what some estimated
probability should be (called second order uncertainty in Bayesian network theory).
The greater the variability of opinions the less precise is the probability. The less
knowledge of a situation the greater the possible range of values for probability
and logic.
Using these uncertainty values in an integrated fashion in an information processing
system is described by Relativistic State Automata Theory.
References
Ackermann, Robert (1967).
An Introduction of Many-valued Logics. New York:
Dover.
Boole, G. (1854).
The Laws of Thought, London: MacMillan. Republished by
Dover: New York in 1958
Borkowski, L., ed (1970).
Jan Lukasiewicz - Selected Works. Amsterdam, London:
North-Holland Publishing Co.
Burks, A. W. (1977). Chance
, Cause Reason - An Inquiry into the Nature of Scientific
Evidence. Chicago: University of Chicago Press
David, F.N (1962).
Games, Gods and Gambling, London: Griffen & Co. Republished
by Dover: New York in 1998
Grant, M. (1989).
The Classical Greeks, Michael Grant Publications, Inc.
Homblad, P. & Ostergaard, J-J. (1982) "Control of a cement kiln by fuzzy logic",
in M.M. Gupta and Elie Sanchez, eds.,
Fuzzy Information and Decision Processes
pp. 398-399, Amsterdam: North Holland.
Lukasiewicz, Jan (1930). Philosophical remarks on many-valued systems of propositional
logic. Reprinted (1967) in Sorrs McCall (Ed.),
Polish Logic 1920 - 1930.
Oxford: Oxford University Press.
Marx, M. & Venema, Y. (1997).
Multi-Dimensional Modal Logic, Dordrecht:
Kluwer Academic Publishers.
McCawley, J. D. (1981).
Everything that Linguists have Always wanted to Know about
Logic* *but were ashamed to ask. Chicago: University of Chicago Press
McNeill, D. & Freiberger, P. (1994).
Fuzzy Logic. New York: Simon &
Schuster
Mamdani, E. & Assilian, S. (1975). An experiment in linguistic synthesis with
a fuzzy logic controller.
International Journal of Man-Machine Studies, 7,1-13
Rescher, N. (1969).
Many-valued Logics. New York, NY: McGraw-Hill.
Wilkinson, R.H. (1963). A method of generating functions of several variables using
analog diode logic.
IEEE Transactions on Electronic Computers. EC12, 112-129
Yamakawa, Y. (1987). Fuzzy logic circuits in current mode. In J.C. Bezdek (ed),
Analysis of Fuzzy Information, CRC Press.
Zadeh, L.A. (1965). Fuzzy sets.
Information and Control, 8, 338-353.
Zadeh, L.A (1973) Outline of a new approach to the analysis of complex systems &
decision processes.
IEEE Transactions on Systems, Man,& Cybernetics.
SMC-3(1), 28-44