Knowledge Representation and Reasoning (KRR)
Knowledge representation and reasoning (KRR) is about converting information from an area of knowledge into machine understandable form and then enabling a machine such as a computer using software to process that information in a manner that is as good as a human could have performed that task/process or even better than a human could have performed that task/process.
For example, some task or process currently performed by humans that, if measured, would achieve a sigma level of 3 which is a defect rate of 6.7% (about 67,000 defects per million opportunities) would be improved and would achieve a sigma level of 6 which is a defect rate of 0.00034% (about 4 defects per million opportunities).
You are hearing me right, defects go from a whopping 67,000 down to 4. Think I am joking or on drugs? The Federal Deposit Insurance Corporation (FDIC) call report collection system went from 18,000 defects (reporting errors) down to 0 defects when it modernized their call report system to make use of XBRL. That was in 2003 and that was an easier forms-based system but they still had around 18,000 reporting errors every quarter. But now the same results can be obtained for a customizable reporting system.
So how do you make all this work? How do you get a machine to perform work better, faster, and/or cheaper than humans? The answer is: very carefully, very deliberately. Here are some things that you need to consider.
- Knowledge
- Knowledge representation approach
- Acquiring knowledge to represent
- Approach to reasoning on the represented knowledge
- Technical implementation of software for selected reasoning approach
- Operator of implemented software
The sections below looks into the choices you have to make for each of of these areas in order to get knowledge representation and reasoning to work effectively.
Knowledge
Knowledge is a form of familiarity with information from some specific area or corpus. Knowledge is often understood to be awareness of facts, having learned skills, or having gained experience using the things and the state of affairs (situations) within some area of knowledge. An area of knowledge (corpus) is a highly organized socially constructed aggregation of shared knowledge for a distinct subject matter. An area of knowledge has a specialized insider vocabulary, underlying assumptions (axioms, theorems, constraints), and persistent open questions that have not necessarily been resolved (i.e. flexibility is necessary). You can think about an area of knowledge as being characterized in a spectrum with two extremes:
- Kind area of knowledge: clear rules, lots of patterns, lots of rules, repetitive patterns, and unchanging tasks.
- Wicked area of knowledge: obscure data, few or no rules, constant change, and abstract ideas.
Sensemaking is the process of determining the deeper meaning or significance or essence of the collective experience for those within an area of knowledge or corpus. System stakeholders need to be in agreement as to an undisputed core knowledge of a system. The Cynefin Framework provides a tool for understanding and categorizing knowledge and rules within a corpus. Per the Cynefin Framework, knowledge can be categorized as being:
- Best practice (obvious)
- Good practice (only obvious if you have the right skills and experience)
- Emergent practice (tend to have to have more skills and experience, then can use principles to group alternatives)
- Novel practice (tends to be unique, but describable)
Knowledge of facts is distinct from opinion or guesswork by virtue of justification or proof. Knowledge is objective. Opinions and guesswork are subjective. In our case we are talking about certain specific knowledge, the facts that make up that knowledge, being able to create a proof to show the knowledge graph system is complete, consistent, and precise; and all of this logic being put into a form readable by a machine and reach a conclusion as to whether the information in the knowledge graph is functioning properly. Effectively, a machine can read that knowledge and mimic understanding of that knowledge represented in a knowledge graph and the information available to both a human reader and a machine reader would be the same and therefore the human and machine should reach the same conclusion.
Knowledge must be managed. Machine readable knowledge needs to be curated to keep it current. This curation and management has value of machine readable knowledge is valuable because the machine readable rules are valuable. This management and curation of rules takes effort.
Knowledge representation approach
There are a number of different approaches that a knowledge representation might take, each approach having a different level of expressivity, which forms a knowledge representation spectrum. The logical theory is the most powerful approach in terms of expressive power.
Acquiring knowledge to represent
There tends to be three approaches to acquiring knowledge for some area of knowledge. These three approaches are:
- Handcrafted knowledge: Skilled and experienced subject matter experts for some area of knowledge create/construct the knowledge representation. This approach can be costly and take time, but it also yields the highest result if done correctly.
- Statistical learning: Also referred to as machine learning, of which there are various forms, but all approaches are based on probability and statistics. While this approach can cost less, the quality can be significantly lower. This tends to be referred to as unsupervised learning.
- Combining handcrafted knowledge approach and statistical learning approach: Combining both approaches, called supervised statistical learning, is where humans and machines work together to achieve the highest quality result with the least expense and time being involved.
- capturing associations or discovering regularities within a set of patterns;
- where the volume, number of variables or diversity of the data is very great;
- relationships between variables are vaguely understood; or,
- relationships are difficult to describe adequately with conventional approaches.
Important terms and associations
Approach to reasoning on the represented knowledge
Logic is a formal system that defines the rules of correct reasoning. Logic involves logical reasoning. Inference are steps in reasoning. There are three types of logical reasoning or types of steps in inference: deductive reasoning, inductive reasoning, and abductive reasoning. This forms what is sometimes referred to as a "triad of reasoning approaches" or reasoning types. Those reasoning approaches are different tools that have different sets of capabilities, different sets of PROs and CONs.
A hybrid system can be created that combines all three approaches into one single tool that leverages the best of each approach. Again, a craftsman's or craftswoman's task is to figure that out.Technical implementation of software for selected reasoning approach
There tends to be three primary groups of problem solving tools for implementing knowledge representation and reasoning against the representation:
- Semantic web stack of technologies
- Graph databases
- Logic programming
Operator of implemented software
There tends to be two primary groups of users of the software used to implement knowledge representation and reasoning:
- Technical professionals
- Nontechnical professionals (business professionals)
Irreducible complexity (a.k.a. essential complexity) is a term used to describe a characteristic of complex systems whereby the complex system needs all of its individual component systems in order to effectively function.
In other words, it is impossible to reduce the complexity of a system (or to further simplify a system) by removing any of its component parts and still maintain its functionality objective because all those component parts are essential to the proper functioning of the system. So for example, consider a simple mechanism such as a mousetrap. If you remove a piece, the mousetrap will not be able to function properly.
The Law of Conservation of Complexity states that: Every software application has an inherent amount of irreducible or essential complexity. The question is who will have to deal with that complexity:
- the application developer,
- the platform developer that the software runs on, or
- the software user.
Comments
Post a Comment