Thinking Beyond the Document

There is an elephant in the room.

Accountants, auditors, and analysts have been performing work related to creating information, attestation related to the trustworthiness of that information, and decision making and other supporting services related to making use of that information for many thousands of years. The basis of that information, the medium used to convey that information, has been the "physical hard copy"; the physical paper-based document; the physical paper-based spreadsheet. These paper based documents were the sources of data used by accountants, auditors, and analysts for literally thousands of years.  Those physical paper based documents evolved from the earlier mediums which included physical objects, clay tablets, papyrus and other earlier forms of physical hard copies. Those physical paper based document oriented artifacts literally drove the universal technology of accountably for thousands of years.

The internetworked computer (a.k.a. the computer hardware, the software, plus the internet) offer new approaches, new mediums, for conveying those sources of data and the information and knowledge provided by those sources to the end of performing work.

And yet, many accountants, auditors, and analysts are still constrained in their thinking by those "physical hard copy" based mediums of information exchange.  Rather than physical hard copy documents and document oriented electronic spreadsheets they are thinking digital copies of documents and document oriented electronic spreadsheets. And then when computers and software have a hard time making use of that data and information because the computer based processes cannot interpret that information; they invent approaches to attempt to get computers and software to somehow interpret those digital copies of documents and document oriented electronic spreadsheets.

But that document oriented approach is a dead end. Why?  As I have pointed out before, computers are dumb beasts.

What is necessary is an information oriented approach that both a human and machines can interpret and work with. Think of this as universal industrial strength information plug-and-play. Put the information in machine interpretable form first; then use computer based processes to convert that machine interpretable information into something that is also interpretable by humans.

Don't get me wrong; documents and document oriented spreadsheets are incredibly useful tools.  We just need to put things in the right order and starting with the human readable version is doing things in the wrong order.

We need to approach this by doing things in the proper order.

An internetworked computer is an extremely useful tool; it can store, retrieve, process, and provide instant access to data and information. But there are obstacles which must be overcome to make effective use of that tool.  I will not even get into the technical obstacles. Here are the business oriented obstacles:

  1. Business professionals use different terminologies to refer to exactly the same thing.
  2. Business professionals have inconsistent understandings of an area of knowledge (a.k.a. area of interest, community of practice, field, domain, subject domain, universe of discourse, society)
Artificial intelligence is a tool, it is not magic.

A key ingredient to creating an effective computer-based system is metadata. An example of metadata is the machine interpretable model of the conceptualization of a business report. Another example of metadata is the language of accounting.

Metadata is structured information that describes, explains, categorizes, gives context to, or otherwise enhances the usability of other information so that humans and machines can understand and use that information effectively.

In his book, Everything is Miscellaneous, David Weinberger explains that there are three orders of order. Understanding these three orders of order can help you understand the value of metadata.
  • Putting books on shelves in some sort of order is an example the first order of order.
  • Creating a list of books on the shelves you have is an example of second order of order. This can be done on paper or it can be done in a database.
  • Adding even more information to information is an example of third order of order. Using the book example, classifying books by genre, best sellers, featured books, bargain books, books which one of your friends has read; basically there are countless ways to organize something.
It is the third-order practices that make a company's existing assets more profitable, increase customer loyalty, and seriously reduce costs are the Trojan horse of the information age. As we all get used to them, third-order practices undermine some of our most deeply ingrained ways of thinking about the world and our knowledge of it.

The power of a computer based system is proportional to the high-quality metadata available to that system. What is not in dispute is the need for a "thick metadata layer" and the benefits of that metadata in terms of getting a computer to be able to perform useful and meaningful work.

But what is sometimes disputed is how to most effectively and efficiently get that thick metadata layer.  There are two basic approaches to getting this thick metadata layer:
  • Have the computer figure out what the metadata is: This approach uses artificial intelligence, machine learning, and other high-tech approaches to detecting patterns and figuring out the metadata.
  • Tell the computer what the metadata is: This approach leverages talented, skilled, and experienced business domain experts and knowledge engineers to piece together the metadata so that the metadata becomes available to the computer based system.
Because acquiring this knowledge, called knowledge acquisition, can be slow and tedious; much of the future of internetworked computer based systems depends on breaking the metadata acquisition bottleneck and in codifying and representing a large knowledge infrastructure. However, this is not an “either/or” question.  Both manual and automated knowledge acquisition methods can be used together.  Manually created metadata are used to prime the pump; then machine learning can build on that foundation. Humans and machines can work together to curate this important metadata.

Machine learning or deep learning systems work best if the system you are using them to model has a high tolerance to error. These types of systems work best for things like:
  • capturing associations or discovering regularities within a set of patterns;
  • where the volume, number of variables, or diversity of the data is very great;
  • relationships between variables are vaguely understood; or, 
  • relationships are difficult to describe adequately with conventional approaches.
Machine learning basically uses probability and statistics, correlations.  This is not to say that machine learning is a bad thing.  Machine learning is a tool.  Any craftsman knows that you need to use the right tool for the job.  Using the wrong tool will leave you unsatisfied .  Ultimately, what you create will either work or it will not work to achieve your objectives or the objectives of system stakeholders.

There are no short cuts.  Again, no one really disputes the need for this thick layer of metadata to get a computer to perform work effectively. Also, this metadata provides leverage similar to how software code creates leverage. Metadata, like software, only has to be created once and then millions can use that metadata similar to how many people can use the same software application.

Accounting does not have a high tolerance to error.  The tolerance for error in many aspects of accounting, reporting, auditing, and analysis is ZERO. So, the threat of inaccuracy needs to be managed. Epistemic risk is manageable.


Accounting information wants to be connected, to be linked. Accounting has built in transparency and traceability mechanisms.  Accounting information also has built it quality control mechanisms like double entry and articulation. However, when you start at the "end of the chain" or in the middle, much of the traceability provided by the linking is lost. If metadata is missing, linking might still be available, but the functionality is less than it could be.

Remember the 1-10-100 rule.  The relative cost of fixing a broken process is $1 as contrast to spending $10 to deal with fixing the mistake and the $100 cost of dealing with the ramifications of a mistake.

If we want interconnected computer based systems to work better, why don't we create the inputs that would give us better outputs?

We need to consciously try and build a new paradigm rather than try and fix the current paradigm.  While computers seem smart, they are actually quite dumb.  We need to try and provide things in a form they can grasp as contrast to forcing them to decipher the messy situation we have created over the past 50 years.  We humans need to fix the mess we created. Making this investment will yield significant dividends.

Additional Information:

Comments

Popular posts from this blog

Professional Study Group for Digital Financial Reporting

Big Idea for 2025: Semantic Accounting and Audit Working Papers

PLATINUM Business Use Cases, Test Cases, Conformance Suite