Tokens

In his article, A Theory of Types, Graham Berrisford discusses the difference between a "thing", an "instance", a "type", a "set", and a "token" and provides this graphic to differentiate those five different  notions: 

In this blog post I want to point out one very specific notion that a lot of people seem to have a hard time getting their head around.  This idea relates to the difference between a "thing" and the "token" used to represent the thing such that computer software can perform work for humans.

A "thing" is about the notion or concept or idea that you are trying to represent and model.  A "token" is the physical implementation of that thing using some sort of technical syntax that a computer software system can read.

So, for example, in accounting, the accounting equation has three different ideas: 

  • assets, 
  • liabilities, 
  • equity.

If you look at the Wikipedia article referenced above, you notice that in the Wikipedia article about the Accounting Equation, in the example of the equation, the idea of "equity" is represented in three different ways.  The first is "E".  The second is "Equity".  The third is "Owner's Equity".  The fourth is "Shareholders' Equity".

There is a lot going on in this very simple example.  Does a computer understand that "E", "Equity", "Owner's Equity", and "Shareholders' Equity" are all referring to exactly the same thing, our idea of  "equity".  Look at the difference between where the apostrophe is in owner and shareholder equity.  Would "Owners' Equity" mean the same thing as "Owner's Equity" to a computer?  How about "Shareholder's Equity", "Shareholder Equity", or "Shareholders' Equity"?  Then there is the hole thing about the different computer symbols for the apostrophe.

The first important idea or notion you want to understand here is that a computer is a dumb beast. I mean really, really dumb.  If you have not told the computer in some way that, for example, all those ways that the notion of "equity" can be represented, the computer will not understand what you are talking about.

The second important idea is the notion of the "token" used to represent that idea.  Does "EFG7742F" represent equity?  Maybe.  You could use pretty much any set of characters to represent tie idea of "equity" to a computer, but then you also have to map every possible string someone might come up with to that set of characters the computer would be using, like "EFG7742F" or "equity" or "EQUITY" or "Equity" or whatever you or a software developer might come up with.  That physical representation of the notion of "equity" that humans might refer to in many different ways is called a "token".  That token symbolizes in physical form the idea of what you refer to as "equity".

The third important idea or notion is that of meaning.  What exactly do you mean by "equity"?  Do you mean whatever that Wikipedia article discussing the Accounting Equation is trying to convey?  If you read through the Wikipedia article, there is a definition in there.  Sort of.  What authority does Wikipedia have to define "equity".  Might not the Financial Accounting Standards Board (FASB) be better qualified to define equity?  What if you are in the European Union, might not the International Accounting Standards Board (IASB) definition of equity be preferable?  Saying this another way, you might have:

  • Wikipedia:Equity
  • FASB:Equity
  • IASB:Equity
  • SEC:Equity
  • Australia:Net Assets (i.e. they might use a different version of the accounting equation)
  • Accounting system:19855-00-9001 (the chart of accounts code for equity in an accounting system)
Which version of equity is the version you want to be using?  How exactly would you distinguish one version of equity from some other version of equity?

The fourth important idea or notion that I want to point out is the difference between some local notion of "equity" and some global version of "equity" and how the two relate.  You may have some local way that you refer to the notion of "equity".  Someone else might, very justifiably, have their own definition and representation. There are mechanisms that technical people have created for making these distinctions.  LOTS of them.  There is also a global standard approach that has been developed to address this ability to distinguish between different instances of what might be the same thing, this approach is called the International Resource Identifier or IRI. There is actually a robust technical specification to define an IRI (which looks a lot like what you might call a URL).
  • https://en.wikipedia.org/wiki/Accounting_equation#Equity
  • https://www.ifrs.org/groups/international-accounting-standards-board/#equity
  • https://fasb.org#EQUITY
  • https://sec.gov/us-gaap#equity
  • https://sec.gov/ifrs#equity
  • https://aasb.gov.au#netassets
  • https://microsoft.com/definitions/coa/#19855-00-9001

Note that the above IRIs are approximations just to provide an example, they probably are legal IRIs, but they might not be.  But the idea of an IRI is to have a global standard approach to uniquely define "things" and provide a machine readable "token" for each thing.

The fifth important idea or notion that I want to point out is the distinction between a "concept" and a "preferred label" for a concept.  Subject matter experts within an area of knowledge need to decide if, say "Equity" and "Shareholders' Equity" are (1) two different things or (2) one thing with two different preferred labels. Would "Owners Equity" be a third thing, or another preferred label for the one thing "equity".  What about "Net Assets"?  Is that a thing or just yet another preferred label?

Finally, the last important idea that I want to point out is the fact that not everyone in the world speaks English.  There are many different languages and "equity" might be "eigenkapital" in German, "حقوق صاحبان سهام" in Persian or "Equidad" in Spanish.  An argument could be made that a GUID (guaranteed unique identifier) is better than a human readable name in any specific language for defining a token.

There are probably additional considerations when defining the necessary physical objects that a computer needs to use to do work for us humans.  I am not even going to get into the technical syntax choice considerations.

My example shows only three "things".  The US GAAP XBRL taxonomy has about 20,000 things (so far) and the IFRS XBRL Taxonomy has about 7,000 but will very likely grow to be of similar size as US GAAP.

But neither US GAAP nor the IFRS XBRL taxonomy has a "token" for the notion of a disclosure.  Referring to a part of an XBRL-based report can be problematic because (a) XBRL has two ways of defining the pieces of a report (network, hypercube), (b) sometimes hypercubes are not even explicitly provided, and (c) sometimes that exact same hypercube is used to describe two different disclosures.  (Actually, Microsoft used the same hypercube to represent 128 different financial disclosures.)  Folks, networks and hypercubes are both tokens.  Missing token means you have to figure out your own way to address the pieces of a report that you want to work with.

Additional Information:

Comments

Popular posts from this blog

Microsoft CEO: "AI Agents will Replace All Software"

Getting Started with Auditchain Luca (now called Luca Suite)

New Tool for Accountants, Auditors, Analysts