Public:Parts of Training Data

From Docs | Smarter.Codes
Parts of a successful AI Software

A successful AI software is made up of two parts. Half of it is AI algorithms, and rest half is Training Data. This training data is fine-tuned such that the AI algorithm..

  • ..can feed upon the supplied training data and make AI software act smart.
  • ..can add more and more training data on its own, and make itself even more smarter

In this article we will zoom inside the Training Data to see how at Smarter.Codes we arrange training data

DIKW

DIKW at Smarter.Codes

At Smarter,Codes we divide the the Training Data into 4 parts : Data, Information, Knowledge, Wisdom. Inspired from DIKW pyramid.

Data

Data is raw input. In the example of an Autonomous Driving car, data could be

  • Color code : RGB (255, 0, 0). Sourced from Camera
  • Geographical Coordinates : 41°24'12.2"N 2°10'26.5"E. Sourced from GPS module
  • The direction our Camera 1 is looking at is 15°. Sourced from Compass

The more creative sources you put to bring the data in, the more useful the AI software becomes. Think of input source as sensory organ of your AI software. The more sensory organs you have, the better.

Information

This raw data once understood is called Information.

In an Autonomous Driving car example, Information could be

  • Color is Red. This Color Red is of Traffic Light.
  • This Geo coordinate is of Acme Street's cross road.
  • We are seeing the traffic light facing north.
  • This particular traffic light is signaling our car.

It stores the interpretation of stimuli. It often contains facts about the current environment.

Knowledge

Information is deduced from Data by referring to pre-existing Knowledge.

In an Autonomous Driving car example, Knowledge can be

  • RGB (255, 0, 0) is given the name Red. Traffic lights tend of have Red color signal.
  • Also there are many types of red. It is okay if a shade of red is assumed as red if seen on a traffic light.
  • Geo coordinate 41°24'12.2"N 2°10'26.5"E (with 5% variance) means we are Acme Street's cross road.
  • Given that we are looking in the direction of 0° to 30° this means we are facing to North.
  • You are supposed to stop at traffic lights while the signal is red.

It guides how to interpret stimuli. And how to respond back to stimuli. Because it is pre-existing knowledge, it is easy to believe that Knowledge is the Training. But the more Data comes, and more Information is made, it makes Knowledge better. Hence the steps before Knowledge : Information and Data also part of Training.

Wisdom

Wisdom is output of Knowledge applied on Information.

In an Autonomous Driving car example, Wisdom can be

  • I must stop my car while the traffic signal is red.
  • Because
    • I am facing a traffic light on Acme Street cross road.
    • And you must stop your car when the signal is red.

It is the decisions that system has taken based on the Information that came in. It also explains why the decision was taken.

Anatomy of Knowledge

Knowledge can be sub-divided into multiple parts.

Propositions

Propositions are facts. Like

Logic

Low Level Logic

Stop on Red signal

  • IF a traffic light IS red
  • AND you ARE driving on the road which HAS this traffic light
  • THEN Stop

Strategical Logic

IF it's a Sunny moment

THEN Stop on Red Signal

IF its a moment of Emergency

(like a matter life and death - say you were driving an ambulance today and you have a patient to save) then

  • Drive on the Ambulance lane if available
  • Prefer NOT to Stop on Red Signal

Style of Response

IF you like it Easy

THEN choose the route of express way that contains tolls

IF you want to optimize for frugality

THEN choose the route without tolls, especially if additional fuel expense would be lesser than toll expense

Applying DIKW in your AI project

Whatever AI software you are building, conduct a Data Sprint and prototype the Training Data that your AI algorithms would use. Depending on how ambitious your AI project is sometimes you prototype only Propositions and no Logic. Other times you only do Low Level Logic, but not anything beyond that. Below are examples of some DIKW we prototyped in some AI projects

Personal Assistant for Learning, Time Managing, Chatting & Social Networking

Explainability

In typical AI projects around the world Data and Information layer are Transparent (often referred as white box), but Knowledge and Wisdom tend to be Opaque (often referred as black box). At Smarter.Codes the training data in Knowledge and Wisdom also tends to be transparent