Visualize | Models
Models address specific kinds of domains, and are useful in comprehending complex systems. A model has many forms. Some models only exist as internal mental constructs of the observer. Some models exist with a physical three-dimensional view. Language itself is a model, and forms our comprehension of the world. We ratchet towards various goals as a species through models, because human cognition cannot handle more than a couple actors utilizing a couple tools towards a goal. Some goals are self-directed.
Understanding models is a hot area right now as we try and create machines that understand reality; for instance, self-driving vehicles that understand the reality of transport in a city. This is an interesting shift from human models, as the same set of inferences from an AI perspective form a consistent model of the world. There should be no difference between one server comprehending a stop sign and another, all other things being equal, for a machine. This is not the case for humans.
The possible models that can be visualized with Cruft Buster are unlimited within certain constraints and conventions. The knowledge is rendered into handbooks and diagrams with Python scripts and Graphviz.
Data flow is a likely primary focus for IT analysis. Chris Gane and Trish Sarson developed a model for data flow in the 1970s that uses just three symbols and some conventions for connecting the symbols to make diagrams. The first symbol is called an external entity, and is a source or destination of data. Normally this is a role or group of people, like accounting or human resources. The second symbol is called a process, and it transforms data that flows through it from one form to another. An example of a process would be an inventory report, or saving a document when using a word processor. The third symbol is a called a data store, which is a store of data at rest, like a disk drive or a thumb drive. These symbols are not limited to computing. As an example, I might fill out a form for employment (Entity), mail it to a recruiter who then reviews it for fit and files it (Process) in a file cabinet (data store). The data store and process symbols are identified with both an ID and a short description, while the entity symbol only has a description.
Let's dive in a bit deeper by looking at an example data flow using a Gane and Sarson model. Review the diagram above as you read the narrative: A job applicant submits a resume either by postal mail or online via the company's recruiting website. Resumes submitted online are immediately stored in the database and scanned for keywords. A recruiter can then run a report to match candidates for particular jobs. If the resume is mailed in, recruiting removes the resume from the incoming mail slot, loads it in the feed tray, and enters the assigned physical location in the grey file cabinet.
Like the Christmas tree, there are other domains outside of just the data flow. The Feed Tray Load process has a particular order to it that this diagram doesn't catch. This is an important distinction. The diagram shows the direction of data flow. It does not show sequence. The intention of the numbers on the top of the process boxes is that it roughly shows the sequence to make the diagram easier to understand, but the nature of the domain is such that this is never completely accurate. As an example, there is no way to tell whether the job applicant will be mailing an application or submitting an application online, so sequence numbers cannot be shown in a way that is accurate.
The data flow model here only uses the direction of data as a relationship. This is shown by the direction of the arrows. The Christmas Tree allegory example has a wider variety of relationships. These are called predicates, and are listed in the order subject predicate object much like in English sentences. These relationships are called triples. For the Christmas tree diagram above, find this relationship: "fir has_needle_type pointy". The data flow model example has "recruiting has_specified_output assign_file". An important thing to notice about the data flow triple, is that the predicate is not "location info". Location info is a convention of the diagram that signifies what data is flowing. This is illustrated by the Christmas Tree payment type. We will interpret the diagram predicate between tree lot and cash only as has_payment_type and end up with tree_lot has_payment_type cash_only. Refer to the diagram above to find that triple. Perhaps it is useful to note what kind of cash. Maybe the tree lot takes Canadian money. This is similar to what is happening with the data flow diagram and the labels on the flows.
The above example data flow is fictional. It illustrates a system that has been around for 20 years and is maintained without fully understanding what is needed. The Resume Wizard Database, Text Scan, Feed Tray Load, and Match report are all part of a software suite the organization purchased in 1999. The company that makes the software has been purchased multiple times and is no longer being updated, yet the support contract still costs several thousand dollars a year to obtain security and application patches that keep the system running as operating systems are upgraded. Only two resumes a year come in through snail mail, and not one of those resumes has been used for a hire since 2006. Two thirds of this system is cruft.
For further illustration of how data flow diagrams work with hybrid documentation, see IT Docent.
In addition to data flow, Cruft Buster includes schemas for network diagrams:
And flow charts:
Custom nodes are possible, even with scaled vector labels. Note that the tree is directly translated to DOT format, and is placed in the /graph/ folder for the domain. It is also possible to embed mermaid diagrams in the markdown of the handbook, if that level of abstraction is a preference.