Ontology merging

Ontology merging defines the act of bringing together two conceptually divergent ontologies or the instance data associated to two ontologies. This is similar to work in database merging (schema matching). This merging process can be performed in a number of ways, manually, semi automatically, or automatically. Manual ontology merging although ideal is extremely labour-intensive and current research attempts to find semi or entirely automated techniques to merge ontologies. These techniques are statistically driven often taking into account similarity of concepts and raw similarity of instances through textual string metrics and semantic knowledge. These techniques are similar to those used in information integration employing string metrics from open source similarity libraries.

Randomized Hough transform

Hough transforms are techniques for object detection, a critical step in many implementations of computer vision, or data mining from images. Specifically, the Randomized Hough transform is a probabilistic variant to the classical Hough transform, and is commonly used to detect curves (straight line, circle, ellipse, etc.) The basic idea of Hough transform (HT) is to implement a voting procedure for all potential curves in the image, and at the termination of the algorithm, curves that do exist in the image will have relatively high voting scores. Randomized Hough transform (RHT) is different from HT in that it tries to avoid conducting the computationally expensive voting process for every nonzero pixel in the image by taking advantage of the geometric properties of analytical curves, and thus improve the time efficiency and reduce the storage requirement of the original algorithm. == Motivation == Although Hough transform (HT) has been widely used in curve detection, it has two major drawbacks: First, for each nonzero pixel in the image, the parameters for the existing curve and redundant ones are both accumulated during the voting procedure. Second, the accumulator array (or Hough space) is predefined in a heuristic way. The more accuracy needed, the higher parameter resolution should be defined. These two needs usually result in a large storage requirement and low speed for real applications. Therefore, RHT was brought up to tackle this problem. == Implementation == In comparison with HT, RHT takes advantage of the fact that some analytical curves can be fully determined by a certain number of points on the curve. For example, a straight line can be determined by two points, and an ellipse (or a circle) can be determined by three points. The case of ellipse detection can be used to illustrate the basic idea of RHT. The whole process generally consists of three steps: Fit ellipses with randomly selected points. Update the accumulator array and corresponding scores. Output the ellipses with scores higher than some predefined threshold. === Ellipse fitting === One general equation for defining ellipses is: a ( x − p ) 2 + 2 b ( x − p ) ( y − q ) + c ( y − q ) 2 = 1 {\displaystyle a(x-p)^{2}+2b(x-p)(y-q)+c(y-q)^{2}=1} with restriction: a c − b 2 > 0 {\displaystyle ac-b^{2}>0} However, an ellipse can be fully determined if one knows three points on it and the tangents in these points. RHT starts by randomly selecting three points on the ellipse. Let them be X 1 {\displaystyle X_{1}} , X 2 {\displaystyle X_{2}} and X 3 {\displaystyle X_{3}} . The first step is to find the tangents of these three points. They can be found by fitting a straight line using least squares technique for a small window of neighboring pixels. The next step is to find the intersection points of the tangent lines. This can be easily done by solving the line equations found in the previous step. Then let the intersection points be T 12 {\displaystyle T_{12}} and T 23 {\displaystyle T_{23}} , the midpoints of line segments X 1 X 2 {\displaystyle X_{1}X_{2}} and X 2 X 3 {\displaystyle X_{2}X_{3}} be M 12 {\displaystyle M_{12}} and M 23 {\displaystyle M_{23}} . Then the center of the ellipse will lie in the intersection of T 12 M 12 {\displaystyle T_{12}M_{12}} and T 23 M 23 {\displaystyle T_{23}M_{23}} . Again, the coordinates of the intersected point can be determined by solving line equations and the detailed process is skipped here for conciseness. Let the coordinates of ellipse center found in previous step be ( x 0 , y 0 ) {\displaystyle (x_{0},y_{0})} . Then the center can be translated to the origin with x ′ = x − x 0 {\displaystyle x'=x-x_{0}} and y ′ = y − y 0 {\displaystyle y'=y-y_{0}} so that the ellipse equation can be simplified to: a x ′ 2 + 2 b x ′ y ′ + c y ′ 2 = 1 {\displaystyle ax'^{2}+2bx'y'+cy'^{2}=1} Now we can solve for the rest of ellipse parameters: a {\displaystyle a} , b {\displaystyle b} and c {\displaystyle c} by substituting the coordinates of X 1 {\displaystyle X_{1}} , X 2 {\displaystyle X_{2}} and X 3 {\displaystyle X_{3}} into the equation above. === Accumulating === With the ellipse parameters determined from previous stage, the accumulator array can be updated correspondingly. Different from classical Hough transform, RHT does not keep "grid of buckets" as the accumulator array. Rather, it first calculates the similarities between the newly detected ellipse and the ones already stored in accumulator array. Different metrics can be used to calculate the similarity. As long as the similarity exceeds some predefined threshold, replace the one in the accumulator with the average of both ellipses and add 1 to its score. Otherwise, initialize this ellipse to an empty position in the accumulator and assign a score of 1. === Termination === Once the score of one candidate ellipse exceeds the threshold, it is determined as existing in the image (in other words, this ellipse is detected), and should be removed from the image and accumulator array so that the algorithm can detect other potential ellipses faster. The algorithm terminates when the number of iterations reaches a maximum limit or all the ellipses have been detected. Pseudo code for RHT: while (we find ellipses AND not reached the maximum epoch) { for (a fixed number of iterations) { Find a potential ellipse. if (the ellipse is similar to an ellipse in the accumulator) then Replace the one in the accumulator with the average of two ellipses and add 1 to the score; else Insert the ellipse into an empty position in the accumulator with a score of 1; } Select the ellipse with the best score and save it in a best ellipse table; Eliminate the pixels of the best ellipse from the image; Empty the accumulator; }

Attribute–value system

An attribute–value system is a basic knowledge representation framework comprising a table with columns designating "attributes" (also known as "properties", "predicates", "features", "dimensions", "characteristics", "fields", "headers" or "independent variables" depending on the context) and "rows" designating "objects" (also known as "entities", "instances", "exemplars", "elements", "records" or "dependent variables"). Each table cell therefore designates the value (also known as "state") of a particular attribute of a particular object. == Example of attribute–value system == Below is a sample attribute–value system. It represents 10 objects (rows) and five features (columns). In this example, the table contains only integer values. In general, an attribute–value system may contain any kind of data, numeric or otherwise. An attribute–value system is distinguished from a simple "feature list" representation in that each feature in an attribute–value system may possess a range of values (e.g., feature P1 below, which has domain of {0,1,2}), rather than simply being present or absent (Barsalou & Hale 1993). == Other terms used for "attribute–value system" == Attribute–value systems are pervasive throughout many different literatures, and have been discussed under many different names: Flat data Spreadsheet Attribute–value system (Ziarko & Shan 1996) Information system (Pawlak 1981) Classification system (Ziarko 1998) Knowledge representation system (Wong & Ziarko 1986) Information table (Yao & Yao 2002)

Cleverpath AION Business Rules Expert

Cleverpath AION Business Rules Expert (formerly Platinum AIONDS, and before that Trinzic AIONDS, and originally Aion) is an expert system and Business rules engine owned by Computer Associates by 2000. == History == The product was created around 1986 as "Aion" by the Aion company. In its initial release Aion was multi-platform and continues to be deliverable to the PC, Unixs, and Mainframe computer's. In addition it ties in seamlessly with a variety of databases including Oracle, Microsoft SQL Server, and ODBC. Aion was founded by Harry Reinstein, Larry Cohn, Garry Hallee, Scott Grinis, and others. From Scott Grinis's bio: Scott founded Aion, a company that developed expert systems and whose advanced inference engine and object technology were used by financial services and insurance firms to develop risk-scoring and underwriting applications. Harry Reinstein was quoted as saying: “Our biggest competitor was not AICorp, it was COBOL” Trinzic owned AION by 1993. A reference in a 1993 announcement indicates that Trinzic's formation was the result of a merger (paraphased): Trinzic set three development initiatives shortly after its formation from the merger of Aion Corp. and AICorp. The other initiatives -- adding SQL extensions to Aion/DS and evaluating the unbundling of some of that product's object-oriented programming capabilities -- are still active. Writing in 1993 Judith Hodges and Deborah Melewski give the date for the merger: Two rival artificial intelligence software vendors -- AICorp, Inc. and Aion Corp. -- merged in September 1992 to form Trinzic Corp. As part of the merger, redundant jobs were eliminated (20% of the combined work force), leaving a total work force of 245 employees worldwide. The new firm also boasted a combined installed base of more than 1,200 sites representing more than 10,000 software licenses. Although in the merger, technically AICorp bought Aion, as AICorp was a public company and Aion was still private, the reality was that Aion's leadership and technology subsumed AICorp's. Jim Gagnard, the CEO of Aion, became CEO of Trinzic and AICorp's flagship product, KBMS, was discontinued, while the Aion Development System continued to be enhanced and KBMS customers were assisted in converting to AIONDS, under the continued technical leadership of Garry Hallee and Scott Grinis. On August 1, 1994 Trinzic released version 6.4 of AIONDS saying, in part: Trinzic Corp., Palo Alto, Calif., has unveiled The Aion Development System (AionDS) Version 6.4, an upgrade to the company's development environment for building business process automation applications. Version 6.4 provides a visual development environment for Microsoft Windows or OS/2 PM applications using business rules. Trinzic was acquired by PLATINUM Technologies in 1995 which retained at least some of Trinzic's acquisitions Platinum Technologies was acquired by Computer Associates in 1999. CA changed the system's name to CA Aion Business Rules Expert" on or before 2009. It is currently (June 2011) at Release 11 on a wide range of supported platforms. == Applications using Aion == Aion has been used in a variety of industries including Energy, Insurance, Military, Aviation, and Banking. At one point an Aion expert system application written by Covia, LLC existed to do airport gate assignment. Colossus, a computer program, developed by Computer Sciences Corporation is the insurance industry’s leading expert system for assisting adjusters in the evaluation of bodily injury claims (aka "pain and suffering"). Colossus helps adjusters reduce variance in payouts on similar bodily injury claims through objective use of industry standard rules.

Rule-based system

In computer science, a rule-based system is a computer system in which domain-specific knowledge is represented in the form of rules and general-purpose reasoning is used to solve problems in the domain. Two different kinds of rule-based systems emerged within the field of artificial intelligence in the 1970s: Production systems, which use if-then rules to derive actions from conditions. Logic programming systems, which use conclusion if conditions rules to derive conclusions from conditions. The differences and relationships between these two kinds of rule-based system has been a major source of misunderstanding and confusion. Both kinds of rule-based systems use either forward or backward chaining, in contrast with imperative programs, which execute commands listed sequentially. However, logic programming systems have a logical interpretation, whereas production systems do not. == Production system rules == A classic example of a production rule-based system is the domain-specific expert system that uses rules to make deductions or choices. For example, an expert system might help a doctor choose the correct diagnosis based on a cluster of symptoms, or select tactical moves to play a game. Rule-based systems can be used to perform lexical analysis to compile or interpret computer programs, or in natural language processing. Rule-based programming attempts to derive execution instructions from a starting set of data and rules. This is a more indirect method than that employed by an imperative programming language, which lists execution steps sequentially. === Construction === A typical rule-based system has four basic components: A list of rules or rule base, which is a specific type of knowledge base. An inference engine or semantic reasoner, which infers information or takes action based on the interaction of input and the rule base. The interpreter executes a production system program by performing the following match-resolve-act cycle: Match: In this first phase, the condition sides of all productions are matched against the contents of working memory. As a result a set (the conflict set) is obtained, which consists of instantiations of all satisfied productions. An instantiation of a production is an ordered list of working memory elements that satisfies the condition side of the production. Conflict-resolution: In this second phase, one of the production instantiations in the conflict set is chosen for execution. If no productions are satisfied, the interpreter halts. Act: In this third phase, the actions of the production selected in the conflict-resolution phase are executed. These actions may change the contents of working memory. At the end of this phase, execution returns to the first phase. Temporary working memory, which is a database of facts. A user interface or other connection to the outside world through which input and output signals are received and sent. Whereas the matching phase of the inference engine has a logical interpretation, the conflict resolution and action phases do not. Instead, "their semantics is usually described as a series of applications of various state-changing operators, which often gets quite involved (depending on the choices made in deciding which ECA rules fire, when, and so forth), and they can hardly be regarded as declarative". == Logic programming rules == The logic programming family of computer systems includes the programming language Prolog, the database language Datalog and the knowledge representation and problem-solving language Answer Set Programming (ASP). In all of these languages, rules are written in the form of clauses: A :- B1, ..., Bn. and are read as declarative sentences in logical form: A if B1 and ... and Bn. In the simplest case of Horn clauses (or "definite" clauses), which are a subset of first-order logic, all of the A, B1, ..., Bn are atomic formulae. Although Horn clause logic programs are Turing complete, for many practical applications, it is useful to extend Horn clause programs by allowing negative conditions, implemented by negation as failure. Such extended logic programs have the knowledge representation capabilities of a non-monotonic logic. == Differences and relationships between production rules and logic programming rules == The most obvious difference between the two kinds of systems is that production rules are typically written in the forward direction, if A then B, and logic programming rules are typically written in the backward direction, B if A. In the case of logic programming rules, this difference is superficial and purely syntactic. It does not affect the semantics of the rules. Nor does it affect whether the rules are used to reason backwards, Prolog style, to reduce the goal B to the subgoals A, or whether they are used, Datalog style, to derive B from A. In the case of production rules, the forward direction of the syntax reflects the stimulus-response character of most production rules, with the stimulus A coming before the response B. Moreover, even in cases when the response is simply to draw a conclusion B from an assumption A, as in modus ponens, the match-resolve-act cycle is restricted to reasoning forwards from A to B. Reasoning backwards in a production system would require the use of an entirely different kind of inference engine. In his Introduction to Cognitive Science, Paul Thagard includes logic and rules as alternative approaches to modelling human thinking. He does not consider logic programs in general, but he considers Prolog to be, not a rule-based system, but "a programming language that uses logic representations and deductive techniques" (page 40). He argues that rules, which have the form IF condition THEN action, are "very similar" to logical conditionals, but they are simpler and have greater psychological plausibility (page 51). Among other differences between logic and rules, he argues that logic uses deduction, but rules use search (page 45) and can be used to reason either forward or backward (page 47). Sentences in logic "have to be interpreted as universally true", but rules can be defaults, which admit exceptions (page 44). He does not observe that all of these features of rules apply to logic programming systems.

Graphics

Graphics (from Ancient Greek γραφικός (graphikós) 'pertaining to drawing, painting, writing, etc.') are visual images or designs on some surface, such as a wall, canvas, screen, paper, or stone, to inform, illustrate, or entertain. In contemporary usage, it includes a pictorial representation of data, as in design and manufacture, in typesetting and the graphic arts, and in educational and recreational software. Images that are generated by a computer are called computer graphics. Examples are photographs, drawings, line art, mathematical graphs, line graphs, charts, diagrams, typography, numbers, symbols, geometric designs, maps, engineering drawings, or other images. Graphics often combine text, illustration, and color. Graphic design may consist of the deliberate selection, creation, or arrangement of typography alone, as in a brochure, flyer, poster, web site, or book without any other element. The objective can be clarity or effective communication, association with other cultural elements, or merely the creation of a distinctive style. Graphics can be functional or artistic. The latter can be a recorded version, such as a photograph, or an interpretation by a scientist to highlight essential features, or an artist, in which case the distinction with imaginary graphics may become blurred. It can also be used for architecture. == History == The earliest graphics known to anthropologists studying prehistoric periods are cave paintings and markings on boulders, bone, ivory, and antlers, which were created during the Upper Palaeolithic period from 40,000 to 10,000 B.C. or earlier. Many of these were found to record astronomical, seasonal, and chronological details. Some of the earliest graphics and drawings are known to the modern world, from almost 6,000 years ago, are that of engraved stone tablets and ceramic cylinder seals, marking the beginning of the historical periods and the keeping of records for accounting and inventory purposes. Records from Egypt predate these and papyrus was used by the Egyptians as a material on which to plan the building of pyramids; they also used slabs of limestone and wood. From 600 to 250 BC, the Greeks played a major role in geometry. They used graphics to represent their mathematical theories such as the Circle Theorem and the Pythagorean theorem. In art, "graphics" is often used to distinguish work in a monotone and made up of lines, as opposed to painting. === Drawing === Drawing generally involves making marks on a surface by applying pressure from a tool or moving a tool across a surface. In which a tool is always used as if there were no tools it would be art. Graphical drawing is an instrumental guided drawing. === Printmaking === Woodblock printing, including images is first seen in China after paper was invented (about A.D. 105). In the West, the main techniques have been woodcut, engraving and etching, but there are many others. ==== Etching ==== Etching is an intaglio method of printmaking in which the image is incised into the surface of a metal plate using an acid. The acid eats the metal, leaving behind roughened areas, or, if the surface exposed to the acid is very thin, burning a line into the plate. The use of the process in printmaking is believed to have been invented by Daniel Hopfer (c. 1470–1536) of Augsburg, Germany, who decorated armour in this way. Etching is also used in the manufacturing of printed circuit boards and semiconductor devices. === Line art === Line art is a rather non-specific term sometimes used for any image that consists of distinct straight and curved lines placed against a (usually plain) background, without gradations in shade (darkness) or hue (color) to represent two-dimensional or three-dimensional objects. Line art is usually monochromatic, although lines may be of different colors. === Illustration === An illustration is a visual representation such as a drawing, painting, photograph or other work of art that stresses the subject more than form. The aim of an illustration is to elucidate or decorate a story, poem or piece of textual information (such as a newspaper article), traditionally by providing a visual representation of something described in the text. The editorial cartoon, also known as a political cartoon, is an illustration containing a political or social message. Illustrations can be used to display a wide range of subject matter and serve a variety of functions, such as: giving faces to characters in a story displaying a number of examples of an item described in an academic textbook (e.g. A Typology) visualizing step-wise sets of instructions in a technical manual communicating subtle thematic tone in a narrative linking brands to the ideas of human expression, individuality, and creativity making a reader laugh or smile for fun (to make laugh) funny === Graphs === A graph or chart is a graphic that represents tabular or numeric data. Charts are often used to make it easier to understand large quantities of data and the relationships between different parts of the data. === Diagrams === A diagram is a simplified and structured visual representation of concepts, ideas, constructions, relations, statistical data, etc., used to visualize and clarify the topic. === Symbols === A symbol, in its basic sense, is a representation of a concept or quantity; i.e., an idea, object, concept, quality, etc. In more psychological and philosophical terms, all concepts are symbolic in nature, and representations for these concepts are simply token artifacts that are allegorical to (but do not directly codify) a symbolic meaning, or symbolism. === Maps === A map is a simplified depiction of a space, a navigational aid which highlights relations between objects within that space. Usually, a map is a two-dimensional, geometrically accurate representation of a three-dimensional space. One of the first 'modern' maps was made by Waldseemüller. === Photography === One difference between photography and other forms of graphics is that a photographer, in principle, just records a single moment in reality, with seemingly no interpretation. However, a photographer can choose the field of view and angle, and may also use other techniques, such as various lenses to choose the view or filters to change the colors. In recent times, digital photography has opened the way to an infinite number of fast, but strong, manipulations. Even in the early days of photography, there was controversy over photographs of enacted scenes that were presented as 'real life' (especially in war photography, where it can be very difficult to record the original events). Shifting the viewer's eyes ever so slightly with simple pinpricks in the negative could have a dramatic effect. The choice of the field of view can have a strong effect, effectively 'censoring out' other parts of the scene, accomplished by cropping them out or simply not including them in the photograph. This even touches on the philosophical question of what reality is. The human brain processes information based on previous experience, making us see what we want to see or what we were taught to see. Photography does the same, although the photographer interprets the scene for their viewer. === Engineering drawings === An engineering drawing is a type of drawing and is technical in nature, used to fully and clearly define requirements for engineered items. It is usually created in accordance with standardized conventions for layout, nomenclature, interpretation, appearance (such as typefaces and line styles), size, etc. === Computer graphics === There are two types of computer graphics: raster graphics, where each pixel is separately defined (as in a digital photograph), and vector graphics, where mathematical formulas are used to draw lines and shapes, which are then interpreted at the viewer's end to produce the graphic. Using vectors results in infinitely sharp graphics and often smaller files, but, when complex, like vectors take time to render and may have larger file sizes than a raster equivalent. In 1950, the first computer-driven display was attached to MIT's Whirlwind I computer to generate simple pictures. This was followed by MIT's TX-0 and TX-2, interactive computing which increased interest in computer graphics during the late 1950s. In 1962, Ivan Sutherland invented Sketchpad, an innovative program that influenced alternative forms of interaction with computers. In the mid-1960s, large computer graphics research projects were begun at MIT, General Motors, Bell Labs, and Lockheed Corporation. Douglas T. Ross of MIT developed an advanced compiler language for graphics programming. S.A.Coons, also at MIT, and J. C. Ferguson at Boeing, began work in sculptured surfaces. GM developed their DAC-1 system, and other companies, such as Douglas, Lockheed, and McDonnell, also made significant developments. In 1968, ray tracing was first described by Arthur Appel of the IBM Research Center, Yorktown Heights, N

Open Knowledge Base Connectivity

Open Knowledge Base Connectivity (OKBC) is a protocol and an API for accessing knowledge in knowledge representation systems such as ontology repositories and object–relational databases. It is somewhat complementary to the Knowledge Interchange Format that serves as a general representation language for knowledge. It is developed by SRI International's Artificial Intelligence Center for DARPA's High Performance Knowledge Base program (HPKB).