COMPUTER DEPARTMENTS

Program System LOREG
LOgical REGularities
(softwere for classification, recognition, data mining)

1. INTRODUCTION
1.1. "LOREG" - What is it?
1.2. Information requirements
1.3. Application areas
1.4. System requirements
1.5. Loreg Setup
1.6. Bibliography

2. THEORY
2.1. Loreg - new approach for data analysis and pattern recognition
2.2. Standard information
2.3. Logical regularities
2.4. Logical descriptions
2.5. Pattern recognition algorithms
2.6. New information for features, objects and classes
2.7. Collective solutions

3. FORMAT OF INITIAL INFORMATION

4. LOGICAL REGULARITIES DETERMINATION AND PATTERN RECOGNITION PROBLEM SOLVING
4.1. Pattern recognition algorithms based upon voting over sets of logical regularities
4.2. MCL application
4.3. Collective rule recognition algorithm (Colw application)
4.4. Table Transformation (Tran application)

5. VISUALISATION
5.1. Why is it necessary
5.2. How to start Loreg Visual
5.3. What and how is displaying
5.4. How to change active items
5.5. How to get information
5.6. Menu commands
5.7. Getting help

6. LOREG IN DECISION OF PRACTICAL PROBLEMS ( EXAMPLE OF SOME MEDICINE DIAGNOSTICAL PROBLEM)

7. HOW TO DECIDE RECOGNIZING TASK - FIRST STEPS
7.1.Original information
7.2. Learning
7.3. Recognition for objects from file
7.4. Recognition in on-line mode
7.5. What else?

1. INTRODUCTION

1.1. "LOREG" -- What is it?

The program system LOREG have being designed for logical regularity search, data analysis, and pattern recognition. The feature descriptions of objects being used. There is a finite number of classes (patterns), and the training information being given by sample of object descriptions.
The following problems can be solved by system LOREG:

logical regularity determination for each class in terms of conjunctions of elementary predicates that define intervals of feature value variations;
determination of logical regularities for classes in a forms of disjunction (it may be as well as minimal or short one);
regularities background recognition of new objects;
feature informativity estimation;
calculation of new parameters that generalize based on experience information about features, objects, and classes;
data analysis and recognition result visualization against background of training and recognizing data;
determination of collective solutions for pattern recognition and feature informativity estimation.

There are the extended and useful tools for System adaptation to user's demands or real data restrictions. User has multivariate realization of different stages of regularity search process and can obtain the most adequate to size and quality of training data results (optimal pattern recognition algorithms, the collective of recognition algorithms, the sets of the same parameters by various methods calculated). The based upon combination of new logical, optimization, and statistical technique theoretical backgrounds of pattern recognition methods have being elaborated in the Computing Center of Russian Academy of Sciences.

1.2. Information Requirements

    The recognizing and training objects are described in terms of numerical features. Each class of training table must contain two objects (they can be equals) or more. It is available the partial contradictory of class descriptions up to their intersection as well as incompleteness of data (absence of some feature values).
    IT DOES NOT TAKE any additional restrictions to information.
    You will obtain answers for the questions about training information quality, feature space informativity, existence of logical interrelations between features and classes, exactness of recognition problem decisions, and correctness of pattern recognition statement itself from the experience with LOREG.

1.3. Application Areas

The system LOREG could being used for diagnostics, recognition, forecasting, and classification problem solution in medicine, politic, sociology, physics, chemistry, technology, economics, geology, business, finances.
These algorithms have being applied for solution of practical problems in the following areas:

in industry (recognition of salt deposits on oil field equipment, check of quality of textile goods, forecasting of alloy properties);
in geology (forecasting of rear metals and oil deposits);
in medicine (diagnostics of diseases, estimation of disease degree, forecasting of treatment results in oncology, neurology, cardiology, and pulmonic);
in agriculture (forecasting the agricultural crop capacity, recognition of dominate breeds in forestry);
handwriting figures recognition;

and others (more than 60 applications).

1.4. System requirements

80386 processor or higher, Windows'95, 4Mb RAM, 10Mb HDD.

1.5. Loreg Setup

To install LOREG, insert installation disk in floppy drive and run Setup.exe from Windows, then follow instructions displayed on the screen.

1.6. Bibliography

Zhuravlev, Yu.I., Correct Algebras over Sets of Inaccurate (Heuristic) Algorithms. 1, Kibernetika, 1977, no. 4, pp. 5-17; no. 6, pp. 21-27.
Zhuravlev, Yu.I., On the Algebraic Approach to the Solution of Problems of Recognition and Classification, Problemy Kibernetiki, Nauka, Moskow, 1978, issue 33, pp.5-68.
Baskakova, L.V., Zhuravlev, Yu.I., A Model of Recognition Algorithms with Representative Collections and Systems of Support Sets, Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki, 1981, vol. 21, no. 5, pp. 1264-1275.
Ryazanov, V.V. On the optimal recognition and classification algorithms construction for solution of applied problems, In: Raspoznovanie, Klassifikatsiya, Prognoz: Matematicheskie Metody i Ikh Primenenie, Nauka, Moscow, 1988, issue 1, pp.229-279.
Sen'ko, O.V., Algorithms of Voting Over the Set of Operators of Continuum Power, In: Voprosy Kibernetiki. Diskretnaya Matematika. Metidy i Primineniya, NSK AN SSSR, Moscow, 1989.
Zhuravlev, Yu.I., Ryazanov, V.V., Intelligence system for optimal algorithms construction for solution of recognition, classification, diagnostics problems, and decision making by precedents. In : Fundamental books for economy, Moscow, Science, 1990.
Ryazanov, V.V., Sen'ko, O.V., About some recognition models and methods for their optimization., In: Raspoznovanie, Klassifikatsiya, Prognoz: Matematicheskie Metody i Ikh Primenenie, Nauka, Moscow, 1990, issue 3, pp.106-145.
Bogomolov V.P., Bushmanov O.N., Ryazanov V.V., Senco O.V. , Zhuravlev Yu.I. Data Analysis and Recognition Systems: Algorithms, Programs and Applications , Pattern Recognition and Image Analysis. 1991. Vol.1. no.3. P.335-346.
Ryazanov V.V. On the Optimization of a Class of Recognition Models, Pattern Recognition and Image Analysis, Vol.1. no.1. 1991. P.108-118.
Sen'ko O.V., The algorithm of prognosis, based on the procedure of voting by system of boxes on multidimensional space, Pattern Recognition and Image Analysis. 1993. Vol.3. no.3. P.283-284.
Ryazanov V.V. Recognition Algorithms Based on Local Optimality Criteria , Pattern Recognition and Image Analysis. 1994. Vol.4. no.2. pp. 98-109.
Larin S.B. Riazanov V.V., On a search of logical regularities for pattern recognition and data analysis, Pattern Recognition and Image Analysis. 1997. Vol.7. no.3.

2. THEORY

2.1. "Loreg" - new approach for data analysis and pattern recognition.

    There are various "classical" methods being based upon statistical approaches, structure analysis, modelling on neural networks, fuzzy sets for data analysis, classification and pattern recognition.
    The mathematical background of program system LOREG is a combination of some logical, optimization and statistical technique for data analysis.
    Logical analysis consists in a search on training table of special fragments and fragment neighbourhoods of training objects that being described in terms of features. Such fragment neighbourhoods are "typical" for some classes and "non-typical" for the others. The parametric and nonparametric methods have being elaborated for their description and search.
    Optimization approach consists in introduction of various numerical criteria for estimation of fragment neighbourhoods and solution of corresponding mathematical programming problems for search of optimal feature neighbourhoods. The optimal solutions are interpreted as logical regularities.
    Statistical ideas are used both by introduction of various optimization criteria (functionals) and for creation of decision rules in pattern recognition algorithms.
    Practically everything is much hard and interrelated. This technique is however the "interior part" of LOREG that does not require any special knowledge from a user.
    The major advantage of LOREG is the availability of automatically calculation over training data of new useful quantities and knowledge that could being test in a simple way.

2.2. Standard Information

    The following terminology for initial information description is used.
    A set M = {K₁, K₂,..., K_l} of objects, phenomena or processes is considered that can be represented as the union of l non-intersected subsets K₁, K₂,..., K_l called classes or patterns.
    Each object S is represented as some numerical row s₁,s₂,...s_n of feature values x₁,x₂,...x_n which characterize indirectly the affiliation of object S ={ s₁,s₂,...s_n } to class K_j.
    The initial information I₀ about classes is given as a sample of object descriptions S₁, S₂, ... , S_m, contained the representatives of all classes. The initial information I₀ is presented as a numerical training table (table of templates) and additional information consisting from number of features - n, number of classes - l, training objects distribution on classes (vector m = (m₀, m₁, m₂,..., m_l)) and code of feature value absence (some integer value r) in descriptions of objects. Here m₀ = 0, m₁ is a number of training objects from the first class, m₂ is a number of training objects from the first and the second classes, and so on. The total number of training object is m_l.

2.3. Logical Regularities

The predicate P_j(S) is called logical regularity for class K_j if the following conditions are being satisfied:

P_j(S_i) = 1 for some S_i of class K_j,
P_j(S_i) = 0 for all training objects S_i that does not belong to class K_j,
f(P_j) = max, where f is a some optimality criterion.

    The functional f(P_j) = < a number of training objects S_i from class K_j such that P_j(S_i) = 1> is the optimality criterion in the system LOREG.
    The predicate P_j(S) is called partial logical regularity for class K_j if the conditions 1 and 3 are satisfied.
    The determined in the system LOREG logical regularities are of the following type: P_j(S_i ) = & (s_it - e_it <= s_t <= s_it + e_it ), where the conjunction is taken over some feature subspace and training object S_i = {s_i1,s_i2,...s_in} belongs to class K_j. The automatically calculated vector E_i = {e_i1,e_i2,...e_in} gives the characteristic for class K_j neighbourhood of object S_i in the feature description space.
    The geometrical interpretation of predicate P_j(S_i) is a hyperparallelepiped in some feature description subspace with object S_i as the central element. This hyperparallelepiped contains the training objects only from the same class. Some optimality condition is satisfied additionally. Denote such hyperparallelepiped contains the training objects dominantly from the same class t in the case of partial logical regularity. Such neighbourhoods are called the optimal feature neighbourhoods.

2.4. Logical Descriptions

    The disjunctive form D_j(S) = P_j¹(S) V P_j²(S) V ... V P_j^h(S) is called a logical description of class K_j , if the disjunction is taken over the set of logical regularities P_jⁱ(S) for class K_j. Evidently D_j(S_i) = 1 for all training objects S_i from class K_j, and D_j(S_i) = 0 for all training objects S_i that does not belong to class K_j. So, D_j(S) can be considered as characteristic function for class K_j.
    The disjunctive form D_j(S) of minimal number of conjunctions P_jⁱ(S) is called the shortest description of class K_j.
    The disjunctive form D_j(S) contained minimal number of variables is called the minimal description of class K_j.

2.5. Pattern Recognition Algorithms

    The pattern recognition problem is formulated as the calculation of predicates a_j(S), j=1,2, ... , l, for object S to be recognized.
    The condition a_j(S)=1 denotes the assignment of object S to class K_j by pattern recognition algorithm.
    The condition a_j(S)=0 denotes object S does not belong to class K_j by opinion of pattern recognition algorithm.
    The row a(S) = (a_j(S), a_j(S), ... , a_j(S)) containing the single unity denotes the single-valued pattern recognition problem solution for object S.
    We have the many-valued pattern recognition problem solution if the row a(S) contains two or more unites. Pattern recognition algorithm denotes several classes containing probably the object S.
    The zero row a(S) is interpreted as the absence of similarity for S to anyone class.
    The presented in system LOREG pattern recognition algorithms are based upon determination of logical regularities and use of voting procedures.
    The basic scheme of recognition is a successive three-step procedure.
    1. Logical regularity determination.
    With training information I₀ the sets {P_jⁱ(S)} of logical regularities for classes K_j are determined.
    2. Proximity measure calculation.
    The proximity measure G_j = Sum(b_i P_jⁱ(S)) of object S for class K_j is calculated. The coefficients b_i can be introduced in various ways. They define the mode of the voting. The particular modes of their determination are considered in chapter 4. The measure is interpreted as a weighted sum of votes for class K_j.
    3. Recognition.
    The object S is related to class K_r if G_r= max {G_j , j=1, 2, ... , l}.
    The first step is the training one. The other steps are of the recognition steps.

2.6. New information for features, objects and classes

    The sets {P_jⁱ(S)} of logical regularities for classes K_j can be used not only for construction of classes' logical descriptions, proximity measures estimations of objects to classes and pattern recognition problem solving but also for determination of some important informational characteristics for classes, training objects and features.
    The parameter p_i = N_i /N is called informativity measure for feature (or weight of feature) , if N_i is the quantity of logical regularities containing feature x_i and N is a total number of logical regularities. It may be introduced in a similar way the with particular classes' associated informativity measures for objects.
    The other important characteristics for training objects and classes are the average length of logical regularities for objects (for classes) and quantity of logical regularities for objects (for classes).

2.7. Collective solutions

Collective rule recognition algorithm builds collective classification by taking into account individual rule recognition results weighted by given values. As a result, for a given object it's relative proximity measure to each class is calculated. The object is considered to be of a class with maximal value of relative proximity measure.

3. FORMAT OF INITIAL INFORMATION

The initial information I₀ about classes (see 2.2) must be presented as a Standard Format Table in the following form:

Header: N L m0 m1 ... mL b

N
L
m0=0
m1, ... , mL
b

- number of features (columns)
- number of classes

- end of class object (row) indexes (mL=M)
- numeric value denoting blanks

Example:
3 2 0 2 5 -1 - header

1 4.5 -1
2.3 8.14 2.15 - the end of the 1st class

1.1   2.9    7
2.5   8.4   6.86
3.21 4.46 12          - the end of the 2nd class

Transformation of the table for purpose of recognition problem investigation can be performed by TRAN application included in the LOREG.

4. LOGICAL REGULARITIES DETERMINATION AND PATTERN RECOGNITION PROBLEM SOLVING

4.1. Pattern recognition algorithms based upon voting over sets of logical regularities

    The based upon voting over sets of logical regularities' pattern recognition algorithms are constructed by training data as the result of program MCL application. Program MCL has the following five control parameters: <Y₁, Y₂, Y₃, Y₄, Y₅, Y₆>. Let us introduce their definitions.

    Y₁= "Weights of neighbourhoods for features". The determination of logical regularities (optimal feature neighbourhoods) is based upon solution of special integer- valid linear programming problems. Coefficients of goal functional and of constraint matrix are calculated by initial information I₀.
    The value "constant" of parameter Y₁ corresponds to the choice of functional F (z₁, z₂, ... , z_n) = Sum (z_i), where z_i is 0 or 1 . The value z_i= 1 of optimal solution corresponds to participation of feature x_i in the found logical regularity (optimal feature neighbourhoods) and vice versa.
    The value "functional" of parameter Y₁ corresponds to the choice of functional F of more complex type taking explicitly into account the results of feature values' comparison of objects from the same and different classes.

    Y₂="Neighbourhood size". Usually, there is a continual set of feature neighbourhoods equivalent to the found optimal feature neighbourhoods . Some feature neighbourhoods are called equivalent if they contain the training subsets of like composition.
    The values "Max", "Min", "Norm" of parameter Y₂ give us geometrically similar optimal feature neighbourhoods of maximal, minimal and middle size (according to inclusion relation). The corresponding recognition algorithms realise "greedy", "careful" and "compromise" approaches in voting procedures.

    Y₃ = "Vote method". There are various approaches for choice of parameters b_i in proximity measure G_j= Sum(b_iP_jⁱ(S)) calculation.
    The choice b_i= f(P_j) (see 2.3) corresponds to the value "Proportional" of parameter Y₃. The b_i parameters are calculated according to statistical weighted procedure in the use value "Statistical".

    Y₄ = "Affiliation of neighbourhood to class". The question of optimal feature neighbourhoods' existence becomes problematical under partial contradictories and incompleteness of training data. The obtained values f(P_j) can be small and the optimal feature neighbourhoods will be "statistically unjustified". The recognition algorithms will be unstable.
    A parameter 50<=Y₄<=100 is introduced for optimal application of program MCL in the cases of bad class separability on training data (partial contradictories, incompleteness, large random noise in training data, and the like).
    The values Y₄<100 correspond usually to the case of partial logical regularities P_j(S). The quantity of violations of condition 2) will increase, as a rule, under decreasing values of Y₄. So, this parameter gives us some lower bound for proximity measure of feature neighbourhood to classes or, in other words, "a power of realness" of condition 2) over training data under consideration.
    The value Y₄=100 corresponds to the case when predicates P_j(S) are the logical regularities.

    Y₅ = "Exactness". The parameter characterises the exactness of some integer-valid linear programming problem that is the most important step of logical regularities search. The parameter Y₅ is a natural number from 1 to 5. The unit corresponds to minimal exactness of solution (the minimal exactness does not denote "bad"). The number five corresponds to maximal level of solution exactness. Naturally, both program's speed and possible dimensions of processing data are in a direct correlation with values of Y₅ (direct and inverse, respectively).

    Y₆ = "Minimal representativness". For each found logical regularity the number of objects from own class of this regularity with the coordinates satisfying corresponding predicate is calculated. If the ratio (in per cents) of this number to the whole number of objects in class is less than the discussed parameter the regularity is excluded from further considerations and use.
    The possibilities of control parameters use are important in view of the following points.
    At first, the user has possibility to select the parameter values that will be the most appropriate to concrete practical data.
    At second, the user can construct various recognition algorithms for recognition problem solution by collective of algorithms.

4.2. Applicaton MCL

The program MCL is used for the construction of optimal pattern recognition algorithms belonging to the logical regularities model and for recognition of new objects. The program also allows to represent in convinient form for user some practically useful information about the structure of constructed algorithm.

Program initialization. Project file.

    The Windows 3.1 is necessary to initialize the program. After initialization the main program window appears at the screen. Before the beginning of calculations user must create the project file or open the existing one with the help of command Create (Open) Project menu. The project file contains the names of data files used for calculations and meanings of the algorithms parameters.
    After creating or opening the project file the user with the help of command Work from menu Project initializes the dialogue the Work with the project. Here it is possible to chose the meanings of algorithm parameters , to choose data files names, to initialize the training procedure, to initialize the recognition procedure, to load the files for print (see the data files description) to the MCL editor.
    It is possible to change the algorithm parameters meanings in the dialogue Parameters (command Parameters from the dialogue Work with project)
    After the completion the work with the dialogue Parameters the user can save the changes in the project file (command Save) or to cancel them (command Cancel).
    The command Change from the dialogue Work with project is used to change the considered data file.

Training procedure.

In the training mode (command Training from dialogue Work) the program by the table of templates and in accordance with control parameters constructs the optimal recognizing algorithm belonging to the model with the logical regularities. The found recognizing algorithm is put to the file with the name coinciding with the name of the project file and with the extension .mcl. The result of the training are saved as text file for print.
The most important results of training are put to the file of special format ("Video file"), which is used by the program Loreg Visual.

The recognition procedure.

The recognition mode allows to solve the task of recognition (classification) of objects from recognized objects table applying recognizing algorithm found in training mode. The results of the classification are put to the File for print (recognition) and "Video file".

The results of the work examination.

The results of the work of program are saved in the files for print and Video-file. Files for print are simply text files. It is possible to examine them in any text editor and in the own MCL editor . To load file for print to the MCL editor commands Load from the dialogue Work with project and command Open menu file are used. The special visualization program works with Video file. (see LOREG VIDEO).
Formats of files containing tables of templates and tables of recognized objects must be standard ones.

File for print (training).

The created at the training stage file contains the heading with the information about project and various useful information.

Informational characteristics of objects:

the number of logical regularities found for object;
the mean length of the logical regularity;
length, weight, share and composition of logical regularity, which is most typical for given object;

Informational characteristics of classes:

the number of logical regularities found for class;
the mean number of logical regularities found for the class object;
the mean length of logical regularity;
the informational measures (weights) of features;
the mean e-thresholds values for class regularities;
the shortest and minimal logical descriptions.

Informational characteristics of task:

the number of logical regularities;
the mean number of regularities per object;
the mean length of regularity;
the informational measures (weights) of features;
the mean numbers of e-thresholds.

File for print (recognition).

The file is created at the stage of recognition. It contains the heading with the information about project and recognition results.

the object number;
the number of the class to which the object is classified;
the number of satisfied logical regularities;
the found similarity measures of objects to every class.

The file contains the information about coincidence of initial and received classifications.

4.3. Collective Rule Recognition Algorithm.

Collective rule recognition algorithm builds collective classification by taking into account individual rule recognition results weighted by given values. As a result, for a given object it's relative proximity measure to each class G_i (i=1, 2, ... , l; i - number of classes) is calculated. The object is considered to be of a class K_c if the corresponding value of relative proximity measure is maximal.
Recognition threshold D (0<D<=1) sets the caution level of recognition. If for some class K_i=/=K_c, G_i/G_c > D, the object is considered to be of a class K₀ (not recognized). The lower value of D the more cautious classificator.

Input and Output

Input
    1. VIDEO.* files in working directory, containing individual rule training and recognition results and feature weights. The application relies on true internal format of the files, all of them being of the same problem. Otherwise, behaviour of the application is unpredictable.
    2. Parameters (see below).
Output
    1. VIDEO.COL file in the working directory, containing collective rule training and recognition results, and feature weights. The old version of the file is deleted on the beginning of the application.
    2. Listing file - text file in the working directory, containing collective rule recognition results. On successful finishing the application invokes MS Windows NotePad application with this file. You can look through/edit/print it.

Parameters


Listing File Name	The name of the text file where collective recognition results are written to (input).
Threshold	Sets the caution level of recognition (0<th<=1). The lower value of threshold the more cautious classificator (input).
Individual Rules	List of individual recognition rules for which recognition results are available (output).
Individual Rule Weights	Relative weights of individual recognition rules (w>=0) (input).

4.4. Table Transformation (Tran application)

Table transformation (Tran) application provides some simple operations on Standard Format Tables. Giving ranges for New Classes and New Features we construct new view of recognition problem under consideration.

Parameters

Input Table	Source for transformation.
Output Table	Result of transformation.
New Classes	Defines new classes (1, 2, ...) in terms of input table row indexes (leave empty if no changes).
New Features	Defines new feature subset in terms of input table column indexes (leave empty if no changes).

Standard Format Table

Header: N L m0 m1 ... mL b

N
L
m0=0
m1, ... , mL
b

- number of features (columns)
- number of classes

- end of class object (row) indexes (mL=M)
- numeric value denoting blanks

Example:

3 2 0 2 5 -1 - header

1 4.5 -1
2.3 8.14 2.15 - the end of the 1st class

1.1   2.9   7
2.5   8.4   6.86
3.21 4.46 12             - the end of the 2nd class

New Classes

Class Number    Ranges

1                           Lo11   Up11
1                           Lo12    Up12
.........................................................
1                           Lo1k1 Up1k1
2                           Lo21    Up21
.........................................................
(Up >= Lo)

New Features

Ranges

Lo1 Up1
Lo2 Up2
......................
(Up >= Lo)

5. VISUALISATION

This chapter explains how you can use Loreg Visual for analysis training information and recognition results.

5.1. Why is it necessary?

As result of training and recognition you obtain large amount of useful information in the form of print files. However, it is interesting for user the observation of the more important results. This task is realised by using the program Loreg Visual.
For visual analysis of training information, classes description, logical regularities and recognition results against the background of learning data in different feature subspaces you may use Loreg Visual. Use it and you study your task as well as possible.

5.2. How to start Loreg Visual?

To begin this application you must have information, which organized as standard table or, that is more desirably , some video-file. You can get such file as result of MCL application work.
Loreg Visual is the part of LOREG and you can start it by two ways:

Start LOREG and choose from menu Visual / Loreg Visual;
Choose icon Loreg Visual from program group Loreg.

Start window Loreg Visual appears.

To load your information for visual analysis choose File / Training Table. Open dialog box appears. Choose one of the following file types from the box <List types of file>:

Video-file (Video.*) - file with training information, logical class description and recognition results;
Table (*.tab) - file with only training information;
Any file (*.*) - file with only training information and any extension.

ADVICE
Most interesting analysis you can do if you have video-file, therefore execute MCL application before starting Loreg Visual.

Choose file from file list. If file is correct then the main window of Loreg Visual appears.

5.3. What and How is displaying?

Main window of Loreg Visual has the next components:

header;
visual area;
menu;
tools panel.

An example 1 of the tools panel.

Main component for visual analysis is visual area. Visual areas intend for representation of different types of information. Following types of information are displaying.

Features
One pair of features is active at any time. One of the feature is displaying at the axe X, other at the axe Y. Numbers of the active features are displaying as X_i near the axes. Representative interval is defined as region of feature meanings on the set of the training objects. Coordinates beginning corresponds to pair of the minimal meanings of the active features.

Objects
There are three types of objects:

training objects;
recognizing objects (from training table);
new objects (for interactive recognition).

    Training objects
    Any training object belongs to one of the classes. Projections of the objects at the plane of active features pair are displaying as circles of different colours for various classes or as figures, where each figure corresponds to some class. To change mode of displaying of the objects choose from menu Object / Mode.
    If object is inside active logical regularity then white circle with smaller radius is displaying inside such object.
    One of the training object may be active. Active object is blinking.

    Recognizing objects
    Recognizing object may be belonged to one of the classes, if recognition have been previously made and results saved in the video-file.
    Recognition objects are displaying as small rectangles of different colours for different classes or as question marks. If recognizing object is not recognized then the colour is grey. Display mode switches by choosing from menu Object / Mode.
    If object is inside active logical regularity then white circle with smaller radius is displaying inside such object.
    One of the recognizing object may be active. Active object is blinking.

New objects
LOREG VISUAL can recognize new object in interactive mode. To do it use menu item Object/New. Dialogue appears. Input values of features. If the value of some feature is unknown than will be unknown code used). Choose button Recognize. After calculation the next information appears: class, table with estimates and estimates diagram corresponding to recognizing object. Radius of circle depends from count of logical regularities which took part in recognition.

An example 2 of the tools panel.

Class logical descriptions
If you consider video-file then one of the class descriptions is active (full, shortest or minimal) and one of the classes is active. To display class logical description choose from menu Description / Output. Distribution of the logical regularities of active class appears:

if colour is absent, it means that there is no logical regularities in this area;
grey colour signifies this area is covered less then 1/3 regularities;
dark grey color signifies this area is covered from 1/3 to 2/3 regularities;
yellow color signifies this area is covered more 2/3 regularities.

An example 3 of the tools panel.

    Logical regularities
    If you consider video-file then one of the class descriptions is active (full, shortest or minimal) and one of the regularities is active.
    Active regularity is displaying as rectangle if displaying features are available at regularity. Colour of regularity corresponds to colour of it's class.
    If some object is inside the active regularity, it has a circle of white colour with smaller radius in the centre of their presentation.

5.4. How to change active item

This chapter explains how to change active features, object, logical regularity, class description, which are displaying at the visual area of main window Loreg Visual.

Features
To change active pair of the features use one of the following methods:

press key X or key Y. Active feature index grows by 1. If an active feature is the last one then the following active feature will be equal to 1;
choose from menu Feature / on Axes. Dialog box appears. Change X,Y- features and choose button <OK>;
if you consider video-file with information about feature weights, then choose from menu Feature / on Weight. Automatically plane of two the most informative features and button <Continue> appear. Choose the button <Continue> and see pairs of the features (1,3) , (2,3), (1,4), etc. ranged according informativity;
if active logical regularity exists, choose from menu Feature / on Regularity. Two presented in the regularity features and button <Continue> appear automatically. Choose this button and see pairs of the features (1,3), (2,3),(1,4), etc.

ADVICE
If you analyse some logical regularity, use method 4. It allows to consider pairs of features presenting at active logical regularity. If you analyze class description, use method 3. In that case, you consider pairs of features in informativity order.

Training objects
To change active training object use one of the following methods:

click by mouse in the interesting object;
choose from menu Object / Next or Object / Previous;
choose from menu Object / Find and info.

Recognizing objects
To change active recognizing object use one of the following methods:

click by mouse in the interesting object;
choose from menu Object / Recognizing.

    Class logical descriptions
    To change type of class description choose from menu Description / (Full, Shortest or Minimal). Active class description is checked. The active logical regularity is changed too, while you change the active class description.
    To display class logical description choose from menu Description / Output.
    To change active class choose from menu Regularity / Next class.

Logical regularities
To change active logical regularity use one of the following methods:

choose from menu Regularity / Next or Regularity / Previous. The next or previous regularity of the active class will be active;
choose from menu Regularity / Next class. The next class and the first regularity of this class will be active;
choose from menu Regularity / Find and info.

5.5. How to get information

This chapter explains how to get information about features, objects, logical regularities, class descriptions, which are displaying at the visual area of main window Loreg Visual.
Information dialogue boxes intend for getting different information. To start such dialogue boxes use menu. To exit to main mode, use button <Exit>.

Common information
To get a common information about task choose from menu Information / Common. Information about filenames of training and recognizing tables, quantity of the classes, training objects and features appears.

Information about training object
To get information about training object choose from menu Object / Find and info. The following information appears:

number of the object;
number of the class of object;
position of object according the active logical regularity (is it inside or outside);
quantity of own and another logical regularities covering the object;
meanings of the features (missing value is displaying as '-').

Some dialogue box intends for displaying and changing active training object. To change active object choose button <Next> or <Previous>. When you press button <Exit>, the last displaying object will be active.

Information about recognizing object
To get information about recognizing object choose from menu Object / Recognizing. The following information appears:

number of the object;
class of the object (if object is not recognized, then quotation mark appears);
object position according the active logical regularity (is it inside or outside of regularity?);
quantities of logical regularities covering object;
meanings of the features (missing value is displaying as '-').

That dialogue box intends for displaying and changing of active training object. To change active object choose button <Next> or <Previous>. When you press button <Exit>, last displaying object will be active.

Information about classes
To get information about classes choose from menu Information / Classes. Information about distribution of the objects for classes appears.

Information about features
To get information about features choose from menu Information / Features. Information about minimal and maximal meaning and informativity of features appears. Informativity of features may be displayed as diagram. Choose button <Diagram>. You can change order (on numbers or on informativity). The green line corresponds to single level of informativity.

Information about class description
To get information about class description choose from menu Information / Description. Information about quantity of logical regularities for each description appears.

    Information about logical regularity
    To get information about logical regularity choose from menu Regularity / Find and Info.
    The following information appears:

number of the regularity;
class of the regularity;
quantities of satisfying to the regularity objects for all classes;
meanings of the features (missing value is displaying as ”-”);
numbers of features and left - right borders which define regularity.

The dialogue box intends for displaying and changing active regularity and active class. To change choose button <Next> or <Previous> or <Next Class>. After pressing of the button <Exit>, the last displaying regularity and class will be active.

5.6. Menu commands

This chapter explains purpose of menu commands. Description has the following structure: menu item and items of submenu with descriptions and references.

    File
    Training table - choosing file with information for visual analysis, see 5.2;
    Recognition table - choosing file with recognizing information. Use it item if you didn't use MCL application and hasn't video-file;
    Exit - exit from Loreg Visual;

    Regularity
    Next - make active next regularity of active class, see 5.4;
    Previous - make active previous regularity of active class, see 5.4;
    Next class - make active next class and active regularity to be first, see 5.4;
    Find and info - getting information about logical regularity, see 5.5;

    Object
    Next - make active the next object, see 5.4;
    Previous - make active the previous object, see 5.4;
    Mode - change mode of displaying of objects (circles/figures), see 5.3;
    Recognizing - getting information about recognizing object, see 5.5;
    Find and info - getting information about training object, see 5.5;

    Feature
    on Axes - define active pair of the features, see 5.4;
    on Weights - start to view features in order of its informativity, see 5.4;
    on Regularity - start to view features, which are presenting in the active regularity, see 5.4;
    Find and info - getting information about features, see 5.5;

    Description
    Full, Shortest, Minimal - choosing active class description, see 5.4;
    Output - change mode of displaying of class description (single regularity / class description ), see 5.3.

    Information
    Common, Classes, Regularities, Descriptions, Features - getting information, see 5.5.
    Help
    Contents - getting help contents, see 5.7;
    Language - change language of interface, see 5.7;
    Tools Panel - change mode (on/off) of tools panel;
    About - getting information about Loreg Visual;

5.7. Getting help

If you need help, complete any of the dialogues box or menu items presented by Loreg Visual. Online Help is available by pressing key <F1> in any time of work with program. Moreover, Help contains main definitions (such as object, feature, video-file, etc.), description of modes of the work and answers on your main question. To get contents of online help choose from menu Help / Contents.
Loreg Visual allows change language of interface. To change language choose from menu Help / Language. Items of menu and dialogues box are saved at the files lang1.txt, lang2.txt, it allows translate interface for any language.

6. LOREG IN DECISION OF PRACTICAL PROBLEMS (EXAMPLE OF SOME MEDICINE DIAGNOSTICAL PROBLEM)

    In proceedings of 9-th Scandinavian conference ( Sweden, Uppsala, on June 6-9, 1995 ) there were presented preliminary results of a problem of melanoma recognition¹ by 32 features, first 17 of which describe the geometrical form of tumour, last 15 features - its radiological characteristics. The initial information was made by sample of numerical lines, each of which is 32D descriptions or malignant lesions (class 1), or benign formation (class 3), or "intermediate" dysplastic objects, (class 2). The problem of melanoma recognition consisted in automatic classification of a line of 32 numbers, to one of three above-stated classes. The initial information was casually splitted on two tables, including representatives of all classes: the training table (17 objects of the first class, 20 of the second and 20 of the third) and the test table (12, 10 and 10 objects of appropriate classes).
    So, our task will consist in investigation of the training table TLMEL.TAB with the help of the application MCL, solving of the recognition problem for lines of the table TRMEL.TAB, including construction of individual and collective rule algorithms, analysis of received results. Thus, we do not use any assumption or knowledge of the internal contents of problem, as some problem of medical diagnostics. We merely process the given numerical tables.
    We begin computing process from running application MCL.
    We create a project file m1.mcp, using option <PROJECT> and <CREATE> of the main menu of the program.
    Through option <PROJECT> and <WORK> we establish names of files and parameter values: tlmel.tab - reference table ( training table ), trmel.tab - table of objects to be recognized, m1.inf - file of results of the analysis training information and, m1.ans - file of recognition results, video.m1 - special format file of main results for visualization. We choose parameters in option <CHOOSE>: "Weight of neighbourhoods for features" (Y₁) = "functional", "the size of a neighbourhood" (Y₂) = "Max", "A method of voting" (Y₃) = "proportional", "Affiliation of neighbourhood to a class" (Y₄) = 100, "Exactness" (Y₅) = 1, "Minimal representativity" (Y₆) = 3.
    After sequential fulfilment of commands <TRAINING> and <RECOGNITION> with the help of a command <LOADING> (or option <FILE> of the main menu of the program MCL) it is possible to look through files of training and recognition results. For each sample the number of logical regularities (number of features in their record), length of the best regularity, its weight (number of the samples from the same class satisfying the best regularity) and relative share of given samples from total amount of samples. At last, the best logical regularities are presented: feature numbers and intervals of their variations.
    For each new applied problem we a priori do not know, which managing parameters are better to use, therefore it is worth to conduct series of calculations at various variants of parameter values. We shall present results of test information recognition, which were received by various algorithms of voting on logical regularities. Below, in the table, results of 13 calculations are presented: values of managing parameters and percent of the true answers on test information.:

N of calcul.	Neighb. weights	Neighb. .size	Voting method	Affilia t. to class	Min repr.	Exact- ness	% of true answers

1	func.	max	prop.	100	1	3	68.7
2	func.	max	prop.	100	1	1	71.8
3	func.	max	stat.	100	1	1	59.3
4	func.	min	prop.	100	1	1	71.8
5	func.	min	prop.	80	1	1	75.0
6	func.	min	stat.	80	10	1	65.6
7	func.	min	prop.	80	10	1	65.6
8	func.	min	stat.	80	20	1	71.8
9	func.	min	stat.	80	30	1	68.7
10	func.	min	stat.	80	40	1	65.6
11	func.	min	stat.	100	20	1	71.8
12	const.	norm.	stat.	100	20	1	53.1
13	const.	norm.	prop.	100	20	1	59.3

    In recognition of any unknown new objects there is a natural question: What exactness of the decision? From presented table we can see that it can vary in reasonably large limits. The system LOREG enables to evaluate recognition results by received results of training (length of logical regularities, number of informative features, estimates (votes) of objects and etc.). For example, too small values of estimates, lengths of regularities, or number of informative features usually indicate about bad quality training information, essential difference of recognition data from training data or rather unsuccessful choice of parameter values for application MCL. In the case it is recommended to repeat training process at other parameter values (for example, in accordance with calculation N 1 or N 2). The preferable variants of parameters choice usually become clear for not large experience of work with the system.
    Moreover there is a opportunity of automatic reception of stable results with use of adjusting module COLW: some calculations are carried out at various parameter values of the program MCL and then collective decision rule with the help of module COLW is calculated. In the case the errors of various calculations usually "absorb" one another, and as a rule we have the best decision or close to best.
    Collective decision, constructed on the basis of all 13 mentioned above algorithms has supplied 71.8 % of true answers, i.e. was hardly below the best of them. We shall allocate those algorithms and their decision, to which there corresponded less than 70 % of the true answers. Given eight algorithms are marked by a information line on Fig. 12. The collective decision, constructed on their base, had 68.7 % true answers and has matched the best decisions from the given list.

7. HOW TO DECIDE RECOGNIZING TASK - FIRST STEPS

We consider some simple pattern recognition example as illustration. What minimal steps must we do to decide the recognition task?

7.1. Original information.

Consider simple example with four classes. There are named as "Normal", "Easy" ,"Hard", "Catastrophe". Recognizing objects are described in terms of next 7 features: "width", "depth", "colour", "weight", "temperature", "length", "size". We have the following 3 files in a form of text format.

1. table_l.tab -

#CLASSES
Normal
Easy
Hard
Catastrophy
#FEATURES
width
depth
color
weight
temperature
length
size

2. table_l.tab

7 4 0 3 6 9 12 -1

11 15 23 12 12 19 -1
23 36 23 17 18 26 34
19 42 22 31 15 73 11

17 77 33 43 38 88 95
15 75 38 -1 64 86 75
10 86 30 21 89 90 20

55 34 34 18 70 59 23
59 37 38 26 72 62 36
70 36 37 -1 81 70 40

10 45 56 92 32 66 40
17 42 59 38 23 67 43
14 -1 56 38 27 69 48

3. table_r.tab

7 4 0 1 2 3 4 -1

15 37 18 22 -1 -1 34
16 77 39 28 50 99 80
60 36 -1 27 77 60 35
15 40 51 40 25 60 50

Table_l.opi consists of information with names of features and classes. These names are labels for users.
Table_l.tab (learning table) consists of learning (training) information about objects and classes. First string is header: count of features(7), count of classes (4), 0, number of last object of first class (3), number of last object of second class (6), ... , number of last object of last class (12), code for unknown value (-1). Other strings present the feature descriptions of objects.
Table_r.tab (recognizing table). It has format that is similar one to learning table and consists of descriptions of unknown objects (testing objects). If it is the testing table, you may describe it in header. In our example, it is <0, 1, 2, 3, 4>.

7.2. Learning.

Run MCL from group LOREG. Main window of MCL appears.
Choose Project/Create. Dialogue "Create project FILE" appears where user must change name of project with .mcp extension.
Let name is m1.mcp. Press OK. Choose Project/Work.
Set files with learning and recognizing objects.
Run Training. Results of training are placed in the special file m1.mcl and ASCII - file m1.inf.

7.3. Recognition for objects from file.

Press Recognize if you have recognizing table (table_r.tab). Results will be in ASCII - file m1.ans.

For each object you can see the result of its recognition (second column). For example, object 1 belongs to class number 1 and estimates (affiliation measures) are 0.379 (for 1 class), 0.000 (for 2 class), 0.030 (for 3 class), and 0.049 (for 4 class). It means this object is classified as object of the first class.

7.4. Recognition in on-line mode.

You can input feature values for unknown object and recognize it. This mode is possible only after step 2 (training).
Run Loreg visual, execute File/Learning table
Choose video.mcl.
Choose Object/New.

An example 4 of the tools panel.

Input value of features for recognition object (for example, <21,72,39,59,41,68,25>) and press Recognize.

An example 5 of the tools panel.

Decision of recognizing task appears (number of class, estimates and diagram)

7.5. What else?

After that you can change parameters for new training. To estimate results you can use print-files of MCL, Visual analysis with Loreg Visual and get collective(colw) decision.

Foot-note

¹ Authors thank very much Prof. H.Ganster from Technical University Graz (Austria) for placed at our disposal melanima data. (See. H.Ganster, M.Gelautz, A.Pinz, Initial Results of Automated Melanima Recognition. Procceedings of 9th SCIA, June, 1995, Uppsula, Sweden, pp.209-218)

back to the text

CONTENTS

Foot-note