Integrating DROOLS and R software for intelligent map system

The paper describes intelligent map system that allows to check errors in map sheets or to help with a map sheet creation. The system is based on expert system DROOLS, ontology created in Protége and statistical software R. Prototype of the system should evaluate that this kind of integration is possible, so the system is not full of rules. The prototype is filled with twenty rules written in DRL language and with more than thirty items from the ontology. The paper should show how all of these components can be integrated together to allow such kind of a map sheet evaluation. The system is now used for selection of the best method for data classification. The selection is suggested by DROOLS system that uses R software to perform statistical tests of normality and uniformity.


Introduction
The world of cartography is changing, we can see it in any map that is available on the web these days.Any internet user can create own map without any basic knowledge about cartography.There are tools for a map creation available free of charge and geodata available free of charge as well.When the tool for a map creation keeps the process of a map creation under its supervision, the resulting map is usually correct in a term of cartography rules.When the tool gives a lot of options how to create the map, the map is usually full of mistakes.This is described in [1]: "Process of making map is core of the whole cartography, but not only specialists are making maps nowadays.In last years, this process not involved only the cartographers, but also the common users.Production of map with using adequate software is a simple process now, which is used by non-cartographic users.These users do not know basic cartographic rules for making maps and they make maps intuitively.This situation needs the implementation of principles of cartography directly into the map production systems in pursuit of correct and effective maps producing.Instead of the final map is also important the explanation and the proposing of several possible solutions.According to progress in the artificial intelligence, the knowledge-base systems can be applied for this problem.These systems can partly substitute a role of the expert in this process."We have decided to research possibilities how to create intelligent map system, that can help in a process of a map creation.Several tools has been inspected and tested for purposes of the system development.We have discovered that such system can not be simply created with one tool, but that several independent systems should be integrated together.This article describes integration of expert system and statistical system.

Aim of the system
Aim of the system is to help with a map creation for users that are not familiar with cartographic rules.The system can help in two ways: • answer a question in a process of a map sheet creation, • check a created map sheet for mistakes.
When the user creates a map there are always steps where he/she must do a decision.For example which size of a font to use for a title of the map or which classification method to use for creating classes breaks.The user just simply (or sometimes not so simply) answers to system questions and obtains recommendations how to finish the step of the map creation.The answers can be filled in a simple graphical user environment with items such as text field or check box.Several answers can be derived from the data used by user for the map creation.A similar approach has been used in the Descartes project [2] and we just adopted it to our project.

An another way
, not yet researched in deep in any founded article, is based on check existing (created) map for mistakes.In this approach the system obtains the map from the user and analyses its content.When it is needed, the system can ask the user for original data.The map is checked according to cartographic rules.This approach is mentioned in [3], but the system mentioned in the paper was not tested and not even developed.A result of the check of the map for mistakes can be of three types: • a list of mistakes and suggestions how to avoid them, • a map without mistakes based on the original map, • a map without mistakes based on the original data and original map.
The simplest way is to provide the user with a list of mistakes and some suggestions how to avoid them.We can generally declare that system described in this article works according to this simplest way.The more difficult is to repair the map.To be able to repair the map there must be meet several conditions: • the map must be available in the form of file, that uses structures that can help with simple map repair (e.g.Scalable Vector Graphics format), • the mistakes must be from the selected types, not all mistakes can be automatically repaired, • the original data must be available.
The aim of the system is not to create it so flexible that it is able to find any mistake in a map, but it should be able to find several most horrible mistakes to help with a map quality improvement.For example to avoid creation of maps such as on the following figure (Figure 1).

Pilot project focus
The pilot project is focused only on selected part of cartography techniques namely Choropleth maps and Cartograms.It has been tested on Atlas of Fire Protection in the Czech Republic The atlas allows to create a choropleth map or a cartogram based on statistical database of events that required fire brigade action.The atlas allows to specify following conditions: • year from/to of events, • type of events (e.g.fire where were injured fireman), • statistical method for generating class intervals (Jenks, Equal interval, etc.), • number of classes, • type of frequency (square km, population), • start colour, end colour for classes visualization.
The user must specify these conditions.Selection of the statistical method is in the pilot project now based on the intelligent map system.
The resulting map can be as on the following figure (Figure 2).

System architecture
The system is based on integration of several items listed on the following figure (Figure 3).
The process of answering to the question which classification method to use is covered by following steps: • Client (Any SOAP/REST capable -in our pilot project the client is the Atlas) sends data for classification to service.• The service reads an ontology (available in OWL format) and creates objects that will be placed in a session of an expert system based on DROOLS.

Geoinformatics FCE CTU 2011
• When is created an instance of a class named StatisticalValuesGeo, the data from the client are stored into the instance.
• After the data are stored in the instance the instance creates R software instance and runs tests of the data in the R software instance.
• The service creates the session of the expert system and fires all rules on the session.

Geoinformatics FCE CTU 2011
• Results of the all rules run is stored in the InfoContainer class.
• The service reads results from the InfoContainer class and returns response containing the results to the client.

Ontology
The used ontology is created in Protége software.The ontology is created with regard to limits of export to Java classes.The export is done via Protége-OWL-API that has several limits when exporting ontology.So the ontology is just a simple hierarchy with super-classes and sub-classes.The classes have defined attributes with a data type definition and a cardinality relationship between class and attribute.

Class StatisticalValuesGeo
The class StatisticalValuesGeo extends class StatisticalValues, that is defined in the ontology.
The extension is based on reaction to the process when data used for a map are stored within this class.In that moment is tested their statistical distribution.
The distribution is tested only for three possible models: • Normal distribution.
The test of distributions is done in software R via tool rJava (JRI -http://rosuda.org/rJava/).The tool rJava is a Java native interface to R software.

Normal distribution
The normal distribution is tested with Shapiro test (module shapiro.test).When the value of resulting W is more than 0.95 and value of resulting pvalue is more than 0.05 then the data are identified as they have normal distribution.See the following code for details.

Conclusion
As a part of our research we did integration of DROOLS with R software.Our findings are simple, but possibly valuable: • The integration is possible, but at the moment not with a good performance (CGI workaround) • The solution based on integration of DROOLS and R software allows in the future to use another functions from R.
• R engine can be replaced with another tool (the solution is not directly dependent on the R engine).

Figure 1 :
Figure 1: A map with several mistakes

Figure 2 :
Figure 2: Choropleth map from the Atlas