Formal Data Validation Tutorial at ABZ 2014, Toulouse (slides available)

announcementAt the occasion of the ABZ 2014 conference, a tutorial on formal data validation is organized at ENSEEIHT together with workshops (see details here). The tutorial will happen on June 3rd 2014 from 2:00 pm to 5:30 pm.

During this tutorial, formal data validation principles will be exposed (key concepts, mathematical language to express properties, tool chain) and applied to simple examples covering different aspects.

The program is as follows:

  • Introduction
  • Formal Data Validation in a Nutshell
  • Industrial Examples
  • The approach: mathematical language et verification tools
  • Case studies

If you want to attend the tutorial or get more information about this event, please contact



During Rodin and DEPLOY EU-funded projects, track topology data model were developed in order to check real word railway system architectures. This data model is of paramount importance, as it is a strong precondition for software to behave safely (for example, if a signal is badly positioned, a collision is likely to occur). Validation of concrete data is complex because of the large amount of data (50k-100k excel cells, modified several times during a development) and the number of verifications to perform (up to 1000 rules, expressing intricate graph-based properties) over this dataset. As a consequence, human-based verification is time-consuming and subject to many mistakes. Automating the process, by formalizing the data model with the mathematical language of the B method and by performing the verification with the model-checker and constraint solver (ProB) has lead to dramatically reduced verification time from months to minutes.


Newcastle University Technical Seminar

nclClearSy is contributing to a technical seminar organized by Newcastle University on 24 October 2013.

Formal methods are used worldwide to improve safety in several railways applications, at various levels and with a wide range of results obtained. This presentation is intended to provide a unified (not agnostic) overview of existing practices and to assess to which extent safety is ensured.

At that occasion, some examples of railways data validation will be exhibited.

CAI 2013 Conference

CAI-2013ClearSy participated to CAI 2013 conference (International Conference on Algebraic Informatics) held in Porquerolles (France) 3-6 September 2013.

A talk, entitled “Railways formal data validation“, was given during the industrial session. It was a good occasion to expose to researchers the mathematical problems the railways industry is currently facing when validating safety-critical embedded data.

Data Validation & Reverse Engineering

graf-zoomData validation principles have been applied recently to a railways reverse-engineering project with great success. B and ProB have demonstrated again how efficient they are when used in combination.

This project was aimed at redeveloping an IDE for Embedded Diagnosis Systems (EDS), mostly from scratch as source code and development documentation are lost. Obsolete hardware (with no direct access to the filesystem) and original IDE were used for black box testing.

Continue Reading →

ProB: a model-checker for data validation


ProB is an animator and model-checker for the B-method. It can be used to systematically check a specification for range of errors. The constraint-solving capabilities of ProB can also be used for model finding, deadlock checking and test-case generation.

Prob is developed by the team of Michael Leuschel, University of Dusseldorf, while commercial support is provided by the spin-off company Formal Mind.

ProB is now being used within Siemens and Alstom for data validation of complicated properties.


PredicateB: a predicate animator

PredicateB was created by ClearSy in 2005, in order to calculate predicates expressed using the mathematical language of the B formal method. PredicateB has been initially developped in Java and is part of the Rodin framework hosted on sourceforge.

History of PredicateB uses for railways data validation

A new version of PredicateB, PredicateB++,  has been developed in 2008 in C++ on top of Qt libraries, with the aim of significantly improving performances.

Continue Reading →

Data validation in the railways

The Genesis

Metro de Malaga with Alstom Urbalis CBTC

Verifying railways systems covers many aspects and requires a large number of cross-verifications, performed by a wide range of actors including the designer of the system, the company in charge of its exploitation, the certification body, etc. Even if complete automation is not possible, any automatic verification is welcome as it helps to improve the overall level of confidence.

Indeed a railways system is a collection of highly dependent sub-system specification and these dependencies need to be checked. They may be based on railways signalling rules (that are specific to every country or even every company in a single country), on rolling stock features (constant or variable train size or configuration) and exploitation conditions.

Continue Reading →

Introduction to data validation

validationAccording to wikipedia, «in computer science, data validation is the process of ensuring that a program operates on clean, correct and useful data. It uses routines, often called  validation rules or check routines, that check for correctness, meaningfulness, and security of data that are input to the system. The rules may be implemented through the automated facilities of a data dictionary, or by the inclusion of explicit application program validation logic ».

In other words, the data are checked against a data model. The data model could be simple or elaborated : it is up to the human validator to define to which extent the data have to be validated.


Often data validation is understood as validating numbers, dates and text, i.e. verifying types and ranges. For that purpose, spreadsheet dedicated function can do the job. However in a number of applications the data model could be quite complex and verification difficult to automate, hence leaving room for human errors during validation.

For digital photographers, “data validation generally addresses three questions. Is the archive complete? Did the files transfer properly? And, are the files uncorrupted?“. This interpretation is out of the scope of this article.

Continue Reading →