Previous Up Next

5  Theoretical aspects

We would like to point out here that there is a strong relation between type systems and logic. More precisely, the so-called Curry-Howard correspondence23 relates types with propositions, and programs with proofs. It can be shown that type inference rules for the typed lambda calculus (i.e. functional programs) relate to natural deduction24 in propositional logic. While this is traditionally presented for implication introduction and elimination (type inference rules for function abstraction and application) only, it is also possible to find such relations for algebraic datatypes. In the following derivations the type declarations (right side of colons) in the type inference rules are structurally the same as the inference rules of propositional logic presented next to them.

Construction of values of sum type (here: two constructors Inl and Inr, each taking an argument of different type) corresponds to introduction of disjunctions (∨I):

  
  
t:τ1
Inl(t) : τ1 | τ2
 
A
A ∨ B
   
  
t:τ2
Inr(t) : τ1 | τ2
 
B
A ∨ B
 

Pattern matching on values of sum type corresponds to elimination of disjunctions (∨E):

  
 
  
t:τ1 | τ2
 
 x1:τ1
  
t:τ 
 
 x2:τ2
  
t:τ 
case t of Inl(x1).t1Inr(x2).t2 : τ
 
 
  
AB
 
 A 
C
 B 
C
C

Construction of values of product type (pairs) corresponds to introduction of conjunctions (∧I):

  
  
t1:τ1
 
  
t2:τ2
(t1t2) : (τ1, τ2)
 
A
 
B
A ∧ B

And finally, decomposition of pairs corresponds to elimination of conjunctions (∧E):

  
  
t:1, τ2)
fst(t) : τ1
 
A ∧ B
A
   
  
t:1, τ2)
snd(t) : τ2
 
A ∧ B
B

Considering this relation, the type of the data in attribute-value representations would correspond to propositions that are in conjunctive normal form (CNF), i.e. conjunctions of disjunctions, whereas algebraic datatypes relate to arbitrary propositions involving these logical connectives. Any formula in propositional logic can be translated to CNF, but as is well known, this may require exponential space (and hence time) in the worst case25. This, too, shows that our more general representation can yield exponential improvements over attribute-value encodings concerning size of unambiguous representation of structured data and induced models.

The demonstrated correspondence may henceforth be used to convert induced functional programs involving algebraic datatypes to proofs of logical propositions. Since logical propositions are especially relevant for complexity analysis within the PAC-framework of computational learning theory26, this bridge between two fields of research in machine learning may turn out to be very fertilizing.


Submitted to: Machine Learning Journal, Special Issue on Inductive Logic Programming and Relational Learning
Copyright   ©  2003 Kluwer Academic Publishers
Author: Markus Mottlmarkus@oefai.at⟩
Previous Up Next