# Introduction

# Some formalizing

IMHO the LS problem can be reduced to a classification problem:

${\displaystyle \mathrm {classify} (word\ w,\ context\ c)\ \in \ \{\ t\ :\ t\ possible\ translation\ for\ w\ \}.}$

the context ${\displaystyle c}$ could be a text frame, a bag of words, a tfidf-labelled array etc.

the possible translations for w can be obtained maybe from WordNet? or another dictionary?

We already have a set of attributes (srl and slr) to mark ambiguous words; it would be best to use those. en-ca and en-es have examples -- Jimregan 13:22, 21 June 2009 (UTC)

Awesome :D I'll give it a read in the next days :) Thanks a lot! -- Deadbeef 23:53, 30 June 2009 (UTC)

the classification problem can be solved in various ways: support vector machines, naive-bayes classifier, decision tree etc.

It seems that the WSD problem can be handled with a Inductive Logic Programming-oriented approach, as this paper states: http://www.mt-archive.info/ACL-2007-Specia.pdf

I'm currently trying to introduce probabilistic reasoning into Aleph[1] - the Inductive Logic Programming framework cited in the paper - for a university project and maybe it would be interesting to see how it could handle with lexical selection.

# Data Mining/Machine Learning tools supporting the classification task

I've tried many tools while taking AI and DM-related classes, like Weka[2] (that I've integrated in a Multi-Agent System to support agents while taking decisions) or RapidMiner[3], but I think the most appropriate tool to use in this case could be Orange[4]. Now I'm doing some experiments in using its APIs from C++ and Python.

