Difference between revisions of "User:Francis Tyers/Perceptron"
| Line 5: | Line 5: | ||
The classifier consists of: |
The classifier consists of: |
||
* An input vector, <math>x</math> |
|||
* Binary features |
|||
* |
* A weight vector, <math>w</math> |
||
* Threshold, <math>\theta</math> |
* Threshold, <math>\theta</math> |
||
Revision as of 22:33, 8 November 2014
A perceptron is a classifier that
The classifier consists of:
- An input vector,
- A weight vector, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w}
- Threshold, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \theta}
Learning algorithm
Example
Here is a worked example of a perceptron applied to the task of lexical selection. Lexical selection is the task of choosing a target translation Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t*} for a given source word Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s} in a context Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle C} out of a set of possible translations Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle T} . A perceptron makes a classification decision for a single class, so we need to train a separate perceptron for each possible target word selection.
In the example,
- Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s} = estació
- Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle T} = {season, station}
- Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t*} = season
Features
The features we will be working with are ngram contexts around the "problem word". These can be extracted from the word alignments calculated from a parallel corpus.
| Catalan | English |
|---|---|
| Durant l' estació seca les pluges són escasses. | During the dry season it rains infrequently. |
| L' estiu és una estació de l' any. | Summer of one of the seasons of the year. |
| Barcelona-Sants és una estació de tren a Barcelona. | Barcelona-Sants is a train station in Barcelona. |
| ... | |
Training data
- s = estació, t* = season
| Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle c_i} | |
|---|---|
_ sec |
1 |
_ de el any |
1 |
_ de tren |
0 |
_ de el línia |
0 |
_ humit |
1 |
_ plujós |
1 |
un _ a |
0 |
Feature vector
This is the above training data expressed as an input vector Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_j} to the perceptron.
| _ sec | _ de el any | _ de tren | _ de la línia | _ humit | _ plujós | un _ a | Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d} | |
|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | |
| 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | |
| 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | |
| 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | |
| 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Weight vector
| Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w_0} | Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w_1} | Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w_2} | Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w_3} | Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w_4} | Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w_5} | Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w_6} |
|---|---|---|---|---|---|---|
| _ sec | _ de el any | _ de tren | _ de la línia | _ humit | _ plujós | un _ a |
| 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |