Emacs

From Apertium
Jump to navigation Jump to search

Emacs stuff:

Quickstart for non-emacs users

If you just want to get emacs set up for dix editing with the minimum of hassle, here is a howto. This assumes you have emacs version 23 or higher installed (but see discussion page if you're stuck with an old version). First execute (paste) the following commands in your terminal:

mkdir ~/.elisp
cd ~/.elisp
wget http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/apertium-tools/dix.el
cd ..
touch ~/.emacs

Then open the file ~/.emacs in an editor (like vi) and enter the following:

 ; Start of dix-mode setup
(add-to-list 'load-path "~/.elisp") ; path to the folder where you have dix.el
(autoload 'dix-mode "dix" 
   "dix-mode is a minor mode for editing Apertium XML dictionary files."  t)

(add-to-list 'auto-mode-alist '("\\.dix\\'" . nxml-mode)) ; turn on nxml-mode for dix-files
(add-hook 'nxml-mode-hook               ; turn on dix-mode for dix-files after nxml-mode
 	  (lambda () (and buffer-file-name
 			  (string-match "\\.dix$" buffer-file-name)
 			  (dix-mode 1))))
(add-to-list 'nxml-completion-hook 'rng-complete) ; turn on schema-based completion with C-RET

 ; Start of CUA mode setup - to make Emacs behave like other editors - see http://www.emacswiki.org/CuaMode
(cua-mode t)
(setq cua-auto-tabify-rectangles nil) ; Don't tabify after rectangle commands
(setq cua-keep-region-after-copy t) ; Standard Windows behaviour

See also the Validation quickstart for auto-validation.

nxml-mode

Emacs has a nice xml editing mode called nXML, with syntax highlighting, movement commands to navigate through the XML (out of, into, across elements, etc.). It also has validation, and can auto-complete using the XML schema if a schema file is available.

Note: since the dix-files can often get rather huge, syntax highlighting can make nXML a bit slow (at least if you're eg. planning on running a keyboard macro 10000 times). To speed it up, just temporarily turn off syntax highlighting with by typing M-x set-variable RET nxml-syntax-highlight-flag RET nil RET. Alternatively, use the dix.el function C-c H (dix-toggle-syntax-highlighting).

If your emacs doesn't turn on nxml-mode automatically when you open an xml-file, you can add the following line to your ~/.emacs file:

(add-to-list 'auto-mode-alist '("\\.dix\\'" . nxml-mode))

Emacs 23 or newer includes nxml-mode, but if your version of emacs doesn't: download nxml-mode-20041004.tar.gz (or whatever the newest version is) from http://www.thaiopensource.com/download/, extract somewhere, and add the following to your .emacs file:

  (load "/path/to/nxml-mode-20041004/rng-auto.el") ; full path to the _file_ rng-auto.el which you just extracted

dix-mode

Screenshot of dix.el in Aquamacs (fullscreen). Upper left window has output from dix-view-pardef, lower left shows rng schema completion. There is a red underline since a p can't be an empty element, as noted by the message in the minibuffer

In svn there is a minor mode for editing .dix files, dix.el (or use svn co https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-tools). It needs nxml-mode (see above).

Put the following in your ~/.emacs file to use it:

 (add-to-list 'load-path "/path/to/dix.el-folder") ; ie. path to the _folder_ containing dix.el
 (autoload 'dix-mode "dix" 
   "dix-mode is a minor mode for editing Apertium XML dictionary files."  t)
 (add-hook 'nxml-mode-hook
 	  (lambda () (and buffer-file-name
 			  (string-match "\\.dix$" buffer-file-name)
 			  (dix-mode 1))))

I use Apertium-dixtools-formatted dix, not all functions have been tested in the regular format.

Note: there's now a menu-bar, if you forget the keyboard shortcuts :-)

The minor mode adds keyboard shortcuts C-c L and C-c R which make LR or RL restricted copies of <e>'s (use C-TAB to cycle between restriction possibilities LR, RL or none, C-c C creates a copy without modifying restrictions), C-c G which finds the pardef of a dictionary entry (and lets you go back with C-u C-SPC) and C-c S which sorts a pardef by its right-hand-side <r>. M-n and M-p move to the next and previous "important bits" of <e>-elements (just try it!). Inside a pardef, C-c A shows all usages of that pardef within the dictionaries represented by the variable `dix-dixfiles', while C-c D gives you a list of all pardefs which use these suffixes (where a suffix is the contents of an <l>-element). The space bar inserts a <b/> in <r>, <l> or <i> elements (o/w a regular space).

Also, if you like having all <i> elements aligned at eg. column 25, the minor mode lets you do M-x align on a region to achieve that, and also aligns <p> to 10 and <r> to 44 (for bidix). These numbers are customizable with M-x customize-group RET dix. (Ie. there's no extra indentation function, but then nxml already has that.)

Validation (Relax NG-schemas)

Validation quickstart

Download and extract trang:

cd
wget http://jing-trang.googlecode.com/files/trang-20091111.zip
unzip trang-20091111.zip

Copy this script to a file like "makeschema.sh":

#!/bin/bash

## Set these to the correct paths:
APERTIUMSRC="~/apertium-svn/trunk/apertium"
TRANGJAR="~/trang-20091111/trang.jar"

## No changes needed below

echo "Creating ~/.elisp/schemas.xml"
cat > ~/.elisp/schemas.xml <<EOF
<?xml version="1.0"?>
<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
  <typeId id="dix" uri="${APERTIUMSRC}/apertium/dix.rnc"/>
  <typeId id="transfer" uri="${APERTIUMSRC}/apertium/transfer.rnc"/>
  <typeId id="interchunk" uri="${APERTIUMSRC}/apertium/interchunk.rnc"/>
  <typeId id="postchunk" uri="${APERTIUMSRC}/apertium/postchunk.rnc"/>
  <typeId id="format" uri="${APERTIUMSRC}/apertium/format.rnc"/>
  <typeId id="tagger" uri="${APERTIUMSRC}/apertium/tagger.rnc"/>
  <typeId id="modes" uri="${APERTIUMSRC}/apertium/modes.rnc"/>

  <documentElement localName="dictionary" typeId="dix"/>
  <documentElement localName="transfer" typeId="transfer"/>
  <documentElement localName="interchunk" typeId="interchunk"/>
  <documentElement localName="postchunk" typeId="postchunk"/>
  <documentElement localName="format" typeId="format"/>
  <documentElement localName="tagger" typeId="tagger"/>
  <documentElement localName="modes" typeId="modes"/>

  <uri pattern="*.dix" typeId="dix"/>
  <uri pattern="*.t1x" typeId="transfer"/>
  <uri pattern="*.t2x" typeId="interchunk"/>
  <uri pattern="*.t3x" typeId="interchunk"/>
  <!-- Some pairs have t3x as postchunk, others t4x or even t5x... but
       if one of the documentElement rules match, these rules are
       ignored since they're below them. -->
</locatingRules>
EOF

echo "Creating rnc files in ${APERTIUMSRC}/apertium"
cd ${APERTIUMSRC}/apertium
for DTD in `ls *.dtd`; do
    OUT=`echo $DTD | sed 's/dtd$/rnc/'`;
    CMD="java -jar ${TRANGJAR} $DTD $OUT"
    echo $CMD
    eval $CMD
done

echo "Now inform nxml-mode about ~/.elisp/schemas.xml by appending this to ~/.emacs:"
cat <<EOF

(add-hook 'nxml-mode-hook
	  (add-to-list 'rng-schema-locating-files "~/.elisp/schemas.xml"))

EOF

Run it like

sh makeschema.sh

and add the hook to your ~/.emacs as instructed.

More information

nxml-mode uses compact Relax NG schemas for validation (without these, XML is only checked for well-formedness by nxml-mode).

(There is a non-compact dix.rng here, while transfer.rng and modes.rng are in trunk/apertium/apertium.)

You can make compact Relax NG schemas (.rnc) using trang. Use a script like this to keep all your rnc's up-to-date:

cd /path/to/trunk/apertium/apertium
for DTD in `ls *.dtd`; do
    OUT=`echo $DTD | sed 's/dtd$/rnc/'`;
    CMD="java -jar /path/to/trang.jar $DTD $OUT"
    echo $CMD
    eval $CMD
done

Note: if you want to auto-complete using the schema (keyboard shortcut: C-RET), you should have (add-to-list 'nxml-completion-hook 'rng-complete) somewhere in your ~/.emacs.

You can toggle validation using the XML menu at the top of the screen, or the keyboard shortcut C-c C-v.

See http://www.dpawson.co.uk/relaxng/nxml/schemaloc.html#d574e168 for how to write a schema.xml file to automatically find the right schema.

See also