Difference between revisions of "Emacs"
(superceded by init-apertium.el) |
|||
Line 40: | Line 40: | ||
== dix and transfer == |
== dix and transfer == |
||
=== Quickstart for non-emacs users === |
|||
If you just want to get emacs set up for dix editing with the minimum of hassle, here is a howto. This assumes you have emacs version '''23 or higher''' installed (but see discussion page if you're stuck with an old version). First execute (paste) the following commands in your terminal: |
|||
mkdir -p ~/.emacs.d |
|||
cd ~/.emacs.d |
|||
wget -O dix.el https://svn.code.sf.net/p/apertium/svn/trunk/apertium-tools/dix.el |
|||
cd .. |
|||
touch ~/.emacs |
|||
Then open the file ~/.emacs in an editor (like vi) and enter the following: |
|||
<pre> |
|||
; Start of dix-mode setup |
|||
(add-to-list 'load-path "~/.emacs.d") ; path to the folder where you have dix.el |
|||
(autoload 'dix-mode "dix" |
|||
"dix-mode is a minor mode for editing Apertium XML dictionary files." t) |
|||
(add-to-list 'auto-mode-alist '("\\.dix\\'" . nxml-mode)) ; turn on nxml-mode for dix-files |
|||
(add-to-list 'auto-mode-alist '("\\.t[0-9]x\\'" . nxml-mode)) ; turn on nxml-mode for transfer files |
|||
(add-hook 'nxml-mode-hook ; turn on dix-mode for transfer or dix-files after nxml-mode |
|||
(lambda () (and buffer-file-name |
|||
(string-match "\\.\\(dix\\|t[0-9]x\\)$" buffer-file-name) |
|||
(dix-mode 1)))) |
|||
; turn on schema-based completion with C-RET (in emacs 24 and up: M-TAB or C-M-i): |
|||
(if (boundp 'nxml-completion-hook) |
|||
(add-to-list 'nxml-completion-hook 'rng-complete) |
|||
(setq nxml-completion-hook '(rng-complete))) |
|||
; Optional, gives more handy keybindings in nxml-mode: |
|||
(setq nxml-sexp-element-flag t ; treat <e><p>...</p></e> like (e (p ...)) for C-M-f/b/k/d/u |
|||
nxml-slash-auto-complete-flag t) ; complete the element on </ |
|||
(add-hook 'nxml-mode-hook (lambda () (define-key nxml-mode-map (kbd "M-D") 'nxml-backward-down-element))) |
|||
; Optional, lets you erase whole <b/> or comment with a single press of backspace: |
|||
(setq dix-hungry-backspace t) |
|||
; The rest of this file is optional, it turns on CUA mode setup, |
|||
; which makes copy-pasting in Emacs behave more like other editors - see http://www.emacswiki.org/CuaMode |
|||
(cua-mode t) |
|||
(setq cua-auto-tabify-rectangles nil) ; Don't tabify after rectangle commands |
|||
(setq cua-keep-region-after-copy t) ; Standard Windows behaviour |
|||
</pre> |
|||
See also the [[Emacs#Validation_quickstart|Validation quickstart]] for auto-validation and schema-based completion. |
|||
The following sections give more info on the general xml editing mode nxml, dix-mode and validation. |
|||
=== nxml-mode === |
=== nxml-mode === |
||
Emacs has a nice xml editing mode called |
Emacs has a nice xml editing mode called |
Revision as of 20:36, 6 April 2016
Contents |
Info on using Emacs for Apertium-related tasks.
Quickstart
There is an init file in SVN that will give your Emacs some useful Apertium-related packages and settings, including:
- dix-mode, for XML dictionary and transfer editing
- cg-mode, for Constraint Grammar rule editing and testing
- hfst-mode, for lexc/twol syntax highlighting
- C++ settings to match the indentation settings most used in Apertium
- tab-completion on words you've used a lot
To get that set up, simply check out https://svn.code.sf.net/p/apertium/svn/trunk/apertium-tools/emacs somewhere, and put
(load "/PATH/TO/apertium-tools/emacs/init-apertium.el")
in the file ~/.emacs.d/init.el
(you may have to mkdir ~/.emacs.d
first).
Then start up Emacs, and it will download some new packages on first startup.
If you ever want to update your installed Emacs packages, you do M-x list-packages
, then U x
.
Full example:
cd svn co https://svn.code.sf.net/p/apertium/svn/trunk/apertium-tools apertium-tools-emacs mkdir -p ~/.emacs.d echo '(load "~/apertium-tools-emacs/init-apertium.el")' > ~/.emacs.d/init.el emacs
Validation slow?
The above init-apertium.el turns on on-the-fly XML validation, which can be slow on old computers. If editing large .dix files seems too slow, try turning off one or both of the validators by putting
(add-hook 'nxml-mode-hook (lambda () (rng-validate-mode 0)) 'append) (add-hook 'dix-mode-hook (lambda () (flycheck-mode 0)) 'append)
in your ~/.emacs.d/init.el
dix and transfer
nxml-mode
Emacs has a nice xml editing mode called nXML, with syntax highlighting, movement commands to navigate through the XML (out of, into, across elements, etc.). It also has validation, and can auto-complete using the XML schema if a schema file is available.
If your emacs doesn't turn on nxml-mode automatically when you open an xml-file, you can add the following line to your ~/.emacs
file:
(add-to-list 'auto-mode-alist '("\\.dix\\'" . nxml-mode))
Emacs 23 or newer includes nxml-mode, but if your version of emacs doesn't: download nxml-mode-20041004.tar.gz (or whatever the newest version is) from http://www.thaiopensource.com/download/, extract somewhere, and add the following to your .emacs
file:
(load "/path/to/nxml-mode-20041004/rng-auto.el") ; full path to the _file_ rng-auto.el which you just extracted
keybindings
Some very handy nxml functions don't have keybindings by default. Here are some lines for your ~/.emacs that define keys for the most useful ones:
(setq nxml-sexp-element-flag t ; treat <e><p>...</p></e> like (e (p ...)) for C-M-f/b/k/d/u nxml-slash-auto-complete-flag t) ; complete the element on </ (add-hook 'nxml-mode-hook (lambda () (define-key nxml-mode-map (kbd "M-D") 'nxml-backward-down-element)))
With these in your ~/.emacs, you can use the following keys:
- C-M-f to move forward one element (e.g. from <e> to </e>)
- C-M-b to move backward one element (e.g. from </e> to <e>)
- C-M-d to move into one element (e.g. from <e> to <p>)
- M-S-d (meta-shift-d) to move into one element backwards (e.g. from after </e> to after </p>)
- C-M-u to move out of one element (e.g. from <p> to <e>)
- C-M-k to kill (cut) one element
- </ to write the end tag of whatever element you're in (e.g. after typing <e><p>…</p></, it'll complete with e>)
dix-mode
In svn there is a minor mode for editing .dix files, dix.el (or use svn co https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-tools
). It needs nxml-mode (see above, installed by default in emacs version 23 or above). There are some short screencasts here.
Put the following in your ~/.emacs
file to use it:
(add-to-list 'load-path "/path/to/dix.el-folder") ; ie. path to the _folder_ containing dix.el (autoload 'dix-mode "dix" "dix-mode is a minor mode for editing Apertium XML dictionary files." t) (add-hook 'nxml-mode-hook (lambda () (and buffer-file-name (string-match "\\.dix$" buffer-file-name) (dix-mode 1))))
I use Apertium-dixtools-formatted dix, not all functions have been tested in the regular format, but I've tried to make the functions use XML-movements so mostly they should work no matter how you format your files.
When you open emacs (after adding the above lines to ~/.emacs) and load a .dix-file, you should see a menu named dix. Most of the functions added by dix-mode are shown in this menu (which also shows their keyboard shortcuts). Hovering over a menu-item might give a little popup-help. The Help for dix-mode entry will show all the user functions defined by dix-mode. The keyboard shortcuts are in general a lot more useful than the menu bar, which is mostly there in case you forget which buttons to press... Remember: C is Control, S is Shift, M is alt (well, M stands for Meta, but that's typically alt).
Some useful functions in dix-mode:
- Movement and editing:
- The space bar inserts a <b/> in <r>, <l> or <i> elements; a
_
in par/pardef names; otherwise a plain space. This works with the . (repeat) command as well, if you use the vim keybindings. - M-n and M-p move to the next and previous "important bits" of <e>-elements (just try it!).
- The space bar inserts a <b/> in <r>, <l> or <i> elements; a
- Copying elements and adding restrictions:
- C-c C just creates a copy of the current <e> element, putting it below the current one
- C-c L and C-c R also make a copy of the current <e> element, but with an LR or RL restriction
- C-TAB cycles between the restriction possibilities LR, RL or none for the current <e> element
- C-S-TAB, used with elements that have the slr/srl attribute, will swap the sense translation of this <e> with the <e> above
- Creating elements from plain text:
- C-c g in a monodix guesses the pardef for a word based on the suffix. Write a word in the bottom of a dix files, place point somewhere in the middle of the word, and hit C-c g, it'll try to find words earlier in the file that have the same ending (characters after point)
- C-c x in a monodix or bidix turns a word-list into <e> entries using the above <e> entry as a template. Words should be written one per line. You can use it in a bidix by writing the left-side, then a colon (:) then the right-side. Assumes that the entry used as a template is written all on one line.
- Pardef viewing and manipulation:
- C-c G will go to the pardef of the nearest <par>
- the place you left is saved in the standard emacs fashion, so you can go back by pressing C-u C-SPACE
- C-c V will show the pardef of the nearest <par> in another window
- C-c S will sort a pardef by its right-hand-side, <r>.
- You can also do M-x dix-sort-e-by-l to sort the selected <e;> elements by the contents of their <l> element
- C-c D (in a pardef or an <e>) will print a list of all pardefs which have the same suffixes as this one (where a 'suffix' is the contents of an <l>-element), useful for finding duplicates. Note: it ignores the tags
- Inside a pardef, C-c A shows all usages of that pardef within the dictionaries represented by the variable `dix-dixfiles'
- C-c G will go to the pardef of the nearest <par>
Note: capital letters means you have to press shift. If you fancy other keyboard shortcuts, copy the relevant define-key
entries from the bottom of dix.el
, put them in your ~/.emacs, e.g. to add F12 as an alternative to C-c V:
(add-hook 'dix-mode-hook (lambda nil (define-key dix-mode-map (kbd "<f12>") 'dix-view-pardef)))
(the whole add-hook thing is needed since dix-mode is not loaded until the first .dix-file is loaded)
Also, if you like having all <i> elements aligned at eg. column 25, select a region and do M-x align to achieve that (this also aligns <p> to 10 and <r> to 44, for bidix). These numbers are customizable with M-x customize-group RET dix. (Ie. there's no extra indentation function, but then, nxml already has that.)
dix-mode for transfer rules
There are some transfer-specific functions in dix-mode that make it worth turning on in transfer mode files too, e.g. C-c n, which lets you enter a rule number to go to (useful when tracing with apertium-transfer -t
). The .emacs in the Quickstart section will turn on nxml-mode and dix-mode in transfer files (ie. all files with the suffix .t1x, .t2x, .t3x, etc.).
M-n and M-p (go to next/previous useful position) should also Do What You Mean in transfer files.
Validation (Relax NG-schemas)
Validation quickstart
Copy this script to a file like "makeschema.sh", making sure to set APERTIUMSRC to the folder containing the apertium source, and LTTOOLBOXSRC to the folder containing the lttoolbox source:
#!/bin/bash ## Set these to the correct paths: APERTIUMSRC="$HOME/apertium-svn/trunk/apertium" LTTOOLBOXSRC="$HOME/apertium-svn/trunk/lttoolbox" SCHEMAFILE=~/.emacs.d/schemas.xml # Change SCHEMAFILE if you want to put your schema locating file somewhere else. # Note: this path can't have quotes around it for some reason ## No changes needed below tmp=$(mktemp -dt schema.XXXXXXXXXXX) trap 'rm -rf "${tmp}"' EXIT trangjar=${tmp}/trang-20091111/trang.jar ( cd "$tmp" echo "Downloading schema converter ..." wget -q https://jing-trang.googlecode.com/files/trang-20091111.zip unzip -q trang-20091111.zip ) echo "Creating ${SCHEMAFILE}" cat > ${SCHEMAFILE} <<EOF <?xml version="1.0"?> <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> <typeId id="dix" uri="${LTTOOLBOXSRC}/lttoolbox/dix.rnc"/> <typeId id="transfer" uri="${APERTIUMSRC}/apertium/transfer.rnc"/> <typeId id="interchunk" uri="${APERTIUMSRC}/apertium/interchunk.rnc"/> <typeId id="postchunk" uri="${APERTIUMSRC}/apertium/postchunk.rnc"/> <typeId id="format" uri="${APERTIUMSRC}/apertium/format.rnc"/> <typeId id="tagger" uri="${APERTIUMSRC}/apertium/tagger.rnc"/> <typeId id="modes" uri="${APERTIUMSRC}/apertium/modes.rnc"/> <documentElement localName="dictionary" typeId="dix"/> <documentElement localName="transfer" typeId="transfer"/> <documentElement localName="interchunk" typeId="interchunk"/> <documentElement localName="postchunk" typeId="postchunk"/> <documentElement localName="format" typeId="format"/> <documentElement localName="tagger" typeId="tagger"/> <documentElement localName="modes" typeId="modes"/> <uri pattern="*.dix" typeId="dix"/> <uri pattern="*.t1x" typeId="transfer"/> <uri pattern="*.t2x" typeId="interchunk"/> <uri pattern="*.t3x" typeId="interchunk"/> <!-- Some pairs have t3x as postchunk, others t4x or even t5x... but if one of the documentElement rules match, these rules are ignored since they're below them. --> </locatingRules> EOF convert () { echo "Creating rnc files in $(pwd)" for inf in *.dtd; do out=${inf%%.dtd}.rnc java -jar "${trangjar}" "${inf}" "${out}" done } for dir in "${APERTIUMSRC}/apertium" "${LTTOOLBOXSRC}/lttoolbox"; do ( cd "$dir" && convert ) done cat <<EOF Done! Now inform nxml-mode about ${SCHEMAFILE} by appending this to ~/.emacs.d/init.el: (add-hook 'nxml-mode-hook (lambda () (add-to-list 'rng-schema-locating-files "${SCHEMAFILE}"))) EOF
Run it like
bash makeschema.sh
and add the hook to your ~/.emacs.d/init.el as instructed by the script.
More about nxml validation
nxml-mode uses compact Relax NG schemas (.rnc
files) for validation (without these, XML is only checked for well-formedness by nxml-mode).
You can make compact Relax NG schemas using trang, see the above script.
Note: if you want to auto-complete using the schema (keyboard shortcut: C-RET in emacs 23; changed to M-TAB or C-M-i in emacs 24 and up), you should have (add-to-list 'nxml-completion-hook 'rng-complete)
somewhere in your ~/.emacs
.
You can toggle validation using the XML menu at the top of the screen, or the keyboard shortcut C-c C-v
.
See http://www.dpawson.co.uk/relaxng/nxml/schemaloc.html#d574e168 for how to write a schema.xml file to automatically find the right schema, or just use the quickstart script above.
Linting with flycheck
If you use http://www.flycheck.org/, you can get linting of dix files like this:
(flycheck-define-checker xml-dix "Check using the dix.xsd from apertium-validate-dictionary." ;; TODO: Why doesn't plain apertium-validate-dictionary work here? :command ("xmllint" "--schema" "/usr/share/lttoolbox/dix.xsd" "--noout" "-") :standard-input t :error-patterns ((error line-start "-:" line ": " (message) line-end)) :predicate (lambda () (and (buffer-file-name) (string-match "\\.dix$" buffer-file-name))) :modes (xml-mode nxml-mode)) (add-to-list 'flycheck-checkers 'xml-dix) ;; Turn on flycheck-mode automatically in dix-mode: (add-hook 'dix-mode-hook #'flycheck-mode)
This assumes you installed lttoolbox from packages; if not, change /usr/share/lttoolbox/dix.xsd to the correct path.
If you've got https://github.com/ggm/vm-for-transfer-cpp compiled and installed to your $PATH, you can use that for some extra info on transfer errors like this:
(flycheck-define-checker apertium-transfervm "Alternative compiler for apertium transfer files." :command ("apertium-compile-transfer" "-i" source "-o" null-device) :error-patterns ((error line-start "Error: line " (id (one-or-more (not (any ",")))) ", " (message (one-or-more not-newline)) line-end)) :error-filter (lambda (errors) (dolist (err errors) (let* ((line (string-to-number (replace-regexp-in-string "[^0-9]+" "" (flycheck-error-id err))))) ;; TODO: line number is at the end of the rule element, not very accurate! (setf (flycheck-error-line err) line))) errors) :predicate (lambda () (and (buffer-file-name) (string-match "\\.t[0-9s]x$" buffer-file-name))) :modes (nxml-mode)) (add-to-list 'flycheck-checkers 'apertium-transfervm) ;; If you've got the binary somewhere outside your $PATH, set it like this: (setq flycheck-apertium-transfervm-executable "/home/me/src/vm-for-transfer-cpp/apertium-compile-transfer")
Note that the line numbers given by transfervm are at the end of the matching rule, not always at the exact line where the error occurred. But it's better than segfaults.
Yasnippet
Yasnippet is a snippet-expansion package for Emacs. It lets you write boilerplate faster. This section shows how to use the snippets made for dix-mode. There's a short screencast of it at https://asciinema.org/a/11192
To use, first install yasnippet by doing M-x package-refresh-contents
and M-x package-install RET yasnippet RET
(assuming you've added melpa to your package-archives).
Then get the dix-mode snippets, this will put them (and a bunch of other snippets) into the default snippets path:
git clone -b dix-mode https://github.com/unhammer/yasnippet-snippets.git ~/.emacs.d/snippets
Then put this into ~/.emacs.d/init.el to make the snippets available in dix-mode:
(eval-after-load 'yasnippet '(progn (setq yas-verbosity 1) (yas-reload-all) (remhash 'nxml-mode yas--tables) ; until https://github.com/AndreaCrotti/yasnippet-snippets/issues/41 is solved (add-to-list 'yas-key-syntaxes 'dix-yas-skip-backwards-to-key) ; The default is to use a point-and-click menu when there are several choices, I prefer ido: (setq yas-prompt-functions '(yas-ido-prompt yas-completing-prompt yas-dropdown-prompt yas-no-prompt)) )) (add-hook 'dix-mode-hook 'yas-minor-mode)
C++
See Emacs C style for Apertium hacking.
HFST
CG
There is a CG-mode for emacs in the vislcg3 repository (see Constraint Grammar). It's installed by default to $prefix/share/emacs/site-lisp/cg3-mode.el. Since /usr and /usr/local are in the default emacs load-path, you should be able to get going by simply putting this in your ~/.emacs.d/init.el:
(autoload 'cg-mode "cg.el" ; specify the full path to cg.el here if you installed to a non-standard prefix "cg-mode is a major mode for editing Constraint Grammar files." t) (add-to-list 'auto-mode-alist '("\\.cg3\\'" . cg-mode)) (add-to-list 'auto-mode-alist '("\\.rlx\\'" . cg-mode))
This will load cg.el and run cg-mode when you open files with names ending in .cg3 or .rlx.
You can use C-; (alternative keybinding M-#) to quickly comment/uncomment a rule (quick demo). C-M-a/e move back and forth full rules (alternatively, M-a/e moves back/forth by "sentences" which includes commented rules).
If you want to test the CG while you're working on it from within Emacs, you can add a line like
# -*- cg-pre-pipe: "apertium -d . nb-nn-morph|cg-conv -a 2>/dev/null" -*-
to the top of your CG file (replace nb-nn-morph
for whatever mode that runs everything up until cg-proc
in your regular mode, or just use something like lt-proc some.automorf.bin|cg-conv -a 2>/dev/null
). Then close and re-open the file, and hit ! when you're asked whether you approve of the command (you only have to do this once).
Now you can do C-c C-i to type in some test text, then C-c C-c (either in that buffer or in the CG buffer) to test the CG on the text. You can do C-c c to toggle if you want to test the text for every change you do (some might find that annoying). You can click REMOVE, SELECT, MAP, ADD etc. in the output to go to the corresponding line, or use C-c C-n / C-c C-p to go back and forth between occurrences (also works for warnings and compile errors).
If you have a lot of input sentences you want to test at once, you can hide all analyses, except ones matching some regex. Select the output buffer, then hit u and type in a regex for analyses you want to see (e.g. vblex
, or \b\(sg\|pl\)\b
to match pl or sg but not the string "place"). Now you should see only the wordforms in the output buffer, except for analyses containing your exceptions. Type h to toggle between a full view and hiding (click a word when hiding and press h to ensure you're scrolled into the analysis of that word). See also the variable cg-sent-tag
which is used to keep linebreaks after certain tags; if you use a non-Apertium sentence tag you may want to put in your ~/.emacs something like (setq cg-sent-tag "\\bpunct\\b")
(if your sentence tag was punct
).
IRC
Do M-x erc
to start the IRC client. See http://www.emacswiki.org/emacs/ErcBasics and http://emacs-fu.blogspot.com/2009/06/erc-emacs-irc-client.html for more info.
See also
- ZenCoding lets you type
section#main>e*2
and it turns it into the full<section id="main"><e></e><e></e></section>
, etc. - YASnippet is a template system (automatically expand abbreviations)