lt-proc man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

lt-proc(1)							    lt-proc(1)

NAME
       lt-proc	-  This	 application is part of the lexical processing modules
       and tools ( lttoolbox )

       This tool is part of the	 apertium  machine  translation	 architecture:
       http://www.apertium.org.

SYNOPSIS
       lt-proc	[  -a | -b | -o | -c | -d | -e | -g | -n | -p | -s | -t | -v |
       -h -z -w ] fst_file [input_file [output_file]]

       lt-proc [ --analysis | --bilingual | --surf-bilingual  |	 --case-sensi‐
       tive  |	--debugged-gen	|  --decompose-nouns  |	 --generation | --non-
       marked-gen | --tagged-gen | --post-generation | --sao |	--translitera‐
       tion | --null-flush --dictionary-case --decompose-compounds | --version
       | --help ] fst_file [input_file [output_file]]

DESCRIPTION
       lt-proc is the application responsible for providing the	 four  lexical
       processing functionalities

	      · morphological analyser	( option -a )

	      · lexical transfer  ( option -n )

	      · morphological generator	 ( option -g )

	      · post-generator	( option -p )

       It  accomplishes	 these tasks by reading binary files containing a com‐
       pact and efficient representation of dictionaries (a class  of  finite-
       state transducers called augmented letter transducers). These files are
       generated by lt-comp(1).

       It is worth to mention that some characters (`[', `]', `$',  `^',  `/',
       `+')  are  special chars used for format and encapsulation. They should
       be escaped if they have to be used literally, for  instance:  `['...`]'
       are ignored and the format of a linefeed is `^...$'.

OPTIONS
       -a, --analysis
	      Tokenizes	 the  text  in	surface	 forms	(lexical units as they
	      appear in texts) and delivers, for each  surface	form,  one  or
	      more  lexical  forms  consisting	of lemma, lexical category and
	      morphological  inflection	 information.  Tokenization   is   not
	      straightforward  due  to the existence, on the one hand, of con‐
	      tractions, and, on the other hand, of multi-word lexical	units.
	      For  contractions, the system reads in a single surface form and
	      delivers the corresponding sequence of lexical forms. Multi-word
	      surface  forms  are  analysed  in a left-to-right, longest-match
	      fashion. Multi-word surface forms may be invariable (such	 as  a
	      multi-word  preposition  or conjunction) or inflected (for exam‐
	      ple, in es, "echaban de menos", "they missed", is a form of  the
	      imperfect	 indicative  tense  of	the verb "echar de menos", "to
	      miss"). Limited support for some kinds of	 discontinuous	multi-
	      word units is also available. Single-word surface forms analysis
	      produces output like the one in  these  examples:	  "cantar"  ->
	      `^cantar/cantar<vblex><inf>$'	   or	      `"daba"	    ->
	       `^daba/dar<vblex><pii><p1><sg>/dar<vblex><pii><p3><sg>$'.

       -b, --bilingual
	      Does lexical transference,  attaching  queues  of	 morphological
	      symbols not specified in the dictionaries. As the analysis mode,
	      supports multiple lexical forms in the  target  language	for  a
	      given  lexical form in the source language. Works tipically with
	      the output of apertium-pretransfer.

       -o, --surf-bilingual
	      As with -b, but takes input from apertium-tagger -p , with  sur‐
	      face  forms,  and if the lexical form is not found in the bilin‐
	      gual dictionary, it outputs the surface form of the word.

       -c, --case-sensitive
	      Use the literal case of the incoming characters

       -d, --debugged-gen
	      Morph. generation with all the stuff

       -e, --decompose-compounds
	      Try to treat unknown words as compounds, and decompose them.

       -w, --dictionary-case
	      Use the case information contained in the	 lexicon,  instead  of
	      the surface case (only applied in analysis mode).

       -g, --generation
	      Delivers a target-language surface form for each target-language
	      lexical form, by suitably inflecting it.

       -n, --non-marked-gen
	      Morphological generation (like  -g)  but	without	 unknown  word
	      marks (asterisk `*').

       -b, --tagged-gen
	      Morphological  generation (like -g) but retaining part-of-speech
	      tags.

       -p, --post-generation
	      Performs orthographical  operations  such	 as  contractions  and
	      apostrophations.	The  post-generator  is	 usually dormant (just
	      copies the input to the output) until  a	special	 alarm	symbol
	      contained	 in  some target-language surface forms wakes it up to
	      perform a particular string transformation if necessary; then it
	      goes back to sleep.

       -s, --sao
	      Input processing is in orthoepikon (previously `sao') annotation
	      system format: http://orthoepikon.sf.net.

       -t, --transliteration
	      Apply a transliteration dictionary

       -z, --null-flush
	      Flush output on the null character

       -v, --version
	      Display the version number.

       -h, --help
	      Display this help.

FILES
       input_file The input compiled dictionary.

SEE ALSO
       lt-expand(1), lt-comp(1), apertium-tagger(1), apertium(1).

BUGS
       Lots of...lurking in the dark and waiting for you!

AUTHOR
       (c) 2005,2006 Universitat d'Alacant / Universidad de Alicante.

				  2006-03-23			    lt-proc(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net