Machine Translation: Mindmap
Basic concepts
- homonymy, ambiguity, polysemy, coreference, anaphora, word order, presupposition, garden path, FAHQMT
History of MT
- Georgetown experiment
- ALPAC report
Rule-based MT (RBMT)
- Classification of rule-based MT systems
- Direct
- METEO
- Transfer
- Transfer rules
- PC translator, SYSTRAN
- Interlingua (KBMT ‒ Knowledge-based MT)
- Rosetta, KBMT-89
- Direct
- Vauquois’ triangle
- tokenization,
- Sentence segmentation
- Morphology level
- Lexical level
- listeme
- principle of compositionality
- multiword expression
- Named entity (NE)
- Word sense disambiguation (WSD)
- Word sense representation
- Lexicons
- Statistical dictionary
- Bilingual extraction from parallel data
- Co-occurrence statistics
- LogDice
- Syntax level
- Grammar
- Syntactic analysis / parsing
- Top-down analysis
- Bottom-up analysis
- Syntax representation / formalism
- Constituent (phrasal) structure
- Dependency
- Context-free grammar
- Tree-adjoining grammar
- Garden path
- Semantic level
Statistical MT
- Noisy channel principle
- Zipf’s law, probability distribution, conditional probability, Bayes’s rule
- Language model
- N-grams
- Chain rule
- Markov’s rule (assumption)
- Maximum likelihood estimation
- Entropy, cross entropy
- Perplexity
- Shannon’s game
- Smoothing
- Add-one (Laplace), Add-α, Deleted estimation, Good-Turing
- Interpolation, back-off
- Hapax legomenon, singleton
- Out-of-vocabulary, zero-frequency, rare word
- Translation model
- Word alignment
- Lexical translation
- IBM models I-V
- Fertility model
- Null token
- Expectation-Maximization algorithm
- Phrase-based translation model
- Consistent phrase
- Decoding
- Beam search
- Moses
- Parallel corpora
- Sentence alignment
- Gale-Church, Hunalign, Bleualign
- Word alignment (Giza++, Moses)
- EUR-Lex, OPUS, Hansards, Europarl, Acquis communautaire, InterCorp, Tatoeba
- Translation memories (TMX, XLIFF)
- DGT, MyMemory
- Comparable corpora
- Sentence alignment
Neural MT
- 1-of-V coding, one-hot representation
- Word embeddings, distributed representation
- skip-gram, CBOW, fasttext
- sentence embeddings
- document embeddings
- Feed-forward model
- Recurrent NN, bi-RNN
- Attention mechanism
- Long short term memory (LSTM)
- Encoder-decoder model
Hybrid MT
Computer-assisted translation
- Translation memory
- SDL Trados
Evaluation of MT
- Fluency, adequacy, accuracy, intelligibility, correlation, metrics, post-editing, reference translation
- Interannotator agreement (IAA)
- Manual evaluation
- Automatic evaluation
- BLEU
- NEVA
- WAFT
- TER / HTER
- Meteor
published: 2019-12-29
last modified: 2023-12-22
https://vit.baisa.cz/notes/learn/mt-mindmap/
last modified: 2023-12-22
https://vit.baisa.cz/notes/learn/mt-mindmap/