MaltOptimizer: A System For MaltParser Optimization

Download MaltOptimizer-SPMRL

MaltOptimizer is distributed under this open source license. It is easy to install and use. If you have any questions contact us.

Latest release

MaltOptimizer-SPMRLSeptember 2 2013MaltOptimizerSPMRL.tar.gz MaltOptimizerSPMRL.zip

Usage

Call the new MaltOptimizer implementation for all (3) phases as usual:
  java -jar MaltOptimizerSPMRL.jar -p 1 -m <MaltParser jar path> -c <training set>
  java -jar MaltOptimizerSPMRL.jar -p 2 -m <MaltParser jar path> -c <training set>
  java -jar MaltOptimizerSPMRL.jar -p 3 -m <MaltParser jar path> -c <training set>
when you get your optimized settings, you should run:
java -jar FromSimpleToComplex.jar <training-set> <training-set> <new-trainingset-10+n>
java -jar FromSimpleToComplex.jar <training-set> <test-set> <new-testset-10+n>
Being <training-set> your training set used to get the optimized settings and <test-set> your held-out test set used for evaluation.
Being <new-testset-10+n> and <new-trainingset-10+n> your new test and training sets with the extra columns containing morphological features.

After that, you may train and parse with MaltParser.
You may also check that you are using the correct files in the finalOptionsFile.xml produced by MaltOptimizerSPMRL
java -jar <MaltParser jar path> -f finalOptionsFile.xml -F <path to the feature model suggested>
java -jar <MaltParser jar path> -c langModel.xml -i <new-testset-10+n>
You also have the option of transforming your output file to a 10-column format by using
java -jar FromComplexToSimple.jar <training-set> <output-test-set-10+n> <output-10columns>
After that, you can simply evaluate by using evaluation tools, such as eval.pl or eval07.pl.
(The script: java -jar FromSimpleToComplexForTestSet.jar is for test sets that only have 6 columns - blind test sets.)

Here you can find the scripts mentioned above. Scripts.zip
Note that your data set should be in CoNLL-X data format, with the FEATS columns of the format a=x|b=y|...|c=z.