CUED-RNNLM v1.0

Usage

./rnnlm.cued.v1.0 [-train | -ppl | -nbest | -sample] [-options]

Train

example (CE training): ./rnnlm.cued.v1.0 -train -trainfile data/train.dat -validfile data/dev.dat -device 0 -minibatch 64 -chunksize 32 -layers 31858:200i:200g:20002 -traincrit ce -inputwlist ./wlists/input.wlist -outputwlist ./wlists/output.wlist -debug 2 -randseed 1 -writemodel h200.mb64/rnnlm.txt -learnrate 1.0

example (VR training): ./rnnlm.cued.v1.0 -train -trainfile data/train.dat -validfile data/dev.dat -device 0 -minibatch 64 -chunksize 32 -layers 31858:200i:200g:20002 -traincrit vr -lognormconst 9.0 -vrpenalty 0.5 -inputwlist ./wlists/input.wlist -outputwlist ./wlists/output.wlist -debug 2 -randseed 1 -writemodel h200.mb64/rnnlm.txt -learnrate 1.0

example (NCE training): ./rnnlm.cued.v1.0 -train -trainfile data/train.dat -validfile data/dev.dat -device 0 -minibatch 64 -chunksize 32 -layers 31858:200i:200g:20002 -traincrit nce -lognormconst 9.0 -ncesample 1000 -inputwlist ./wlists/input.wlist -outputwlist ./wlists/output.wlist -debug 2 -randseed 1 -writemodel h200.mb64/rnnlm.txt -learnrate 1.0

example (CE training with additional input feature): ./rnnlm.cued.v1.0 -train -trainfile data/train.fea.dat -validfile data/dev.fea.dat -device 0 -minibatch 64 -chunksize 32 -layers 31858:200i:200g:20002 -traincrit ce -feafile ./data/feature.mat -inputwlist ./wlists/input.wlist -outputwlist ./wlists/output.wlist -debug 2 -randseed 1 -writemodel h200.mb64/rnnlm.txt -learnrate 1.0

Note: when feature matrix is specified, the train and valid data should add the feature id in each line

-trainfile <string> : train text file, each line started with <s> and ended with </s>

-validfile <string> : valid text file, same format as train file

-feafile   <string> : additional feature file in the input layer, the first line is the number and dimension of feature. The train and valid file need to modify if feafile is used, the first iterm in each line is the feature id, followed by a sentence.

-device <int>: GPU device id for RNNLM training (default: 0)

-minibatch <int> : specify the minibatch size for RNNLM training (default: 32)

-chunksize <int> : specify the chunk size for RNNLM training (default: 32)

-layers <int>:<int>i:<int><char>:...:<int>: specify the model structure of RNNLM (including input and output layer).

the first layer has to be 'i', which is a linear projection layer, the second layer can be has severl options:

'r': sigmoid based recurrent layer

'g': GRU based recurrent layer

'm': LSTM based recurrent layer

'x': GRU with highway connection based recurrent layer

'y': LSTM with highway connection based recurrent layer

'l': linear layer

-clipping <float> : specify the clipping for bp error (default: 5)

-dropout <float> : specify the dropout rate for training (default: 0.0)

-traincrit <string> : specify the training criterion for RNNLM [ce (default) | nce | vr]

-lrtune <string> : specify the method of learning rate tuning for RNNLM training [newbob (default) | adagrad | rmsprop]

-inputwlist <string> : specify the input word list for RNNLM training

-outputwlist <string> : specify the output word list for RNNLM training

-learnrate <float> : specify the initial learning rate for RNNLM training (default: 0.8)

-momentum <float> : specify the momentum for RNNLM training (default: 0.0)

-vrpenalty <float> : specify the penalty for RNNLM training with variance regularization (default: 0.0)

-ncesample <int> : specify the sample number for NCE based RNNLM training (default: 10)

-lognormconst <float> : specify the log norm const for NCE training and evaluation without normalization (default: -1.0)

-cachesize <int> : specify the cache size for RNNLM training (default: 0)

-debug <int> : specify the debug level (default: 1)

-nthread <int> : specify the number of thread for computation (default: 1)

-randseed <int> : specify the rand seed to generate rand value (default: 1)

-readmodel <string> : specify the RNNLM model to be read

-writemodel <string> : specify the RNNLM model to be written

PPL evaluation (-ppl)

example: ./rnnlm.cued.v1.0 -ppl -readmodel h200.mb64/rnnlm.txt -testfile data/test.dat -inputwlist ./wlists/input.wlist -outputwlist ./wlists/output.wlist -nglmstfile ng.st -lambda 0.5 -debug 2

-readmodel <string> : specify the RNNLM model to be read

-testfile <string> : specify the test file for RNNLM evaluation

-nglmstfile <string> : specify the ngram lm stream file for interpolation

-lambda <float> : specify the interpolation weight for RNNLM when interpolating with N-Gram LM (default: 0.5)

-fullvocsize <int> : specify the full vocabulary size, all OOS words will share the probability

-inputwlist <string> : specify the input word list for RNNLM training

-outputwlist <string> : specify the output word list for RNNLM training

-debug <int> : specify the debug level (default: 1)

N-best rescore (-nbest)

example: ./rnnlm.cued.v1.0 -nbest -readmodel h200.mb64/rnnlm.txt.nbest -testfile data/test.dat -inputwlist ./wlists/input.wlist -outputwlist ./wlists/output.wlist -nglmstfile ng.st -lambda 0.5 -debug 2

-readmodel <string> : specify the RNNLM model to be read

-testfile <string> : specify the test file for RNNLM evaluation

-nglmstfile <string> : specify the ngram lm stream file for interpolation

-lambda <float> : specify the interpolation weight for RNNLM when interpolating with N-Gram LM (default: 0.5)

-fullvocsize <int> : specify the full vocabulary size, all OOS words will share the probability

-inputwlist <string> : specify the input word list for RNNLM training

-outputwlist <string> : specify the output word list for RNNLM training

-debug <int> : specify the debug level (default: 1)