pytorch_end2end package¶

Module contents¶

class pytorch_end2end.CTCDecoder(beam_width=100, after_logsoftmax=False, blank_idx=0, time_major=False, labels=None, lm_path=None, lmwt=1.0, wip=1.0, oov_penalty=- 10, case_sensitive=True)[source]¶

Decoder class to perform CTC decoding

Parameters

beam_width –
width of beam (number of stored hypotheses), default 100.

If 1, decoder always perform greedy (argmax) decoding
after_logsoftmax –
if log_logits (logits after log softmax), default False

(with False decoder expects pure logits, not after softmax). Greedy decoding ignores this parameter and can work with pure logits, or after log softmax
blank_idx – id of blank label, default 0
time_major – if logits are time major (else batch major)
labels – list of strings with labels (including blank symbol), e.g. ["_", "a", "b", "c"]
lm_path – path to language model (ARPA format or gzipped ARPA)
lmwt – language model weight, default 1.0, makes sense only if language model is present
wip – word insertion penalty, default 1.0, makes sense only if labels are present
oov_penalty – penalty for each oov word, default -10.0
case_sensitive – obtain language model scores with respect to case, default False

decode(logits, logits_lengths=None)[source]¶

Performs prefix beam search decoding as described in https://arxiv.org/abs/1408.2873

Parameters

logits –
tensor with neural network outputs after logsoftmax

of shape (sequence_length, batch_size, alphabet_size) if time_major

else of shape (batch_size, sequence_length, alphabet_size)
logits_lengths – default None

Returns

namedtuple(decoded_targets, decoded_targets_lengths, decoded_sentences)

decoded_targets:: tensor with result targets of shape (batch_size, sequence_length), doesn’t contain blank symbols
decoded_targets_length:: tensor with lengths of decoded targets
decoded_sentences:: list of strings, shape (batch_size). If labels are None, list of empty string is returned.

decode_greedy(logits, logits_lengths=None)[source]¶

Performs greedy (argmax) decoding

Parameters

logits –
tensor with neural network outputs after logsoftmax

of shape (sequence_length, batch_size, alphabet_size) if time_major

else of shape (batch_size, sequence_length, alphabet_size)
logits_lengths – default None

Returns

(decoded_targets, decoded_targets_lengths, decoded_sentences)

decoded_targets:: tensor with result targets of shape (batch_size, sequence_length), doesn’t contain blank symbols
decoded_targets_length:: tensor with lengths of decoded targets
decoded_sentences:: list of strings, shape (batch_size). If labels are None, list of empty string is returned.

class pytorch_end2end.CTCEncoder(characters, blank_id=0, transform_fn=<method 'upper' of 'str' objects>)[source]¶

Simple CTC-encoder for text

clean(text)[source]¶

decode(ids_list)[source]¶

decode_pure(ids_list)[source]¶

encode(text)[source]¶

class pytorch_end2end.CTCLoss(size_average=None, reduce=None, after_logsoftmax=False, time_major=False, blank_idx=0)[source]¶

Criterion to compute CTC Loss as described in http://www.cs.toronto.edu/~graves/icml_2006.pdf

Parameters

size_average – if compute average loss (only if reduce is True)
reduce – if compute mean or average loss (if None, returns full tensor of shape (batch_size,))
after_logsoftmax –
if logsoftmax is used before passing neural network outputs

(else takes pure network outputs)
time_major – if logits are time major (or batch major), default True
blank_idx – id of blank label, default 0

training: bool¶