pytorch_end2end package¶
Module contents¶
-
class
pytorch_end2end.
CTCDecoder
(beam_width=100, after_logsoftmax=False, blank_idx=0, time_major=False, labels=None, lm_path=None, lmwt=1.0, wip=1.0, oov_penalty=- 10, case_sensitive=True)[source]¶ Decoder class to perform CTC decoding
- Parameters
beam_width –
width of beam (number of stored hypotheses), default
100
.If
1
, decoder always perform greedy (argmax) decodingafter_logsoftmax –
if log_logits (logits after log softmax), default
False
(with False decoder expects pure logits, not after softmax). Greedy decoding ignores this parameter and can work with pure logits, or after log softmax
blank_idx – id of blank label, default
0
time_major – if logits are time major (else batch major)
labels – list of strings with labels (including blank symbol), e.g.
["_", "a", "b", "c"]
lm_path – path to language model (ARPA format or gzipped ARPA)
lmwt – language model weight, default
1.0
, makes sense only if language model is presentwip – word insertion penalty, default
1.0
, makes sense only if labels are presentoov_penalty – penalty for each oov word, default
-10.0
case_sensitive – obtain language model scores with respect to case, default
False
-
decode
(logits, logits_lengths=None)[source]¶ Performs prefix beam search decoding as described in https://arxiv.org/abs/1408.2873
- Parameters
logits –
tensor with neural network outputs after logsoftmax
of shape
(sequence_length, batch_size, alphabet_size)
iftime_major
else of shape
(batch_size, sequence_length, alphabet_size)
logits_lengths – default
None
- Returns
namedtuple(decoded_targets, decoded_targets_lengths, decoded_sentences)
- decoded_targets:
tensor with result targets of shape
(batch_size, sequence_length)
, doesn’t contain blank symbols- decoded_targets_length:
tensor with lengths of decoded targets
- decoded_sentences:
list of strings, shape
(batch_size)
. Iflabels are None
, list of empty string is returned.
-
decode_greedy
(logits, logits_lengths=None)[source]¶ Performs greedy (argmax) decoding
- Parameters
logits –
tensor with neural network outputs after logsoftmax
of shape
(sequence_length, batch_size, alphabet_size)
iftime_major
else of shape
(batch_size, sequence_length, alphabet_size)
logits_lengths – default
None
- Returns
(decoded_targets, decoded_targets_lengths, decoded_sentences)
- decoded_targets:
tensor with result targets of shape
(batch_size, sequence_length)
, doesn’t contain blank symbols- decoded_targets_length:
tensor with lengths of decoded targets
- decoded_sentences:
list of strings, shape
(batch_size)
. Iflabels are None
, list of empty string is returned.
-
class
pytorch_end2end.
CTCEncoder
(characters, blank_id=0, transform_fn=<method 'upper' of 'str' objects>)[source]¶ Simple CTC-encoder for text
-
class
pytorch_end2end.
CTCLoss
(size_average=None, reduce=None, after_logsoftmax=False, time_major=False, blank_idx=0)[source]¶ Criterion to compute CTC Loss as described in http://www.cs.toronto.edu/~graves/icml_2006.pdf
- Parameters
size_average – if compute average loss (only if reduce is True)
reduce – if compute mean or average loss (if None, returns full tensor of shape
(batch_size,)
)after_logsoftmax –
if logsoftmax is used before passing neural network outputs
(else takes pure network outputs)
time_major – if logits are time major (or batch major), default
True
blank_idx – id of blank label, default
0
-
training
: bool¶