ExtendedLSTM#
ExtendedLSTM - 1#
Version#
name: ExtendedLSTM
domain: com.amd.quark
Summary#
This is a customized version of the official operator LSTM, it computes an one-layer quantized LSTM with bfloat16.
Attributes#
direction - STRING (default is ‘forward’):
Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional, but currently it supports “bidirectional” only.
hidden_size - INT:
Number of neurons in the hidden layer.
input_forget - INT (default is ‘0’):
Couple the input and forget gates if 1, Currently it can only be 0.
layout - INT (default is ‘0’):
The shape format of inputs and outputs, Currently it can only be 0.
x_scale - FLOAT :
Scale for input X. It only supports per-tensor/per-layer quantization, so the scale should be a scalar.
x_zero_point - INT:
Zero point for input X. Shape must match x_scale. It only supports uint16 quantization, so the zero point value should be in the range of [0, 65535].
w_scale - FLOAT :
Scale for input W. It only supports per-tensor/per-layer quantization, so the scale should be a scalar.
w_zero_point - INT:
Zero point for input W. Shape must match w_scale. It only supports uint16 quantization, so the zero point value should be in the range of [0, 65535].
r_scale - FLOAT :
Scale for input R. It only supports per-tensor/per-layer quantization, so the scale should be a scalar.
r_zero_point - INT:
Zero point for input R. Shape must match r_scale. It only supports uint16 quantization, so the zero point value should be in the range of [0, 65535].
b_scale - FLOAT :
Scale for input B. It only supports per-tensor/per-layer quantization, so the scale should be a scalar.
b_zero_point - INT:
Zero point for input B. Shape must match b_scale. It only supports uint16 quantization, so the zero point value should be in the range of [0, 65535].
Inputs#
X (heterogeneous) - T:
The input sequences packed (and potentially padded) into one 3-D tensor with the shape of [seq_length, batch_size, input_size].
W (heterogeneous) - T:
The weight tensor for the gates. Concatenation of W[iofc] and WB[iofc] (if bidirectional) along dimension 0. The tensor has shape [num_directions, 4*hidden_size, input_size].
R (heterogeneous) - T:
The recurrence weight tensor. Concatenation of R[iofc] and RB[iofc] (if bidirectional) along dimension 0. This tensor has shape [num_directions, 4*hidden_size, hidden_size].
B (optional, heterogeneous) - T:
The bias tensor for input gate. Concatenation of [Wb[iofc], Rb[iofc]], and [WBb[iofc], RBb[iofc]] (if bidirectional) along dimension 0. This tensor has shape [num_directions, 8*hidden_size]. Optional: If not specified - assumed to be 0.
Outputs#
Y (optional, heterogeneous) - T:
A tensor that concatenates all the intermediate output values of the hidden. It has shape [seq_length, num_directions, batch_size, hidden_size].
Type Constraints#
T in ( tensor(float) ):
Constrain input and output types to float tensors.