-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] Support for converting zero-filled memory for i4 compressed weights #29937
base: master
Are you sure you want to change the base?
[GPU] Support for converting zero-filled memory for i4 compressed weights #29937
Conversation
strides.push_back(1); | ||
strides.push_back(padded_dims[1]); | ||
dnnl::memory::data_type dt = convert_data_type(weights_layout.data_type); | ||
weights_md = dnnl::memory::desc(dims, dt, strides); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think it would be better to be placed within
onednn::layout_to_memory_desc
. - This function supports
padded_dims[1]
only. Then could you add assert to checkpadded_dims[0] == 0
? - Could you add test case for this fix?
@@ -115,7 +189,17 @@ static void create_data(ProgramBuilder& p, const ov::Shape& const_shape, const s | |||
auto buf = lock.data(); | |||
auto bufSize = constLayout.bytes_count(); | |||
|
|||
std::memcpy(&buf[0], &data[0], bufSize); | |||
if (constLayout.data_padding) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it necessary to check also data_type == u4 || data_type == i4
?
std::memcpy(&padded_vec[padded_begin], &non_padded_vec[non_padded_begin], non_padded_row_size); | ||
if (non_padded_end < non_padded_dims_size) { | ||
uint8_t s0, s1; | ||
unpack8to4(non_padded_vec[non_padded_end], s0, s1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid it will create wrong data if you convert char
data into uint8_t
. For example, when char
contains negative value.
|
||
// Function to convert a non-padded vector to a padded vector | ||
static void copy_to_padded_vector(const std::vector<int> &non_padded_dims, const char* non_padded_vec, | ||
const std::vector<int> &padded_dims, char* padded_vec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function works only for 4bit data element. I think it would be better to clarify from the name.
temp_padding_buffer[index] = s0; | ||
++index; | ||
temp_padding_buffer[index] = s1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does it take? I'm afraid this will be very slow. I think we need to find a better way.
Details:
Tickets: