-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: test snapshot in replica while seeding master #4867
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: kostas <kostas@dragonflydb.io>
@adiholden I added a boilerplate test. Let's discuss here what else we would like |
|
||
# Start data stream | ||
stream_task = asyncio.create_task(seeder.run(c_master, target_ops=5000)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a sleep for the steam task to run a little bit
tests/dragonfly/replication_test.py
Outdated
# Start data stream | ||
stream_task = asyncio.create_task(seeder.run(c_master, target_ops=5000)) | ||
|
||
assert await c_replica.execute_command("SAVE DF tmp_dump") == "OK" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to check that we actually trigger write also from the flow of running commands.
We can do this by extract_int_after_prefix logic we have in cluster
tests/dragonfly/replication_test.py
Outdated
await wait_for_replicas_state(c_replica) | ||
|
||
# Start data stream | ||
stream_task = asyncio.create_task(seeder.run(c_master, target_ops=5000)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better use seeder.run without target and call seeder.stop at the end
|
||
# Check that everything is in sync | ||
hashes = await asyncio.gather(*(SeederV2.capture(c) for c in [c_master, c_replica])) | ||
assert len(set(hashes)) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets load the snapshot just to make sure we success loading and the snapshot is not broken
Signed-off-by: kostas <kostas@dragonflydb.io>
tests/dragonfly/replication_test.py
Outdated
assert int(sub[1]) > 0 | ||
|
||
# Check that the produced rdb is loaded correctly | ||
node = df_factory.create(dbfilename="tmp_dump") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use random filename, we had several test issues when not using random filename
replica.stop() | ||
lines = replica.find_in_logs("Exit SnapshotSerializer") | ||
assert len(lines) == 3 | ||
for line in lines: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it will better to change the log format of SnapshotSerializer to be similar to RestoreStreamer print and use the same python function extract_int_after_prefix to extract the data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
||
replica.stop() | ||
lines = replica.find_in_logs("Exit SnapshotSerializer") | ||
assert len(lines) == 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this number 3 is because of proactor_threads = 4?
if so I suggest to use len(lines) == proactor_threads - 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will do
lines = replica.find_in_logs("Exit SnapshotSerializer") | ||
assert len(lines) == 3 | ||
for line in lines: | ||
side_saved = extract_int_after_prefix("side_saved ", line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add a comment that this check verifies that we are also testing the serializtion path of command execution
@@ -2988,3 +2988,50 @@ async def test_bug_in_json_memory_tracking(df_factory: DflyInstanceFactory): | |||
await wait_for_replicas_state(*c_replicas) | |||
|
|||
await fill_task | |||
|
|||
|
|||
@dfly_args({"proactor_threads": 4, "serialization_max_chunk_size": 1}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to set serialization_max_chunk_size = 1
When you set huge_value_target=10000 this will enter the path of big value serialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, just wanted to force it even for smaller items. I can remove no issue with that
Add test for big values that seeds master while replica snapshots
Resolves #4858