Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlatMapPipelinedCursor starts background calculations of pipelined inner cursors #3072

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

hatyo
Copy link
Contributor

@hatyo hatyo commented Jan 29, 2025

This PR enables scheduling the calculation of the first cursor item of inner when as soon as it is added to the pipeline, this should improve the performance of the flat map pipeline cursor in situations where the calculation of inner is much more CPU intensive compared to outer, it comes however with the cost of potentially over computing.

- this should improve the performance of the pipeline now that it schedule
  up to N-calculations of the inner's first cursor item, where N is the size
  of the pipeline.
@@ -295,20 +296,26 @@ public PipelineQueueEntry(RecordCursor<V> innerCursor,
this.priorOuterContinuation = priorOuterContinuation;
this.outerResult = outerResult;
this.outerCheckValue = outerCheckValue;
// start calculating the next result in the background.
setInnerFuture();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of style, I think it would be better not to fire off this future in the constructor. There are some security implications in general if this can throw an error (see: https://www.baeldung.com/java-constructors-exceptions, see "Security concerns"), but also, it's counter-intuitive for users of the API, who don't expect constructing an object to have side-effects like this. If we really want to always fire off this future when we make a new element, I think we should make a static initializer, which removes the security concern and is less likely to surprise API consumers (which, granted, is just us because this is a private class)

@@ -267,6 +267,7 @@ private synchronized void addEntryToPipeline(PipelineQueueEntry pipelineQueueEnt
if (closed) {
pipelineQueueEntry.close();
}
pipelineQueueEntry.getNextInnerPipelineFuture();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both this change and the change in the PipelineQueueEntry constructor? If I'm reasoning about this correctly, it seems like we only need one of the two (that this is extraneous if the next future is fired off during the constructor, and the constructor call is extraneous if the future is created here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this was an oversight, let me remove that.

@foundationdb-ci
Copy link
Contributor

Result of fdb-record-layer-pr on Linux CentOS 7

  • Commit ID: 0be931c
  • Duration 0:36:32
  • Result: ❌ FAILED
  • Error: Error while executing command: ./gradlew --no-daemon --console=plain -b ./build.gradle build destructiveTest -PcoreNotStrict -PreleaseBuild=false -PpublishBuild=false -PspotbugsEnableHtmlReport. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of fdb-record-layer-pr on Linux CentOS 7

  • Commit ID: b4322bb
  • Duration 0:36:48
  • Result: ❌ FAILED
  • Error: Error while executing command: ./gradlew --no-daemon --console=plain -b ./build.gradle build destructiveTest -PcoreNotStrict -PreleaseBuild=false -PpublishBuild=false -PspotbugsEnableHtmlReport. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

alecgrieser
alecgrieser previously approved these changes Jan 29, 2025
- scheduling the calculation of inner in the pipeline is triggering extra computation.
@foundationdb-ci
Copy link
Contributor

Result of fdb-record-layer-pr on Linux CentOS 7

  • Commit ID: aa38bd0
  • Duration 0:56:28
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@hatyo hatyo changed the title WIP - Start calculating inner's first item concurrently. FlatMapPipelinedCursor starts background calculations of pipelined inner cursors Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants