You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you submit a pipeline to AWS Batch which does not have sufficient resources, the pipeline will hang forever
Expected behavior and actual behavior
(Give a brief description of the expected behavior and actual behavior)
Steps to reproduce the problem
create a worker pool with a small instance type, I used a c6id.large which has 2 CPUS and used Seqera Platform to create the CE and used minimum CPUs of 0 to make sure any ECS resources scaled to zero before using them.
submit any pipeline with the process.cpus = 4. I used the following config:
process {
withName: '.*' {
cpus =4
}
}
Wait forever...
We should see Nextflow fail early and tell us the error.
On AWS side, we can see the following error in the job submission:
MISCONFIGURATION:JOB_RESOURCE_REQUIREMENT - The job resource requirement (vCPU/memory/GPU) is higher than that can be met by the CE(s) attached to the job queue.
Program output
A normal Nextflow log file, no errors.
Environment
Nextflow version: 24.10.5
(ran on platform 25.1.0-cycle5_3494631)
Additional context
When using an automated head node to run Nextflow, this causes resources to be consumed needlessly.
The text was updated successfully, but these errors were encountered:
I believe we have similar checks for other executors like K8s and the grid executors, when it's possible to know that a job will never be scheduled due to resource requirements. So it should just be a matter of catching this error and throwing an "unrecoverable" exception
Note that at submission time there no way to know that because a AWS Batch queue can define any combination of instance types or family. This may also be determined by temporary unavailability of certain types when using spot instances
This is the same case that the quota exceeded warning in Google Cloud or the unschedulable status in k8s. There is an event message warning about a situation but the job keeps in a pending state but not failing. In those cases, nextflow is just sending a warning. In this case, I could also check the status reason and add a warning. Do you think we should also cancel the task and produce a ProcessException.
Bug report
If you submit a pipeline to AWS Batch which does not have sufficient resources, the pipeline will hang forever
Expected behavior and actual behavior
(Give a brief description of the expected behavior and actual behavior)
Steps to reproduce the problem
process.cpus = 4
. I used the following config:We should see Nextflow fail early and tell us the error.
On AWS side, we can see the following error in the job submission:
Program output
A normal Nextflow log file, no errors.
Environment
(ran on platform 25.1.0-cycle5_3494631)
Additional context
When using an automated head node to run Nextflow, this causes resources to be consumed needlessly.
The text was updated successfully, but these errors were encountered: