Add scheduling_map field to ParameterSchema so Application creators can
declare that a parameter (e.g. NP) maps to a scheduling field (e.g. cpus).
The backend auto-injects the scheduling value into script template variables
before rendering, eliminating duplicate user input. The frontend hides
mapped parameters from the form and injects their values on submit.
RecoverStuckTasks scans for tasks with updated_at > 5min ago and
re-enqueues them. This incorrectly matched tasks actively being
processed by the worker (e.g. slow downloads), causing
double-processing.
Add inflight sync.Map to track taskIDs currently inside ProcessTask.
RecoverStuckTasks skips tasks found in inflight. On server restart
inflight is empty (in-memory), so genuinely stuck tasks are still
correctly recovered.
Also: increase taskCh buffer 16→10000, add periodic RecoverStuckTasks
goroutine in TaskPoller (every 5min), and add status guard in
ProcessTask as defense-in-depth against duplicate enqueues.
The cleanup goroutine used context.Background() with no timeout, so if
MinIO accepted TCP connections but never responded, the goroutine would
block indefinitely. Now uses context.WithTimeout to prevent leaks.
mapSlurmStateToTaskStatus previously defaulted to 'running' for empty
state arrays and unrecognized states. This was too aggressive — treating
unknown as actively running could cause incorrect status updates when
Slurm returns unexpected or empty state data.
Now empty/unknown states return an empty string, and refreshTaskStatus
skips the update in that case.
RecoverStuckTasks now skips tasks that already have a slurm_job_id,
and ProcessTask adds a guard before the submitting step to prevent
re-submission even if a task is incorrectly re-enqueued.
Also deprecates POST /api/v1/jobs/submit endpoint (replaced by POST /tasks)
and comments out related handlers and tests.
- Map CPUs to CpusPerTask (not MinimumCpus) for consistent SlurmDBD history
- Add Set:true to memory Uint64NoVal on submission
- Filter number=0 in mapUint64NoValToInt64 to avoid false zeros
- Extract peak memory from Steps.Tres.Requested.Max across all steps
- Add formatTresList, parseGresDetail, extractMemoryFromSteps helpers
- Update mapJobInfo and mapSlurmdbJob with new field mappings
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Previously ValidateParams ran before WORK_DIR injection and file_ids mapping,
causing required parameter missing errors for auto-handled params. Now the
execution order is: inject WORK_DIR, map file_ids to file params, validate, resolve.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- ProcessTask injects $WORK_DIR only when script template uses it
- File/directory type params: resolves file_id to filename before rendering
- ValidateParams validates file/directory params as valid int64 file IDs
- RenderScript no longer shell-escapes file/directory type values
- Log rendered script before submitting to Slurm for debugging
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Comment out SubmitFromApplication method and its fallback path
- Comment out 5 tests that tested the old direct-submission code
- Remove unused imports after commenting out the method
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
When multiple chunk uploads race on the pending→uploading transition, ignore ErrRecordNotFound from UpdateSessionStatus since another request already completed the update.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Slurm requires environment variables in job submission; without them it returns 'batch job cannot run without an environment'. Also chmod the entire directory path to 0777 to bypass umask, ensuring Slurm and compute node users can write.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add ApplicationService with ValidateParams, RenderScript, SubmitFromApplication. Includes shell escaping, longest-first parameter replacement, and work directory generation. 15 tests.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
GetJob now falls back to SlurmDBD history when active queue returns 404 or empty jobs. Expand JobHistoryQuery from 7 to 16 filter params (add SubmitTime, Cluster, Qos, Constraints, ExitCode, Node, Reservation, Groups, Wckey).
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>