Add scheduling_map field to ParameterSchema so Application creators can
declare that a parameter (e.g. NP) maps to a scheduling field (e.g. cpus).
The backend auto-injects the scheduling value into script template variables
before rendering, eliminating duplicate user input. The frontend hides
mapped parameters from the form and injects their values on submit.
RecoverStuckTasks scans for tasks with updated_at > 5min ago and
re-enqueues them. This incorrectly matched tasks actively being
processed by the worker (e.g. slow downloads), causing
double-processing.
Add inflight sync.Map to track taskIDs currently inside ProcessTask.
RecoverStuckTasks skips tasks found in inflight. On server restart
inflight is empty (in-memory), so genuinely stuck tasks are still
correctly recovered.
Also: increase taskCh buffer 16→10000, add periodic RecoverStuckTasks
goroutine in TaskPoller (every 5min), and add status guard in
ProcessTask as defense-in-depth against duplicate enqueues.
The cleanup goroutine used context.Background() with no timeout, so if
MinIO accepted TCP connections but never responded, the goroutine would
block indefinitely. Now uses context.WithTimeout to prevent leaks.
mapSlurmStateToTaskStatus previously defaulted to 'running' for empty
state arrays and unrecognized states. This was too aggressive — treating
unknown as actively running could cause incorrect status updates when
Slurm returns unexpected or empty state data.
Now empty/unknown states return an empty string, and refreshTaskStatus
skips the update in that case.
RecoverStuckTasks now skips tasks that already have a slurm_job_id,
and ProcessTask adds a guard before the submitting step to prevent
re-submission even if a task is incorrectly re-enqueued.
Also deprecates POST /api/v1/jobs/submit endpoint (replaced by POST /tasks)
and comments out related handlers and tests.
- Map CPUs to CpusPerTask (not MinimumCpus) for consistent SlurmDBD history
- Add Set:true to memory Uint64NoVal on submission
- Filter number=0 in mapUint64NoValToInt64 to avoid false zeros
- Extract peak memory from Steps.Tres.Requested.Max across all steps
- Add formatTresList, parseGresDetail, extractMemoryFromSteps helpers
- Update mapJobInfo and mapSlurmdbJob with new field mappings
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Previously ValidateParams ran before WORK_DIR injection and file_ids mapping,
causing required parameter missing errors for auto-handled params. Now the
execution order is: inject WORK_DIR, map file_ids to file params, validate, resolve.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Reuse MockSlurm + MockMinIO + TestEnv wiring pattern to create a standalone
binary that serves all API endpoints with in-memory SQLite and seed data.
- internal/mockserver/server.go: assembly logic (New/Close/Run), option pattern,
4 accessors for seed data injection
- cmd/mockserver/main.go: CLI flags (--port, --seed, MOCK_PORT), 6 seed jobs
in all 5 states + 2 seed applications, signal handling
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- ProcessTask injects $WORK_DIR only when script template uses it
- File/directory type params: resolves file_id to filename before rendering
- ValidateParams validates file/directory params as valid int64 file IDs
- RenderScript no longer shell-escapes file/directory type values
- Log rendered script before submitting to Slurm for debugging
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- GetFile uses new GetFileResponse instead of manual FileResponse construction
- ListFiles handler parses optional user_id query parameter
- Wire FolderStore into FileService in app.go, testenv, and file_test
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Comment out submit route assertions in main_test.go and server_test.go
- Comment out TestTask_OldAPICompatibility in task_test.go
- Update expected route count 31→30 in testenv env_test.go
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Comment out SubmitApplication handler method
- Comment out route registration in server.go (interface + router + placeholder)
- Comment out related handler tests
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Comment out SubmitFromApplication method and its fallback path
- Comment out 5 tests that tested the old direct-submission code
- Remove unused imports after commenting out the method
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- mockminio: in-memory ObjectStorage with all 11 methods, thread-safe, SHA256 ETag, Range support
- mockslurm: httptest server with 11 Slurm REST API endpoints, job eviction from active to history queue
- testenv: one-line test environment factory (SQLite + MockSlurm + MockMinIO + all stores/services/handlers + httptest server)
- integration tests: 37 tests covering Jobs(5), Cluster(5), App(6), Upload(5), File(4), Folder(4), Task(4), E2E(1)
- no external dependencies, no existing files modified
When multiple chunk uploads race on the pending→uploading transition, ignore ErrRecordNotFound from UpdateSessionStatus since another request already completed the update.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Slurm requires environment variables in job submission; without them it returns 'batch job cannot run without an environment'. Also chmod the entire directory path to 0777 to bypass umask, ensuring Slurm and compute node users can write.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add ApplicationService with ValidateParams, RenderScript, SubmitFromApplication. Includes shell escaping, longest-first parameter replacement, and work directory generation. 15 tests.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add inline comments to SubmitJobRequest, JobListResponse, JobHistoryQuery, JobTemplate, CreateTemplateRequest, and UpdateTemplateRequest fields, consistent with existing cluster.go and JobResponse style.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
GetJob now falls back to SlurmDBD history when active queue returns 404 or empty jobs. Expand JobHistoryQuery from 7 to 16 filter params (add SubmitTime, Cluster, Qos, Constraints, ExitCode, Node, Reservation, Groups, Wckey).
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace ErrorResponse with SlurmAPIError that extracts structured errors/warnings from JSON body when Slurm returns non-2xx (e.g. 404 with valid JSON). Add IsNotFound helper for fallback logic.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>