Commit Graph

73 Commits

Author SHA1 Message Date
dailz
36d842350c refactor(service): disable SubmitFromApplication fallback, fully replaced by POST /tasks
- Comment out SubmitFromApplication method and its fallback path

- Comment out 5 tests that tested the old direct-submission code

- Remove unused imports after commenting out the method

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-16 15:15:42 +08:00
dailz
80f2bd32d9 docs(openapi): update spec to match code — add Tasks, fix schemas, remove submit endpoint
- Add Tasks tag, /tasks paths, and Task schemas (CreateTaskRequest, TaskResponse, TaskListResponse)

- Fix SubmitJobRequest.work_dir, InitUploadRequest mime_type/chunk_size, UploadSessionResponse.created_at

- Fix FolderResponse: add file_count/subfolder_count, remove updated_at

- Fix response wrapping for File/Upload/Folder endpoints to use ApiResponseSuccess

- Remove /applications/{id}/submit path and ApplicationSubmitRequest schema

- Update Applications tag description

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-16 15:15:31 +08:00
dailz
52a34e2cb0 feat(client): add CLI client entry point 2026-04-16 13:24:21 +08:00
dailz
b9b2f0d9b4 feat(testutil): add MockSlurm, MockMinIO, TestEnv and 37 integration tests
- mockminio: in-memory ObjectStorage with all 11 methods, thread-safe, SHA256 ETag, Range support
- mockslurm: httptest server with 11 Slurm REST API endpoints, job eviction from active to history queue
- testenv: one-line test environment factory (SQLite + MockSlurm + MockMinIO + all stores/services/handlers + httptest server)
- integration tests: 37 tests covering Jobs(5), Cluster(5), App(6), Upload(5), File(4), Folder(4), Task(4), E2E(1)
- no external dependencies, no existing files modified
2026-04-16 13:23:27 +08:00
dailz
73504f9fdb feat(app): add TaskPoller, wire DI, and add task integration tests 2026-04-15 21:31:17 +08:00
dailz
3f8a680c99 feat(handler): add TaskHandler endpoints and register task routes 2026-04-15 21:31:11 +08:00
dailz
ec64300ff2 feat(service): add TaskService, FileStagingService, and refactor ApplicationService for task submission 2026-04-15 21:31:02 +08:00
dailz
acf8c1d62b feat(store): add TaskStore CRUD and batch query methods for files and blobs 2026-04-15 21:30:51 +08:00
dailz
d46a784efb feat(model): add Task model, DTOs, and status constants for task submission system 2026-04-15 21:30:44 +08:00
dailz
79870333cb fix(service): tolerate concurrent pending-to-uploading status race in UploadChunk
When multiple chunk uploads race on the pending→uploading transition, ignore ErrRecordNotFound from UpdateSessionStatus since another request already completed the update.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 10:27:12 +08:00
dailz
d9a60c3511 fix(model): rename Application table to hpc_applications
Avoid table name collision with other systems.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:32:11 +08:00
dailz
20576bc325 docs(openapi): add file storage API specifications
Add 13 endpoints for chunked upload, file management, and folder CRUD with 6 new schemas.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:30:18 +08:00
dailz
c0176d7764 feat(app): wire file storage DI, cleanup worker, and integration tests
Add DI wiring with graceful MinIO fallback, background cleanup worker for expired sessions and leaked multipart uploads, and end-to-end integration tests.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:23:25 +08:00
dailz
2298e92516 feat(handler): add upload, file, and folder handlers with routes
Add UploadHandler (5 endpoints), FileHandler (4 endpoints), FolderHandler (4 endpoints) with Gin route registration in server.go.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:23:17 +08:00
dailz
f0847d3978 feat(service): add upload, download, file, and folder services
Add UploadService (dedup, chunk lifecycle, ComposeObject), DownloadService (Range support), FileService (ref counting), FolderService (path validation).

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:23:09 +08:00
dailz
a114821615 feat(server): add streaming response helpers for file download
Add ParseRange, StreamFile, StreamRange for full and partial content delivery.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:22:58 +08:00
dailz
bf89de12f0 feat(store): add blob, file, folder, and upload stores
Add BlobStore (ref counting), FileStore (soft delete + pagination), FolderStore (materialized path), UploadStore (idempotent upsert), and update AutoMigrate.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:22:44 +08:00
dailz
c861ff3adf feat(storage): add ObjectStorage interface and MinIO client
Add ObjectStorage interface (11 methods) with MinioClient implementation using minio-go Core.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:22:33 +08:00
dailz
0e4f523746 feat(model): add file storage GORM models and DTOs
Add FileBlob, File, Folder, UploadSession, UploadChunk models with validators.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:22:25 +08:00
dailz
44895214d4 feat(config): add MinIO object storage configuration
Add MinioConfig struct with connection, bucket, chunk size, and session TTL settings.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-15 09:22:18 +08:00
dailz
a65c8762af fix(service): add environment variables and fix work directory permissions for Slurm job submission
Slurm requires environment variables in job submission; without them it returns 'batch job cannot run without an environment'. Also chmod the entire directory path to 0777 to bypass umask, ensuring Slurm and compute node users can write.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-14 13:06:51 +08:00
dailz
04f99cc1c4 docs(openapi): update spec for Application Definition
Add 6 application endpoints and schemas to OpenAPI spec. Update .gitignore.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-13 17:13:02 +08:00
dailz
32f5792b68 feat(service): pass work directory to Slurm job submission
Add WorkDir to SubmitJobRequest and pass it as CurrentWorkingDirectory to Slurm REST API. Fixes Slurm 500 error when working directory is not specified.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-13 17:12:28 +08:00
dailz
328691adff feat(config): add WorkDirBase for application job working directory
Add WorkDirBase config field for auto-generated job working directories. Pattern: {base}/{app_name}/{timestamp}_{random}/

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-13 17:11:48 +08:00
dailz
10bb15e5b2 feat(handler): add Application handler, routes, and wiring
Add ApplicationHandler with CRUD + Submit endpoints. Register 6 routes, wire in app.go, update main_test.go references. 22 handler tests.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-13 17:10:54 +08:00
dailz
d3eb728c2f feat(service): add Application service with parameter validation and script rendering
Add ApplicationService with ValidateParams, RenderScript, SubmitFromApplication. Includes shell escaping, longest-first parameter replacement, and work directory generation. 15 tests.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-13 17:10:09 +08:00
dailz
4a8153aa6c feat(model): add Application model and store
Add Application and ParameterSchema models with CRUD store. Includes 10 store tests and ParamType constants.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-13 17:08:24 +08:00
dailz
dd8d226e78 refactor: remove JobTemplate production code
Remove all JobTemplate model, store, handler, migrations, and wiring. Replaced by Application Definition system.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-13 17:07:46 +08:00
dailz
62e458cb7a docs(openapi): update GET /jobs with pagination and JobListResponse
Add page/page_size query parameters, change response from JobResponse[] to JobListResponse, add 400 error code.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 15:15:24 +08:00
dailz
2cb6fbecdd feat(service): add pagination to GetJobs endpoint
GetJobs now accepts page/page_size query parameters and returns JobListResponse instead of raw array. Uses in-memory pagination matching GetJobHistory pattern.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 15:14:56 +08:00
dailz
35a4017b8e docs(model): add Chinese field comments to all model structs
Add inline comments to SubmitJobRequest, JobListResponse, JobHistoryQuery, JobTemplate, CreateTemplateRequest, and UpdateTemplateRequest fields, consistent with existing cluster.go and JobResponse style.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 13:53:54 +08:00
dailz
f4177dd287 feat(service): add GetJob fallback to SlurmDBD history and expand query params
GetJob now falls back to SlurmDBD history when active queue returns 404 or empty jobs. Expand JobHistoryQuery from 7 to 16 filter params (add SubmitTime, Cluster, Qos, Constraints, ExitCode, Node, Reservation, Groups, Wckey).

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 13:43:31 +08:00
dailz
b3d787c97b fix(slurm): parse structured errors from non-2xx Slurm API responses
Replace ErrorResponse with SlurmAPIError that extracts structured errors/warnings from JSON body when Slurm returns non-2xx (e.g. 404 with valid JSON). Add IsNotFound helper for fallback logic.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 13:43:17 +08:00
dailz
30f0fbc34b fix(slurm): correct PartitionInfoMaximums CpusPerNode/CpusPerSocket types to Uint32NoVal
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 11:39:29 +08:00
dailz
34ba617cbf fix(test): update log assertions for debug logging and field expansion
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 11:13:13 +08:00
dailz
824d9e816f feat(service): map additional Slurm SDK fields and fix ExitCode/Default bugs
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 11:12:51 +08:00
dailz
85901fe18a feat(model): expand API response fields to expose full Slurm data
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 11:12:33 +08:00
dailz
270552ba9a feat(service): add debug logging for Slurm API calls with request/response body and latency
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 10:28:58 +08:00
dailz
347b0e1229 fix: remove redundant binding tags and clarify logger compress logic
- Remove binding:"required" from model fields that are manually validated in handlers. - Add parentheses to logger compress default to clarify operator precedence.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 09:25:46 +08:00
dailz
c070dd8abc fix(slurm): add default 30s timeout to HTTP client
Replaces http.DefaultClient with a client that has a 30s timeout to prevent indefinite hangs when the Slurm REST API is unresponsive.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 09:25:35 +08:00
dailz
1359730300 fix(store): return ErrRecordNotFound when updating non-existent template
RowsAffected == 0 now returns gorm.ErrRecordNotFound so the handler can respond with 404 instead of silently returning 200.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 09:21:03 +08:00
dailz
4ff02d4a80 fix: 移除 main() 中多余的 defer application.Close()
Run() 在所有退出路径中已调用 Close(),main 中的 defer 是冗余的。

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 08:48:42 +08:00
dailz
1784331969 feat: 添加应用骨架,配置化 zap 日志贯穿全链路
- cmd/server/main.go: 使用 logger.NewLogger(cfg.Log) 替代 zap.NewProduction()

- internal/app: 依赖注入组装 DB/Slurm/Service/Handler,传递 logger

- internal/middleware: RequestLogger 请求日志中间件

- internal/server: 统一响应格式和路由注册

- go.mod: module 更名为 gcy_hpc_server,添加 gin/zap/lumberjack/gorm 依赖

- 日志初始化失败时 fail fast (os.Exit(1))

- GormLevel 从配置传递到 NewGormDB,支持 YAML 独立配置

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 08:40:16 +08:00
dailz
e6162063ca feat: 添加 HTTP 处理层和结构化日志
- JobHandler: 提交/查询/取消/历史,5xx Error + 4xx Warn 日志

- ClusterHandler: 节点/分区/诊断,错误和未找到日志

- TemplateHandler: CRUD 操作,创建/更新/删除 Info + 未找到 Warn

- 不记录成功响应(由 middleware.RequestLogger 处理)

- 不记录请求体和模板内容(安全考虑)

- 完整 TDD 测试,使用 zaptest/observer 验证日志级别和字段

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 08:40:06 +08:00
dailz
4903f7d07f feat: 添加业务服务层和结构化日志
- JobService: 提交、查询、取消、历史记录,记录关键操作日志

- ClusterService: 节点、分区、诊断查询,记录错误日志

- NewSlurmClient: JWT 认证 HTTP 客户端工厂

- 所有构造函数接受 *zap.Logger 参数实现依赖注入

- 提交/取消成功记录 Info,API 错误记录 Error

- 完整 TDD 测试,使用 zaptest/observer 验证日志输出

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 08:39:46 +08:00
dailz
fbfd5c5f42 feat: 添加数据模型和存储层
- model: JobTemplate、SubmitJobRequest、JobHistoryQuery 等模型定义

- store: NewGormDB MySQL 连接池,使用 zap 日志替代 GORM 默认日志

- store: TemplateStore CRUD 操作,支持 GORM AutoMigrate

- NewGormDB 接受 gormLevel 参数,由上层传入配置值

- 完整 TDD 测试覆盖

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 08:39:30 +08:00
dailz
f7a21ee455 feat: 添加 zap 日志工厂和 GORM 日志桥接
- NewLogger 工厂函数:支持 JSON/Console 编码、stdout/文件/多输出、lumberjack 轮转

- NewGormLogger 实现 gorm.Interface:Trace 区分错误/慢查询/正常查询

- output_stdout 用 *bool 三态处理(nil=true, true, false)

- 默认值:level=info, encoding=json, max_size=100, max_backups=5, max_age=30

- 慢查询阈值 200ms,ErrRecordNotFound 不视为错误

- 编译时接口检查: var _ gormlogger.Interface = (*GormLogger)(nil)

- 完整 TDD 测试覆盖

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 08:39:21 +08:00
dailz
7550e75945 feat: 添加配置加载和日志配置支持
- 新增 LogConfig 结构体,支持 9 个日志配置字段(level, encoding, output_stdout, file_path, max_size, max_backups, max_age, compress, gorm_level)

- Config 结构体新增 Log 字段,支持 YAML 解析

- output_stdout 使用 *bool 指针类型,nil 默认为 true

- 更新 config.example.yaml 添加完整 log 配置段

- 新增 TDD 测试:日志配置解析、向后兼容、字段完整性

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 08:39:09 +08:00
dailz
246c19c052 feat(client): add functional option pattern for JWT auth config
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 10:33:31 +08:00
dailz
f8119ff9e5 feat(auth): add JWTAuthTransport with auto-refresh RoundTrip
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 10:31:39 +08:00