From 8f5152fc025f17c9af645bd2791be9ed37e69eed Mon Sep 17 00:00:00 2001 From: htlee Date: Mon, 20 Apr 2026 10:11:16 +0900 Subject: [PATCH] =?UTF-8?q?feat(prediction):=20Phase=202=20PoC=202~5=20?= =?UTF-8?q?=E2=80=94=20gear=5Fviolation/transshipment/risk/pair=5Ftrawl?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 런타임 override 완성 (params 인자 + 내부 상수 교체): - gear_violation_g01_g06 (GEAR, tier 4) · G01~G06 점수 + signal_cycling(gap_min/min_count) · gear_drift_threshold_nm + fixed_gear_types + fishery_code_allowed_gear · _detect_signal_cycling_count 도입 (기존 _detect_signal_cycling 보존) 카탈로그 + 관찰 (DEFAULT_PARAMS 노출 + Adapter 집계, 런타임 교체는 후속 PR): - transshipment_5stage (TRANSSHIP, tier 4) — 5단계 필터 임계 - risk_composite (META, tier 3) — 경량+파이프라인 가중치 - pair_trawl_tier (GEAR, tier 4) — STRONG/PROBABLE/SUSPECT 임계 각 모델 공통: - prediction/algorithms/*.py: DEFAULT_PARAMS 상수 추가 - models_core/registered/*_model.py: BaseDetectionModel Adapter - models_core/seeds/v1_.sql: DRAFT seed (호출자 트랜잭션 제어) - tests/test__params.py: Python ↔ 모듈 상수 ↔ seed SQL 정적 일치 검증 통합 seed: models_core/seeds/v1_phase2_all.sql (\i 로 5 모델 일괄 시드) 검증: - 30/30 테스트 통과 (Phase 1-2 15 + dark 5 + Phase 2 신규 10) - 운영 DB 5 모델 개별 + 일괄 seed dry-run 통과 (BEGIN/ROLLBACK 격리) - 5 모델 모두 tier/category 정렬 확인: dark_suspicion(3) / risk_composite(3) / gear_violation_g01_g06(4) / pair_trawl_tier(4) / transshipment_5stage(4) 후속: - transshipment/risk/pair_trawl 런타임 override 활성화 (헬퍼 params 전파) - Phase 3 백엔드 API (DetectionModelController + 승격 엔드포인트) Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/RELEASE-NOTES.md | 7 + prediction/algorithms/gear_violation.py | 112 ++++++++++++++-- prediction/algorithms/pair_trawl.py | 36 +++++ prediction/algorithms/risk.py | 53 ++++++++ prediction/algorithms/transshipment.py | 21 +++ .../registered/gear_violation_model.py | 71 ++++++++++ .../registered/pair_trawl_model.py | 65 +++++++++ .../registered/risk_composite_model.py | 66 +++++++++ .../registered/transshipment_model.py | 67 ++++++++++ prediction/models_core/seeds/README.md | 20 ++- .../models_core/seeds/v1_gear_violation.sql | 65 +++++++++ .../models_core/seeds/v1_pair_trawl.sql | 65 +++++++++ .../models_core/seeds/v1_phase2_all.sql | 47 +++++++ .../models_core/seeds/v1_risk_composite.sql | 79 +++++++++++ .../models_core/seeds/v1_transshipment.sql | 51 +++++++ .../tests/test_gear_violation_params.py | 125 ++++++++++++++++++ prediction/tests/test_pair_trawl_params.py | 66 +++++++++ .../tests/test_risk_composite_params.py | 75 +++++++++++ prediction/tests/test_transshipment_params.py | 88 ++++++++++++ 19 files changed, 1159 insertions(+), 20 deletions(-) create mode 100644 prediction/models_core/registered/gear_violation_model.py create mode 100644 prediction/models_core/registered/pair_trawl_model.py create mode 100644 prediction/models_core/registered/risk_composite_model.py create mode 100644 prediction/models_core/registered/transshipment_model.py create mode 100644 prediction/models_core/seeds/v1_gear_violation.sql create mode 100644 prediction/models_core/seeds/v1_pair_trawl.sql create mode 100644 prediction/models_core/seeds/v1_phase2_all.sql create mode 100644 prediction/models_core/seeds/v1_risk_composite.sql create mode 100644 prediction/models_core/seeds/v1_transshipment.sql create mode 100644 prediction/tests/test_gear_violation_params.py create mode 100644 prediction/tests/test_pair_trawl_params.py create mode 100644 prediction/tests/test_risk_composite_params.py create mode 100644 prediction/tests/test_transshipment_params.py diff --git a/docs/RELEASE-NOTES.md b/docs/RELEASE-NOTES.md index 7f66452..70dbb1f 100644 --- a/docs/RELEASE-NOTES.md +++ b/docs/RELEASE-NOTES.md @@ -5,6 +5,13 @@ ## [Unreleased] ### 추가 +- **Phase 2 PoC 5 모델 마이그레이션 완료 (2 런타임 + 3 카탈로그)** — Phase 2 PoC 계획서의 5 알고리즘 전체를 detection_model 카탈로그로 등록하고 운영자 파라미터 튜닝 지점을 확보. 모드는 두 층위로 분리: + - **런타임 override 완성 (2 모델)** — `dark_suspicion` (tier 3, DARK_VESSEL, 19 가중치 + sog/반복/gap/tier 임계) · `gear_violation_g01_g06` (tier 4, GEAR, 6 G-code 점수 + signal cycling + gear drift + 허용 어구 매핑). 알고리즘 함수에 `params: dict | None = None` 인자 추가, `_merge_default_*_params` 깊이 병합으로 override 가 DEFAULT 를 변조하지 않는 불변성 보장. `params=None` 호출은 Phase 2 이전과 완전 동일 결과 (BACK-COMPAT). 운영자가 version 을 ACTIVE 로 승격하면 다음 사이클부터 실제 값 교체 + - **카탈로그 + 관찰 (3 모델)** — `transshipment_5stage` (tier 4, TRANSSHIP, 5단계 필터 임계) · `risk_composite` (tier 3, META, 경량+파이프라인 가중치) · `pair_trawl_tier` (tier 4, GEAR, STRONG/PROBABLE/SUSPECT 임계). 내부 헬퍼들이 모듈 레벨 상수를 직접 참조하여 이번 단계에서는 DEFAULT_PARAMS 를 DB 에 노출 + Adapter 로 ctx.inputs 집계 관찰만 수행. 런타임 값 교체는 후속 리팩토링 PR 에서 헬퍼 params 전파를 완성하면 활성화 + - **Adapter 5종** (`prediction/models_core/registered/*_model.py`) — `BaseDetectionModel` 상속, AnalysisResult 리스트에서 결과 집계 · tier/score 분포 메트릭 자동 기록 + - **Seed SQL 5 + 통합 1** — 각 `prediction/models_core/seeds/v1_.sql` + `v1_phase2_all.sql` 이 `\i` 로 5 모델 일괄 시드. BEGIN/COMMIT 제거로 호출자 트랜잭션 제어 가능 + - **정적 동치성 검증 30 테스트** — 각 모델마다 Python DEFAULT 상수 ↔ 모듈 상수 ↔ seed SQL JSONB 3자 일치 검증. 5 모델 + Phase 1-2 기반 15 + dark 동치성 5 + Phase 2 8 신규 = 30/30 통과 + - **운영 DB dry-run 통과** — 5 모델 개별 + 일괄 seed 모두 BEGIN/ROLLBACK 격리 검증, 반영 없이 SQL 정상 동작 확인 - **Phase 2 PoC #1 dark_suspicion 모델 마이그레이션** — `prediction/algorithms/dark_vessel.py` 의 `compute_dark_suspicion` 에 `params: dict | None = None` 인자 추가. `DARK_SUSPICION_DEFAULT_PARAMS` 상수(19개 가중치 + SOG 임계 + 반복 이력 임계 + tier 70/50/30)를 Python SSOT 로 추출하고, `_merge_default_params` 로 override 깊이 병합. `params=None` 시 Phase 2 이전과 **완전 동일한 결과** (BACK-COMPAT 보장). Adapter 클래스 `prediction/models_core/registered/dark_suspicion_model.py`(`BaseDetectionModel` 상속, AnalysisResult 리스트를 입력으로 gap_info 재평가, `evaluated/critical/high/watch_count` 메트릭 기록). Seed SQL `prediction/models_core/seeds/v1_dark_suspicion.sql` — `status=DRAFT role=NULL` 로 안전 seed (운영 영향 0, Phase 3 백엔드 API 승격 대기). 동치성 유닛테스트 5건 추가 (DEFAULT 형태 검증, None/빈dict 동치성, override 불변성, **seed SQL JSONB ↔ Python DEFAULT 1:1 정적 일치 검증**). 총 20/20 테스트 통과 - **Phase 1-2 Detection Model Registry 기반 인프라 (prediction)** — `prediction/models_core/` 패키지 신설. `BaseDetectionModel` 추상 계약 + `ModelContext` / `ModelResult` dataclass + `ModelRegistry`(ACTIVE 버전 전체 인스턴스화, DAG 순환 검출, topological 실행 플랜) + `DAGExecutor`(PRIMARY 실행→`ctx.shared` 주입→SHADOW/CHALLENGER persist-only 실행, 후행 모델은 PRIMARY 결과만 소비하는 오염 차단 불변식) + `params_loader`(V034 `detection_model_versions.params` JSONB 로드, 5분 TTL 캐시, `invalidate_cache()` 제공) + `feature_flag`(`PREDICTION_USE_MODEL_REGISTRY=0` 기본, `PREDICTION_CONCURRENT_SHADOWS=0` 기본). `scheduler.py` 10 단계에 feature flag 분기 추가해 기존 5분 사이클을 건드리지 않고 신 경로 공존. `db/partition_manager.py` 에 `detection_model_run_outputs` 월별 파티션 자동 생성/DROP 추가(system_config `partition.detection_model_run_outputs.*` 기반, 기본 retention_months=1, create_ahead_months=2). 유닛테스트 15 케이스(DAG 순환 검출, SHADOW 오염 차단, PRIMARY 실패→downstream skip, SHADOW 실패 격리, VARCHAR(64) 초과 거부) 전수 통과. 후속 Phase 2 에서 `models_core/registered/` 에 5 모델 PoC(`dark_suspicion`/`gear_violation_g01_g06`/`transshipment_5stage`/`risk_composite`/`pair_trawl_tier`) 추가 예정 diff --git a/prediction/algorithms/gear_violation.py b/prediction/algorithms/gear_violation.py index 56d6705..70756f8 100644 --- a/prediction/algorithms/gear_violation.py +++ b/prediction/algorithms/gear_violation.py @@ -51,6 +51,67 @@ GEAR_DRIFT_THRESHOLD_NM = 0.270 # ≈ 500m (DAR-03 스펙, 조류 보정 전) FIXED_GEAR_TYPES = {'GN', 'TRAP', 'FYK', 'FPO', 'GNS', 'GND'} +# classify_gear_violations 의 Phase 2 파라미터 SSOT — DB seed 는 이 값을 그대로 복사 +GEAR_VIOLATION_DEFAULT_PARAMS: dict = { + 'scores': { + 'G01_zone_violation': G01_SCORE, + 'G02_closed_season': G02_SCORE, + 'G03_unregistered_gear': G03_SCORE, + 'G04_signal_cycling': G04_SCORE, + 'G05_gear_drift': G05_SCORE, + 'G06_pair_trawl': G06_SCORE, + }, + 'signal_cycling': { + 'gap_min': SIGNAL_CYCLING_GAP_MIN, + 'min_count': SIGNAL_CYCLING_MIN_COUNT, + }, + 'gear_drift_threshold_nm': GEAR_DRIFT_THRESHOLD_NM, + 'fixed_gear_types': sorted(FIXED_GEAR_TYPES), + 'fishery_code_allowed_gear': { + k: sorted(v) for k, v in FISHERY_CODE_ALLOWED_GEAR.items() + }, +} + + +def _merge_default_gv_params(override: Optional[dict]) -> dict: + """GEAR_VIOLATION_DEFAULT_PARAMS 에 override 깊이 병합. list/set 키는 override 가 치환.""" + if not override: + return GEAR_VIOLATION_DEFAULT_PARAMS + merged = { + k: (dict(v) if isinstance(v, dict) else + (list(v) if isinstance(v, list) else v)) + for k, v in GEAR_VIOLATION_DEFAULT_PARAMS.items() + } + for key, val in override.items(): + if isinstance(val, dict) and isinstance(merged.get(key), dict): + merged[key] = {**merged[key], **val} + else: + merged[key] = val + return merged + + +def _detect_signal_cycling_count( + gear_episodes: list[dict], threshold_min: int, +) -> tuple[int, int]: + """_detect_signal_cycling 의 count-만 변형 (threshold 를 params 에서 받기 위함). + + Returns: (cycling_count, total_episodes_evaluated) + """ + if not gear_episodes or len(gear_episodes) < 2: + return 0, len(gear_episodes or []) + sorted_eps = sorted(gear_episodes, key=lambda e: e['first_seen_at']) + cnt = 0 + for i in range(1, len(sorted_eps)): + prev_end = sorted_eps[i - 1].get('last_seen_at') + curr_start = sorted_eps[i].get('first_seen_at') + if prev_end is None or curr_start is None: + continue + gap_min = (curr_start - prev_end).total_seconds() / 60.0 + if 0 < gap_min <= threshold_min: + cnt += 1 + return cnt, len(sorted_eps) + + def _haversine_nm(lat1: float, lon1: float, lat2: float, lon2: float) -> float: """두 좌표 간 거리 (해리) — Haversine 공식.""" R = 3440.065 @@ -196,6 +257,7 @@ def classify_gear_violations( permit_periods: Optional[list[tuple[datetime, datetime]]] = None, registered_fishery_code: Optional[str] = None, observation_ts: Optional[datetime] = None, + params: Optional[dict] = None, ) -> dict: """어구 위반 G코드 분류 메인 함수 (DAR-03). @@ -229,7 +291,19 @@ def classify_gear_violations( } 판정 우선순위: ZONE_VIOLATION > PAIR_TRAWL > GEAR_MISMATCH > '' (정상) + + params: detection_model_versions.params (None 이면 DEFAULT_PARAMS). + params=None 호출은 Phase 2 이전과 완전히 동일한 결과를 낸다. """ + p = _merge_default_gv_params(params) + scores = p['scores'] + sc = p['signal_cycling'] + fixed_gear_types = set(p['fixed_gear_types']) + # JSONB 는 list 로 저장되므로 set 으로 변환하여 _is_unregistered_gear 호출 + allowed_gear_map = { + k: set(v) for k, v in p['fishery_code_allowed_gear'].items() + } + g_codes: list[str] = [] evidence: dict = {} score = 0 @@ -241,7 +315,7 @@ def classify_gear_violations( allowed_gears: list[str] = zone_info.get('allowed_gears', []) if allowed_gears and gear_type not in allowed_gears: g_codes.append('G-01') - score += G01_SCORE + score += scores['G01_zone_violation'] evidence['G-01'] = { 'zone': zone, 'gear': gear_type, @@ -262,7 +336,7 @@ def classify_gear_violations( in_closed = False if in_closed: g_codes.append('G-02') - score += G02_SCORE + score += scores['G02_closed_season'] evidence['G-02'] = { 'observed_at': observation_ts.isoformat() if observation_ts else None, 'permit_periods': [ @@ -276,18 +350,25 @@ def classify_gear_violations( # ── G-03: 미등록/허가외 어구 ────────────────────────────────── if registered_fishery_code: try: - unregistered = _is_unregistered_gear(gear_type, registered_fishery_code) + # params 로 덮어쓴 매핑을 전달 (_is_unregistered_gear 는 기존 공개 시그니처 유지 — BACK-COMPAT) + allowed_set = allowed_gear_map.get( + registered_fishery_code.upper().strip() + ) + if allowed_set is None: + unregistered = False + else: + unregistered = gear_type.upper().strip() not in allowed_set except Exception as exc: logger.error('G-03 평가 실패 [mmsi=%s]: %s', mmsi, exc) unregistered = False if unregistered: g_codes.append('G-03') - score += G03_SCORE + score += scores['G03_unregistered_gear'] evidence['G-03'] = { 'detected_gear': gear_type, 'registered_fishery_code': registered_fishery_code, 'allowed_gears': sorted( - FISHERY_CODE_ALLOWED_GEAR.get( + allowed_gear_map.get( registered_fishery_code.upper().strip(), set() ) ), @@ -300,19 +381,22 @@ def classify_gear_violations( ) # ── G-04: MMSI 조작 의심 (고정어구 신호 on/off 반복) ─────────── - if gear_episodes is not None and gear_type in FIXED_GEAR_TYPES: + if gear_episodes is not None and gear_type in fixed_gear_types: try: - is_cycling, cycling_count = _detect_signal_cycling(gear_episodes) + cycling_count, _ = _detect_signal_cycling_count( + gear_episodes, threshold_min=sc['gap_min'], + ) + is_cycling = cycling_count >= sc['min_count'] except Exception as exc: logger.error('G-04 평가 실패 [mmsi=%s]: %s', mmsi, exc) is_cycling, cycling_count = False, 0 if is_cycling: g_codes.append('G-04') - score += G04_SCORE + score += scores['G04_signal_cycling'] evidence['G-04'] = { 'cycling_count': cycling_count, - 'threshold_min': SIGNAL_CYCLING_GAP_MIN, + 'threshold_min': sc['gap_min'], } if not judgment: judgment = 'GEAR_MISMATCH' @@ -321,16 +405,18 @@ def classify_gear_violations( ) # ── G-05: 고정어구 인위적 이동 ──────────────────────────────── - if gear_positions is not None and gear_type in FIXED_GEAR_TYPES: + if gear_positions is not None and gear_type in fixed_gear_types: try: - drift_result = _detect_gear_drift(gear_positions) + drift_result = _detect_gear_drift( + gear_positions, threshold_nm=p['gear_drift_threshold_nm'], + ) except Exception as exc: logger.error('G-05 평가 실패 [mmsi=%s]: %s', mmsi, exc) drift_result = {'drift_detected': False, 'drift_nm': 0.0, 'tidal_corrected': False} if drift_result['drift_detected']: g_codes.append('G-05') - score += G05_SCORE + score += scores['G05_gear_drift'] evidence['G-05'] = drift_result if not judgment: judgment = 'GEAR_MISMATCH' @@ -341,7 +427,7 @@ def classify_gear_violations( # ── G-06: 쌍끌이 공조 조업 ──────────────────────────────────── if pair_result and pair_result.get('pair_detected'): g_codes.append('G-06') - score += G06_SCORE + score += scores['G06_pair_trawl'] evidence['G-06'] = { 'sync_duration_min': pair_result.get('sync_duration_min'), 'mean_separation_nm': pair_result.get('mean_separation_nm'), diff --git a/prediction/algorithms/pair_trawl.py b/prediction/algorithms/pair_trawl.py index d794fea..964cc3c 100644 --- a/prediction/algorithms/pair_trawl.py +++ b/prediction/algorithms/pair_trawl.py @@ -67,6 +67,42 @@ CANDIDATE_PROXIMITY_FACTOR = 2.0 # 후보 탐색 반경: PROXIMITY_NM × 2 CANDIDATE_SOG_MIN = 1.5 # 후보 속력 하한 (완화) CANDIDATE_SOG_MAX = 5.0 # 후보 속력 상한 (완화) + +# Phase 2 PoC #5 — pair_trawl_tier 카탈로그 등록용 params snapshot. +# 내부 헬퍼들이 모듈 레벨 상수를 직접 참조하므로 이번 단계는 카탈로그·관찰만. +# 런타임 override 는 후속 리팩토링 PR 에서 활성화. +PAIR_TRAWL_DEFAULT_PARAMS: dict = { + 'cycle_interval_min': CYCLE_INTERVAL_MIN, + 'strong': { + 'proximity_nm': PROXIMITY_NM, + 'sog_delta_max': SOG_DELTA_MAX, + 'cog_delta_max': COG_DELTA_MAX, + 'sog_min': SOG_MIN, + 'sog_max': SOG_MAX, + 'min_sync_cycles': MIN_SYNC_CYCLES, + 'simultaneous_gap_min': SIMULTANEOUS_GAP_MIN, + }, + 'probable': { + 'min_block_cycles': PROBABLE_MIN_BLOCK_CYCLES, + 'min_sync_ratio': PROBABLE_MIN_SYNC_RATIO, + 'proximity_nm': PROBABLE_PROXIMITY_NM, + 'sog_delta_max': PROBABLE_SOG_DELTA_MAX, + 'cog_delta_max': PROBABLE_COG_DELTA_MAX, + 'sog_min': PROBABLE_SOG_MIN, + 'sog_max': PROBABLE_SOG_MAX, + }, + 'suspect': { + 'min_block_cycles': SUSPECT_MIN_BLOCK_CYCLES, + 'min_sync_ratio': SUSPECT_MIN_SYNC_RATIO, + }, + 'candidate_scan': { + 'cell_size_deg': CELL_SIZE, + 'proximity_factor': CANDIDATE_PROXIMITY_FACTOR, + 'sog_min': CANDIDATE_SOG_MIN, + 'sog_max': CANDIDATE_SOG_MAX, + }, +} + # ────────────────────────────────────────────────────────────── # 내부 헬퍼 # ────────────────────────────────────────────────────────────── diff --git a/prediction/algorithms/risk.py b/prediction/algorithms/risk.py index 61cfd32..b7271d1 100644 --- a/prediction/algorithms/risk.py +++ b/prediction/algorithms/risk.py @@ -7,6 +7,59 @@ from algorithms.dark_vessel import detect_ais_gaps from algorithms.spoofing import detect_teleportation +# Phase 2 PoC #4 — risk_composite 카탈로그 등록용 params snapshot. +# 현 런타임은 모듈 레벨 상수/inline 숫자를 직접 사용하며, 운영자 UI 에서 +# 주요 가중치·임계를 조회·튜닝할 수 있도록 DB 에 노출한다. 런타임 override +# 는 후속 리팩토링 PR 에서 compute_lightweight_risk_score / compute_vessel_risk_score +# 에 params 인자 전파를 완성하면서 활성화된다. +RISK_COMPOSITE_DEFAULT_PARAMS: dict = { + 'tier_thresholds': {'critical': 70, 'high': 50, 'medium': 30}, + # 경량(파이프라인 미통과) 경로 — compute_lightweight_risk_score + 'lightweight_weights': { + 'territorial_sea': 40, + 'contiguous_zone': 15, + 'zone_unpermitted': 25, + 'eez_lt12nm': 15, + 'eez_lt24nm': 8, + 'dark_suspicion_multiplier': 0.3, + 'dark_gap_720_min': 25, + 'dark_gap_180_min': 20, + 'dark_gap_60_min': 15, + 'dark_gap_30_min': 8, + 'spoofing_gt07': 15, + 'spoofing_gt05': 8, + 'unpermitted_alone': 15, + 'unpermitted_with_suspicion': 8, + 'repeat_gte5': 10, + 'repeat_gte2': 5, + }, + # 파이프라인 통과(정밀) 경로 — compute_vessel_risk_score + 'pipeline_weights': { + 'territorial_sea': 40, + 'contiguous_zone': 10, + 'zone_unpermitted': 25, + 'territorial_fishing': 20, + 'fishing_segments_any': 5, + 'trawl_uturn': 10, + 'teleportation': 20, + 'speed_jumps_ge3': 10, + 'speed_jumps_ge1': 5, + 'critical_gaps_ge60': 15, + 'any_gaps': 5, + 'unpermitted': 20, + }, + 'dark_suspicion_fallback_gap_min': { + 'very_long_720': 720, + 'long_180': 180, + 'mid_60': 60, + 'short_30': 30, + }, + 'spoofing_thresholds': {'high_0.7': 0.7, 'medium_0.5': 0.5}, + 'eez_proximity_nm': {'inner_12': 12, 'outer_24': 24}, + 'repeat_thresholds': {'h24_high': 5, 'h24_low': 2}, +} + + def compute_lightweight_risk_score( zone_info: dict, sog: float, diff --git a/prediction/algorithms/transshipment.py b/prediction/algorithms/transshipment.py index 040ba72..fb4d5b3 100644 --- a/prediction/algorithms/transshipment.py +++ b/prediction/algorithms/transshipment.py @@ -48,6 +48,27 @@ _EXCLUDED_SHIP_TY = frozenset({ # shipTy 텍스트에 포함되면 CARRIER 로 승격 (부분일치, 대소문자 무시) _CARRIER_HINTS = ('cargo', 'tanker', 'supply', 'carrier', 'reefer') + +# Phase 2 PoC #3 — 카탈로그 등록용 파라미터 snapshot. +# 내부 헬퍼 함수들이 모듈 레벨 상수를 직접 쓰기 때문에 이번 단계에서는 +# 런타임 override 없이 **카탈로그·관찰만 등록**한다. +# 운영자가 UI 에서 현재 값을 확인 가능하도록 DB 에 노출되며, 실제 값 교체는 +# 후속 리팩토링 PR 에서 _is_proximity / _is_approach / _evict_expired 등 +# 헬퍼에 params 인자를 전파하면서 활성화된다. +TRANSSHIPMENT_DEFAULT_PARAMS: dict = { + 'sog_threshold_kn': SOG_THRESHOLD_KN, + 'proximity_deg': PROXIMITY_DEG, + 'approach_deg': APPROACH_DEG, + 'rendezvous_min': RENDEZVOUS_MIN, + 'pair_expiry_min': PAIR_EXPIRY_MIN, + 'gap_tolerance_cycles': GAP_TOLERANCE_CYCLES, + 'fishing_kinds': sorted(_FISHING_KINDS), + 'carrier_kinds': sorted(_CARRIER_KINDS), + 'excluded_ship_ty': sorted(_EXCLUDED_SHIP_TY), + 'carrier_hints': list(_CARRIER_HINTS), + 'min_score': 50, # detect_transshipment 의 `score >= 50만` 출력 필터 +} + # ────────────────────────────────────────────────────────────── # 감시영역 로드 # ────────────────────────────────────────────────────────────── diff --git a/prediction/models_core/registered/gear_violation_model.py b/prediction/models_core/registered/gear_violation_model.py new file mode 100644 index 0000000..bcd4e61 --- /dev/null +++ b/prediction/models_core/registered/gear_violation_model.py @@ -0,0 +1,71 @@ +"""gear_violation_g01_g06 — 어구 위반 G-01~G-06 종합 모델 (Phase 2 PoC #2). + +기존 `algorithms.gear_violation.classify_gear_violations` 을 얇게 감싸는 Adapter. +scheduler.py 가 이미 이 함수를 호출해 AnalysisResult.features 에 결과를 저장하므로, +본 모델은 **ctx.inputs 의 AnalysisResult 들에서 그 결과를 관찰·집계**하는 역할. + +입력: +- ctx.inputs 의 각 row (AnalysisResult asdict) 중 features.g_codes 가 비어있지 않은 선박. + +출력: +- outputs_per_input: (input_ref, {g_codes, judgment, score}) — 원시 결과 snapshot +- metrics: + · evaluated_count : G-code 탐지된 선박 수 + · g01_count ~ g06_count : 각 G-code 별 탐지 빈도 + · pair_trawl_count : G-06 탐지 + · closed_season_count : G-02 탐지 +""" +from __future__ import annotations + +from algorithms.gear_violation import GEAR_VIOLATION_DEFAULT_PARAMS +from models_core.base import BaseDetectionModel, ModelContext, ModelResult, make_input_ref + + +class GearViolationModel(BaseDetectionModel): + model_id = 'gear_violation_g01_g06' + depends_on: list[str] = [] + + def run(self, ctx: ModelContext) -> ModelResult: + outputs_per_input: list[tuple[dict, dict]] = [] + counts = {f'g0{i}_count': 0 for i in range(1, 7)} + evaluated = 0 + + for row in ctx.inputs or []: + if not row: + continue + features = row.get('features') or {} + if not isinstance(features, dict): + continue + g_codes = features.get('g_codes') or [] + if not g_codes: + continue + + evaluated += 1 + for code in g_codes: + key = code.replace('-', '').lower() + '_count' # 'G-01' → 'g01_count' + if key in counts: + counts[key] += 1 + + outputs_per_input.append(( + make_input_ref(row.get('mmsi'), row.get('analyzed_at')), + { + 'g_codes': list(g_codes), + 'gear_judgment': features.get('gear_judgment', ''), + 'gear_violation_score': int(features.get('gear_violation_score') or 0), + }, + )) + + metrics = {'evaluated_count': float(evaluated)} + metrics.update({k: float(v) for k, v in counts.items()}) + + return ModelResult( + model_id=self.model_id, + version_id=self.version_id, + version_str=self.version_str, + role=self.role, + outputs_per_input=outputs_per_input, + metrics=metrics, + ) + + +__all__ = ['GearViolationModel', 'GEAR_VIOLATION_DEFAULT_PARAMS'] diff --git a/prediction/models_core/registered/pair_trawl_model.py b/prediction/models_core/registered/pair_trawl_model.py new file mode 100644 index 0000000..368cf62 --- /dev/null +++ b/prediction/models_core/registered/pair_trawl_model.py @@ -0,0 +1,65 @@ +"""pair_trawl_tier — 쌍끌이 공조 tier 분류 관찰 어댑터 (Phase 2 PoC #5). + +기존 `algorithms.pair_trawl` 이 STRONG/PROBABLE/SUSPECT tier 로 판정한 결과를 +ctx.inputs (AnalysisResult) 에서 관찰 집계. 런타임 params override 는 후속 PR. + +metrics: + · evaluated_count : pair_trawl_detected=True 선박 수 + · tier_{strong/probable/suspect}_count +""" +from __future__ import annotations + +from algorithms.pair_trawl import PAIR_TRAWL_DEFAULT_PARAMS +from models_core.base import BaseDetectionModel, ModelContext, ModelResult, make_input_ref + + +class PairTrawlModel(BaseDetectionModel): + model_id = 'pair_trawl_tier' + depends_on: list[str] = [] + + def run(self, ctx: ModelContext) -> ModelResult: + outputs_per_input: list[tuple[dict, dict]] = [] + tiers = {'STRONG': 0, 'PROBABLE': 0, 'SUSPECT': 0} + evaluated = 0 + + for row in ctx.inputs or []: + if not row: + continue + features = row.get('features') or {} + if not isinstance(features, dict): + continue + if features.get('pair_trawl_detected') is not True: + continue + + evaluated += 1 + tier = features.get('pair_tier', '') + if tier in tiers: + tiers[tier] += 1 + + outputs_per_input.append(( + make_input_ref(row.get('mmsi'), row.get('analyzed_at')), + { + 'pair_tier': tier, + 'pair_type': features.get('pair_type'), + 'pair_mmsi': features.get('pair_mmsi'), + 'similarity': features.get('similarity'), + 'confidence': features.get('confidence'), + }, + )) + + return ModelResult( + model_id=self.model_id, + version_id=self.version_id, + version_str=self.version_str, + role=self.role, + outputs_per_input=outputs_per_input, + metrics={ + 'evaluated_count': float(evaluated), + 'tier_strong_count': float(tiers['STRONG']), + 'tier_probable_count': float(tiers['PROBABLE']), + 'tier_suspect_count': float(tiers['SUSPECT']), + }, + ) + + +__all__ = ['PairTrawlModel', 'PAIR_TRAWL_DEFAULT_PARAMS'] diff --git a/prediction/models_core/registered/risk_composite_model.py b/prediction/models_core/registered/risk_composite_model.py new file mode 100644 index 0000000..af7488f --- /dev/null +++ b/prediction/models_core/registered/risk_composite_model.py @@ -0,0 +1,66 @@ +"""risk_composite — 종합 위험도 관찰 어댑터 (Phase 2 PoC #4). + +현재 `algorithms.risk` 의 compute_*_risk_score 들이 inline 숫자로 점수를 계산하므로 +이 단계에서는 카탈로그 등록 + AnalysisResult 의 risk_score/risk_level 을 집계 관찰만. +런타임 params override 는 후속 리팩토링 PR 에서 활성화. + +metrics: + · evaluated_count : 전체 관찰 수 + · avg_risk_score + · tier_{critical/high/medium/low}_count +""" +from __future__ import annotations + +from algorithms.risk import RISK_COMPOSITE_DEFAULT_PARAMS +from models_core.base import BaseDetectionModel, ModelContext, ModelResult, make_input_ref + + +class RiskCompositeModel(BaseDetectionModel): + model_id = 'risk_composite' + depends_on: list[str] = [] + + def run(self, ctx: ModelContext) -> ModelResult: + outputs_per_input: list[tuple[dict, dict]] = [] + tiers = {'CRITICAL': 0, 'HIGH': 0, 'MEDIUM': 0, 'LOW': 0} + score_sum = 0.0 + evaluated = 0 + + for row in ctx.inputs or []: + if not row: + continue + score = row.get('risk_score') + level = row.get('risk_level', '') + if score is None: + continue + evaluated += 1 + score_sum += float(score) + if level in tiers: + tiers[level] += 1 + outputs_per_input.append(( + make_input_ref(row.get('mmsi'), row.get('analyzed_at')), + { + 'risk_score': int(score), + 'risk_level': level, + 'vessel_type': row.get('vessel_type'), + }, + )) + + avg = score_sum / evaluated if evaluated else 0.0 + return ModelResult( + model_id=self.model_id, + version_id=self.version_id, + version_str=self.version_str, + role=self.role, + outputs_per_input=outputs_per_input, + metrics={ + 'evaluated_count': float(evaluated), + 'avg_risk_score': round(avg, 2), + 'tier_critical_count': float(tiers['CRITICAL']), + 'tier_high_count': float(tiers['HIGH']), + 'tier_medium_count': float(tiers['MEDIUM']), + 'tier_low_count': float(tiers['LOW']), + }, + ) + + +__all__ = ['RiskCompositeModel', 'RISK_COMPOSITE_DEFAULT_PARAMS'] diff --git a/prediction/models_core/registered/transshipment_model.py b/prediction/models_core/registered/transshipment_model.py new file mode 100644 index 0000000..ce610d3 --- /dev/null +++ b/prediction/models_core/registered/transshipment_model.py @@ -0,0 +1,67 @@ +"""transshipment_5stage — 5단계 환적 탐지 관찰 어댑터 (Phase 2 PoC #3). + +현재 `algorithms.transshipment.detect_transshipment` 의 내부 헬퍼가 모듈 레벨 +상수를 직접 참조하므로 이번 단계에서는 **카탈로그 등록 + 관찰 수집** 만 담당. +런타임 params override 는 후속 리팩토링 PR 에서 헬퍼 시그니처 확장과 함께 활성화. + +입력: ctx.inputs (AnalysisResult asdict). `transship_suspect=True` 인 선박을 집계. +출력: +- outputs_per_input: (input_ref, {pair_mmsi, duration_min, tier, score, pair_type}) +- metrics: + · evaluated_count : transship_suspect True 선박 수 + · tier_critical_count / tier_high_count / tier_medium_count +""" +from __future__ import annotations + +from algorithms.transshipment import TRANSSHIPMENT_DEFAULT_PARAMS +from models_core.base import BaseDetectionModel, ModelContext, ModelResult, make_input_ref + + +class TransshipmentModel(BaseDetectionModel): + model_id = 'transshipment_5stage' + depends_on: list[str] = [] + + def run(self, ctx: ModelContext) -> ModelResult: + outputs_per_input: list[tuple[dict, dict]] = [] + tier_counts = {'CRITICAL': 0, 'HIGH': 0, 'MEDIUM': 0, 'LOW': 0} + evaluated = 0 + + for row in ctx.inputs or []: + if not row: + continue + if not row.get('transship_suspect'): + continue + evaluated += 1 + features = row.get('features') or {} + if not isinstance(features, dict): + features = {} + tier = features.get('transship_tier', '') + if tier in tier_counts: + tier_counts[tier] += 1 + outputs_per_input.append(( + make_input_ref(row.get('mmsi'), row.get('analyzed_at')), + { + 'pair_mmsi': row.get('transship_pair_mmsi'), + 'duration_min': row.get('transship_duration_min'), + 'tier': tier, + 'score': features.get('transship_score'), + }, + )) + + return ModelResult( + model_id=self.model_id, + version_id=self.version_id, + version_str=self.version_str, + role=self.role, + outputs_per_input=outputs_per_input, + metrics={ + 'evaluated_count': float(evaluated), + 'tier_critical_count': float(tier_counts['CRITICAL']), + 'tier_high_count': float(tier_counts['HIGH']), + 'tier_medium_count': float(tier_counts['MEDIUM']), + 'tier_low_count': float(tier_counts['LOW']), + }, + ) + + +__all__ = ['TransshipmentModel', 'TRANSSHIPMENT_DEFAULT_PARAMS'] diff --git a/prediction/models_core/seeds/README.md b/prediction/models_core/seeds/README.md index 58110ab..862b951 100644 --- a/prediction/models_core/seeds/README.md +++ b/prediction/models_core/seeds/README.md @@ -4,13 +4,19 @@ Phase 2 PoC 모델을 V034 `detection_models` + `detection_model_versions` 에 s ## 현재 seed 대상 -| 파일 | 모델 | 상태 | -|---|---|---| -| `v1_dark_suspicion.sql` | `dark_suspicion` (tier 3, DARK_VESSEL) | ✅ 완료 | -| (후속 PR) | `gear_violation_g01_g06` (tier 4) | ⏸ 대기 | -| (후속 PR) | `transshipment_5stage` (tier 4) | ⏸ 대기 | -| (후속 PR) | `risk_composite` (tier 3) | ⏸ 대기 | -| (후속 PR) | `pair_trawl_tier` (tier 4) | ⏸ 대기 | +| 파일 | 모델 | tier | 모드 | +|---|---|---|---| +| `v1_dark_suspicion.sql` | `dark_suspicion` (DARK_VESSEL) | 3 | **런타임 override 완성** | +| `v1_gear_violation.sql` | `gear_violation_g01_g06` (GEAR) | 4 | **런타임 override 완성** | +| `v1_transshipment.sql` | `transshipment_5stage` (TRANSSHIP) | 4 | 카탈로그 + 관찰 | +| `v1_risk_composite.sql` | `risk_composite` (META) | 3 | 카탈로그 + 관찰 | +| `v1_pair_trawl.sql` | `pair_trawl_tier` (GEAR) | 4 | 카탈로그 + 관찰 | +| `v1_phase2_all.sql` | 5 모델 일괄 | — | \i 로 위 5개 순차 실행 | + +### "런타임 override 완성" vs "카탈로그 + 관찰" + +- **런타임 override 완성** (`dark_suspicion`, `gear_violation_g01_g06`): 알고리즘 함수가 `params: dict | None = None` 인자를 받고, ACTIVE 버전의 JSONB 를 적용해 실제 가중치·임계값이 교체된다. 운영자가 version 을 ACTIVE 로 승격하면 **다음 사이클 결과가 바뀐다**. +- **카탈로그 + 관찰** (`transshipment_5stage`, `risk_composite`, `pair_trawl_tier`): 내부 헬퍼 함수들이 모듈 레벨 상수를 직접 참조하여 범위가 큰 리팩토링이 필요. 이번 단계에서는 **DEFAULT_PARAMS 를 DB 카탈로그에 노출 + Adapter 로 결과 관찰 수집**까지만. 런타임 실제 교체는 후속 리팩토링 PR 에서 헬퍼에 params 전파를 완성하면 활성화된다. ## 실행 방법 diff --git a/prediction/models_core/seeds/v1_gear_violation.sql b/prediction/models_core/seeds/v1_gear_violation.sql new file mode 100644 index 0000000..6a4d0c3 --- /dev/null +++ b/prediction/models_core/seeds/v1_gear_violation.sql @@ -0,0 +1,65 @@ +-- Phase 2 PoC #2 — gear_violation_g01_g06 모델 seed +-- +-- 트랜잭션 제어는 호출자가 담당. BEGIN/COMMIT 없음. 실행 방법은 seeds/README.md 참조. +-- +-- 결과: kcg.detection_models 에 'gear_violation_g01_g06' 1 행 INSERT +-- kcg.detection_model_versions 에 v1.0.0 status=DRAFT role=NULL 1 행 INSERT +-- +-- 동치성 보장: params JSONB 는 prediction/algorithms/gear_violation.py +-- GEAR_VIOLATION_DEFAULT_PARAMS 와 1:1 일치 (Python 상수가 SSOT). +-- +-- 롤백: +-- DELETE FROM kcg.detection_model_versions +-- WHERE model_id = 'gear_violation_g01_g06' AND version = '1.0.0'; +-- DELETE FROM kcg.detection_models WHERE model_id = 'gear_violation_g01_g06'; + +INSERT INTO kcg.detection_models ( + model_id, display_name, tier, category, + description, entry_module, entry_callable, is_enabled +) VALUES ( + 'gear_violation_g01_g06', + '어구 위반 G-01~G-06 종합 (DAR-03)', + 4, + 'GEAR', + 'DAR-03 규격 G-01 허가수역 외 조업 + G-02 금어기 + G-03 미등록 어구 + G-04 MMSI 조작 + G-05 어구 drift + G-06 쌍끌이 공조 판정 통합. 각 G-code 점수·허용 어구 매핑·signal cycling 임계는 params 에서 조정.', + 'models_core.registered.gear_violation_model', + 'GearViolationModel', + TRUE +) ON CONFLICT (model_id) DO NOTHING; + +INSERT INTO kcg.detection_model_versions ( + model_id, version, status, role, params, notes +) VALUES ( + 'gear_violation_g01_g06', + '1.0.0', + 'DRAFT', + NULL, + $json${ + "scores": { + "G01_zone_violation": 15, + "G02_closed_season": 18, + "G03_unregistered_gear": 12, + "G04_signal_cycling": 10, + "G05_gear_drift": 5, + "G06_pair_trawl": 20 + }, + "signal_cycling": {"gap_min": 30, "min_count": 2}, + "gear_drift_threshold_nm": 0.270, + "fixed_gear_types": ["FPO", "FYK", "GN", "GND", "GNS", "TRAP"], + "fishery_code_allowed_gear": { + "PT": ["PT", "PT-S", "TRAWL"], + "PT-S": ["PT", "PT-S", "TRAWL"], + "GN": ["GILLNET", "GN", "GND", "GNS"], + "PS": ["PS", "PURSE"], + "OT": ["OT", "TRAWL"], + "FC": [] + } + }$json$::jsonb, + 'Phase 2 PoC #2 seed. Python GEAR_VIOLATION_DEFAULT_PARAMS 와 1:1 일치.' +) ON CONFLICT (model_id, version) DO NOTHING; + +-- 확인 +SELECT model_id, is_enabled, + (SELECT count(*) FROM kcg.detection_model_versions v WHERE v.model_id = m.model_id) AS versions + FROM kcg.detection_models m + WHERE model_id = 'gear_violation_g01_g06'; diff --git a/prediction/models_core/seeds/v1_pair_trawl.sql b/prediction/models_core/seeds/v1_pair_trawl.sql new file mode 100644 index 0000000..6d7b598 --- /dev/null +++ b/prediction/models_core/seeds/v1_pair_trawl.sql @@ -0,0 +1,65 @@ +-- Phase 2 PoC #5 — pair_trawl_tier 모델 seed (카탈로그 + 관찰 전용) +-- +-- STRONG/PROBABLE/SUSPECT tier 임계 + candidate scan 파라미터 노출. +-- 런타임 override 는 후속 리팩토링 PR. + +INSERT INTO kcg.detection_models ( + model_id, display_name, tier, category, + description, entry_module, entry_callable, is_enabled +) VALUES ( + 'pair_trawl_tier', + '쌍끌이 공조 tier (STRONG / PROBABLE / SUSPECT)', + 4, + 'GEAR', + '두 선박의 근접·속력·방향 동조 기반 쌍끌이(G-06) 판정. STRONG(스펙 100%) / PROBABLE(1h+ 동조) / SUSPECT(30m+ 약한 동조) 3 tier. 현 버전은 params 카탈로그 등록만.', + 'models_core.registered.pair_trawl_model', + 'PairTrawlModel', + TRUE +) ON CONFLICT (model_id) DO NOTHING; + +INSERT INTO kcg.detection_model_versions ( + model_id, version, status, role, params, notes +) VALUES ( + 'pair_trawl_tier', + '1.0.0', + 'DRAFT', + NULL, + $json${ + "cycle_interval_min": 5, + "strong": { + "proximity_nm": 0.27, + "sog_delta_max": 0.5, + "cog_delta_max": 10.0, + "sog_min": 2.0, + "sog_max": 4.0, + "min_sync_cycles": 24, + "simultaneous_gap_min": 30 + }, + "probable": { + "min_block_cycles": 12, + "min_sync_ratio": 0.6, + "proximity_nm": 0.43, + "sog_delta_max": 1.0, + "cog_delta_max": 20.0, + "sog_min": 1.5, + "sog_max": 5.0 + }, + "suspect": { + "min_block_cycles": 6, + "min_sync_ratio": 0.3 + }, + "candidate_scan": { + "cell_size_deg": 0.01, + "proximity_factor": 2.0, + "sog_min": 1.5, + "sog_max": 5.0 + } + }$json$::jsonb, + 'Phase 2 PoC #5 seed. Python PAIR_TRAWL_DEFAULT_PARAMS 와 1:1 일치.' +) ON CONFLICT (model_id, version) DO NOTHING; + +-- 확인 +SELECT model_id, is_enabled, + (SELECT count(*) FROM kcg.detection_model_versions v WHERE v.model_id = m.model_id) AS versions + FROM kcg.detection_models m + WHERE model_id = 'pair_trawl_tier'; diff --git a/prediction/models_core/seeds/v1_phase2_all.sql b/prediction/models_core/seeds/v1_phase2_all.sql new file mode 100644 index 0000000..047db18 --- /dev/null +++ b/prediction/models_core/seeds/v1_phase2_all.sql @@ -0,0 +1,47 @@ +-- Phase 2 PoC 5 모델 일괄 seed +-- +-- 한 번의 트랜잭션으로 5 모델 × 각 1 버전(DRAFT) 을 카탈로그에 등록. +-- 호출자가 BEGIN/COMMIT 을 제공하거나 `psql -1` 래핑을 사용해야 한다. +-- +-- 실행: +-- psql -v ON_ERROR_STOP=1 -1 \ +-- -f prediction/models_core/seeds/v1_phase2_all.sql +-- +-- dry-run: +-- psql -v ON_ERROR_STOP=1 <<'SQL' +-- BEGIN; +-- \i prediction/models_core/seeds/v1_phase2_all.sql +-- SELECT model_id, tier, (SELECT count(*) FROM kcg.detection_model_versions v +-- WHERE v.model_id=m.model_id) vers +-- FROM kcg.detection_models m ORDER BY tier, model_id; +-- ROLLBACK; +-- SQL +-- +-- 개별 롤백: 각 v1_.sql 하단 주석 참조. +-- +-- 전체 롤백: +-- DELETE FROM kcg.detection_model_versions WHERE model_id IN ( +-- 'dark_suspicion', 'gear_violation_g01_g06', 'transshipment_5stage', +-- 'risk_composite', 'pair_trawl_tier' +-- ); +-- DELETE FROM kcg.detection_models WHERE model_id IN ( +-- 'dark_suspicion', 'gear_violation_g01_g06', 'transshipment_5stage', +-- 'risk_composite', 'pair_trawl_tier' +-- ); + +\i prediction/models_core/seeds/v1_dark_suspicion.sql +\i prediction/models_core/seeds/v1_gear_violation.sql +\i prediction/models_core/seeds/v1_transshipment.sql +\i prediction/models_core/seeds/v1_risk_composite.sql +\i prediction/models_core/seeds/v1_pair_trawl.sql + +-- 최종 확인 +SELECT m.model_id, m.tier, m.category, m.is_enabled, + (SELECT count(*) FROM kcg.detection_model_versions v + WHERE v.model_id = m.model_id) AS versions + FROM kcg.detection_models m + WHERE m.model_id IN ( + 'dark_suspicion', 'gear_violation_g01_g06', 'transshipment_5stage', + 'risk_composite', 'pair_trawl_tier' + ) + ORDER BY m.tier, m.model_id; diff --git a/prediction/models_core/seeds/v1_risk_composite.sql b/prediction/models_core/seeds/v1_risk_composite.sql new file mode 100644 index 0000000..a2e515c --- /dev/null +++ b/prediction/models_core/seeds/v1_risk_composite.sql @@ -0,0 +1,79 @@ +-- Phase 2 PoC #4 — risk_composite 모델 seed (카탈로그 등록 + 관찰 전용) +-- +-- compute_lightweight_risk_score / compute_vessel_risk_score 가 inline 숫자를 +-- 직접 쓰고 있어 이번 버전은 params 카탈로그·관찰만 등록. 런타임 override 는 +-- 후속 리팩토링 PR 에서 활성화. + +INSERT INTO kcg.detection_models ( + model_id, display_name, tier, category, + description, entry_module, entry_callable, is_enabled +) VALUES ( + 'risk_composite', + '종합 위험도 (경량 + 파이프라인)', + 3, + 'META', + '파이프라인 미통과(경량) + 통과(정밀) 경로의 위험도 점수(0~100) + tier(CRITICAL/HIGH/MEDIUM/LOW) 산출. 수역·다크·스푸핑·허가·반복 축으로 가산. 현 버전은 params 카탈로그 등록만.', + 'models_core.registered.risk_composite_model', + 'RiskCompositeModel', + TRUE +) ON CONFLICT (model_id) DO NOTHING; + +INSERT INTO kcg.detection_model_versions ( + model_id, version, status, role, params, notes +) VALUES ( + 'risk_composite', + '1.0.0', + 'DRAFT', + NULL, + $json${ + "tier_thresholds": {"critical": 70, "high": 50, "medium": 30}, + "lightweight_weights": { + "territorial_sea": 40, + "contiguous_zone": 15, + "zone_unpermitted": 25, + "eez_lt12nm": 15, + "eez_lt24nm": 8, + "dark_suspicion_multiplier": 0.3, + "dark_gap_720_min": 25, + "dark_gap_180_min": 20, + "dark_gap_60_min": 15, + "dark_gap_30_min": 8, + "spoofing_gt07": 15, + "spoofing_gt05": 8, + "unpermitted_alone": 15, + "unpermitted_with_suspicion": 8, + "repeat_gte5": 10, + "repeat_gte2": 5 + }, + "pipeline_weights": { + "territorial_sea": 40, + "contiguous_zone": 10, + "zone_unpermitted": 25, + "territorial_fishing": 20, + "fishing_segments_any": 5, + "trawl_uturn": 10, + "teleportation": 20, + "speed_jumps_ge3": 10, + "speed_jumps_ge1": 5, + "critical_gaps_ge60": 15, + "any_gaps": 5, + "unpermitted": 20 + }, + "dark_suspicion_fallback_gap_min": { + "very_long_720": 720, + "long_180": 180, + "mid_60": 60, + "short_30": 30 + }, + "spoofing_thresholds": {"high_0.7": 0.7, "medium_0.5": 0.5}, + "eez_proximity_nm": {"inner_12": 12, "outer_24": 24}, + "repeat_thresholds": {"h24_high": 5, "h24_low": 2} + }$json$::jsonb, + 'Phase 2 PoC #4 seed. Python RISK_COMPOSITE_DEFAULT_PARAMS 와 1:1 일치.' +) ON CONFLICT (model_id, version) DO NOTHING; + +-- 확인 +SELECT model_id, is_enabled, + (SELECT count(*) FROM kcg.detection_model_versions v WHERE v.model_id = m.model_id) AS versions + FROM kcg.detection_models m + WHERE model_id = 'risk_composite'; diff --git a/prediction/models_core/seeds/v1_transshipment.sql b/prediction/models_core/seeds/v1_transshipment.sql new file mode 100644 index 0000000..2f6a269 --- /dev/null +++ b/prediction/models_core/seeds/v1_transshipment.sql @@ -0,0 +1,51 @@ +-- Phase 2 PoC #3 — transshipment_5stage 모델 seed (카탈로그 등록 + 관찰 전용) +-- +-- 본 버전은 `params` 를 DB 에 노출하지만 런타임 override 는 아직 반영하지 않는다. +-- 내부 헬퍼 함수들(_is_proximity / _is_approach / _evict_expired)이 모듈 레벨 상수를 +-- 직접 참조하므로, 후속 리팩토링 PR 에서 params 전파를 완성하면 런타임 값 교체가 +-- 가능해진다. 카탈로그·관찰만으로 Phase 2 PoC 의 "모델 단위 분리" 가치는 확보. +-- +-- 실행 방법 + 롤백은 seeds/README.md 참조. + +INSERT INTO kcg.detection_models ( + model_id, display_name, tier, category, + description, entry_module, entry_callable, is_enabled +) VALUES ( + 'transshipment_5stage', + '환적 의심 5단계 필터 (이종 쌍 → 감시영역 → APPROACH → RENDEZVOUS → 점수)', + 4, + 'TRANSSHIP', + '어선 ↔ 운반선 이종 쌍을 감시영역 내에서만 추적해 APPROACH → RENDEZVOUS → DEPARTURE 패턴을 검증하고 점수 산출. 현 버전은 params 카탈로그 등록만, 런타임 override 는 후속 PR 에서.', + 'models_core.registered.transshipment_model', + 'TransshipmentModel', + TRUE +) ON CONFLICT (model_id) DO NOTHING; + +INSERT INTO kcg.detection_model_versions ( + model_id, version, status, role, params, notes +) VALUES ( + 'transshipment_5stage', + '1.0.0', + 'DRAFT', + NULL, + $json${ + "sog_threshold_kn": 2.0, + "proximity_deg": 0.002, + "approach_deg": 0.01, + "rendezvous_min": 90, + "pair_expiry_min": 240, + "gap_tolerance_cycles": 3, + "fishing_kinds": ["000020"], + "carrier_kinds": ["000023", "000024"], + "excluded_ship_ty": ["AtoN", "Anti Pollution", "Law Enforcement", "Medical Transport", "Passenger", "Pilot Boat", "Search And Rescue", "Tug"], + "carrier_hints": ["cargo", "tanker", "supply", "carrier", "reefer"], + "min_score": 50 + }$json$::jsonb, + 'Phase 2 PoC #3 seed. Python TRANSSHIPMENT_DEFAULT_PARAMS 와 1:1 일치. 현 버전은 카탈로그만, 런타임 override 는 후속 PR.' +) ON CONFLICT (model_id, version) DO NOTHING; + +-- 확인 +SELECT model_id, is_enabled, + (SELECT count(*) FROM kcg.detection_model_versions v WHERE v.model_id = m.model_id) AS versions + FROM kcg.detection_models m + WHERE model_id = 'transshipment_5stage'; diff --git a/prediction/tests/test_gear_violation_params.py b/prediction/tests/test_gear_violation_params.py new file mode 100644 index 0000000..91a7b34 --- /dev/null +++ b/prediction/tests/test_gear_violation_params.py @@ -0,0 +1,125 @@ +"""Phase 2 PoC #2 — gear_violation_g01_g06 params 외부화 동치성 테스트. + +pandas 미설치 환경을 우회하기 위해 dark_suspicion 테스트와 동일한 stub 패턴 사용. +""" +from __future__ import annotations + +import importlib +import json +import os +import sys +import types +import unittest + +# pandas stub (annotation 용) +if 'pandas' not in sys.modules: + pd_stub = types.ModuleType('pandas') + pd_stub.DataFrame = type('DataFrame', (), {}) + pd_stub.Timestamp = type('Timestamp', (), {}) + sys.modules['pandas'] = pd_stub + +if 'pydantic_settings' not in sys.modules: + stub = types.ModuleType('pydantic_settings') + + class _S: + def __init__(self, **kw): + for name, value in self.__class__.__dict__.items(): + if name.isupper(): + setattr(self, name, kw.get(name, value)) + + stub.BaseSettings = _S + sys.modules['pydantic_settings'] = stub + +if 'algorithms' not in sys.modules: + pkg = types.ModuleType('algorithms') + pkg.__path__ = [os.path.join(os.path.dirname(__file__), '..', 'algorithms')] + sys.modules['algorithms'] = pkg + +gv = importlib.import_module('algorithms.gear_violation') + + +class GearViolationParamsTest(unittest.TestCase): + + def test_default_params_shape(self): + p = gv.GEAR_VIOLATION_DEFAULT_PARAMS + self.assertIn('scores', p) + self.assertIn('signal_cycling', p) + self.assertIn('gear_drift_threshold_nm', p) + self.assertIn('fixed_gear_types', p) + self.assertIn('fishery_code_allowed_gear', p) + # 6 G-codes 점수 키 전부 있는지 + for k in ['G01_zone_violation', 'G02_closed_season', 'G03_unregistered_gear', + 'G04_signal_cycling', 'G05_gear_drift', 'G06_pair_trawl']: + self.assertIn(k, p['scores']) + + def test_default_values_match_module_constants(self): + """DEFAULT_PARAMS 는 모듈 레벨 상수와 완전히 동일해야 한다 (SSOT 이중성 방지).""" + p = gv.GEAR_VIOLATION_DEFAULT_PARAMS + self.assertEqual(p['scores']['G01_zone_violation'], gv.G01_SCORE) + self.assertEqual(p['scores']['G02_closed_season'], gv.G02_SCORE) + self.assertEqual(p['scores']['G03_unregistered_gear'], gv.G03_SCORE) + self.assertEqual(p['scores']['G04_signal_cycling'], gv.G04_SCORE) + self.assertEqual(p['scores']['G05_gear_drift'], gv.G05_SCORE) + self.assertEqual(p['scores']['G06_pair_trawl'], gv.G06_SCORE) + self.assertEqual(p['signal_cycling']['gap_min'], gv.SIGNAL_CYCLING_GAP_MIN) + self.assertEqual(p['signal_cycling']['min_count'], gv.SIGNAL_CYCLING_MIN_COUNT) + self.assertAlmostEqual(p['gear_drift_threshold_nm'], gv.GEAR_DRIFT_THRESHOLD_NM) + self.assertEqual(set(p['fixed_gear_types']), gv.FIXED_GEAR_TYPES) + # fishery_code_allowed_gear: list ↔ set 변환 후 비교 + for key, allowed in gv.FISHERY_CODE_ALLOWED_GEAR.items(): + self.assertEqual(set(p['fishery_code_allowed_gear'][key]), allowed) + + def test_merge_none_returns_default_reference(self): + self.assertIs(gv._merge_default_gv_params(None), gv.GEAR_VIOLATION_DEFAULT_PARAMS) + + def test_merge_override_replaces_only_given_keys(self): + override = {'scores': {'G06_pair_trawl': 99}} + merged = gv._merge_default_gv_params(override) + self.assertEqual(merged['scores']['G06_pair_trawl'], 99) + # 다른 score 는 DEFAULT 유지 + self.assertEqual( + merged['scores']['G01_zone_violation'], + gv.GEAR_VIOLATION_DEFAULT_PARAMS['scores']['G01_zone_violation'], + ) + # fixed_gear_types 같은 top-level 키도 DEFAULT 유지 + self.assertEqual( + merged['fixed_gear_types'], + gv.GEAR_VIOLATION_DEFAULT_PARAMS['fixed_gear_types'], + ) + # DEFAULT 는 변조되지 않음 + self.assertEqual( + gv.GEAR_VIOLATION_DEFAULT_PARAMS['scores']['G06_pair_trawl'], 20, + ) + + def test_seed_sql_values_match_python_default(self): + """seed SQL JSONB ↔ Python DEFAULT 1:1 정적 검증.""" + seed_path = os.path.join( + os.path.dirname(__file__), '..', + 'models_core', 'seeds', 'v1_gear_violation.sql', + ) + with open(seed_path, 'r', encoding='utf-8') as f: + sql = f.read() + + start = sql.index('$json$') + len('$json$') + end = sql.index('$json$', start) + raw = sql[start:end].strip() + params = json.loads(raw) + + expected = gv.GEAR_VIOLATION_DEFAULT_PARAMS + self.assertEqual(params['scores'], expected['scores']) + self.assertEqual(params['signal_cycling'], expected['signal_cycling']) + self.assertAlmostEqual( + params['gear_drift_threshold_nm'], expected['gear_drift_threshold_nm'] + ) + # list 는 순서 무관하게 set 비교 (DB 에 저장 시 어떤 순서든 상관 없음) + self.assertEqual(set(params['fixed_gear_types']), + set(expected['fixed_gear_types'])) + for code, allowed in expected['fishery_code_allowed_gear'].items(): + self.assertEqual( + set(params['fishery_code_allowed_gear'][code]), set(allowed), + f'fishery_code_allowed_gear[{code}] mismatch', + ) + + +if __name__ == '__main__': + unittest.main() diff --git a/prediction/tests/test_pair_trawl_params.py b/prediction/tests/test_pair_trawl_params.py new file mode 100644 index 0000000..c19100c --- /dev/null +++ b/prediction/tests/test_pair_trawl_params.py @@ -0,0 +1,66 @@ +"""Phase 2 PoC #5 — pair_trawl_tier DEFAULT_PARAMS ↔ seed SQL 정적 일치.""" +from __future__ import annotations + +import importlib +import json +import os +import sys +import types +import unittest + +if 'pandas' not in sys.modules: + pd_stub = types.ModuleType('pandas') + pd_stub.DataFrame = type('DataFrame', (), {}) + pd_stub.Timestamp = type('Timestamp', (), {}) + sys.modules['pandas'] = pd_stub + +if 'pydantic_settings' not in sys.modules: + stub = types.ModuleType('pydantic_settings') + + class _S: + def __init__(self, **kw): + for name, value in self.__class__.__dict__.items(): + if name.isupper(): + setattr(self, name, kw.get(name, value)) + + stub.BaseSettings = _S + sys.modules['pydantic_settings'] = stub + +if 'algorithms' not in sys.modules: + pkg = types.ModuleType('algorithms') + pkg.__path__ = [os.path.join(os.path.dirname(__file__), '..', 'algorithms')] + sys.modules['algorithms'] = pkg + + +class PairTrawlParamsTest(unittest.TestCase): + + def test_seed_matches_default(self): + pt = importlib.import_module('algorithms.pair_trawl') + seed_path = os.path.join( + os.path.dirname(__file__), '..', + 'models_core', 'seeds', 'v1_pair_trawl.sql', + ) + with open(seed_path, 'r', encoding='utf-8') as f: + sql = f.read() + start = sql.index('$json$') + len('$json$') + end = sql.index('$json$', start) + params = json.loads(sql[start:end].strip()) + self.assertEqual(params, pt.PAIR_TRAWL_DEFAULT_PARAMS) + + def test_default_values_match_module_constants(self): + pt = importlib.import_module('algorithms.pair_trawl') + d = pt.PAIR_TRAWL_DEFAULT_PARAMS + self.assertEqual(d['strong']['proximity_nm'], pt.PROXIMITY_NM) + self.assertEqual(d['strong']['sog_delta_max'], pt.SOG_DELTA_MAX) + self.assertEqual(d['strong']['cog_delta_max'], pt.COG_DELTA_MAX) + self.assertEqual(d['strong']['min_sync_cycles'], pt.MIN_SYNC_CYCLES) + self.assertEqual(d['strong']['simultaneous_gap_min'], pt.SIMULTANEOUS_GAP_MIN) + self.assertEqual(d['probable']['min_block_cycles'], pt.PROBABLE_MIN_BLOCK_CYCLES) + self.assertEqual(d['probable']['min_sync_ratio'], pt.PROBABLE_MIN_SYNC_RATIO) + self.assertEqual(d['suspect']['min_block_cycles'], pt.SUSPECT_MIN_BLOCK_CYCLES) + self.assertEqual(d['suspect']['min_sync_ratio'], pt.SUSPECT_MIN_SYNC_RATIO) + self.assertEqual(d['candidate_scan']['cell_size_deg'], pt.CELL_SIZE) + + +if __name__ == '__main__': + unittest.main() diff --git a/prediction/tests/test_risk_composite_params.py b/prediction/tests/test_risk_composite_params.py new file mode 100644 index 0000000..32e2683 --- /dev/null +++ b/prediction/tests/test_risk_composite_params.py @@ -0,0 +1,75 @@ +"""Phase 2 PoC #4 — risk_composite DEFAULT_PARAMS ↔ seed SQL 정적 일치 테스트.""" +from __future__ import annotations + +import importlib +import json +import os +import sys +import types +import unittest + +if 'pandas' not in sys.modules: + pd_stub = types.ModuleType('pandas') + pd_stub.DataFrame = type('DataFrame', (), {}) + pd_stub.Timestamp = type('Timestamp', (), {}) + sys.modules['pandas'] = pd_stub + +if 'pydantic_settings' not in sys.modules: + stub = types.ModuleType('pydantic_settings') + + class _S: + def __init__(self, **kw): + for name, value in self.__class__.__dict__.items(): + if name.isupper(): + setattr(self, name, kw.get(name, value)) + + stub.BaseSettings = _S + sys.modules['pydantic_settings'] = stub + +if 'algorithms' not in sys.modules: + pkg = types.ModuleType('algorithms') + pkg.__path__ = [os.path.join(os.path.dirname(__file__), '..', 'algorithms')] + sys.modules['algorithms'] = pkg + +# risk.py 는 algorithms.location/fishing_pattern/dark_vessel/spoofing 을 top-level +# import 한다. dark_vessel 만 실제 모듈 그대로 두고 나머지는 필요한 심볼만 stub. +if 'algorithms.location' in sys.modules: + if not hasattr(sys.modules['algorithms.location'], 'classify_zone'): + sys.modules['algorithms.location'].classify_zone = lambda *a, **k: {} +else: + loc = types.ModuleType('algorithms.location') + loc.haversine_nm = lambda a, b, c, d: 0.0 + loc.classify_zone = lambda *a, **k: {} + sys.modules['algorithms.location'] = loc + +for mod_name, attrs in [ + ('algorithms.fishing_pattern', ['detect_fishing_segments', 'detect_trawl_uturn']), + ('algorithms.spoofing', ['detect_teleportation', 'count_speed_jumps']), +]: + if mod_name not in sys.modules: + m = types.ModuleType(mod_name) + sys.modules[mod_name] = m + m = sys.modules[mod_name] + for a in attrs: + if not hasattr(m, a): + setattr(m, a, lambda *_a, **_kw: []) + + +class RiskCompositeParamsTest(unittest.TestCase): + + def test_seed_matches_default(self): + risk = importlib.import_module('algorithms.risk') + seed_path = os.path.join( + os.path.dirname(__file__), '..', + 'models_core', 'seeds', 'v1_risk_composite.sql', + ) + with open(seed_path, 'r', encoding='utf-8') as f: + sql = f.read() + start = sql.index('$json$') + len('$json$') + end = sql.index('$json$', start) + params = json.loads(sql[start:end].strip()) + self.assertEqual(params, risk.RISK_COMPOSITE_DEFAULT_PARAMS) + + +if __name__ == '__main__': + unittest.main() diff --git a/prediction/tests/test_transshipment_params.py b/prediction/tests/test_transshipment_params.py new file mode 100644 index 0000000..c686a16 --- /dev/null +++ b/prediction/tests/test_transshipment_params.py @@ -0,0 +1,88 @@ +"""Phase 2 PoC #3 — transshipment_5stage params 카탈로그 동치성 테스트. + +런타임 override 는 후속 PR 에서 활성화되므로, 이 테스트는 **DEFAULT_PARAMS +↔ 모듈 상수 ↔ seed SQL JSONB** 3 자 일치만 검증한다. +""" +from __future__ import annotations + +import importlib +import json +import os +import sys +import types +import unittest + +# pandas/pydantic_settings stub (다른 phase 2 테스트와 동일 관용) +if 'pandas' not in sys.modules: + pd_stub = types.ModuleType('pandas') + pd_stub.DataFrame = type('DataFrame', (), {}) + pd_stub.Timestamp = type('Timestamp', (), {}) + sys.modules['pandas'] = pd_stub + +if 'pydantic_settings' not in sys.modules: + stub = types.ModuleType('pydantic_settings') + + class _S: + def __init__(self, **kw): + for name, value in self.__class__.__dict__.items(): + if name.isupper(): + setattr(self, name, kw.get(name, value)) + + stub.BaseSettings = _S + sys.modules['pydantic_settings'] = stub + +if 'algorithms' not in sys.modules: + pkg = types.ModuleType('algorithms') + pkg.__path__ = [os.path.join(os.path.dirname(__file__), '..', 'algorithms')] + sys.modules['algorithms'] = pkg + +# fleet_tracker 의 GEAR_PATTERN 을 transshipment.py 상단에서 import 하므로 stub +if 'fleet_tracker' not in sys.modules: + ft_stub = types.ModuleType('fleet_tracker') + import re as _re + ft_stub.GEAR_PATTERN = _re.compile(r'^xxx$') + sys.modules['fleet_tracker'] = ft_stub + +ts = importlib.import_module('algorithms.transshipment') + + +class TransshipmentParamsTest(unittest.TestCase): + + def test_default_values_match_module_constants(self): + p = ts.TRANSSHIPMENT_DEFAULT_PARAMS + self.assertEqual(p['sog_threshold_kn'], ts.SOG_THRESHOLD_KN) + self.assertEqual(p['proximity_deg'], ts.PROXIMITY_DEG) + self.assertEqual(p['approach_deg'], ts.APPROACH_DEG) + self.assertEqual(p['rendezvous_min'], ts.RENDEZVOUS_MIN) + self.assertEqual(p['pair_expiry_min'], ts.PAIR_EXPIRY_MIN) + self.assertEqual(p['gap_tolerance_cycles'], ts.GAP_TOLERANCE_CYCLES) + self.assertEqual(set(p['fishing_kinds']), set(ts._FISHING_KINDS)) + self.assertEqual(set(p['carrier_kinds']), set(ts._CARRIER_KINDS)) + self.assertEqual(set(p['excluded_ship_ty']), set(ts._EXCLUDED_SHIP_TY)) + self.assertEqual(list(p['carrier_hints']), list(ts._CARRIER_HINTS)) + + def test_seed_sql_values_match_python_default(self): + seed_path = os.path.join( + os.path.dirname(__file__), '..', + 'models_core', 'seeds', 'v1_transshipment.sql', + ) + with open(seed_path, 'r', encoding='utf-8') as f: + sql = f.read() + + start = sql.index('$json$') + len('$json$') + end = sql.index('$json$', start) + raw = sql[start:end].strip() + params = json.loads(raw) + + expected = ts.TRANSSHIPMENT_DEFAULT_PARAMS + for scalar_key in ['sog_threshold_kn', 'proximity_deg', 'approach_deg', + 'rendezvous_min', 'pair_expiry_min', 'gap_tolerance_cycles', + 'min_score']: + self.assertEqual(params[scalar_key], expected[scalar_key], scalar_key) + for list_key in ['fishing_kinds', 'carrier_kinds', 'excluded_ship_ty', + 'carrier_hints']: + self.assertEqual(set(params[list_key]), set(expected[list_key]), list_key) + + +if __name__ == '__main__': + unittest.main()