1. The problem
Satellites in Low Earth Orbit at 1000 km altitude are resource-constrained machines that operate entirely on their own. For most of each orbit there is no ground contact, no engineer available, and no way to push a fix. Every decision about what to compute, when, and where must be made onboard — autonomously, continuously, under constraints that are constantly shifting.
The central constraint is power. At 1000 km, the satellite orbits Earth roughly every 105 minutes. For part of each orbit it is in Earth's shadow — an eclipse pass — during which the solar panel generates zero power and the satellite runs entirely on its battery. If too many compute tasks run during eclipse, the battery drops below the 25% safety floor, risking a system shutdown. If the scheduler is too conservative, mission data never gets processed.
"Power is only part of it. The scheduler must simultaneously track CPU load, memory, onboard temperature, task priority, time until the next phase transition, and the battery and CPU state of peer satellites in the constellation. A decision that is correct for one combination of conditions can be completely wrong for another."
Traditional schedulers handle this with hand-written rules defined before launch. The problem is that the space of possible constraint combinations grows faster than any rule set can enumerate. Rules cannot explain their reasoning. They cannot adapt after launch without a full OTA firmware push from the ground. And they fail silently on scenarios their authors never anticipated.
This project was built to answer one question: can a fine-tuned small language model, running entirely onboard, make smarter scheduling decisions than a rule engine — and tell you why?
2. The solution
The AI Workload Scheduler is an autonomous decision system running inside each satellite node. Every ~12 seconds it reads live sensor data, reasons over the current satellite state and constellation context, and outputs one of three decisions for the incoming task:
Execute on this satellite now. Battery, CPU, and thermal are all within safe margins for the task's full duration.
Forward to a peer satellite via ISL. Local resources are constrained or eclipse risk makes execution unsafe here.
Hold until conditions improve. The State Forecaster calculates the earliest viable execution window in the next 60 minutes.
What separates this from a rule engine is that every decision comes from a fine-tuned language model. The model receives a structured JSON prompt containing the full telemetry context — battery charge, CPU and memory utilisation, eclipse state, task requirements, and peer satellite states — and outputs a decision, a confidence score, and a plain-language reason. Every decision is traceable.
The system is designed to be safe. If the model produces bad output it retries up to three times. If JSON parsing still fails, a regex scan attempts to extract a valid action from the raw text. Only if output is completely empty does a deterministic rule-based oracle take over — and only until the model recovers. Model weights are SHA-256 verified on every boot to detect bit-flip corruption from cosmic rays.
3. System architecture
The scheduler is composed of six independent subsystems that together form a resilient, self-supervising decision loop.
The Core Agent is the decision orchestrator — reading from the Telemetry Collector every 2 s, querying the State Forecaster for a 60-min battery projection, and dispatching escalations via ISL Transport to the Constellation Coordinator (leader-elected by battery, CPU, and uptime score). The Fallback Controller monitors inference health: after three consecutive bad outputs or a failed SHA-256 weight checksum, it activates Safe Mode and hands control to the rule oracle. The Ground Station Handler is the sole trusted channel for OTA model and firmware updates.
4. How a decision gets made
Every ~12 seconds the following sequence runs inside each satellite node:
- Telemetry read — the Core Agent reads all live sensor values and packages them for inference.
- Forecast — the State Forecaster simulates battery trajectory for the next 60 minutes and identifies the earliest safe window if conditions are currently insufficient.
- Task selection — the next task is pulled from a rotating pool with varying resource profiles, from lightweight telemetry to heavy ML inference.
- AI inference — the Qwen model receives a JSON prompt with telemetry, task requirements, and peer satellite states. It outputs action, confidence score, and reason string.
- Enrichment — the system computes available vs required resources (CPU, memory, power), assigns a risk level (LOW / MEDIUM / HIGH), and builds a 4-step analysis pipeline for the dashboard.
- Dispatch — the enriched decision is logged, acted upon, and streamed to the monitoring dashboard via Server-Sent Events.
5. The AI model
Why Qwen 2.5-1.5B?
The choice of base model was driven by three hard constraints: it must fit on satellite-grade hardware after quantisation, it must handle structured reasoning over telemetry JSON reliably, and it must be fine-tunable with limited compute. Qwen 2.5-1.5B satisfies all three. Its instruction-following capability is strong enough to parse nested JSON context and produce valid structured output consistently — a requirement that many smaller models fail at under distribution shift.
Why AWQ INT4 quantisation?
Satellites have limited onboard memory. AWQ (Activation-aware Weight Quantisation) compresses model weights from 16-bit to 4-bit with minimal accuracy loss by accounting for the statistical distribution of activations before quantising each weight layer. The result is a reduction from approximately 3 GB to ~1.5 GB — a requirement for deployment on embedded space-grade hardware where DRAM is scarce and power draw per GB is a real constraint.
Why LoRA fine-tuning?
Rather than retraining the full model — which would require significant GPU time and produce weights too large for incremental OTA delivery — LoRA attaches small trainable adapter layers (rank-16) that teach the model the satellite scheduling task. The base weights stay completely frozen. The resulting adapter is small enough to deliver as an OTA update post-launch, enabling the scheduler's decision policy to be updated in orbit without replacing the entire model binary.
Prompt structure
The model receives three structured inputs in a single JSON prompt: current telemetry (battery SoC, CPU utilisation, memory, temperature, eclipse state, charging rate), the incoming task's resource requirements (CPU demand, memory demand, power draw, estimated duration), and the battery and CPU state of up to two peer satellites in the mesh. The model is expected to return a single JSON object with action, confidence, and reason fields on every inference call.
6. Building the training dataset with PASEOS
PASEOS — PAseos Simulates the Environment for Operating multiple Spacecraft
PASEOS is an open-source Python simulation framework developed by AI Sweden and the European Space Agency under the Φ-lab@Sweden programme. It models the complete physical environment of a satellite constellation: orbital mechanics via SGP4/SDP4 propagators, battery charge and discharge cycles, solar panel behaviour, eclipse detection, thermal effects, radiation modelling, and multi-node inter-satellite communication.
PASEOS was central to our training pipeline. Because we have no real satellite in orbit, we needed a way to generate physically grounded scheduling scenarios at scale. PASEOS gave us a simulation environment that respects the actual orbital physics of a 1000 km LEO satellite — correct eclipse durations, accurate battery discharge curves under compute load, and realistic ISL topologies.
What we simulated
Using PASEOS, we generated a synthetic dataset of satellite scheduling scenarios covering a wide range of constraint combinations — varying battery levels from 15% to 95%, eclipse and sunlight states, CPU and memory loads from idle to near-saturation, task types spanning the full operational spectrum (ML_INFERENCE, TELEMETRY, ISL_RELAY, DOWNLINK), and peer satellite states with various combinations of battery and CPU headroom.
Each scenario was formatted as a ChatML prompt with the telemetry state as input and the correct scheduling decision — derived from a deterministic oracle that knows the full simulation state — as the target output. This produced a large, physically grounded dataset without requiring real satellite logs.
Training pipeline
- Data generation via PASEOS — thousands of satellite scheduling scenarios across a wide parameter space, each tagged with ground-truth decisions from the oracle
- ChatML formatting — each scenario serialised as a structured system + user + assistant message triple
- LoRA fine-tuning on Qwen 2.5-1.5B — rank-16 adapters trained on the formatted dataset
- AWQ INT4 quantisation — model compressed from ~3 GB to ~1.5 GB via AutoAWQ
- OTA package bundling — model weights bundled with SHA-256 checksums for secure in-orbit delivery
7. Operational task classes
The scheduler handles the full range of satellite operations. Each task class has a distinct resource profile, priority level, and tolerance for deferral. The AI model was trained to reason over all of them in combination.
Housekeeping & telemetry
Continuous collection of onboard sensor data — battery voltage, CPU temperature, memory usage, solar panel output, and attitude control readings. Lightweight but time-critical: missed housekeeping cycles obscure early fault signals. Scheduled as near-constant background tasks with low CPU and power requirements.
Telemetry & data downlink
Transmission of collected data to ground stations during contact windows. Requires sufficient battery margin for the radio and the ground station contact window to be open. The scheduler predicts contact windows and queues downlink tasks ahead of the pass, avoiding eclipse-period transmission when power margins are tight.
Inter-satellite link operations
Coordination tasks across the optical mesh: ISL ranging (precise distance measurement for orbit determination), data relay between nodes, coordination sync for cluster management, and distress relay for failed-node recovery. These require active ISL link windows and consume moderate power; the scheduler checks link availability and peer battery state before escalating tasks through the mesh.
Gateway uplink & downlink
High-priority data transfers between the constellation and ground gateways — firmware updates, model OTA packages, priority tasking commands from ground operators. Gateway transfers have strict integrity requirements; each package is signature-verified before application. The scheduler protects battery margin during active gateway sessions.
Constellation self-management
The 10% of compute reserved for the constellation's own intelligence: the scheduler itself, the state forecaster, anomaly detection runs, and station-keeping planning. These tasks are self-referential — the scheduler must reason about the resources consumed by its own operation — and are given priority access to prevent decision-loop starvation.
8. Technology stack
| Layer | Technology | Role |
|---|---|---|
| AI model | Qwen 2.5-1.5B |
Base language model — instruction-following, structured JSON output |
| Fine-tuning | LoRA rank-16 |
Adapter training for satellite scheduling task; base weights frozen |
| Quantisation | AWQ INT4 via AutoAWQ |
Reduces model from ~3 GB to ~1.5 GB; ~2 GB VRAM at inference |
| Inference runtime | HuggingFace Transformers + PyTorch + CUDA |
Onboard model serving; ~11–13 s per decision cycle |
| Training data | PASEOS (AI Sweden / ESA) |
Physically grounded synthetic scheduling scenarios at 1000 km LEO |
| Backend | Python 3.11 · FastAPI · Uvicorn |
Decision loop, ISL transport, ground station handler, REST API |
| Dashboard | React 18 · Vite |
Real-time monitoring UI with telemetry charts and decision flow |
| Streaming | Server-Sent Events (SSE) |
Low-latency push of decisions and telemetry to the dashboard |
| Orbital physics | Custom Kepler propagator |
Eclipse detection, phase timing, battery trajectory forecasting |
| Security | SHA-256 weight verification |
Boot-time integrity check against cosmic ray bit-flip corruption |
The fine-tuned model achieves ~87% decision accuracy against the oracle on held-out PASEOS scenarios — compared to ~72% for the zero-shot base model and ~81% for the deterministic rule engine on out-of-distribution constraint combinations. More importantly, the fine-tuned model produces valid structured JSON output on >99.5% of inference calls with the retry mechanism active, and its plain-language reasons are human-readable and consistent with the telemetry context.
9. What's next
The current system is a working proof of concept in software simulation. The next milestones are hardware integration — running the AWQ INT4 model on space-grade ARM processors without CUDA — and federated learning, which would allow constellation nodes to improve the scheduling policy in orbit from real mission data without downlinking raw telemetry to the ground.
The scheduler is not yet flying. But the architecture is ready to fly, and the results from PASEOS simulation give us confidence that a fine-tuned SLM can make smarter, more transparent decisions than any rule engine we could write before launch.