Load your first model
A fresh hal0 install boots into the FirstRun wizard. Three steps,
plus a “done” coda. Open the dashboard at http://localhost:8080 and
the wizard takes over the screen until the primary slot has a model.
What the wizard does
Section titled “What the wizard does”The wizard is a guarded route at /firstrun. Until the API reports
first_run: false, every other dashboard navigation redirects back to
it — there’s no point operating an empty box.
It owns three jobs: pick a starting model from the curated list,
confirm any license requirements, and stream a model pull straight into
the primary slot. When the slot transitions to ready, the wizard
moves to the done step and unlocks the rest of the dashboard.
The three steps
Section titled “The three steps”-
Pick model. A curated list of starting picks, sized to your detected hardware. The probe already wrote
/etc/hal0/hardware.jsonduring install, so the list shows fit warnings inline — “this model is larger than your detected GPU” appears next to anything that would offload heavily.The default highlight is
Phi-3-mini-4k-instruct-q4— a 2.4 GB Q4 GGUF that downloads in roughly 10 seconds on a modern connection and is small enough to fit anywhere hal0 runs. Strix Halo users should jump straight to a Q4 7B-class model or a Q4 MoE 30B — see recommended loadouts on the Strix Halo page. -
License. Some weights have a click-through license (Llama, Gemma, others). When the picked model requires it, step 2 shows the license URL and asks for explicit confirmation. If the model is freely redistributable, this step is skipped.
-
Install. The dashboard kicks off a streaming pull through
POST /api/models/{id}/pull. Progress streams live over SSE — bytes, percent, and the slot state transitioning throughpulling → starting → warming → ready. The slot starts automatically when the download completes.
When primary reaches ready, the wizard moves to the done step. From
there you can head straight to OpenWebUI for a first chat,
or to the dashboard to load more models.
When the pull fails
Section titled “When the pull fails”The wizard surfaces errors inline. Common ones:
- No disk space.
/var/lib/hal0/models/ran out mid-pull. Free up space and retry — partial downloads resume. - Hugging Face rate-limit. Anonymous pulls hit a rate cap on
popular weights. Export
HF_TOKEN(or set it in/etc/hal0/api.env) and retry. - License not accepted on Hugging Face. Some gated models require acceptance on the HF side before the API will serve the files. The error message links out to the model page.
The slot stays in the error state with details in
/var/lib/hal0/slots/primary/state.json until you retry — nothing
hidden.
Picking something other than the default
Section titled “Picking something other than the default”The wizard’s curated list is a starting point, not an exhaustive catalog. After it’s done you can:
-
Add models from the Models page in the dashboard
-
Assign them to slots from the Slots page
-
Or do it from the CLI:
Terminal window hal0 model listhal0 slot swap primary --model qwen2.5-coder-7b-instruct-q4_k_m
See recommended loadouts for a hardware-by-hardware breakdown of what fits where.