The Epson Photo+ Tool is quite inconvenient in selecting folders. You always have to browse to the path you want to go to. If you have some photos on your main drive, and some on your NAS, you need to click through the whole dialog.
One solution is to use a “link”. In the sense of a linux file system link. This also works in windows.
## as Admin run: in cmd
mklink /D "C:\Users\[username]\Pictures\Pixel7a NAS mklink" "Z:\photos\Pixel 7a"
## Output:
symbolic link created for C:\Users\windo\Pictures\2024 Pixel7a NAS mklink <<===>> Z:\photos\Smartphone-Backups\2024 Pixel7a
Note:
Deleting the mklink folder, does not delete the remote folder.
Unlike linux, you must not create the new “linkfolder” in advance.
I asked gemini. Prompt:
generate a windows link to a folder (link a linked folder, linux: ln)
from: C:\Users\[user]\Pictures> mkdir “Pixel7a NAS” to Z:\photos\Pixel7a is that possible
For powershell (not tested):
New-Item -ItemType SymbolicLink -Path "C:\Users\[username]\Pictures\Pixel7a NAS mklink" -Target "Z:\photos\Pixel 7a"
TL;DR: I can run some basic inference and tts, but there is no proper pipeline or any integration available with home assistant…, so I’ll next go back to rhasspy.
In my last post I setup my Roborock S7 (aka Rocki) with home assistant and setup a voice assistant with the voice preview device and google gemini models.
In this post, I want to document my exploration into running a proper speech-speech-omni model and control Rocki. First step, get the model to run and somehow be able to input speech.
git clone https://github.com/ictnlp/LLaMA-Omni2
cd LLaMA-Omni2
# sidetrack to install anaconda: go to https://repo.anaconda.com/archive/
# i selected https://repo.anaconda.com/archive/Anaconda3-2025.06-1-Linux-x86_64.sh
# I run linux in WSL on windows
conda create -n llama-omni2 python=3.10
conda activate llama-omni2
pip install -e .
# now run python shell
python
> import whisper
> model = whisper.load_model("large-v3", download_root="models/speech_encoder/")
> exit()
huggingface-cli download --resume-download ICTNLP/cosy2_decoder --local-dir models/cosy2_decoder
model_name=LLaMA-Omni2-7B
huggingface-cli download --resume-download ICTNLP/$model_name --local-dir models/$model_name
# it's downloading a lot of large files...
# maybe the 7B is a little big for my RTX3060 with 12GB VRAM, and also might be slow, so for testing, let's get the smallest 0.5B model.
# And who knows how this will work, like does whisper run in parallel, then blocking VRAM?
model_name=LLaMA-Omni2-0.5B
huggingface-cli download --resume-download ICTNLP/$model_name --local-dir models/$model_name
# FIX 1: now somehow we need matcha-tts I ran into errors and doing this allows demo to run
pip install matcha-tts
# FIX 2: install ffmpeg (source: https://gist.github.com/ScottJWalter/eab4f534fa2fc9eb51278768fd229d70)
sudo add-apt-repository ppa:mc3man/trusty-media
sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get install ffmpeg
# open 3 terminals, make sure to activate the conda environment in each
# 1)
python -m omni_speech.serve.controller --host 0.0.0.0 --port 10000
# 2)
python -m llama_omni2.serve.gradio_web_server --controller http://localhost:10000 --port 8000 --vocoder-dir models/cosy2_decoder
# this has problems: jsonable_encoder stuff, gemini recommended to:
pip install --upgrade pydantic
pip install --upgrade fastapi
# problem persists...
# 3)
model_name=LLaMA-Omni2-0.5B
python -m llama_omni2.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path models/$model_name --model-name $model_name
Ok, it didn’t work out of the box. 😐
Trying the local inference python script. This works! I adapted the questions, recorded my own audio and you get a response: “Why is the sky blue?”
I adapted it with my own question: Why is the sky blue.
run_llama_omni2.py takes ~18.6s
run_cose2_decoder.py takes ~14.1s
Result:
So: I think I need some more out of the box approach here. Maybe go back to Rhasspy?!
Update 29.03.2026: running openclaw, and integrate hass with the openclaw-homeassistant plugin allows for proper natural language commands! Not trivial to setup (neither openclaw, nor the plugin) though!