PHPackages                             vielhuber/runpodhelper - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. vielhuber/runpodhelper

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

vielhuber/runpodhelper
======================

Automates self-hosted llm inference

1.5.4(1mo ago)1121↓20%MITShellPHP &gt;=8.1

Since Mar 12Pushed 1w agoCompare

[ Source](https://github.com/vielhuber/runpodhelper)[ Packagist](https://packagist.org/packages/vielhuber/runpodhelper)[ RSS](/packages/vielhuber-runpodhelper/feed)WikiDiscussions main Synced 3w ago

READMEChangelogDependencies (12)Versions (54)Used By (0)

[![GitHub Tag](https://camo.githubusercontent.com/3074d5edeeed4741e213fe10941e3712a29c14e242d49377444b000556d25e6f/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f7461672f7669656c68756265722f72756e706f6468656c706572)](https://github.com/vielhuber/runpodhelper/tags)[![Code Style](https://camo.githubusercontent.com/1540f8ce219727155ab62506c77b818b720421c22c4cf0b18a5f160942132e2d/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f636f64655f7374796c652d7073722d2d31322d6666363962342e737667)](https://www.php-fig.org/psr/psr-12/)[![License](https://camo.githubusercontent.com/21e456533aa4518421d521d559169b0219ada2f0d0f4d4936939e7c09cc65cda/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6963656e73652f7669656c68756265722f72756e706f6468656c706572)](https://github.com/vielhuber/runpodhelper/blob/main/LICENSE.md)[![Last Commit](https://camo.githubusercontent.com/3ff0477ea8d75f44cf66115c0a16fb6e2c515cffe9eda7ec2dcd0095fa92285a/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f7669656c68756265722f72756e706f6468656c706572)](https://github.com/vielhuber/runpodhelper/commits)[![PHP Version Support](https://camo.githubusercontent.com/a37bd357a2d387737abed18c84b3b09a64d15ad8c54124fe33b15c048479da60/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f7068702d762f7669656c68756265722f72756e706f6468656c706572)](https://packagist.org/packages/vielhuber/runpodhelper)[![Packagist Downloads](https://camo.githubusercontent.com/2d71845869d523e40e13cb39a7943f271784414c84a6be55c73056e3eff14cfa/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f7669656c68756265722f72756e706f6468656c706572)](https://packagist.org/packages/vielhuber/runpodhelper)

⛈ runpodhelper ⛈
================

[](#-runpodhelper-)

runpodhelper automates the full lifecycle of self-hosted llm inference on runpod gpu cloud. it provisions pods via the runpod graphql api, installs lm studio or llama.cpp, downloads gguf models from huggingface, and serves them behind a cloudflare tunnel.

usage
-----

[](#usage)

```
./vendor/bin/runpod.sh create --config pods.yaml
./vendor/bin/runpod.sh delete --all
./vendor/bin/runpod.sh status

./vendor/bin/runpod.sh create \
    --gpu "4x RTX PRO 6000" \
    --hdd 250 \
    --model "unsloth/MiniMax-M2.7-GGUF-UD-Q4_K_XL" \
    --image "runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404" \
    --type "llamacpp" \
    --api-key "your-static-api-key" \
    --context-length 262144 \
    --parallel 2 \
    --datacenter "EUR-IS-2" \
    --auto-destroy 3600

./vendor/bin/runpod.sh delete --id 001
./vendor/bin/runpod.sh test quality --runs 5
./vendor/bin/runpod.sh test quantity --runs 80

./vendor/bin/runpod.sh scale --start \
    --gpu "RTX 5090" \
    --hdd 50 \
    --model "unsloth/Qwen3.5-27B-GGUF-UD-Q4_K_XL" \
    --image "runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404" \
    --type "lmstudio" \
    --api-key "your-static-api-key" \
    --context-length 65536 \
    --parallel 2 \
    --datacenter "EUR-IS-2" \
    --auto-destroy 3600 \
    --pod-count 3

./vendor/bin/runpod.sh scale --start \
    --gpu "L40S" \
    --hdd 60 \
    --model "unsloth/Qwen3.5-35B-A3B-GGUF" \
    --image "runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404" \
    --type "llamacpp" \
    --api-key "your-static-api-key" \
    --context-length 131072 \
    --parallel 1 \
    --pod-count 1

./vendor/bin/runpod.sh scale --start \
    --gpu "2x RTX PRO 6000" \
    --hdd 250 \
    --model "unsloth/MiniMax-M2.7-GGUF-UD-Q4_K_XL" \
    --image "runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404" \
    --type "llamacpp" \
    --api-key "your-static-api-key" \
    --context-length 131072 \
    --parallel 1 \
    --pod-count 1

./vendor/bin/runpod.sh scale --start \
    --gpu "4x RTX PRO 6000" \
    --hdd 250 \
    --model "unsloth/MiniMax-M2.7-GGUF-UD-Q4_K_XL" \
    --image "runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404" \
    --type "llamacpp" \
    --api-key "your-static-api-key" \
    --context-length 262144 \
    --parallel 2 \
    --pod-count 1

./vendor/bin/runpod.sh scale --stop
./vendor/bin/runpod.sh scale --pod-count 20
./vendor/bin/runpod.sh scale --refresh --context-length 65536 --parallel 2
./vendor/bin/runpod.sh scale --refresh
```

rules
-----

[](#rules)

- `gpu-vram ≈ model-size + context-length * model-factor`
- `token-budget-per-session ≈ parallel * context-length`
- `workers-per-pod ≈ workers-count / pod-count`
- `running-workers-per-pod ≈ parallel`
- `concurrent-workers ≈ parallel * pod-count`
- `concurrent-workers ≈ 0.2 * parallel * workers-count`
- `pod-count ≈ 0.2 * workers-count`

### RTX 5090 + Qwen3.5-27B

[](#rtx-5090--qwen35-27b)

- `gpu-vram ≈ 32 GB`
- `model-size ≈ 17.6 GB`
- `model-factor ≈ 0.00022`
- `=> max-context-length ≈ 65536`
- `=> max-parallel ≈ 2` (at context-length 65536)

### L40S + Qwen3.5-27B

[](#l40s--qwen35-27b)

- `gpu-vram ≈ 48 GB`
- `model-size ≈ 17.6 GB`
- `model-factor ≈ 0.00022`
- `=> max-context-length ≈ 138240`
- `=> max-parallel ≈ 4` (at context-length 65536)

### RTX PRO 6000 + Qwen3.5-122B-A10B (MoE)

[](#rtx-pro-6000--qwen35-122b-a10b-moe)

- `gpu-vram ≈ 96 GB`
- `model-size ≈ 66 GB`
- `model-factor ≈ 0.00013` (MoE, 10B active params)
- `=> max-context-length ≈ 131072` (128K model limit)
- `=> max-parallel ≈ 1` (at context-length 131072, ~83 GB total)
- `=> max-parallel ≈ 2` (at context-length 98304, ~92 GB total)

### RTX PRO 6000 + Qwen3.5-27B

[](#rtx-pro-6000--qwen35-27b)

- `gpu-vram ≈ 96 GB`
- `model-size ≈ 17.6 GB`
- `model-factor ≈ 0.00022`
- `=> max-context-length ≈ 356352`
- `=> max-parallel ≈ 10` (at context-length 65536)

### RTX PRO 6000 + gemma-4-26B-A4B (MoE)

[](#rtx-pro-6000--gemma-4-26b-a4b-moe)

- `gpu-vram ≈ 96 GB`
- `model-size ≈ 27.9 GB`
- `model-factor ≈ 0.00013` (MoE, 30 layers)
- `=> max-context-length ≈ 256K (model limit)`
- `=> max-parallel ≈ 8` (at context-length 65536)

### RTX PRO 6000 + gemma-4-31B

[](#rtx-pro-6000--gemma-4-31b)

- `gpu-vram ≈ 96 GB`
- `model-size ≈ 27.5 GB`
- `model-factor ≈ 0.00025` (dense, 60 layers)
- `=> max-context-length ≈ 256K (model limit)`
- `=> max-parallel ≈ 4` (at context-length 65536)

### RTX PRO 6000 + Qwen3.6-35B-A3B (MoE)

[](#rtx-pro-6000--qwen36-35b-a3b-moe)

- `gpu-vram ≈ 96 GB`
- `model-size ≈ 22 GB` (UD-Q4\_K\_XL)
- `model-factor ≈ 0.00013` (MoE, 3B active params)
- `=> max-context-length ≈ 256K (model limit)`
- `=> max-parallel ≈ 2` (at context-length 131072, ~56 GB total)
- `=> max-parallel ≈ 4` (at context-length 65536, ~56 GB total)

installation
------------

[](#installation)

- install library
    - `composer require vielhuber/runpodhelper`
    - `./vendor/bin/runpod.sh init`
- setup cloudflare
    - Create a domain `custom.xyz`
    - Profile &gt; API Tokens &gt; Create Token
        - Permissions:
            - `Zone / DNS / Edit`
            - `Zone / Single Redirect / Edit`
            - `Account / Cloudflare Tunnel / Edit`
        - Account Resources
            - `Include > Your account`
        - Zone Resource
            - Include / Specific zone / `custom.xyz`
    - Set `CLOUDFLARE_DOMAIN`/`CLOUDFLARE_API_KEY` in `.env`
    - Each pod gets a subdomain based on its config ID:
        - `001.custom.xyz`
        - `002.custom.xyz`
        - …
- edit config
    - `vi ./.env`
    - `vi ./models.yaml`

mcp server
----------

[](#mcp-server)

```
{
    "mcpServers": {
        "runpodhelper": {
            "command": "/usr/bin/php",
            "args": ["/path/to/project/runpodhelper/bin/mcp-server.php"]
        }
    }
}
```

recommended models
------------------

[](#recommended-models)

NameHDDModelContext lengthParalleltok/sNotesNVIDIA GeForce RTX 509050 GBQwen3.5-27B-GGUF-UD-Q4\_K\_XL655362~43best current MCP/tool-use baselineNVIDIA L40S50 GBQwen3.5-27B-GGUF-UD-Q4\_K\_XL655364~252x parallel slots vs. RTX 5090NVIDIA RTX PRO 600050 GBQwen3.5-27B-GGUF-UD-Q4\_K\_XL6553610~20max parallel slots, single podNVIDIA RTX PRO 600050 GBgemma-4-26B-A4B-it-GGUF-UD-Q8\_K\_XL655368~65MoE: 3.8B active params, best parallelism on 96 GBNVIDIA RTX PRO 600050 GBgemma-4-31B-it-GGUF-UD-Q6\_K\_XL655364~18dense, best reliability on 96 GBNVIDIA RTX PRO 600080 GBQwen3.5-122B-A10B-GGUF-UD-Q4\_K\_XL1310721~?MoE: 10B active params, #1 intelligence indexNVIDIA RTX PRO 600050 GBQwen3.6-35B-A3B-GGUF-UD-Q4\_K\_XL1310722~?MoE: 3B active params, high throughput at 128K ctxNVIDIA A4050 GBQwen3.5-27B-GGUF-UD-Q4\_K\_XL655362~20discontinued/unavailable as of 2026-03manual deployment
-----------------

[](#manual-deployment)

-  &gt; Pods &gt; Deploy
- Pod template &gt; Edit
- Expose HTTP ports (comma separated): `1234`
- Container Disk: `100 GB`
- Copy: SSH over exposed TCP
- `ssh root@xxxxxxxxxx -p xxxxx`

```
curl -fsSL https://lmstudio.ai/install.sh | bash
export PATH="/root/.lmstudio/bin:$PATH"
# this is unreliable
#lms get -y qwen/qwen3-coder-next
mkdir -p ~/.lmstudio/models/unsloth/MiniMax-M2.1-GGUF
cd ~/.lmstudio/models/unsloth/MiniMax-M2.1-GGUF
wget -c https://huggingface.co/unsloth/MiniMax-M2.1-GGUF/resolve/main/MiniMax-M2.1-UD-TQ1_0.gguf
mkdir -p ~/.lmstudio/models/lmstudio-community/Qwen3.5-35B-A3B-GGUF
cd ~/.lmstudio/models/lmstudio-community/Qwen3.5-35B-A3B-GGUF
wget -c https://huggingface.co/lmstudio-community/Qwen3.5-35B-A3B-GGUF/resolve/main/Qwen3.5-35B-A3B-Q4_K_M.gguf
lms server start --port 1234 --bind 0.0.0.0
```

alternative: use runpodctl
--------------------------

[](#alternative-use-runpodctl)

- `ssh-keygen -t ed25519 -C "name@tld.com"`
- `wget https://github.com/Run-Pod/runpodctl/releases/download/v1.14.3/runpodctl-linux-amd64 -O runpodctl`
- `chmod +x runpodctl`
- `mv runpodctl /usr/bin/runpodctl`
- `runpodctl config --apiKey `
- `runpodctl version`

more commands
-------------

[](#more-commands)

- `curl http://localhost:1234/v1/models`
- `lms --help`
- `lms status`
- `lms server stop`
- Copy: HTTP services &gt; URL

```
curl https://xxxxxxxxx-1234.proxy.runpod.net/v1/responses \
  -X POST \
  -H "Content-Type: application/json" \
    -H "Authorization: Bearer your-static-api-key" \
  -d '{
    "model": "xxxxxxxxxxxxx",
    "messages": [
        {"role": "user", "content": [{"type": "input_text", "text": "hi"}]}
    ],
    "temperature": 1.0,
    "stream": true
  }'
```

###  Health Score

46

—

FairBetter than 92% of packages

Maintenance96

Actively maintained with recent releases

Popularity14

Limited adoption so far

Community6

Small or concentrated contributor base

Maturity56

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~1 days

Total

53

Last Release

35d ago

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/3183737?v=4)[David Vielhuber](/maintainers/vielhuber)[@vielhuber](https://github.com/vielhuber)

---

Top Contributors

[![vielhuber](https://avatars.githubusercontent.com/u/3183737?v=4)](https://github.com/vielhuber "vielhuber (58 commits)")

### Embed Badge

![Health badge](/badges/vielhuber-runpodhelper/health.svg)

```
[![Health](https://phpackages.com/badges/vielhuber-runpodhelper/health.svg)](https://phpackages.com/packages/vielhuber-runpodhelper)
```

###  Alternatives

[laravel/framework

The Laravel Framework.

34.8k532.1M19.4k](/packages/laravel-framework)[tempest/framework

The PHP framework that gets out of your way.

2.2k31.1k12](/packages/tempest-framework)[matomo/matomo

Matomo is the leading Free/Libre open analytics platform

21.6k38.2k](/packages/matomo-matomo)[pressbooks/pressbooks

Pressbooks is an open source book publishing tool built on a WordPress multisite platform. Pressbooks outputs books in multiple formats, including PDF, EPUB, web, and a variety of XML flavours, using a theming/templating system, driven by CSS.

45344.0k1](/packages/pressbooks-pressbooks)[aedart/athenaeum

Athenaeum is a mono repository; a collection of various PHP packages

245.2k](/packages/aedart-athenaeum)[lion/bundle

Lion-framework configuration and initialization package

122.3k3](/packages/lion-bundle)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
