If Bash is your native tongue and you like breaking things on purpose (then writing down exactly why they broke), this role is basically “paid red team for an AI model.” You’ll push LLMs through real engineering scenarios, judge their code like a cranky senior reviewer, and document failure patterns so the model gets smarter and safer.

About Invisible Agency
Invisible supports companies by building scalable, high-quality operations. This project focuses on improving AI model performance using expert-created training data and rigorous evaluation, especially in technical domains.

Schedule
Remote (United States).
Contract / freelance.
Hours vary based on your availability (you’ll report average weekly hours).
You provide your own secure computer and high-speed internet. No company benefits (contractor role).

What You’ll Do

Challenge AI models with software engineering tasks and technical scenarios, primarily implemented in Bash
Review outputs for logic, correctness, security, clarity, and “would this survive production?” quality
Identify and capture reproducible error traces and failure modes (bad assumptions, broken logic, unsafe commands, etc.)
Suggest improvements to prompts, evaluation metrics, and how the model is assessed on Bash fluency and engineering reasoning

What You Need

Strong Bash scripting experience (automation, pipelines, text processing, process control, environment management, debugging)
Comfort evaluating broader engineering concepts the prompts may touch (APIs, cloud, systems design, secure coding, distributed debugging)
Ability to communicate clearly and show your work when explaining what’s wrong and how to fix it
Bachelor’s/master’s/PhD in CS/SE (nice), or equivalent real-world experience (often better)
Bonus signals: open-source contributions, technical writing, production automation at scale

Benefits

Pay range: $8–$65/hour depending on experience, expertise, and location
Remote, flexible contract work
Direct impact: you’ll help harden how AI handles real technical work, not toy examples

Backbone moment: a lot of “Bash experts” are really “I can copy a one-liner from Stack Overflow.” If you’re actually strong, price yourself like it. If you’re rusty, still apply, but don’t claim “fluent” unless you can debug shell quoting in your sleep.

Happy Hunting,
~Two Chicks…

APPLY HERE.