Infra Assignment
Steps
- Sign up on Google Cloud. You will get $300 free credits, enough for the test.
- Find which of the inference frameworks are best for making an OpenAI-compatble inference server for a 7B LLM. Some examples: SGLang, vLLM
- Host a simple OpenAI compatible inference server image with a small 7B model (like Qwen-2.5-7B-Instruct) in Google K8s cluster (or equivalent)
- Clone and run SyllabAI, and change the base OpenAI URL in it with the URL generated by GKE.
- Run the repo.
- Check GKE on three parameters:
- Usage
- Tokens/s (by checking logs)
- Pricing of containers used
BROWNIE POINTS if you can set up OpenTelemetry and pair up Observability with Grafana
Assignment Duration
3 days
Submission
You expected a submission form, didn’t you? Sorry to break the mould, but you gotta show it to us, and as soon as you can!
Book a meet with the team here!
BASELINE It’s okay if you are unable to complete the entire assignment. We want to see the efforts you put in. Skills are transferable, the go-getter attitude isn’t! ;)