The Token Tax: Netflix Engineer Open-Sources ‘Project Headroom’ to Slash LLM API Costs
Tejas Chopra's Project Headroom aims to reduce LLM bills by up to 90% by pruning redundant tokens and optimizing KV cache efficiency.
Tejas Chopra's Project Headroom aims to reduce LLM bills by up to 90% by pruning redundant tokens and optimizing KV cache efficiency.
An analysis of Hy3 preview's unexpected surge in the OpenRouter model rankings, exploring the roles of pricing, prompt caching, and SiliconFlow.
An analysis of the surprising rise of Tencent's Hy3 model on OpenRouter, exploring how pricing and provider dynamics can skew LLM popularity data.
Norway's National Library is leveraging 2PB of Huawei flash storage and the Sigma2 supercomputer to build a local language LLM, avoiding reliance on US-based AI.
Norway's National Library is leveraging 2PB of Huawei flash storage and a massive digital archive to build a sovereign LLM that preserves Norwegian culture.
From 'DAN' to gaslighting, hackers are moving beyond code to exploit the simulated personalities of AI chatbots to bypass safety guardrails.
Beyond traditional coding, a new wave of AI security threats relies on psychological manipulation and 'gaslighting' to bypass safety guardrails in LLMs.
From the 'Grandma exploit' to sophisticated gaslighting, hackers are moving away from code and toward psychological manipulation to bypass AI guardrails.
A threat report from TrendAI reveals how a Russian actor used jailbroken Gemini LLMs to impersonate US veterans and steal cryptocurrency from MAGA supporters.
Researchers from the University of Maryland reveal how minor semantic edits to AI agent 'skills' can bypass security scanners and lead to prompt injection attacks.